Every benchmark scoresettled on Ethereum L1

The open registry for verified AI benchmarks.

One API call runs the benchmark, commits a Merkle root over every transcript tuple, generates a ZK proof of the scoring function, and settles the attestation on Ethereum mainnet. Browsing is free forever; posting is $5 per test after your first twenty.

Used by builders at Anthropic · Mem0 · Letta · Hermes
Services
Benchmark suites
Verified runs
Publishers
Built on open infrastructure · not a proprietary rollup
Ethereum Aligned Layer SP1 · Succinct Risc0 USDC Slopshop Inc.
Last verified
All →
A certificate of attestation sealed with an emerald wax-seal, threaded to smaller proof cards — Benchlist's commitment chain visualized
Fig. 1 — Every score, sealed.
The thesis, briefly
Self-reported numbers are a race to the bottom. Pick a favorable subset, tune to the eval, publish a blog post. Benchlist puts every score behind a cryptographic proof anyone can re-check — on Ethereum, forever.
From the about page
Try it live

One request. End-to-end.

Watch the complete lifecycle — queue, run, commit, prove, batch, settle on mainnet — in under five seconds. Real SHA-256 commitment computed in your browser.


      
Pipeline
Real API Post your own
State of AI · Weekly

This week on Benchlist.

A rolling seven-day digest of every attestation that landed on-chain. Unedited, unspun, computed live from the same JSON the registry serves.

Full leaderboard →
Attested
runs, 7 days
Gas burned
USD, Ethereum L1
Publishers
unique, this week
Median proof
minutes, commit→chain
Biggest scores, last seven days
top 5
Leader per benchmark
live
Browse by category

Sixteen categories, one standard.

From frontier LLMs to vector search, every listing comes with attested benchmark results.

Featured services

Recently verified.

View all
How it works

Benchmark → attest → publish.

The whole chain is open. You can replay any run bit-for-bit on your own hardware.

Step 1
Run
A trusted attestor runs the benchmark against the service. Full transcripts are stored.
Step 2
Commit
The runner computes a Merkle root over every (prompt, response, judge) tuple plus dataset and methodology hashes.
Step 3
Prove
A ZK proof of the scoring function over the commitment is submitted to Aligned Layer.
Step 4
Verify
Aligned batches the proof and verifies on Ethereum L1. The batch ID becomes the listing's credential.
Aligned Layer

Benchlist uses Aligned Layer — a proof aggregation network on Ethereum — so any claim on this site is a signed, on-chain attestation. Read the integration spec →

Today's leaderboard

Top verified runs.

All benchmarks
For builders

Publish a listing
buyers actually trust.

Run any benchmark. Get an on-chain proof. Post with a single API call — or fill out a form if you’d rather we do it for you.