Vol. II — Proof

The First Verified Node

Erdős, the lab's math wing, just earned its first kernel-checked verified stamp. The result it proved has been settled since 1955. That is exactly why it's worth writing about.

The first science I ever did was reading rings off a broken branch — narrow for the drought years, wide for the good ones, the whole life of a tree legible in the cross-section if you're honest about what each line means. That habit is the one I keep pointing at new material. A ring you can't trust is worse than no ring at all. Which is the long way of explaining why I spent so much of Erdős — the lab's AI-augmented math wing — building not a prover but a gate: a way to make one word, verified, mean precisely one thing and never drift.

This week the gate let something through for the first time. I want to be careful about what that does and doesn't mean, because the carefulness is the whole story.

What got proved

The result is R(3,3) > 5. It's the lower-bound half of a small, famous fact in combinatorics: the Ramsey number R(3,3) equals 6. Said plainly — color every edge of a complete six-person graph either red or blue however you like, and you are forced to create a monochromatic triangle: three people who are all mutual reds, or all mutual blues. Six is the threshold. With five people, you can still dodge it.

R(3,3) > 5 is the dodge. It claims there exists a two-coloring of the complete graph on five vertices with no monochromatic triangle at all. The matching upper half — that six forces one — is the other side of the proof; Erdős checks that one too, by a different method. This entry is about the five.

None of this is new. Greenwood and Gleason settled R(3,3) = 6 in 1955. You can find it in any combinatorics textbook. I want that stated up front and in plain sight, because the temptation in a piece like this is to let the reader feel like a discovery happened. It didn't. The novelty here is the tool, not the result.

The witness — a pentagon

The proof needs a concrete five-person coloring that avoids the triangle, and the one Erdős uses is genuinely pretty. Put the five vertices around a circle and connect every pair — that's the complete graph K5, with all ten edges drawn. Now color: an edge is red if its two endpoints sit next to each other on the pentagon (0–1, 1–2, and so on around the rim), and blue otherwise.

Red = the pentagon's five sides. Blue = the pentagram's five chords.

What falls out is two five-cycles nested inside each other. The red edges trace the pentagon's rim — a loop of five. The blue edges trace the five-pointed star hiding inside it — the pentagram, also a loop of five. And a five-cycle simply has no triangle in it: you can't close a loop of three when the shortest loop available is a loop of five. So neither color holds a monochromatic triangle. The dodge works, and you can see why it works in the shape itself.

"Verified" versus "computed"

Here is where the gate matters. Erdős runs a battery of small combinatorics toys, and every result gets logged with an evidence grade. Most of them are graded computed — meaning a solver (Z3) exhaustively checked a finite space and reported a yes or no. That's real and it's useful, but a solver's "I checked and found nothing" is a different kind of claim than a proof. It's a search result, not a theorem.

This node is graded verified, and in Erdős that word is reserved for exactly one thing: a proof the Lean 4 kernel has accepted. The R(3,3) > 5 statement is written in Lean, and the proof is a single tactic — decide — which walks the kernel through all ten triples of five vertices and confirms, constructively, that not one of them is monochromatic. There is no sorry standing in for a missing step. And #print axioms reports that the theorem leans on no axioms at all — it's fully constructive, true by computation the kernel can re-run itself. That is a stronger object than a solver's verdict, and it earns the stronger word.

The honesty gate

None of that would matter if the grade were a label I typed in. The load-bearing piece is how the stamp is earned: a node is marked verified if and only if running lake env lean Erdos/RamseyR33.lean exits with code zero. The subprocess exit code is the source of truth. Not a model's summary of what it did. Not a confident sentence in a log. The exit code, or nothing.

A model that grades its own homework will always pass. So it doesn't get to. The kernel grades; the exit code reports; the ledger records what the exit code said.

This is the same discipline that runs through everything in the lab — receipts over vibes, honest nulls, the test as the trust boundary. A checker that can only ever say "looks good" is indistinguishable from one wired to return yes. So the interesting question was never "can the system verify R(3,3) > 5." It was "can the system earn a verified stamp at all, under a rule strict enough that the stamp is worth something." Now it can, and once for real.

Why an old result is the right first one

You prove the gate on math you already know the answer to. That's the point of starting at a settled result: if the pipeline had stamped verified on something false, or stamped it on a search where it should have said computed, I'd have caught it against the textbook. Building the honesty machinery on known ground is how you earn the right to point it, later, at ground where there's no textbook to check against — and to be believed when you do.

So: no new mathematics today. A small, beautiful fact, settled since 1955, re-proved by a tool that now knows the difference between checking and proving, and won't let itself blur the two. The pentagon was always triangle-free. What's new is that the lab can finally say so with a receipt the kernel signed — and that the word for it means exactly what it claims. That's the ring I trust. Click through the run if you want to see the rest of the grain.

← the journal erdős the lab github