Also, *the verifier does not check that a proof only uses its own essential hypotheses.* This allows incorrect proofs such as $[ set.mm $] evil $p |- ph $= ( idi.1 ) B $. which turn hypotheses into conclusions.
On Tue, Sep 16, 2025 at 3:44 PM Matthew House <[email protected]> wrote: > A few issues I spotted, many of which are nitpicks and some of which are > more serious: > > - The verifier does not check that each file only uses printable > ASCII + "\t\n\f\r". This can be problematic if not checked, since > differences in how characters outside this set are handled (e.g., what > counts as white space) can lead to discrepancies between verifiers. > - The verifier does not check that keyword tokens are separated by > white space, e.g., it accepts $($) as a valid comment, or foo$a|- bar$. > as a valid $a statement. > - The verifier does not check that comments do not contain $( or $). > - The verifier does not check that ${ $} delimiters for blocks are > balanced within each file. > - The verifier does not check that math symbols do not contain $. > - The verifier accepts a meaningless dangling $ character at the end > of a file. > - The verifier does not check that active constants and variables > cannot be redeclared, nor that variables cannot be declared as constants or > vice versa. It has no support for variables becoming inactive at the end of > their containing block. > - The verifier does not check that math symbols in $d statements are > distinct active variables. > - The verifier does not check that all labels are unique, nor that all > labels are distinct from all declared math symbols. > - The verifier does not check that typecodes are active constants. > - The verifier does not check that $f statements use an active > variable, nor that all variables in an $e, $a, or $p statement have an > active $f statement. > - The verifier does not check that a variable has at most one active $f > statement at a time, nor that every $f statement for a variable has > the same typecode. > - The verifier does not ignore repeated inclusions of the same file. > For instance, a file test.mm containing $[ test.mm $] will lead to a > runaway memory leak. > - The verifier does not ignore comments in certain contexts in > statements. For instance, it will interpret foo $a $( baz $) |- bar $. > as a statement with typecode $( and expression $( baz $) |- bar, which > is nonsense and will lead to discrepancies between verifiers. > - The verifier does not check for integer overflow in compressed-label > steps. This allows multiple ways to write step numbers, as well as ways to > write the special code Z as a number, which will lead to discrepancies > between verifiers. > - The verifier has no support for unknown steps ?. After reading a > proof with a ? step, a typical verifier will continue verifying the > rest of the database as usual, but warn the user that it > contains incomplete proofs. > - *The verifier does not check that a proof cannot use its own label > in a step.* This allows incorrect proofs such as $[ set.mm $] evil $p > |- ph $= ( evil ) AB $. which insist upon themselves. > - *The verifier does not check that typecodes match when substituting > variables.* This allows incorrect proofs such as $[ set.mm $] evil $p > |- ph $= ( vx wn weq wal equid ax-gen ax6 pm2.24ii ) BCZJDZBEAKBJFGBJHI $. > which exploit syntax ambiguities. > > Writing a correct Metamath verifier requires very careful attention to > detail, which it seems that LLMs still struggle with, short of directed > hand-holding or extraordinarily thorough test suites. The original > mmverify.py code is quite compact for how many details it captures. > > Matthew House > > On Tue, Sep 16, 2025 at 5:03 AM Zarathustra Goertzel <[email protected]> > wrote: > >> I was curious if Codex could write a Metamath verifier, so asked ChatGPT >> for a list of verifiers and what languages it thought would be both *easy >> and interesting*: it chose Go. >> >> I started from my mmverify.py repo (where iIhave a too-slow verifier in >> another weird language, MeTTa <https://metta-lang.dev/>), so it had >> access to reference implementations. >> >> The task was completed pretty easily (perhaps in 40-60min of AI-work) >> with my feeding in some errors. It seems to pass David Wheeler's >> Metamath-test >> suite <https://github.com/zariuq/metamath-test>. >> >> It's here if anyone's curious: >> https://github.com/zariuq/mmverify.py/tree/mmverifyGo/go >> >> FWIW, I decided to try to see if it could do Metamath Zero, and I must >> say it struggles (even to just do .mm0 + .mmu verification). >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Metamath" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion visit >> https://groups.google.com/d/msgid/metamath/e55ea1ba-b3c2-4018-80a1-a46149ead54fn%40googlegroups.com >> <https://groups.google.com/d/msgid/metamath/e55ea1ba-b3c2-4018-80a1-a46149ead54fn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/metamath/CADBQv9bxtqLqC2j%3D_7N3%3D95TZ2y5%2BkD_mLgEUhjPXZOTkFNc0Q%40mail.gmail.com.
