[Metamath] Introducing ProofScaffold: A Modular, AI-Friendly Toolchain for Metamath

Mingli Yuan Tue, 10 Feb 2026 23:05:10 -0800

Dear Metamath Community,

I am writing to share an open-source project we have been working on called
ProofScaffold (https://github.com/epistemic-frontier/proof-scaffold).


Our team has deep respect for set.mm and the rigorous foundation this
community has built. However, as we explore the intersection of Metamath
and Large Language Models (LLMs), we’ve encountered a persistent challenge:
feeding a 48MB monolithic file to an AI often leads to context dilution,
hallucinated imports, and noisy proof searches.

To solve this, we built ProofScaffold. It acts as a package manager and
linker (written in Python) for Metamath. It allows us to split massive
databases into composable, compilable, and independently verifiable
packages with explicit dependencies and exports—much like Cargo or NPM, but
for formal math.

Crucially, the Trust Computing Base (TCB) does not change. Python acts
strictly as an untrusted builder/linker. The final output is a standard,
flattened .mm transient monolith that is verified by metamath-exe or
metamath-knife.

We believe this modularity is the key to unlocking AI's true potential in
formal mathematics:

- Targeted Retrieval: By scoping the context to a specific package (e.g.,
just fol or zf), we drastically reduce noise for the LLM.

- Controlled Semantic Boundaries: Explicit exports provide the AI with a
strict subset of permissible symbols and axioms. This prevents hallucinated
reasoning, such as an AI accidentally employing the Axiom of Choice in a
strict ZF-only context.

- Verifiable Incremental Loops: An AI can generate a proof, verify it
locally within the package, and map any verifier errors directly back to
the specific package contract (e.g., missing $f or label conflicts). This
makes AI self-correction much more reliable.

- Curriculum Alignment: Modular packages naturally form a curriculum
(axioms → definitions → lemmas → theorems), providing high-quality gradient
data for training models, rather than overwhelming them with the entire
set.mm at once.

We have successfully ported a prelude and are currently porting the logic
package (aligned with the |- conventions of set.mm). Our next step is to
further subdivide logic into prop, fol, eq, and setvar packages, and to
generate machine-readable interface manifests to help AI planners.

A Question for the Community regarding PyPI Naming:

To make this ecosystem easily accessible for AI researchers and engineers,
we plan to publish these modularized databases as Python packages on PyPI.

I would like to ask if the community is comfortable with us using the
metamath- prefix for these packages (e.g., metamath-logic, metamath-zfc). I
want to be entirely respectful of the Metamath trademark/legacy and ensure
this doesn't cause confusion with official tools. If the community prefers
we use a different namespace (e.g., proof-scaffold-logic), we will gladly
do so.

I would love to hear your thoughts, feedback, or critiques on this approach.

Best regards,

Mingli Yuan

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/metamath/CAGDRuAimeskGzX6dNmBLnCfOg%2BvZr57YmLqUu%2B6bg5TzxYrm1A%40mail.gmail.com.

[Metamath] Introducing ProofScaffold: A Modular, AI-Friendly Toolchain for Metamath

Reply via email to