> On Jun 5, 2025, at 11:56 AM, 'Fabian Huch' via Metamath 
> <[email protected]> wrote:
> 
> I am currently creating a comparison of large libraries of ITPs. I am only 
> considering libraries with more than 1M (nonempty) LOC, of which I know (in 
> alphabetical order): ACL2, HOL4, Isabelle, Lean, Metamath, Mizar, PVS, Rocq.
> To that end, I want to approximate the number of definitions (incl. axioms; 
> excl. abbreviations, instances, fixed values/constants, examples) and proven 
> proper theorems (excl. automatically generated ones) by simple analysis of 
> the sources (without running the system).
> 
> For Metamath, my approach would be to look at tokens in `.mm` files and count 
> $a for axioms and $p for theorems. Is this reasonable or is there a better 
> way?


That makes *overall* sense but there are some details you'll want to address:

* A comment can refer to an "$a" or "$p" statement. It doesn't happen *often*,
  and checking for patterns should make that unimportant.
* In Metamath, definitions are also considered $a statements. The largest
  Metamath databases use naming conventions to distinguish them
  ("ax-" means axiom, "df-" means definition). I would provide numbers for
  those 3 cases (true axioms, definitions, and proven statements).
* Metamath has several databases. We normally distribute them compressed, which
  through the magic of compression makes them smaller. When uncompressed
  several are *easily* larger than 1M nonempty LOC. In size order (starting 
from largest)
  they are set.mm, iset.mm, nf.mm, and ql.mm. I would include at least
  set.mm and iset.mm, as they're the largest, in especially active development,
  and each has proofs the other doesn't. I'm not sure I'd include nf.mm or ql.mm
  given your goals.

Here's a quick shell script I wrote to do those estimates
(including nf.mm and ql.mm in case you want them):

for f in set.mm iset.mm nf.mm sqlmm; do
 echo "$f:"
 echo -n "Axioms: "; grep -E ' *ax-.* \$a ' "$f" | wc -l
 echo -n "Definitions: "; grep -E ' *df-.* \$a ' "$f" | wc -l
 echo -n "Proofs: "; grep -E ' [^ ]+ \$p ' "$f" | wc -l
 echo
done

set.mm:
Axioms:      124
Definitions:     1401
Proofs:    45488

iset.mm:
Axioms:       89
Definitions:      349
Proofs:    14462

nf.mm:
Axioms:       43
Definitions:      159
Proofs:     5981

ql.mm:
Axioms:       21
Definitions:       29
Proofs:     1178

The number of axioms shown is a little misleadingly large.
For example, in set.mm, ax-1cn is an axiom that 1 is a complex number.
This statement is proven in set.mm, but we restate it as an axiom so that later
proofs that depend on it won't necessarily depend on the construction of 1.

Still, that will give you numbers to work with.

(This is a resend, sorry if you got both.)

--- David A. Wheeler

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/metamath/5D5843F3-0658-4FDD-B124-E03DB67ADE1C%40dwheeler.com.

Reply via email to