Re: [Metamath] New community verifier?

Mario Carneiro Sat, 21 Nov 2020 19:15:33 -0800

On Sat, Nov 21, 2020 at 8:42 PM David A. Wheeler <[email protected]>
wrote:


> Providing a Scheme-like interface is a reasonable approach. It’s easy to
> implement & provides a lot of flexibility. The big problem is that MANY
> people hate their syntax, e.g., “lots of irritating silly parentheses”. As
> you know, I think it’d be *much* better if it supported neoteric syntax (at
> least), so that you can use syntax like f(5) and {2 + 3}, as that
> eliminates the main complaint about Lisps while retaining all their
> benefits.
>

I'm inclined not to spend much time on this complaint. Better to spend some
time with it, get a sense for the way things are put together, and *then*
propose improvements. Alternate syntax is not out of the question but it
needs to fit within the other design constraints of the language. In
particular, MM1 is in many ways an MVP for a metamath based proof
assistant, enough to help me prove my big theorems and not much more. If
this is to become some big flagship product then I don't mind investing
more to make it convenient, but only after the more pressing problems are
dealt with.

If you’re picky about types, you could even trivially implement a
> preprocessor to convert the Scheme-like language into Rust, and then
> compile it into the result. Then, once you’re happy with something, you
> could “freeze” into a faster Rust implementation.
>

Indeed, I don't recommend going overboard with the metaprogramming because
the result isn't especially maintainable without static types. Complicated
tactics should be written as plugins, either in rust or dynamically linked
from another language. The main strength of the language is for
micro-tactics like when two branches of a proof are almost but not quite
the same, and simple pattern matching tactics like norm_num.


> I like this idea. Rust makes it easy to expose libraries to other programs
> via crates.io, and I've considered doing this for mm0-rs, although I
> don't have a particularly clear conception of what the most useful hooks
> would be. Perhaps this can be tacked as the project gets going.
>
>
> I expect that’s the only way to tackle it. The key is to agree early on
> that “this is a library, with a minimal UI on top”. If you can change the
> UI without changing what’s underneath, then that’s a big success.
>

I don't have any plans to write an alternate UI for MM0 right now, which is
why there hasn't been too much movement on this front for mm0-rs. If/when
not-me people start using it, I'm sure this will become a bigger concern.

>  2. Implemented in a memory-safe language, and preferably statically
> typed. I like the confidence this could provide.
>
> I think this is the main reason it's best to write in something like C++
> or its better cousin Rust - it's a big project so you want static types,
> modules, namespaces and all that jazz, but it's also performance sensitive
> so you want as much native code as you can get. (Java is okay in this
> regime, but these days I don't really see any reason to pick it over
> alternatives.)
>
>
> Neither C nor C++ are memory-safe. I’d prefer to work in memory-safe land.
>

Memory safety is nice but kind of a low bar. I want my programs to be
correct. Within the confines of existing languages, memory safety by
default is a nice-to-have but not essential to making a usable program. The
main issues I have with C++ are its ridiculous complexity and crufty
syntax. They are doing much better these days with encouraging memory safe
programming, but Rust is just put together better.


> Java has garbage collection & semantics that are very familiar to many
> (e.g., object orientation & references).
>

Rust's approach to handling memory feels very similar to Java as an end
user, but it's quite different under the hood (the latter is obvious if you
hear the press, but the former might be more surprising). You almost never
have to explicitly drop things, and when you do it's usually because you
need to unlock a mutex or something like that that goes beyond regular
memory management.

> 3. Has a small kernel implementing “legal actions”, which is distinguished
>> from the many tactics it includes that allow trying things out without
>> producing an invalid state as long as the tactic obeys simple static rules
>> (the state may be *useless* but not *wrong*).
>>
>
> In MM1 this is mostly implemented through functions on the "Environment"
> (the set of all proven theorems so far), which only does this work at the
> unit of entire theorems. Tactics are permitted to construct arbitrary proof
> terms and are expected to only produce well formed proof terms but you get
> a runtime error if not. Unlike lean and other LCF-style provers, I don't
> make any significant attempt to isolate the "kernel", because the source of
> truth is an external verifier. (If this verifier / proof assistant is to be
> a proper, trustworthy verifier, the verifier part should probably be
> isolated from the rest of the system.)
>
>
> That’s definitely different than an LCF-style prover. Using a separate
> trustworthy verifier instead does make sense, though wouldn’t that delay
> detection of problems?
>
> Are there really advantages to *not* having a kernel? I really do want to
> know.
>

Certainly if an error is missed by the front end and caught by the real
external verifier, the error reporting quality takes a dive. So this is
something to avoid. But the luxury that this affords you is that it's not
*essential* that such errors are caught, only preferred, and it's okay if
it only happens in some weird unlikely conjunction of circumstances.  We
don't need to assume the worst of the author, and can help under the
assumption that they are trying to make a correct proof, by contrast to the
verifier's perspective where the proof author is the antagonist and we must
defend against every possible misuse.

There are lots of verification-related technologies that don't have an LCF
style approach; SAT and SMT solvers, model checkers, and even some proof
languages like PVS and NuPRL come to mind. But I think MM1 is qualitatively
different from these examples, in that MM1 is a tool for producing a proof
at the end of the day, while the others are producing "correctness", with a
yea/nay answer at the end and nothing more. If that's all you have, you
really want to know that everything that goes into that yea/nay answer is
trustworthy, and that's just what our verifiers are doing. But with a proof
object, you can always just check it at your leisure.

In MM1, the proof objects are lisp data (of course) and you can build the
objects however you like. The "refine" tactic, which is the main workhorse
for unification, will report unification errors on the spot, which amount
to 99% of errors that a user needs to deal with. But if you write your own
tactic to construct proofs from whole cloth there is no unification check
and you can produce ill-formed proofs if you want. Also, refine doesn't
catch DV violations (it's a bit tricky to work out DV conditions on a
setvar that is a metavariable). However, all proof constructions ultimately
funnel through the function Environment::add_decl() function that adds a
theorem to the environment, which is where deduplication happens, and other
typechecking can be included there. So while there is no kernel in MM1, it
is not too far from having one and this could be made more explicit if
desired.

Regarding advantages to not having a kernel, it is all a matter of degree.
In my list earlier I could have also included Coq, Lean, and Isabelle,
which all have kernels which are not particularly small. (All of these are
"true kernels" in the sense that they produce yea/nay answers that must be
trusted for correctness. Even though Coq and Lean have proof terms, there
is a large amount of reduction standing between those terms and a proper
verification trace, and some intelligence is needed from the kernel to
avoid age-of-the-universe verification times on real databases.) Isabelle's
in particular can be contrasted with a real small kernel LCF system like
HOL Light. Having a narrow API makes it easier to trust the kernel, but it
also means that all computation must filter through those functions, and
also there are more function calls overall, whereas if you implement more
complicated composite steps you can improve performance. For MM/MM0 this
question does not come up because the target is not a yea/nay but instead a
proof object satisfying a rigid external specification. You can use
whatever means you like to construct the proof, and the spec is linear time
checkable on whatever you come up with.

In summary, I don't think it's important to have a kernel given how many
backup plans we have, although any embedded verifier would be held to the
usual standards of correctness, but it can be done if we care to pursue
this.

For a library API, I think we should expose the main data structures, with
> programmatic versions of the things I've mentioned. In MM1 there are two
> main data structures: the Environment and the Elaborator. The Elaborator is
> the state that changes in the course of processing an .mm1 file, while the
> Environment is the "proof" content of the file; the elaborator is discarded
> once the file is finished and the "result" is the final Environment. The
> Elaborator stores things like the current status of file-local settings,
> the errors we need to report, and the file AST. I think most of the
> elaborator functions, corresponding to individual directives, tactics, and
> statements in the file, can be exposed as API methods on the Environment to
> external code. That way external code can say things like "add this
> notation", "add this theorem", "check this theorem" and so on. Additionally
> some proof state specific stuff should also be exposed so that external
> programs can act on a given proof state, allowing for sledgehammer-like
> tools.
>
>
> There might need to be some work so that the data structures are
> easy-to-navigate & not fragile. I wonder if methods would be useful (e.g.,
> what appear to be “values” are really computed). I would hope a REPL could
> query their state, and in some cases change them (though maybe only in
> specific ways).
>

In Rust, there is some focus placed on privacy as an abstraction mechanism.
You can have a struct with private fields, and provide public functions as
accessors to the data. This is especially important for anything around the
"kernel" like the Environment struct, since in a public API we probably
want to make it hard to misuse. Right now I think almost nothing in mm0-rs
is public, although a lot of things are pub(crate) (meaning accessible by
other code in the project but not from other modules). Making something
public is to some extent a stability promise; there are semver rules
related to changing the layout of a struct to add public fields and so on
so each of these things should be considered carefully once we have some
user code that wants to interface with it.

Mario

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CAFXXJSvH8fm-Ff6eZeDbBp2sNQWQ4%2BExunvvSZJwn46FjFxANw%40mail.gmail.com.

Re: [Metamath] New community verifier?

Reply via email to