Re: [Metamath] New community verifier?

David A. Wheeler Sat, 21 Nov 2020 17:42:54 -0800

> 
> On Sat, Nov 21, 2020 at 12:47 PM vvs <[email protected] 
> <mailto:[email protected]>> wrote:
> I sense some confusion here. Let's define our dictionary first.
> 
> IDE is just a text editor with UI for syntax highlighting, autocompletion, 
> code and documentation searching and integrating other stuff. It also has to 
> be fast and robust. Verifier should just check soundness of definitions, 
> wellformedness of terms and correctness of proofs. Working incrementally is a 
> big advantage. Proof assistant should enable automated synthesis of proofs by 
> inferring missing information, automatically coercing terms, filling in gaps 
> and making good suggestions (according to Andrej Bauer).
> 
> mmj2 has a very basic IDE with integrated verifier. It's also a very 
> rudimentary PA as it has only a simple unification tool for synthesis. 
> Granted, there are some basic tactics written in JS but it isn't suited at 
> all to be programmed by an average user. It's also buggy, slow and 
> monolithic, it can't be easily used from other projects. I believe that's 
> what prompted Mario to start this thread in a first place, right? There is 
> also an established tradition to write PA in a functional programming 
> language.
> 
> I completely agree with this. Building a good IDE is a hard and time 
> consuming problem that we *should not* attempt. The major gains to be had are 
> in the "proof assistant" parts. I want to call out the MM1 "tactic language" 
> at this point, which I think is much better integrated than the JS macro 
> language in mmj2. MM1 uses a scheme-like programming language inside "do" 
> blocks, where you can just write stuff and they get evaluated on the spot as 
> you type. This makes it usable for querying the system, i.e. show me the 
> declaration/proof of "foo", what does this depend on, search for theorems 
> matching this string, etc. Proofs are also scheme expressions, and while a 
> lot of proofs are just big quoted literals containing the proof, you can 
> always just start writing a program that generates the proof literal instead, 
> or call a function that will be given the context and produces the proof. So 
> for example the arithmetic evaluator "norm_num" is defined here ( 
> https://github.com/digama0/mm0/blob/d07e5d64e68510095044f1eb12988f3b775c95ec/examples/peano_hex.mm1#L596-L601
>  
> <https://github.com/digama0/mm0/blob/d07e5d64e68510095044f1eb12988f3b775c95ec/examples/peano_hex.mm1#L596-L601>
>  ), and you can see on the succeeding lines how it is being applied to prove 
> a bunch of basic theorems.
> 
> The downside of a scripting language like this is that they tend to be slower 
> than native code, although I got my MM1 CI set up yesterday and the entire 
> library still compiles in under 2 seconds so it's tolerable (and still way 
> better than lean, which has a similar front end design to MM1).
> 
> Rust isn't a functional programming language, but the first implementation of 
> MM1 was in Haskell and it had some memory usage and speed issues that would 
> have required some significant concessions to imperative style to fix. I 
> think Rust actually gets the important things right when it comes to the 
> problems that FP languages solve, from a modeling and reasoning standpoint. I 
> guess scheme is a lisp which is FP? It's not a particularly pure FP: at least 
> my implementation uses references for metavariables so they get used fairly 
> heavily.


Providing a Scheme-like interface is a reasonable approach. It’s easy to 
implement & provides a lot of flexibility. The big problem is that MANY people 
hate their syntax, e.g., “lots of irritating silly parentheses”. As you know, I 
think it’d be *much* better if it supported neoteric syntax (at least), so that 
you can use syntax like f(5) and {2 + 3}, as that eliminates the main complaint 
about Lisps while retaining all their benefits.

If you’re picky about types, you could even trivially implement a preprocessor 
to convert the Scheme-like language into Rust, and then compile it into the 
result. Then, once you’re happy with something, you could “freeze” into a 
faster Rust implementation.

Scheme is a member of the Lisp family, and a *subset* of Scheme/Lisp is 
definitely in the set of FP languages (indeed, it’s the ancestor of all the 
rest). It’s not pure, but that doesn’t make it a bad thing.


> Here are some traits I’d like to see in a better Metamath proof assistant:
> 1. Easily usable as a programmatic *library* from other languages. I should 
> be able to import the library from most any other language, then use it to 
> load databases, do queries, etc. The higher-level capabilities should 
> implemented by calling that interface & ideally callable as well. That would 
> make it much easier to build other things.
> 
> I like this idea. Rust makes it easy to expose libraries to other programs 
> via crates.io <http://crates.io/>, and I've considered doing this for mm0-rs, 
> although I don't have a particularly clear conception of what the most useful 
> hooks would be. Perhaps this can be tacked as the project gets going.

I expect that’s the only way to tackle it. The key is to agree early on that 
“this is a library, with a minimal UI on top”. If you can change the UI without 
changing what’s underneath, then that’s a big success.

>  2. Implemented in a memory-safe language, and preferably statically typed. I 
> like the confidence this could provide.
> 
> I think this is the main reason it's best to write in something like C++ or 
> its better cousin Rust - it's a big project so you want static types, 
> modules, namespaces and all that jazz, but it's also performance sensitive so 
> you want as much native code as you can get. (Java is okay in this regime, 
> but these days I don't really see any reason to pick it over alternatives.)

Neither C nor C++ are memory-safe. I’d prefer to work in memory-safe land.

Java has garbage collection & semantics that are very familiar to many (e.g., 
object orientation & references).


> 3. Has a small kernel implementing “legal actions”, which is distinguished 
> from the many tactics it includes that allow trying things out without 
> producing an invalid state as long as the tactic obeys simple static rules 
> (the state may be *useless* but not *wrong*).
> 
> In MM1 this is mostly implemented through functions on the "Environment" (the 
> set of all proven theorems so far), which only does this work at the unit of 
> entire theorems. Tactics are permitted to construct arbitrary proof terms and 
> are expected to only produce well formed proof terms but you get a runtime 
> error if not. Unlike lean and other LCF-style provers, I don't make any 
> significant attempt to isolate the "kernel", because the source of truth is 
> an external verifier. (If this verifier / proof assistant is to be a proper, 
> trustworthy verifier, the verifier part should probably be isolated from the 
> rest of the system.)

That’s definitely different than an LCF-style prover. Using a separate 
trustworthy verifier instead does make sense, though wouldn’t that delay 
detection of problems?

Are there really advantages to *not* having a kernel? I really do want to know.

> ... 
> 5. Lots of automation (eventually).
> 
> I think the best way to get automation is to have a language that lets end 
> users write the automation.

Agreed.

> I’ve played with Rust but not tried to build a proof assistant system with 
> it. Would that be excessively hard? ...
> Obviously as the author of mm0-rs I can speak to the effectiveness of Rust 
> for writing proof assistants. For storing terms, I usually opt for a 
> recursive struct using reference-counted pointers; this limits you to acyclic 
> data structures but that's usually not a problem for a proof language. To 
> store a "baked" proof that is not being manipulated and is fully 
> deduplicated, I use a combination: the expression is really a tuple 
> (Vec<Node>, Node) where Node is a recursive struct with Arcs as mentioned, 
> but with a constructor Ref(i) that can be used to refer to a node on the 
> "heap" (the Vec<Node>), and elements in the heap can similarly refer to 
> earlier vectors on the heap. (See 
> https://github.com/digama0/mm0/blob/d07e5d64e68510095044f1eb12988f3b775c95ec/mm0-rs/src/elab/environment.rs#L135-L158
>  
> <https://github.com/digama0/mm0/blob/d07e5d64e68510095044f1eb12988f3b775c95ec/mm0-rs/src/elab/environment.rs#L135-L158>
>  .)
> 
> This is useful to explicitly represent subterm sharing, so that you can have 
> a highly redundant proof (for example if each step i+1 refers to step i twice 
> then the unshared proof will be exponential size) that you can still perform 
> operations on (unification, type checking, etc) in linear time. It's possible 
> to do the same thing with just Arcs, but you have to keep a hash table of 
> visited nodes and if you forget and just recurse naturally then the entire 
> system will grind to a halt. A lot of proof assistants suffer from this 
> problem (including metamath.exe!)

Good, that’s exactly what I wanted to know. Subterm sharing is especially 
important, and it sounds like the solution is well in hand.


> One problem with Rust is that there’s no built-in REPL. If it’s accessible 
> via an easily imported library that may be a non-problem (e.g., make it easy 
> to import into Python and use Python’s REPL).
> 
> As much as I like rust I don't think it lends itself well to a REPL, in the 
> sense that the rust language shouldn't be typed line by line, it benefits a 
> lot from scoping. MM1 uses Scheme instead for the REPL stuff, but all the 
> heavy lifting is done in Rust.

Make sense. There’s some work to try to have a Rust REPL, but it looks pretty 
dodgy.
As I noted elsewhere, I think it should support at least neoteric-expressions 
(an extension of S-expressions), but with that addition the biggest problems of 
Scheme syntax are resolved.

> I might be willing to help out if it’s Rust. But I have only fiddled with 
> Rust, not used it is seriously for this kind of application, so it might be 
> useful to hear if it’s a decent map for the purpose. To be honest, I’ve been 
> looking for an excuse to delve further into Rust, and this might be a good 
> excuse :-). You’ve inspired me, I think I’ll go de-rust my Rust in case this 
> goes anywhere.
> 
> However: Let’s imagine that the implementation is complete. What would 
> interacting with it look like? What is its human UI? What is an example of a 
> rough-draft library API?
> 
> Since most of my experience in this area is coming from MM1 and the mm0-rs 
> implementation, that would be my default answer to such questions, unless 
> others object. You interact through vscode or another LSP-enabled editor, 
> with a relatively small file (not set.mm <http://set.mm/>) containing "code" 
> in some TBD language that I will assume looks like MM1 :) .

I presume LSP = Language Server Protocol.

> The server runs in the background and provides point info and diagnostics. By 
> typing directives in a do block you can use it like a REPL. At the top of the 
> file is likely "import "set.mm <http://set.mm/>";" which says to preload 
> set.mm <http://set.mm/> and base the current development on it. (Perhaps it 
> should instead be something like "altering "set.mm <http://set.mm/>";" if the 
> goal of the file is to construct an alteration to set.mm <http://set.mm/> 
> that may involve inserting theorems in places, renaming things and other such 
> stuff.) The user can add lines to adjust set.mm <http://set.mm/> notation to 
> their preference, and the language can come with some presets that can be 
> included so that we can collect an approved set of alterations for set.mm 
> <http://set.mm/>.
> 
> If you want to add a new theorem, you can declare one in the file, with a 
> notation like MM1's but possibly with some modifications to support DV 
> conditions that are not expressible in MM0 (for example if two set variables 
> are not necessarily disjoint). You can write a literal expression if the 
> proof is simple enough, or you can call a tactic, or you can pop open a 
> (focus) block as in MM1 to enter tactic mode where you evolve a goal state 
> through the use of imperative commands like applying theorems, switching 
> between goals, calling automation and so on.
> 
> At the end of the day, you get a file that represents an alteration to set.mm 
> <http://set.mm/>, or a new .mm file. There is some command in the editor, or 
> you can use the console version of the program such as "mm0-rs compile 
> foo.mm1", and it will construct a new mm file as requested, making minimal 
> formatting changes to the file and otherwise adding your new MM theorems, 
> making some attempt to format things the way they would be formatted in a by 
> hand edit. The user can then make their own formatting tweaks if they want 
> and PR to set.mm <http://set.mm/> repo.

I think I’d need to see this in action (e.g., watch a video of someone doing 
this) to fully grok this, but it *seems* plausible.

> For a library API, I think we should expose the main data structures, with 
> programmatic versions of the things I've mentioned. In MM1 there are two main 
> data structures: the Environment and the Elaborator. The Elaborator is the 
> state that changes in the course of processing an .mm1 file, while the 
> Environment is the "proof" content of the file; the elaborator is discarded 
> once the file is finished and the "result" is the final Environment. The 
> Elaborator stores things like the current status of file-local settings, the 
> errors we need to report, and the file AST. I think most of the elaborator 
> functions, corresponding to individual directives, tactics, and statements in 
> the file, can be exposed as API methods on the Environment to external code. 
> That way external code can say things like "add this notation", "add this 
> theorem", "check this theorem" and so on. Additionally some proof state 
> specific stuff should also be exposed so that external programs can act on a 
> given proof state, allowing for sledgehammer-like tools.

There might need to be some work so that the data structures are 
easy-to-navigate & not fragile. I wonder if methods would be useful (e.g., what 
appear to be “values” are really computed). I would hope a REPL could query 
their state, and in some cases change them (though maybe only in specific ways).


> One issue is that MM1 is a proof assistant for MM0, it doesn't currently 
> support MM. One way I would like to support this is to be able to "import 
> "foo.mm <http://foo.mm/>";" in an MM1 file, which will convert the MM 
> theorems into MM1 and import them into the current file; this will make it 
> easy to build on set.mm <http://set.mm/> from within an MM0 development, but 
> it doesn't "give back" to the MM database, it is a one way conversion (at 
> least so far), and even if the reverse conversion were added it would 
> probably not be the inverse map, resulting in ugly proofs after 
> round-tripping.
> 
> I think that it may well be possible to extend mm0-rs to allow swapping out 
> the back end so that it is actually working with MM, and then most of the 
> additions mentioned here are incremental changes on that baseline. The main 
> goal with this new verifier is to attempt to address the needs of metamath 
> users / set.mm <http://set.mm/> contributors and not rock the boat so much by 
> replacing the foundations as is done with MM0, but I agree that this could be 
> incorporated as a sub-mode of mm0-rs usage.

You’re best suited to know how doable that is.

>   
> David's tutorials for mmj2 were super helpful for me for getting into using 
> it, something similar might be cool.
>  
> There are currently no tutorials for MM1, only a few scattered demos and a 
> spec document (https://github.com/digama0/mm0/blob/master/mm0-hs/mm1.md 
> <https://github.com/digama0/mm0/blob/master/mm0-hs/mm1.md>). It's starting to 
> get to a relatively stable place, so I don't mind moving to a more 
> professional packaging if people are interested. There are probably a number 
> of glaring problems for people who are not me that need to be addressed 
> though, so I would encourage folks to try it out, get hopelessly stuck and 
> then report your experience as a basis for documentation and improvements.

Okay. I’m good at getting hopelessly stuck :-).

--- David A. Wheeler

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/F4ACE9FB-3DAD-4C3E-BD06-BF94A72CF698%40dwheeler.com.

Re: [Metamath] New community verifier?

Reply via email to