Re: [Metamath] Some questions about AI’s which can generate proofs. I’d welcome any and all answers and feedback :)

Mario Carneiro Wed, 09 Jun 2021 13:45:12 -0700

On Wed, Jun 9, 2021 at 11:21 AM Jon P <[email protected]> wrote:


> There have been a couple of AI systems, gpt-f
> <https://arxiv.org/abs/2009.03393> and Holophrasm
> <https://arxiv.org/abs/1608.02644>, built on metamath whose aim is to
> generate proofs. Are there more I’m not aware of?
>

Not that I know of, at least for metamath. There are other systems in other
languages: GPT-f also has a lean version, HOList works for HOL light,
Tactician is a Coq-based ML architecture, and there are several others that
don't immediately come to mind.

The question “Is there a proof of any length which proves this proposition”
> is in the same complexity class as the halting problem I think. Whereas “Is
> there a proof of length less than L steps which proves the proposition” is
> NP-complete, so is easier to tackle I think.
>

Technically, L has to be written in unary for it to be NP-complete. If you
write it normally then it is NEXPTIME.
.

> Trying to find such a chain of substitution rules is essentially a tree
> search problem. If there are N theorems in the database then to answer the
> latter question comes down to searching a tree of size N^L. Is that
> somewhat right?
>

This is a slightly overly restricted setting. You probably want to be able
to add new theorems to the database and then use them to compress future
proofs. That means that the branching factor can grow over time, roughly
(N+L)^L if we say that adding a theorem counts as a step. This is actually
how DreamCoder works (https://arxiv.org/abs/2006.08381), you should take a
look.


> So does this mean, in the abstract, given enough hardware and time you
> could find all proofs of length less than L just by brute force search?
>

Yes. A proof of length L is a string, and there are only finitely many
strings of that length. Just verify them all and see if one checks.

I am also interested in what the true branching factor is. For instance not
> all N theorems in the database can be applied to all hypotheses, only a
> subset can be. How much is the reduction? Can most theorems be applied to
> most statements or is it that only a very small subset can? Moreover I
> guess there is a reduction because it is sufficient to find only one proof
> whereas there will be many for any provable statement.
>

There are quite a few theorems, like ax-mp, that apply on any goal state,
so the branching factor is always at least as many as the number of
theorems that conclude with |- ph. But it's true that most lemmas are not
like this. Worse, one of ax-mp's hypotheses is |- ps where ps is unrelated
to ph, which means that you can apply literally any theorem T in set.mm to
the current goal in two steps, by using ax-mp and then T in the minor
premise. So the average branching factor is at least sqrt(N) where N is the
number of theorems in the database.

Is it true that any AI system built on metamath will need these same
> foundations? For instance it will need a parser for the database, a
> verifier and also it will need a function which: for any statement returns
> the possible substitutions/theorems which could be applied.
>

No. The brute force search method just described is not practical above
about 15 steps or so. It has been used for parts of
http://us.metamath.org/mmsolitaire/pmproofs.txt but it runs out of steam
even before the end of that list.

Most theorem provers are ultimately doing tree search, but the tree being
searched may not literally be the tree of substitutions mentioned. For
example they might instead use tactics (proof producing programs) as
building blocks and tree search a way to fit them together. This method may
not be complete, and may not produce the shortest proof, but it can
eliminate a lot of the excess branching factor in the naive method.


> Is there any value in us (I don’t know if anyone would be interested in
> working on this) gathering some programs which already do this or writing
> some new code in a nice optimised way? For example in MineRL
> <https://minerl.io/docs/tutorials/first_agent.html#creating-an-environment>,
> a minecraft AI challenge, they provide an environment which offers the
> agent the state of the game and a list of options available to it. Maybe
> some sort of environment like this for metamath would help encourage people
> to tackle the AI side of these problems?
>

set.mm is already a fairly good source for hand-optimized proofs, since
this has been a running goal for a while. You should talk to the gpt-f
folks about a metamath-based gym environment; I think they already have
something like that.

Another question I have is how much can be done with optimised code, or
> multithreading, or gpu parallelisation? For instance the unify command in
> MMJ2 presumably is searching for all possible proofs of length 1 to fill in
> a gap? Could something like this be optimised to run in the background on
> other threads on the cpu so it could fill in 2, 3, or more steps for you?
> Would this be a powerful thing to add to a proof assistant, a brute force
> solver which is always trying to help? If a gpu optimised parallel search
> could be performed on a large amount of hardware might it be able to
> actually generate some reasonable length proofs?
>

Certainly bigger small-scale brute force search is useful for proof
authors. Smarter search is also useful. Unfortunately, most proof
activities are not well-suited to GPU parallelism because it is a very
discrete and nonlinear problem (as you already mentioned, not unlike the
halting problem) and it's hard to find a piece of it that looks like "do
the same thing to these 1000 data elements". Multithreading is fine though,
you can have this stuff running in the background on other CPUs.

Finally for the general AI problem presumably if any kind of mild heuristic
> could be found then it might be possible to generate proofs much more
> efficiently? I think that’s what the above systems are doing, if I have
> understood correctly, is given a list of possible substitutions to prefer
> some other overs as more likely to lead in the right direction? Their hope
> is that the Neural Networks can develop some sort of intuition.
>

Yes, neural network based tree search builds heuristics for how promising
the goal state looks like, so that silly things like applying ax-mp many
times in a row are rejected for generating too many goals. You can still
improve on this by giving the neural network human level steps to be able
to work with: for example not requiring that the NN fills in substitutions
and just give theorem names with the substitutions being handled by
unification.

In terms of finding a heuristic there is some interesting work here
> <https://arxiv.org/abs/2005.12535> which might apply. The basic idea, as
> applied to metamath, would be to use some variant of word2vec to embed all
> the statements in the database into a high dimensional (say 100) vector
> space. Then when searching the tree of possible proofs statements could be
> evaluated based on whether they were leading closer to the hypotheses or
> not. Might this approach have any value here? Is anyone interested in it?
> Are there other ways of generating a heuristic to help with search?
>

GPT-f is a transformer, but previous methods like Holophrasm and HOList are
based on a tree embedding of the formulas. You kind of have to do this
because NNs work on a finite dimensional latent space.

Mario

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CAFXXJSuy4Cm%2BXwo%2Bn%3DiziRJ0hVJH34kgmBvrReJd2RK4KSWLHw%40mail.gmail.com.

Re: [Metamath] Some questions about AI’s which can generate proofs. I’d welcome any and all answers and feedback :)

Reply via email to