There have been a couple of AI systems, gpt-f 
<https://arxiv.org/abs/2009.03393> and Holophrasm 
<https://arxiv.org/abs/1608.02644>, built on metamath whose aim is to 
generate proofs. Are there more I’m not aware of?

If I have understood correctly I think how the search process works is as 
follows:

You have some set of hypotheses and a proposition you want to prove.

You apply a series of already proven theorems, which are substitution 
rules, to the proposition until the resulting statements either match the 
hypotheses or match theorems already proven.

The question “Is there a proof of any length which proves this proposition” 
is in the same complexity class as the halting problem I think. Whereas “Is 
there a proof of length less than L steps which proves the proposition” is 
NP-complete, so is easier to tackle I think.

Trying to find such a chain of substitution rules is essentially a tree 
search problem. If there are N theorems in the database then to answer the 
latter question comes down to searching a tree of size N^L. Is that 
somewhat right?

So does this mean, in the abstract, given enough hardware and time you 
could find all proofs of length less than L just by brute force search? 

I am also interested in what the true branching factor is. For instance not 
all N theorems in the database can be applied to all hypotheses, only a 
subset can be. How much is the reduction? Can most theorems be applied to 
most statements or is it that only a very small subset can? Moreover I 
guess there is a reduction because it is sufficient to find only one proof 
whereas there will be many for any provable statement.

Is it true that any AI system built on metamath will need these same 
foundations? For instance it will need a parser for the database, a 
verifier and also it will need a function which: for any statement returns 
the possible substitutions/theorems which could be applied.

Is there any value in us (I don’t know if anyone would be interested in 
working on this) gathering some programs which already do this or writing 
some new code in a nice optimised way? For example in MineRL 
<https://minerl.io/docs/tutorials/first_agent.html#creating-an-environment>, 
a minecraft AI challenge, they provide an environment which offers the 
agent the state of the game and a list of options available to it. Maybe 
some sort of environment like this for metamath would help encourage people 
to tackle the AI side of these problems? 

Another question I have is how much can be done with optimised code, or 
multithreading, or gpu parallelisation? For instance the unify command in 
MMJ2 presumably is searching for all possible proofs of length 1 to fill in 
a gap? Could something like this be optimised to run in the background on 
other threads on the cpu so it could fill in 2, 3, or more steps for you? 
Would this be a powerful thing to add to a proof assistant, a brute force 
solver which is always trying to help? If a gpu optimised parallel search 
could be performed on a large amount of hardware might it be able to 
actually generate some reasonable length proofs?

Finally for the general AI problem presumably if any kind of mild heuristic 
could be found then it might be possible to generate proofs much more 
efficiently? I think that’s what the above systems are doing, if I have 
understood correctly, is given a list of possible substitutions to prefer 
some other overs as more likely to lead in the right direction? Their hope 
is that the Neural Networks can develop some sort of intuition.

In terms of finding a heuristic there is some interesting work here 
<https://arxiv.org/abs/2005.12535> which might apply. The basic idea, as 
applied to metamath, would be to use some variant of word2vec to embed all 
the statements in the database into a high dimensional (say 100) vector 
space. Then when searching the tree of possible proofs statements could be 
evaluated based on whether they were leading closer to the hypotheses or 
not. Might this approach have any value here? Is anyone interested in it? 
Are there other ways of generating a heuristic to help with search?

Thank you for taking the time to read this. I’d really value any feedback 
or input. I apologise in advance for the many misunderstandings I have and 
would love to be corrected on anything. 

Jon

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/027a6381-8caa-4a3b-b2a0-5435d55e0e77n%40googlegroups.com.

Reply via email to