> This seems to me to be a remarkable claim, especially since creating formal 
> proofs is something that even relatively early computers were doing. There 
> are many other automated provers. Coq proved the 4-color theorem, and prover9 
> found a previously unknown algebraic proof.

The claim is about *deep learning* based automated provers.
"Classical" provers and hammers --that are super well integrated in
Coq/HOL4/...-- have indeed contributed numerous proofs in other
libraries obviously. We've done a quite thorough review of deep
learning based systems but we may have missed something obviously.
Arguably, the deep learning distinction is extremely arbitrary; but it
would have been a shame not to take it :)

> I was intrigued that you could only add such a small percentage of synthetic 
> data before it started making things worse. That seems unfortunate, because 
> current machine learning algorithms generally need a large number of examples 
> to learn from. Is it clear what the problem is? Are there obvious ways to 
> overcome this? If the problem is that the synthetic data is too different 
> from the learned data, maybe follow-on work could try learning from the data 
> and generating new examples, to then feed back as more training data. Then 
> the new examples might be more similar to the existing set. This doesn't take 
> away from the paper at all, it's impressive, I'm just trying to think of ways 
> to make its results even better.

You're perfectly right. The synthetic data is pretty
out-of-distribution compared to set.mm which explains the phenomenon.
Training a model that conjectures statements "in-distribution" and
attempting to prove them would provide data that would help more
efficiently (we've done some work in that direction, but are not ready
to publish those yet). We still kept the synthetic data because it
helps a bit, it allows the analysis of section 5.5 which I find quite
interesting, and it makes the model much better when using it as a
proof assistant to prove simple "math exercises" (which are not so
common in set.mm as of today).

If we were to evaluate on a benchmark of maths exercises / IMO
problems / etc... we would probably be able to pour in more data from
these generators and others (work for future us!).

> In 4.3.2 Formal Verifier, first sentence, a word is missing (added in 
> brackets):
> > Performing such proof searches requires [us] to tightly couple a Metamath 
> > verifier with our models.

Will action :+1:

> Nice work!

Thanks, it means the world!

-stan

On Wed, Sep 9, 2020 at 3:04 PM David A. Wheeler <[email protected]> wrote:
>
> On September 9, 2020 4:46:47 AM EDT, 'Stanislas Polu' via Metamath 
> <[email protected]> wrote:
> >Hi!
> >
> >We finally released a paper describing our methodology for the proof
> >assistant.
> >
> >It's available on arXiv here: https://arxiv.org/abs/2009.03393
> >
> >Any feedback/suggestions or questions is obviously welcome!
> >
> >Special shout out to David Wheeler and Mario Carneiro in the
> >acknowledgements that I want to reiterate here:
> >
> >""
> >Finally, the authors would like to thank the whole Metamath community
> >for their support, feedback, and encouragement, in particular, David
> >A. Wheeler for his motivating enthusiasm and Mario Carneiro for his
> >precious help on a wide variety of technical questions.
> >""
> >
> >Hope you enjoy it.
>
> I definitely enjoyed it. I think that's the first time I've been cited for 
> motivating enthusiasm, but I'll take it :-).
>
>
> > GPT-f found new short proofs that were accepted into the main Metamath 
> > library, which is to our knowledge, the first time a deep learning based 
> > system has contributed proofs that were adopted by a formal mathematics 
> > community.
>
> This seems to me to be a remarkable claim, especially since creating formal 
> proofs is something that even relatively early computers were doing. There 
> are many other automated provers. Coq proved the 4-color theorem, and prover9 
> found a previously unknown algebraic proof.
>
> Yet I do not know of a counterexample. Does anyone?
>
> I was intrigued that you could only add such a small percentage of synthetic 
> data before it started making things worse. That seems unfortunate, because 
> current machine learning algorithms generally need a large number of examples 
> to learn from. Is it clear what the problem is? Are there obvious ways to 
> overcome this? If the problem is that the synthetic data is too different 
> from the learned data, maybe follow-on work could try learning from the data 
> and generating new examples, to then feed back as more training data. Then 
> the new examples might be more similar to the existing set. This doesn't take 
> away from the paper at all, it's impressive, I'm just trying to think of ways 
> to make its results even better.
>
> In 4.3.2 Formal Verifier, first sentence, a word is missing (added in 
> brackets):
> > Performing such proof searches requires [us] to tightly couple a Metamath 
> > verifier with our models.
>
> Nice work!
>
> --- David A.Wheeler
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/metamath/3399CCB0-E840-478A-9D8A-A8E2CD78BECF%40dwheeler.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CACZd_0zsSeG1XTWczwV_Kaqsc-jPW-r-VS1zTjZszbWK%2Bow9Hw%40mail.gmail.com.

Reply via email to