> This seems to me to be a remarkable claim, especially since creating formal > proofs is something that even relatively early computers were doing. There > are many other automated provers. Coq proved the 4-color theorem, and prover9 > found a previously unknown algebraic proof.
The claim is about *deep learning* based automated provers. "Classical" provers and hammers --that are super well integrated in Coq/HOL4/...-- have indeed contributed numerous proofs in other libraries obviously. We've done a quite thorough review of deep learning based systems but we may have missed something obviously. Arguably, the deep learning distinction is extremely arbitrary; but it would have been a shame not to take it :) > I was intrigued that you could only add such a small percentage of synthetic > data before it started making things worse. That seems unfortunate, because > current machine learning algorithms generally need a large number of examples > to learn from. Is it clear what the problem is? Are there obvious ways to > overcome this? If the problem is that the synthetic data is too different > from the learned data, maybe follow-on work could try learning from the data > and generating new examples, to then feed back as more training data. Then > the new examples might be more similar to the existing set. This doesn't take > away from the paper at all, it's impressive, I'm just trying to think of ways > to make its results even better. You're perfectly right. The synthetic data is pretty out-of-distribution compared to set.mm which explains the phenomenon. Training a model that conjectures statements "in-distribution" and attempting to prove them would provide data that would help more efficiently (we've done some work in that direction, but are not ready to publish those yet). We still kept the synthetic data because it helps a bit, it allows the analysis of section 5.5 which I find quite interesting, and it makes the model much better when using it as a proof assistant to prove simple "math exercises" (which are not so common in set.mm as of today). If we were to evaluate on a benchmark of maths exercises / IMO problems / etc... we would probably be able to pour in more data from these generators and others (work for future us!). > In 4.3.2 Formal Verifier, first sentence, a word is missing (added in > brackets): > > Performing such proof searches requires [us] to tightly couple a Metamath > > verifier with our models. Will action :+1: > Nice work! Thanks, it means the world! -stan On Wed, Sep 9, 2020 at 3:04 PM David A. Wheeler <[email protected]> wrote: > > On September 9, 2020 4:46:47 AM EDT, 'Stanislas Polu' via Metamath > <[email protected]> wrote: > >Hi! > > > >We finally released a paper describing our methodology for the proof > >assistant. > > > >It's available on arXiv here: https://arxiv.org/abs/2009.03393 > > > >Any feedback/suggestions or questions is obviously welcome! > > > >Special shout out to David Wheeler and Mario Carneiro in the > >acknowledgements that I want to reiterate here: > > > >"" > >Finally, the authors would like to thank the whole Metamath community > >for their support, feedback, and encouragement, in particular, David > >A. Wheeler for his motivating enthusiasm and Mario Carneiro for his > >precious help on a wide variety of technical questions. > >"" > > > >Hope you enjoy it. > > I definitely enjoyed it. I think that's the first time I've been cited for > motivating enthusiasm, but I'll take it :-). > > > > GPT-f found new short proofs that were accepted into the main Metamath > > library, which is to our knowledge, the first time a deep learning based > > system has contributed proofs that were adopted by a formal mathematics > > community. > > This seems to me to be a remarkable claim, especially since creating formal > proofs is something that even relatively early computers were doing. There > are many other automated provers. Coq proved the 4-color theorem, and prover9 > found a previously unknown algebraic proof. > > Yet I do not know of a counterexample. Does anyone? > > I was intrigued that you could only add such a small percentage of synthetic > data before it started making things worse. That seems unfortunate, because > current machine learning algorithms generally need a large number of examples > to learn from. Is it clear what the problem is? Are there obvious ways to > overcome this? If the problem is that the synthetic data is too different > from the learned data, maybe follow-on work could try learning from the data > and generating new examples, to then feed back as more training data. Then > the new examples might be more similar to the existing set. This doesn't take > away from the paper at all, it's impressive, I'm just trying to think of ways > to make its results even better. > > In 4.3.2 Formal Verifier, first sentence, a word is missing (added in > brackets): > > Performing such proof searches requires [us] to tightly couple a Metamath > > verifier with our models. > > Nice work! > > --- David A.Wheeler > > -- > You received this message because you are subscribed to the Google Groups > "Metamath" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/metamath/3399CCB0-E840-478A-9D8A-A8E2CD78BECF%40dwheeler.com. -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/CACZd_0zsSeG1XTWczwV_Kaqsc-jPW-r-VS1zTjZszbWK%2Bow9Hw%40mail.gmail.com.
