Re: [Metamath] GPT-f paper

'Stanislas Polu' via Metamath Wed, 09 Sep 2020 12:05:10 -0700

> 1) Maybe you mention it in the article and I missed it, but why is the model 
> called "GPT-f"? Does 'f' stand for "formal"?


Exactly

> 2) Have you tried augmenting the set.mm database by adding theorems with 
> renamed variables (like a version of id with |- ( ps -> ps ) instead of |- ( 
> ph -> ph ))? I think that some variables are used more often in set.mm than 
> others, so it seems plausible that the model will "learn less" about rare 
> variables. Does GPT-f have less luck proving statements which use rare 
> variables rather than often-used ones?

This is indeed something that happens quite often. We've run a few
experiments around using de-bruijn indices instead of class/setvar
names but it's easy to get wrong and the uplift appeared as limited
compared to the risks associated :) Shuffling variable names seems to
be a very interesting data-augmentation strategy that we should
explore. Concerning stats on the performance of GPT-f based on
variables contained in theorem, I unfortunately don't have numbers
available but I can probably try to run them. Though I'm not sure how
useful that would be as I presume there is correlation between
difficulty and appearance of specific variable names?

> 3) Do you have any plans to release GPT-f in the future as OpenAI did with 
> GPT-2?

No plan for now. We've released the proof assistant and we will
probably release the code we use to generate our datasets from set.mm.
We'll also probably make the model available through our API as well.

> Nice work! I really hope that a publication from such a credible AI company 
> will bring more attention to metamath.

Thank you for your questions and encouragements!

-stan

On Wed, Sep 9, 2020 at 5:05 PM savask <[email protected]> wrote:
>
> Nice work! I really hope that a publication from such a credible AI company 
> will bring more attention to metamath.
>
> I have several small questions, hope you could shed some light on some of 
> them.
>
> 1) Maybe you mention it in the article and I missed it, but why is the model 
> called "GPT-f"? Does 'f' stand for "formal"?
>
> 2) Have you tried augmenting the set.mm database by adding theorems with 
> renamed variables (like a version of id with |- ( ps -> ps ) instead of |- ( 
> ph -> ph ))? I think that some variables are used more often in set.mm than 
> others, so it seems plausible that the model will "learn less" about rare 
> variables. Does GPT-f have less luck proving statements which use rare 
> variables rather than often-used ones?
>
> 3) Do you have any plans to release GPT-f in the future as OpenAI did with 
> GPT-2?
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/metamath/5a86d117-f2ac-47c9-9f29-61d5dec65addo%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CACZd_0zqYTpE66TBPXONnHa52sprbgueUMzZXLLoJ0o4NeRiuQ%40mail.gmail.com.

Re: [Metamath] GPT-f paper

Reply via email to