Re: [Metamath] Future website directions

Mario Carneiro Sun, 26 Feb 2023 20:04:38 -0800

FYI I did a fairly comprehensive HTML modernization in mm-web-rs [
https://github.com/digama0/mm-web-rs/blob/master/src/render.rs] (which is
basically a clone of the metamath-exe theorem page generator), which among
other things lowercased all the tags and made it pass the w3c validator
again (there is a link at the bottom of most pages but they don't pass more
modern HTML standards and still use things like <FONT>).


On Sun, Feb 26, 2023 at 11:52 AM David A. Wheeler <dwhee...@dwheeler.com>
wrote:

> I completely agree, a page on AI/ML would be great. The pages could do
> with a refresh anyway, many places suggest contacting Norm.
>
> Norm used an HTML 1.0 convention of using uppercase tags. I suggest that
> new text use lowercase tags as a style choice
> It doesn't matter technically, but it is the current convention, and it
> would make it easier to see which parts are unchanged.
>
> On February 25, 2023 5:33:02 PM EST, Mario Carneiro <di.g...@gmail.com>
> wrote:
>>
>> A page with short descriptions of papers and projects using Metamath for
>> AI would be great to put on the website. I think it would be good publicity
>> for both metamath and the AI systems, and maybe we can solicit the paper
>> authors to write the descriptions and provide related links as well.
>>
>> On Sat, Feb 25, 2023 at 5:27 PM Jon P <drjonpenn...@gmail.com> wrote:
>>
>>> Re the subject of LLMs being trained on metamath this new model called
>>> LLama from facebook is interesting. It seems to be smaller than other
>>> systems and it looks like they're already training on metamath.
>>>
>>> https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
>>>
>>> It links to this page and this paper which talk about the matheamtical
>>> aspects
>>>
>>> https://ai.facebook.com/blog/ai-math-theorem-proving/
>>>
>>> https://arxiv.org/abs/2205.11491
>>>
>>> It's interesting they talk about the "metamath benchmark". Maybe one
>>> approach is to take that and have a page which presents the metamath
>>> database for training (kind of like the .txt files or json or whatever is
>>> preferred) and then has some notes about the papers that have already been
>>> published about it (gpt-f, LLama) and their results. Might be interesting.
>>>
>>> On Sunday, February 19, 2023 at 10:39:55 PM UTC David A. Wheeler wrote:
>>>
>>>>
>>>> > On Sun, Feb 19, 2023 at 2:19 PM David A. Wheeler <
>>>> dwhe...@dwheeler.com> wrote:
>>>> > My first choice when building a website, where it's easily achieved,
>>>> is static rendering (statically-generated pages) where the most important
>>>> data is visible without client-side JavaScript. Those are easily
>>>> understood, incredibly hard to attack, trivial to configure, trivial to
>>>> mirror, support security/privacy-conscious users, etc. Some sites *cannot*
>>>> be developed this way, of course!! But using that as the "starting point"
>>>> helps clarify "what else do you need?" - that is, it forces a discussion of
>>>> "what are the unspoken requirements?". It's also where we are in the
>>>> metamath site.
>>>> >
>>>> > With pre-rendering as a replacement for dynamic server side stuff, we
>>>> can still do most of the things. For example we could prerender all the
>>>> json queries and use client side JS to request them when doing cross
>>>> linking stuff. This is only made difficult if the set of all queries is
>>>> infinite, for example if we can do a search query.
>>>>
>>>>
>>>> > On Feb 19, 2023, at 3:13 PM, Mario Carneiro <di....@gmail.com>
>>>> wrote:
>>>> > The main advantages of dynamic rendering are:
>>>> > (1) quicker turnaround time, e.g. if we want to hastily put up a
>>>> mm100 theorem on the website;
>>>> > (2) faster iteration - it is possible to make changes to the server
>>>> and see the results ~immediately;
>>>> > (3) lower storage cost, since we won't have to generate all the HTML
>>>> for the web site in advance and just cache important stuff (I assume that
>>>> most of the work going into the website build is wasted, since no one looks
>>>> at that page before it is next regenerated);
>>>> > (4) the ability to have many more rendering styles (unicode, gif,
>>>> text, latex, mathml, json have all been considered and more are possible)
>>>> and more databases, which is a combinatorial problem for the website build
>>>> limited by storage space and generation time.
>>>>
>>>> Sure. As always, it's all about trade-offs :-).
>>>>
>>>> I don't know if quicker turnaround time & such is all that important.
>>>> A "local" server run (by, say, a tool that called metamath-knife) would
>>>> have the same
>>>> tooling problems: You'd need to set up additional tools to run the tool
>>>> locally.
>>>>
>>>> Of that list, the ability to have more rendering styles is possibly the
>>>> most compelling of that list.
>>>> But exactly what rendering styles?
>>>>
>>>>
>>>> > > We can obviously use a program to dynamically generate many of the
>>>> pages instead, but what would it need to do?
>>>> >
>>>> > For the first version, I would aim simply to replicate the behavior /
>>>> look and feel of the current metamath web pages. That's what I implemented
>>>> in https://github.com/digama0/mm-web-rs . Once we have something that
>>>> can potentially replace the functionality of metamath-exe it will be easier
>>>> to consider incremental improvements, in an architecture where we can do
>>>> fancy things if we want.
>>>>
>>>> Okay, but what is desirable that is *different* long term? I think
>>>> there are plausible good answers,
>>>> but it'd be good to have some idea about them.
>>>>
>>>> > One technique that I use in the MM0 web pages is to use the HTML as
>>>> the data source: put all the proofs in a table like they would be normally,
>>>> but perhaps not as marked up as they should be. Then postprocess it with JS
>>>> to add all the markup and make additional server requests if needed. That
>>>> solves the "make the JS-disabled experience palatable" problem, since you
>>>> just get the basic version of the page if no JS is running.
>>>>
>>>> Yup! Sounds like a great approach.
>>>>
>>>> > The sort of thing that might require additional data is parsing
>>>> support; for example with MM0 pages the formulas are printed in ascii but
>>>> ideally they would reflow using the layout algorithm, using correct
>>>> indentation. Metamath web pages are horrible at this; they get good
>>>> per-token rendering but reflowing is garbage so hopefully you just don't
>>>> have too-long lines. mmj2 has a nice indenting formatter (actually there
>>>> are several algorithms available); wouldn't it be nice if that could be
>>>> done in-browser so that it adapts to the viewport width? If the lines are
>>>> sent in HTML as token strings, then that means that the JS has to do the
>>>> parsing itself so it needs to know the signatures of declarations like
>>>> 'wi'.
>>>>
>>>> An alternative is to provide HTML in a table (or some other grid),
>>>> along with something that's simpler to parse but not shown to someone just
>>>> looking at the HTML. It could be in an attribute, something with "display:
>>>> none", whatever. If there's a client-side renderer being used, I would want
>>>> it to be *easy* to write, not make it parse complex HTML. The individual
>>>> theorems aren't big, so it'd be pretty easy to include both in one file.
>>>>
>>>> > Also expensive queries can be hidden behind an expandable drop-down
>>>> and a server request. We probably want to keep axioms-used and used-by
>>>> lists enabled by default even though they are somewhat expensive (although
>>>> if the server has some time to warm up and precompute some data structures
>>>> then it can cheaply answer these requests), but there are some other
>>>> requests along the same lines, like "how many steps would this proof be if
>>>> everything was expanded to axioms" (which is fun if you like to look at
>>>> numbers in the range of 10^50), which are accessible to metamath-exe but
>>>> not shown on the web page.
>>>>
>>>> Agreed. You can do the same with pregenerated static files, BTW, where
>>>> searches bring you to a *dynamic* program that does the search & provides
>>>> the results.
>>>>
>>>> An alternative to a lengthy warm-up each time would be to precompute
>>>> some data structures & answers, then store them in a database (probably
>>>> using SQLite, but MySQL or PostgreSQL would work, though perhaps even a
>>>> simple key/value store would be enough). Or do the lengthy warm-up only
>>>> when that .mm has changed, then store that in a database. Then server
>>>> restarts can be fast when the .mm file hasn't changed.
>>>>
>>>> > > I don't know if we need a JSON format for the .mm file as a whole.
>>>> ...
>>>> >
>>>> > I agree with this. The main use case for JSON is at the per-theorem
>>>> level; a JSON for the mm file as a whole would be huge and you would be
>>>> better off parsing MM directly at that point, since it would be more
>>>> efficient. Some other potential files are table-of-contents like things,
>>>> which cover the whole mm file but only represent one 'aspect' of the
>>>> theorems. So an actual table of contents (the section name hierarchy), the
>>>> theorem pages (i.e. give the statements of theorems 1000-1100), the
>>>> definition list, the grammar (syntax axioms). The bare list of all theorem
>>>> names, in order, might also be a useful query - I would just give this in
>>>> plain text, newline separated. Maybe include the $f/$e/$a/$p
>>>> classifications too.
>>>>
>>>> --- David A. Wheeler
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Metamath" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to metamath+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com
>>> <https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to metamath+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CAFXXJSvC5DCjsxGFS7Kw3CVBCF8duMm332QvcB9CZbSk0EJqxQ%40mail.gmail.com.

Re: [Metamath] Future website directions

Reply via email to