FYI I did a fairly comprehensive HTML modernization in mm-web-rs [ https://github.com/digama0/mm-web-rs/blob/master/src/render.rs] (which is basically a clone of the metamath-exe theorem page generator), which among other things lowercased all the tags and made it pass the w3c validator again (there is a link at the bottom of most pages but they don't pass more modern HTML standards and still use things like <FONT>).
On Sun, Feb 26, 2023 at 11:52 AM David A. Wheeler <dwhee...@dwheeler.com> wrote: > I completely agree, a page on AI/ML would be great. The pages could do > with a refresh anyway, many places suggest contacting Norm. > > Norm used an HTML 1.0 convention of using uppercase tags. I suggest that > new text use lowercase tags as a style choice > It doesn't matter technically, but it is the current convention, and it > would make it easier to see which parts are unchanged. > > On February 25, 2023 5:33:02 PM EST, Mario Carneiro <di.g...@gmail.com> > wrote: >> >> A page with short descriptions of papers and projects using Metamath for >> AI would be great to put on the website. I think it would be good publicity >> for both metamath and the AI systems, and maybe we can solicit the paper >> authors to write the descriptions and provide related links as well. >> >> On Sat, Feb 25, 2023 at 5:27 PM Jon P <drjonpenn...@gmail.com> wrote: >> >>> Re the subject of LLMs being trained on metamath this new model called >>> LLama from facebook is interesting. It seems to be smaller than other >>> systems and it looks like they're already training on metamath. >>> >>> https://ai.facebook.com/blog/large-language-model-llama-meta-ai/ >>> >>> It links to this page and this paper which talk about the matheamtical >>> aspects >>> >>> https://ai.facebook.com/blog/ai-math-theorem-proving/ >>> >>> https://arxiv.org/abs/2205.11491 >>> >>> It's interesting they talk about the "metamath benchmark". Maybe one >>> approach is to take that and have a page which presents the metamath >>> database for training (kind of like the .txt files or json or whatever is >>> preferred) and then has some notes about the papers that have already been >>> published about it (gpt-f, LLama) and their results. Might be interesting. >>> >>> On Sunday, February 19, 2023 at 10:39:55 PM UTC David A. Wheeler wrote: >>> >>>> >>>> > On Sun, Feb 19, 2023 at 2:19 PM David A. Wheeler < >>>> dwhe...@dwheeler.com> wrote: >>>> > My first choice when building a website, where it's easily achieved, >>>> is static rendering (statically-generated pages) where the most important >>>> data is visible without client-side JavaScript. Those are easily >>>> understood, incredibly hard to attack, trivial to configure, trivial to >>>> mirror, support security/privacy-conscious users, etc. Some sites *cannot* >>>> be developed this way, of course!! But using that as the "starting point" >>>> helps clarify "what else do you need?" - that is, it forces a discussion of >>>> "what are the unspoken requirements?". It's also where we are in the >>>> metamath site. >>>> > >>>> > With pre-rendering as a replacement for dynamic server side stuff, we >>>> can still do most of the things. For example we could prerender all the >>>> json queries and use client side JS to request them when doing cross >>>> linking stuff. This is only made difficult if the set of all queries is >>>> infinite, for example if we can do a search query. >>>> >>>> >>>> > On Feb 19, 2023, at 3:13 PM, Mario Carneiro <di....@gmail.com> >>>> wrote: >>>> > The main advantages of dynamic rendering are: >>>> > (1) quicker turnaround time, e.g. if we want to hastily put up a >>>> mm100 theorem on the website; >>>> > (2) faster iteration - it is possible to make changes to the server >>>> and see the results ~immediately; >>>> > (3) lower storage cost, since we won't have to generate all the HTML >>>> for the web site in advance and just cache important stuff (I assume that >>>> most of the work going into the website build is wasted, since no one looks >>>> at that page before it is next regenerated); >>>> > (4) the ability to have many more rendering styles (unicode, gif, >>>> text, latex, mathml, json have all been considered and more are possible) >>>> and more databases, which is a combinatorial problem for the website build >>>> limited by storage space and generation time. >>>> >>>> Sure. As always, it's all about trade-offs :-). >>>> >>>> I don't know if quicker turnaround time & such is all that important. >>>> A "local" server run (by, say, a tool that called metamath-knife) would >>>> have the same >>>> tooling problems: You'd need to set up additional tools to run the tool >>>> locally. >>>> >>>> Of that list, the ability to have more rendering styles is possibly the >>>> most compelling of that list. >>>> But exactly what rendering styles? >>>> >>>> >>>> > > We can obviously use a program to dynamically generate many of the >>>> pages instead, but what would it need to do? >>>> > >>>> > For the first version, I would aim simply to replicate the behavior / >>>> look and feel of the current metamath web pages. That's what I implemented >>>> in https://github.com/digama0/mm-web-rs . Once we have something that >>>> can potentially replace the functionality of metamath-exe it will be easier >>>> to consider incremental improvements, in an architecture where we can do >>>> fancy things if we want. >>>> >>>> Okay, but what is desirable that is *different* long term? I think >>>> there are plausible good answers, >>>> but it'd be good to have some idea about them. >>>> >>>> > One technique that I use in the MM0 web pages is to use the HTML as >>>> the data source: put all the proofs in a table like they would be normally, >>>> but perhaps not as marked up as they should be. Then postprocess it with JS >>>> to add all the markup and make additional server requests if needed. That >>>> solves the "make the JS-disabled experience palatable" problem, since you >>>> just get the basic version of the page if no JS is running. >>>> >>>> Yup! Sounds like a great approach. >>>> >>>> > The sort of thing that might require additional data is parsing >>>> support; for example with MM0 pages the formulas are printed in ascii but >>>> ideally they would reflow using the layout algorithm, using correct >>>> indentation. Metamath web pages are horrible at this; they get good >>>> per-token rendering but reflowing is garbage so hopefully you just don't >>>> have too-long lines. mmj2 has a nice indenting formatter (actually there >>>> are several algorithms available); wouldn't it be nice if that could be >>>> done in-browser so that it adapts to the viewport width? If the lines are >>>> sent in HTML as token strings, then that means that the JS has to do the >>>> parsing itself so it needs to know the signatures of declarations like >>>> 'wi'. >>>> >>>> An alternative is to provide HTML in a table (or some other grid), >>>> along with something that's simpler to parse but not shown to someone just >>>> looking at the HTML. It could be in an attribute, something with "display: >>>> none", whatever. If there's a client-side renderer being used, I would want >>>> it to be *easy* to write, not make it parse complex HTML. The individual >>>> theorems aren't big, so it'd be pretty easy to include both in one file. >>>> >>>> > Also expensive queries can be hidden behind an expandable drop-down >>>> and a server request. We probably want to keep axioms-used and used-by >>>> lists enabled by default even though they are somewhat expensive (although >>>> if the server has some time to warm up and precompute some data structures >>>> then it can cheaply answer these requests), but there are some other >>>> requests along the same lines, like "how many steps would this proof be if >>>> everything was expanded to axioms" (which is fun if you like to look at >>>> numbers in the range of 10^50), which are accessible to metamath-exe but >>>> not shown on the web page. >>>> >>>> Agreed. You can do the same with pregenerated static files, BTW, where >>>> searches bring you to a *dynamic* program that does the search & provides >>>> the results. >>>> >>>> An alternative to a lengthy warm-up each time would be to precompute >>>> some data structures & answers, then store them in a database (probably >>>> using SQLite, but MySQL or PostgreSQL would work, though perhaps even a >>>> simple key/value store would be enough). Or do the lengthy warm-up only >>>> when that .mm has changed, then store that in a database. Then server >>>> restarts can be fast when the .mm file hasn't changed. >>>> >>>> > > I don't know if we need a JSON format for the .mm file as a whole. >>>> ... >>>> > >>>> > I agree with this. The main use case for JSON is at the per-theorem >>>> level; a JSON for the mm file as a whole would be huge and you would be >>>> better off parsing MM directly at that point, since it would be more >>>> efficient. Some other potential files are table-of-contents like things, >>>> which cover the whole mm file but only represent one 'aspect' of the >>>> theorems. So an actual table of contents (the section name hierarchy), the >>>> theorem pages (i.e. give the statements of theorems 1000-1100), the >>>> definition list, the grammar (syntax axioms). The bare list of all theorem >>>> names, in order, might also be a useful query - I would just give this in >>>> plain text, newline separated. Maybe include the $f/$e/$a/$p >>>> classifications too. >>>> >>>> --- David A. Wheeler >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Metamath" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to metamath+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com >>> <https://groups.google.com/d/msgid/metamath/3fcc0b30-42ad-4bda-9627-73aa59c66fa2n%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "Metamath" group. To unsubscribe from this group and stop receiving emails from it, send an email to metamath+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/metamath/CAFXXJSvC5DCjsxGFS7Kw3CVBCF8duMm332QvcB9CZbSk0EJqxQ%40mail.gmail.com.