Re: [Metamath] Re: MM0 -> MM translation challenges

Mario Carneiro Mon, 26 May 2025 13:55:36 -0700

On Mon, May 26, 2025 at 4:54 PM Gino Giotto <[email protected]>
wrote:

> > You can do that, but ideally this would be done as a change to the
> exporter, so that the theorems themselves can be stated using the notation,
> and also so that it stays up to date with set.mm. So the exporter will
> have to identify whether a notation is prefix or infix or a general
> notation, and there will be compromises needed on some set.mm notations
> that are mm0-ambiguous, like wa `( ph /\ ps )` and w3a `( ph /\ ps /\ th
> )`. In MM0 every notation has to start with a unique token.
>
> Ok, this sounds good to me. I will probably need one example to see how it
> works, but after that I should be able to figure out how to carry on with
> the notation. For w3a, one could disambiguate using a slightly different
> symbol, not already present in set.mm, like `( ph /\. ps /\. th )`. A
> problem with this approach is that someone might add a definition in
> set.mm with the `/\.` symbol in the future and then the translation would
> break because of a conflict between tokens. One could avoid it by using
> special hidden symbols, but this is probably against the
> transparency philosophy of MM0, so I'm sure there are plenty of reasons why
> this is bad.

The translator already has to do things like this when renaming theorems,
since MM allows dots, hyphens, and underscores in labels while only
underscores are legal in MM0, leading to potential name clashes. Currently
it will resolve overlaps by just appending a sequence number, but it means
you need to have the context of what has been translated so far to perform
back translation reliably. You could have a more sophisticated scheme that
is actually injective, but I think it would make the common case look worse.

> Another possibility could be to tell the exporter to automatically replace
> `(w3a ph ps ch)` with `(wa (wa ph ps) ch)` everywhere before introducing
> the notation. The original set.mm would not be preserved, but since the
> theorem being edited in MM1 would verify anyway in the original set.mm, I
> think it would not be a problem.
>

This is definitely going to result in more idiomatic results on the MM0
side, at the loss of injectivity.

> There are also df-op `<. A , B >.` and df-ot `<. A , B , C >.` that have
> the same obstacle. Dropping df-ot might be ok, but this approach becomes
> problematic for definitions like df-s1, df-s2, df-s3, df-s4, etc.. which
> really make statements about words shorter. I think dropping those might
> not be acceptable.
>

Currently, the nearest case of this occurring in MM1 is in lists and
(hexa)decimal numbers, and in this case the terms as written are just
stacks of "cons" operations like `a : b : c : d : 0`. (Not having to put
parentheses in this expression makes it much more palatable than the MM
equivalent.) In the case of numbers, there is also a special parser written
as a tactic which allows writing `,1234` which gets parsed into `x4 :x xd
:x x2` where `a :x b` is the infix operator 16a + b. The same mechanism can
parse string constants so you can write `,"hello"` and it becomes `ch x6 x8
': ch x6 x5 ': ch x6 xc ': ch x6 xc ': ch x6 xf ': s0` which is the
encoding of that string as a sequence of ASCII bytes.

That is all to say that if it were a native MM1 issue, I would be inclined
to solve it using a custom parser and keep the terms simple. But for
representing MM terms faithfully, the easy solution would be to add <"1 ...
<"8 constants, or maybe if you want to get clever, <', <", <"', <"" and
then <"5, ... <"8. (For the closing bracket, there is no requirement that
it be unique, you can use "> for all of them or bracket match or whatever.)

> I guess the easy ones to automate are class constants. Looking at
> peano.mm0, I notice that (all?) constants are defined as prefix operators
> with maximum precedence (e.g. def d1:  nat = $suc 0$; prefix d1:  $1$  prec
> max;). If that's the case, then it seems feasible to make a complete manual
> list of problematic definitions and decide how to handle them. The vast
> majority of set.mm definitions are just class symbols all different
> between each other.
>

I don't know whether this is a good idea, but for class constants there is
another approach, which is to just not have a notation at all, since unlike
metamath, mm0 lets you choose to use notation or not for any constant. So
if it's a constant with a name which is a valid label (like Grp) you can
just do `def Grp = $ ... $;` and then use `Grp` directly in later
statements. The examples with numbers are only there because numbers are
not valid labels (in MM0; MM lets you have names including initial numbers
and dots leading to things like the amusingly named 0.999...
<https://us.metamath.org/mpeuni/0.999....html>). But in constant symbols in
notations you can use almost any sequence of characters except for spaces
and dollar signs, which basically coincides with MM token rules.

On Mon, May 26, 2025 at 6:22 PM Jim Kingdon <[email protected]> wrote:

> A hybrid approach would translate to `(wa (wa ph ps) ch)` with some kind
> of annotation to enable round tripping but when I think of designing
> that sort of annotation mechanism I'm not sure it ends up really
> different from the /\. solution.
>

You could also not have an annotation and just opportunistically use it
when you think it will look better by some kind of analysis. This gets
really complicated really fast though, especially since you need to patch
up the proofs to make sure they remain coherent if you change your mind
about the interpretation of a subterm midway through a proof.

> I haven't been close enough to MM0/MM1 to have a fully fleshed out
> opinion about how to handle these issues, but I hope that it isn't
> preventing people from doing things with MM0/MM1 and set.mm. At least
> from a distance these issues look solvable to me.
>

Well, from the timeline on this thread, the real issue is that no one
actually has time to do the work. :) Or at least, I don't have time and
most others aren't or aren't yet in a position to do it themselves.

> > There are also df-op `<. A , B >.` and df-ot `<. A , B , C >.` that
> > have the same obstacle. Dropping df-ot might be ok, but this approach
> > becomes problematic for definitions like df-s1, df-s2, df-s3, df-s4,
> > etc.. which really make statements about words shorter. I think
> > dropping those might not be acceptable.
>
> ot is less used than 3an and 3or but I guess my mind first goes to
> automatically translating between the metamath notation and something
> which is better behaved in the ways we have been discussing. Maybe I'm
> just used to the metamath notation but given that it seems like it could
> be automatically translated, it doesn't seem like a major obstacle.
>

In MM0 the pair syntax is `a, b`, and this generalizes fairly well to
longer tuples as `a, b, c, d` (although note it's right associated unlike
metamath, because this behaves better when defining things inductively).
This is actually super useful, I make use of it all the time to have the
equivalent of df-opr, df-mpo and similar things without having to make N
versions of it for every size of ordered N-tuple, like this
<https://github.com/digama0/mm0/blob/ec331c36028025698b564593ac4f6af9cdbbbbf2/examples/mm0.mm1#L2723>
(which is defining tupled 5-ary function) or this
<https://github.com/digama0/mm0/blob/ec331c36028025698b564593ac4f6af9cdbbbbf2/examples/compiler-new.mm1#L270>
(a 6-ary class-valued function).

Now, I can think of other reasons why 3an and 3or might not be so great
> (basically the overhead of having both an and 3an when they do the same
> thing), but I don't know if I'd decide it based on notation.
>

Indeed I think this is one of the failings of metamath, it ties syntax and
semantics a bit *too* strongly, meaning that people will make definitions
to improve visual display. df-3an in particular I think does not pay for
itself, it instead generates a huge amount of combinatorial growth of
spellings for the simpl* and ad*ant* family. (df-3or I think doesn't matter
because no one ever uses it.) If you can declare a notation to say that two
ands in a particular arrangement can be written without parentheses, you
never need to have df-3an in the first place.

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/metamath/CAFXXJSumdYEOLFEQtigL-uyQDq%2BkkrgyDkp7oQXuZjidXe%3DmEw%40mail.gmail.com.

Re: [Metamath] Re: MM0 -> MM translation challenges

Reply via email to