On 17-01-30 06:20 PM, Dmitry Soshnikov wrote:
On Mon, Jan 30, 2017 at 2:22 PM, Michael Dyck <jmd...@ibiblio.org
<mailto:jmd...@ibiblio.org>> wrote:


                1. Using "No<Lookahead>" productions.

            The problem with this approach is that, in general, a
            lookahead-restriction is not a restriction on the nonterminal that
            immediately follows it in the production, it's a restriction on
            the next few input elements, which might be generated by more
            than just that nonterminal.

        Well, in this case it's not a "lookahead" anymore.

    It isn't a "lookahead" in the resulting grammar, but it is in the
    original, and the two grammars should define the same language. What I'm
    saying is that it's incorrect to assume that the effect of the lookahead
    (in the original) can be confined to the nonterminal that immediately
    follows, which is what the suggested transformation does.


A "lookahead" is a terminal token. A non-terminal is not a "lookahead".

Okay, so then in my previous statement, read "lookahead" as "lookahead-restriction". (But I did say "lookahead-restriction" in the statement before that.)


To reiterate, imaging you have:

```
• (x, y)
```

and

```
• (x, y) -> { return x + y; }
```

where `•` is a cursor position. Your lookaheads are "(", "x", "," "y", etc.
Usually 1 lookahead token is enough to route the parser, in this case "(".

What you're talking that you need not these lookaheads (at the beginning of
the cursor), but *after* the parsed non-terminal (i.e. after fully parsed
`(x, y)`), i.e. you're looking for "->", which goes at the end of the
non-terminal.

I'm not saying I don't need the lookahead tokens right after the cursor. But I am saying that the lookahead tokens that the hypothetical parser needs to examine (in order to enforce a lookahead-restriction), even if it's only 1 or 2, might extend past the 'next' nonterminal.

But that's an awkward way to express my objection.

To recap, I'm talking about the suggestion to replace:
    [lookahead != <some sequence of terminals>] Nonterminal
in a production in the original grammar with
    Nonterminal-that-does-not-start-with-that-sequence
(and a bunch of new productions) in the transformed grammar, and whether that's a valid transformation.

I'm saying it isn't valid when the Nonterminal can derive phrases with fewer terminals than the prohibited lookahead-sequence. (E.g., if Nonterminal is nullable, or if it can derive a phrase of length 1 when the prohibited lookahead-sequence is length 2.)

This is not possible, since it's not a lookahead at that
specific cursor position. It will be a lookahead after you have parsed the
non-terminal corresponding to "(x, y)".

So you can't have two non-terminals for this, since it'll be a
"reduce/reduce" conflict, and you need a cover grammar here.

Well, I'm saying you'd run into problems before you even tried to construct the LR automaton, but I think you're agreeing with me that this approach is not generally applicable?


        In this case a cover grammar is simpler, and that's why it's used
        starting since ES6 spec.

    But note that none of the ES5 lookahead-restrictions was replaced with a
    cover grammar in ES6. They're all still lookahead-restrictions.

Sure, because it's normally possible to do with the "No<Lookahead>" doubled
productions.

But they weren't replaced with "No" productions (or grammatical parameters) either! All the lookahead-restrictions in ES5 are still lookahead-restrictions in the current draft.

So this actually answers your initial question (it's a real practice).

No, I don't think it does.

----------------------------

                2. Another approach is to use Cover grammar.

        Usually it's a manual process of designing a grammar, and defining
        post-processing static semantics.

    Okay, I'm not really interested then. I'm looking for an algorithm
    (i.e., a modification of the LR algorithm) that starts with a
    grammar-with-lookahead-restrictions and ends with an LR parser.

OK, yeah, at this point I think we already clarified that it is possible to
parse ES grammar,  and handle it's lookahead restrictions with several
techniques described.

Well, ES implementations exist, so clearly there are techniques for handling those aspects of the ES grammar. But that doesn't mean that any of them are parsing *according to* the ES grammar. E.g., it's unlikely that they're finding exactly the parse tree indicated by the ES grammar. (Nor is there any requirement that they do so.)

----

By way of contrast, consider "opt" subscripts and grammatical parameters. If you look up the LR parser-construction algorithm, it doesn't talk about how to handle such things, but the spec says how to transform them into plain context-free productions, which the LR algorithm can handle. So it's an easy shorthand to say that the ES grammar is LR(1) iff the 'expanded' grammar is LR(1).

But the spec doesn't give that kind of CFG-equivalence for lookahead-restrictions. Until there's agreement about what they mean in a CFG or LR sense, it isn't really meaningful to ask if the ES grammar is LR(1).

-Michael

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Reply via email to