Re: LR(1) grammar/parser and lookahead-restrictions

Michael Dyck Tue, 31 Jan 2017 07:57:06 -0800

On 17-01-30 06:20 PM, Dmitry Soshnikov wrote:

On Mon, Jan 30, 2017 at 2:22 PM, Michael Dyck <jmd...@ibiblio.org
<mailto:jmd...@ibiblio.org>> wrote:



                1. Using "No<Lookahead>" productions.

            The problem with this approach is that, in general, a
            lookahead-restriction is not a restriction on the nonterminal that
            immediately follows it in the production, it's a restriction on
            the next few input elements, which might be generated by more
            than just that nonterminal.

        Well, in this case it's not a "lookahead" anymore.

    It isn't a "lookahead" in the resulting grammar, but it is in the
    original, and the two grammars should define the same language. What I'm
    saying is that it's incorrect to assume that the effect of the lookahead
    (in the original) can be confined to the nonterminal that immediately
    follows, which is what the suggested transformation does.


A "lookahead" is a terminal token. A non-terminal is not a "lookahead".

Okay, so then in my previous statement, read "lookahead" as"lookahead-restriction". (But I did say "lookahead-restriction" in thestatement before that.)

To reiterate, imaging you have:

```
• (x, y)
```

and

```
• (x, y) -> { return x + y; }
```

where `•` is a cursor position. Your lookaheads are "(", "x", "," "y", etc.
Usually 1 lookahead token is enough to route the parser, in this case "(".

What you're talking that you need not these lookaheads (at the beginning of
the cursor), but *after* the parsed non-terminal (i.e. after fully parsed
`(x, y)`), i.e. you're looking for "->", which goes at the end of the
non-terminal.

I'm not saying I don't need the lookahead tokens right after the cursor. ButI am saying that the lookahead tokens that the hypothetical parser needs toexamine (in order to enforce a lookahead-restriction), even if it's only 1or 2, might extend past the 'next' nonterminal.


But that's an awkward way to express my objection.

To recap, I'm talking about the suggestion to replace:
    [lookahead != <some sequence of terminals>] Nonterminal
in a production in the original grammar with
    Nonterminal-that-does-not-start-with-that-sequence

(and a bunch of new productions) in the transformed grammar, and whetherthat's a valid transformation.

I'm saying it isn't valid when the Nonterminal can derive phrases with fewerterminals than the prohibited lookahead-sequence. (E.g., if Nonterminal isnullable, or if it can derive a phrase of length 1 when the prohibitedlookahead-sequence is length 2.)

This is not possible, since it's not a lookahead at that
specific cursor position. It will be a lookahead after you have parsed the
non-terminal corresponding to "(x, y)".

So you can't have two non-terminals for this, since it'll be a
"reduce/reduce" conflict, and you need a cover grammar here.

Well, I'm saying you'd run into problems before you even tried to constructthe LR automaton, but I think you're agreeing with me that this approach isnot generally applicable?

        In this case a cover grammar is simpler, and that's why it's used
        starting since ES6 spec.

    But note that none of the ES5 lookahead-restrictions was replaced with a
    cover grammar in ES6. They're all still lookahead-restrictions.

Sure, because it's normally possible to do with the "No<Lookahead>" doubled
productions.

But they weren't replaced with "No" productions (or grammatical parameters)either! All the lookahead-restrictions in ES5 are stilllookahead-restrictions in the current draft.

So this actually answers your initial question (it's a real practice).


No, I don't think it does.

----------------------------

                2. Another approach is to use Cover grammar.

        Usually it's a manual process of designing a grammar, and defining
        post-processing static semantics.

    Okay, I'm not really interested then. I'm looking for an algorithm
    (i.e., a modification of the LR algorithm) that starts with a
    grammar-with-lookahead-restrictions and ends with an LR parser.

OK, yeah, at this point I think we already clarified that it is possible to
parse ES grammar,  and handle it's lookahead restrictions with several
techniques described.

Well, ES implementations exist, so clearly there are techniques for handlingthose aspects of the ES grammar. But that doesn't mean that any of them areparsing *according to* the ES grammar. E.g., it's unlikely that they'refinding exactly the parse tree indicated by the ES grammar. (Nor is thereany requirement that they do so.)


----

By way of contrast, consider "opt" subscripts and grammatical parameters. Ifyou look up the LR parser-construction algorithm, it doesn't talk about howto handle such things, but the spec says how to transform them into plaincontext-free productions, which the LR algorithm can handle. So it's an easyshorthand to say that the ES grammar is LR(1) iff the 'expanded' grammar isLR(1).

But the spec doesn't give that kind of CFG-equivalence forlookahead-restrictions. Until there's agreement about what they mean in aCFG or LR sense, it isn't really meaningful to ask if the ES grammar is LR(1).


-Michael

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Re: LR(1) grammar/parser and lookahead-restrictions

Reply via email to