On 17-01-30 06:20 PM, Dmitry Soshnikov wrote:
On Mon, Jan 30, 2017 at 2:22 PM, Michael Dyck <jmd...@ibiblio.org
<mailto:jmd...@ibiblio.org>> wrote:
1. Using "No<Lookahead>" productions.
The problem with this approach is that, in general, a
lookahead-restriction is not a restriction on the nonterminal that
immediately follows it in the production, it's a restriction on
the next few input elements, which might be generated by more
than just that nonterminal.
Well, in this case it's not a "lookahead" anymore.
It isn't a "lookahead" in the resulting grammar, but it is in the
original, and the two grammars should define the same language. What I'm
saying is that it's incorrect to assume that the effect of the lookahead
(in the original) can be confined to the nonterminal that immediately
follows, which is what the suggested transformation does.
A "lookahead" is a terminal token. A non-terminal is not a "lookahead".
Okay, so then in my previous statement, read "lookahead" as
"lookahead-restriction". (But I did say "lookahead-restriction" in the
statement before that.)
To reiterate, imaging you have:
```
• (x, y)
```
and
```
• (x, y) -> { return x + y; }
```
where `•` is a cursor position. Your lookaheads are "(", "x", "," "y", etc.
Usually 1 lookahead token is enough to route the parser, in this case "(".
What you're talking that you need not these lookaheads (at the beginning of
the cursor), but *after* the parsed non-terminal (i.e. after fully parsed
`(x, y)`), i.e. you're looking for "->", which goes at the end of the
non-terminal.
I'm not saying I don't need the lookahead tokens right after the cursor. But
I am saying that the lookahead tokens that the hypothetical parser needs to
examine (in order to enforce a lookahead-restriction), even if it's only 1
or 2, might extend past the 'next' nonterminal.
But that's an awkward way to express my objection.
To recap, I'm talking about the suggestion to replace:
[lookahead != <some sequence of terminals>] Nonterminal
in a production in the original grammar with
Nonterminal-that-does-not-start-with-that-sequence
(and a bunch of new productions) in the transformed grammar, and whether
that's a valid transformation.
I'm saying it isn't valid when the Nonterminal can derive phrases with fewer
terminals than the prohibited lookahead-sequence. (E.g., if Nonterminal is
nullable, or if it can derive a phrase of length 1 when the prohibited
lookahead-sequence is length 2.)
This is not possible, since it's not a lookahead at that
specific cursor position. It will be a lookahead after you have parsed the
non-terminal corresponding to "(x, y)".
So you can't have two non-terminals for this, since it'll be a
"reduce/reduce" conflict, and you need a cover grammar here.
Well, I'm saying you'd run into problems before you even tried to construct
the LR automaton, but I think you're agreeing with me that this approach is
not generally applicable?
In this case a cover grammar is simpler, and that's why it's used
starting since ES6 spec.
But note that none of the ES5 lookahead-restrictions was replaced with a
cover grammar in ES6. They're all still lookahead-restrictions.
Sure, because it's normally possible to do with the "No<Lookahead>" doubled
productions.
But they weren't replaced with "No" productions (or grammatical parameters)
either! All the lookahead-restrictions in ES5 are still
lookahead-restrictions in the current draft.
So this actually answers your initial question (it's a real practice).
No, I don't think it does.
----------------------------
2. Another approach is to use Cover grammar.
Usually it's a manual process of designing a grammar, and defining
post-processing static semantics.
Okay, I'm not really interested then. I'm looking for an algorithm
(i.e., a modification of the LR algorithm) that starts with a
grammar-with-lookahead-restrictions and ends with an LR parser.
OK, yeah, at this point I think we already clarified that it is possible to
parse ES grammar, and handle it's lookahead restrictions with several
techniques described.
Well, ES implementations exist, so clearly there are techniques for handling
those aspects of the ES grammar. But that doesn't mean that any of them are
parsing *according to* the ES grammar. E.g., it's unlikely that they're
finding exactly the parse tree indicated by the ES grammar. (Nor is there
any requirement that they do so.)
----
By way of contrast, consider "opt" subscripts and grammatical parameters. If
you look up the LR parser-construction algorithm, it doesn't talk about how
to handle such things, but the spec says how to transform them into plain
context-free productions, which the LR algorithm can handle. So it's an easy
shorthand to say that the ES grammar is LR(1) iff the 'expanded' grammar is
LR(1).
But the spec doesn't give that kind of CFG-equivalence for
lookahead-restrictions. Until there's agreement about what they mean in a
CFG or LR sense, it isn't really meaningful to ask if the ES grammar is LR(1).
-Michael
_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss