Re: LR(1) grammar/parser and lookahead-restrictions

Dmitry Soshnikov Wed, 01 Feb 2017 10:25:57 -0800

On Wed, Feb 1, 2017 at 7:37 AM, Michael Dyck <jmd...@ibiblio.org> wrote:


> On 17-01-31 02:17 PM, Dmitry Soshnikov wrote:
>
>> The spec is a formal language definition (including for syntactic
>> grammar).
>>
>
> (Well, the extent to which the spec is formal is debatable. The grammar is
> the most formal part of it, but that's true of most programming language
> definitions.)
>
> It doesn't talk about specifics of implementation much, nor it's supposed
>> to
>> do so (so demanding this from spec isn't appropriate).
>>
>
> Right (and I'm pretty sure I didn't make such a demand).
>
> It doesn't tell one to implement an LR(1),
>>
>
> Nor should it.
>
> neither it describes the
>> techniques *how* one going to do so (even though as was confirmed in this
>> thread it's LR(1)-parsable). It just uses some formal notation for this,
>> e.g. `lookahead ∉ {{, function}`. Specific techniques *how exactly* you
>> going to achieve this, is not the purpose of the specification.
>>
>
> Right.



OK, I mostly meant that a spec can be fully abstract, even in abstract
definitions themselves, and use different notations instead of providing
fully "CFG-equivalence for lookahead restrictions".


>
>
> (but it's possible to clarify such techniques on discussions which we have
>> now -- please note though, the techniques described above are not an
>> "assumption", or a "guess", they are actual techniques used on practice to
>> achieve this in LR(1)).
>>
>
> I'm not disputing the actualness of the techniques. I'm saying:
>
>  - technique #1 (duplicated productions): As stated, looks insufficient
> for the ES grammar. Which may just mean that there's a more complicated
> version of the approach that *is* sufficient for ES.
>


Here's an example:
https://raw.githubusercontent.com/zaach/jison/master/examples/jscore.jison
(see how lookahead restriction for `{` in `ExpressionStatement` is
implemented using `NoBF` productions).



>
>  - technique #2 (cover grammar): Can you algorithmically (a) construct the
> cover grammar and (b) recover the parse according to the original grammar?
>

And this approach I chose (instead of duplicating productions) to handle
the same `{` lookahead restriction, what this comment explains:
https://github.com/DmitrySoshnikov/syntax/blob/master/examples/lang.bnf#L206-L222

As mentioned, Cover grammar is usually the process of the grammar design
itself (as in ES6 spec itself). I'm not aware of automatic transformations
for this (if you find any please let me know).



>
>  - technique #3 (parse-time actions): In the LR automaton, can you
> algorithmically distinguish action-avoided conflicts from 'true' conflicts?
>
>
> The spec in definitions is not consistent though. In one place it uses
>> explicit *technique* for lookahead restriction via the "No<lookahead>"
>> productions. In some cases it just uses formal description as  `lookahead
>> ∉
>> {{, function}`, leaving a specific technique to achieve this up to
>> implementations.
>>
>> In fact (for consistency), the rule from ES5:
>>
>> ```
>> for ( ExpressionNoIn ; Expression ; Expression ) Statement
>> ```
>>
>> could be described as:
>>
>> ```
>> for ( `lookahead ∉ {in}` Expression ; Expression ; Expression ) Statement
>> ```
>>
>> and would mean the same.
>>
>
> No, they wouldn't. ExpressionNoIn (or Expression[~In]) is a version of
> Expression that doesn't ("openly") use the 'in' operator (at the
> RelationalExpression level), i.e. it doesn't have the 'in' operator
> somewhere in the *middle*, whereas
>    [lookahead ∉ {in}] Expression
> prohibits an Expression that *starts* with the token 'in'. You can't
> express the 'NoIn' restriction with any finite amount of lookahead, because
> a RelationalExpression can be arbitrarily large.
>
>
I agree, the example not the best, I was leaning to, that a spec could use
even more abstract definition for this instead of explicit productions.



>
> You may search for ES5 LR(1) construction (there are several over
>> internets), they use the described above techniques for `lookahead ∉ {{,
>> function}` (adding `NoCurlyBrace`, or `NoFunction` productions).
>>
>
> But the chances are low that they've been algorithmically generated from
> the ES spec. And while it's conceivable I could figure out how to do it
> algorithmically by examining examples of it done manually, I'm doubtful of
> that too.
>
>
I'm not sure about "algorithmic" transformations, but yeah, how the grammar
is described in the spec, and which extra transformation you have to do in
the resulting LR(1), might be different things. Does it mean the spec
doesn't define an LR(1) grammar? Perhaps, but it doesn't really matter,
since again the spec uses formal notation, and for shortness may omit
actual techniques.



> That's also said, it could be rewritten using Cover grammar which adds
>> though post-processing overhead -- that might be the reason why some of
>> the `No<lookahead>` productions were removed from ES6 spec,
>>
>
>
Yeah, seems most of them were replaced with inverted notation (using [In]
suffix) -- still leaving the interpretation of this abstract notation up to
implementations.

P.S:

All this reduces to that practically today it might be easier to do just a
manual recursive decent/accent. Yes, it's a lot of recursion, yes, it might
be slower than table-driven parser. But gives you much control, and avoid
all those described techniques (hence a manual recursive descent is chosen
on practice by most of the implementations today). Or, the syntax of a
language could be different itself, and avoid ambiguities.

Dmitry

_______________________________________________
es-discuss mailing list
es-discuss@mozilla.org
https://mail.mozilla.org/listinfo/es-discuss

Re: LR(1) grammar/parser and lookahead-restrictions

Reply via email to