On Thu, Feb 2, 2017 at 3:23 PM, Waldemar Horwat <walde...@google.com> wrote:
> On 02/01/2017 10:25, Dmitry Soshnikov wrote: > >> As mentioned, Cover grammar is usually the process of the grammar design >> itself (as in ES6 spec itself). I'm not aware of automatic transformations >> for this (if you find any please let me know). >> > > Cover grammars are an ugly hack that we had to add when there was no other > practical choice. Avoid them as much as possible. They are only used in > situations where lookahead restrictions and parametrized grammar rules do > not work in any practical way. > > Yeah, absolutely agreed. The reason why I used cover grammar in that example to parse `{` as a `Block` vs. `ObjectLiteral`, and handle `ExpressionStatement` is to make the resulting grammar short, and don't introduce a bunch of `NoCurly`, etc extra productions for this. That's also said, this ugly hack also forces you to do post-processing overhead -- either validation of the nodes, or even re-interpretation (rewriting) to other nodes -- in my case I have to actually check that all nodes between `{` and `}` are properties, or vice-versa, statements, based on the expression/statement position. Practically, a cover grammar still can be slower because of such post-processing overhead (imaging you have a giant file with just object -- all those now are needed to be validated to contain property nodes). OTOH, a trade off with lookahead restrictions is introducing ~1.5x more parsing states. > When designing the grammar, the preferences are: > > - Use standard LR(1) productions > - Use parametrized productions > - Use lookahead restrictions if parametrized productions would introduce > too many parameters into rules > - Use a cover grammar if the grammar can't be reasonably expressed in any > other way. They're a last resort with lots of problems. > > Just to double-check, by "parametrized productions" you mean the `No<lookahead>` extra productions? > Lookahead restrictions fit very well into an LR(1) engine as long as > they're limited to only one token, and that's what I've implemented in the > validator. You need to be very careful with them if looking more than one > token ahead because lexing of the tokens can vary based on context. For > example, if the next few characters in front of the cursor are )/abc/i+, > then what is the second token? What is the third token? It's actually > context-dependent. > > The same problem is even worse for cover grammars. > > Yeah, that's also true. Thanks again for the confirmation, that such a validator even exists, it's good -- at least we know it's parsable in principle. Those, that's said, on real practice now implementing a manual recursive descent for ECMAScript is more approachable now. Dmitry
_______________________________________________ es-discuss mailing list es-discuss@mozilla.org https://mail.mozilla.org/listinfo/es-discuss