Re: Let's stop parser Hell

David Piepgrass Sat, 07 Jul 2012 14:45:20 -0700

On Saturday, 7 July 2012 at 20:39:18 UTC, Roman D. Boiko wrote:

On Saturday, 7 July 2012 at 20:26:07 UTC, David Piepgrass wrote:
I'd like to add that if we give tree parsing first-classtreatment, I believe the most logical approach to parsing hasthree or more stages instead of the traditional two(lex+parse):
1. Lexer
2. Tree-ification
3. Parsing to AST (which may itself use multiple stages, e.g.parse the declarations first, then parse function bodies later)
The new stage two simply groups things that are in parenthesisand braces. So an input stream such as the following:
I bet that after stage 2 you would have performed almost thesame amount of work (in other words, spent almost the sametime) as you would if you did full parsing. After that youwould need to iterate the whole tree (possibly multiple times),modify (or recreate if the AST is immutable) its nodes, etc.Altogether this might be a lot of overhead.
My opinion is that tree manipulation is something that shouldbe available to clients of parser-as-a-library or even ofparser+semantic analyzer, but not necessarily advantageous forparser itself.

Hmm, you've got a good point there, although simpletree-ification is clearly less work than standard parsing, sincestatements like "auto x = y + z;" would be quickly "blitted" intothe same node in phase 2, but would become multiple separatenodes in phase 3.

What I like about it is not its performance, but how it matchesthe way we think about languages. Humans tend to see overallstructure first, and examine the fine details later. The treeparsing approach is similarly nonlinear and can be modularized ina way that might be more intuitive than traditional EBNF.

On the other hand, one could argue it is /too/ flexible,admitting so many different approaches to parsing that afront-end based on this approach is not necessarily intuitive tofollow; and of course, not using a standard EBNF-type grammarcould be argued to be bad.

Still... it's a fun concept, and even if the initial parsing endsup using the good-old lex-parse approach, semantic analysis andlowering can benefit from a tree parser. Tree parsing, of course,is just a generalization of linear parsing and so a tree parsergenerator (TPG) could work equally well for flat input.

Re: Let's stop parser Hell

Reply via email to