Re: [PEG] and-predicate with slightly different behavior?

.alyn.post. Sat, 11 Dec 2010 09:15:01 -0800

Will you explain in this example what you do with the tree object
afterwards?


In the example I gave, the production is converted to a number.

I think in your example, you're showing how to convert a parse tree
to a syntax tree, and I assume you later operator over the syntax
tree?

Thank you,

-Alan

On Sat, Dec 11, 2010 at 10:46:40AM +0100, Ondřej Bílka wrote:
> exp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> This looks as step backwards
> How it is different from old yacc
> exp <- digit '*' digit EOF -> {$$ = (* $1 $3)}
> 
> My approach is different. I made tree structure explicit in rule by binding as
> in example
> 
> tree = '(' number:value tree?:left tree?:rigth ')' result(Tree)
> 
> result(Tree) creates Tree object and sets value,left,rigth fields to 
> corresponding values
> 
> On Fri, Dec 10, 2010 at 01:47:59PM +1300, Peter Cashin wrote:
> >    Mathias:
> >    A good plan to separate out the:
> >    - grammar operators
> >    - parse tree building
> >    - parser action code
> >    I agree with you, but in my case I want to have the grammar language
> >    totally independent of the implementation programming language.
> >    So the grammar has no parser action code, although this can be added 
> > later
> >    in a particular programming language implementation. Parboiled has much
> >    closer integration with the programming language, which has advantages,
> >    but I really want grammar specifications that are totally independent.
> >    So now for the grammar operators, and parse tree building annotations. 
> > For
> >    many years I kept these separate: tree building annotations in the rule
> >    head (ie they annotate the rule name) or definition syntax (= for 
> > interior
> >    nodes, : for leaf nodes or terminals). The right hand side of the rule 
> > had
> >    the grammar expression body withs grammar operations only. One more head
> >    notation allows pruning the tree.
> >    This works fine, and maybe I should have stuck to that separation, but I
> >    introduced the `x prefix notation because I found that it was an 
> > advantage
> >    to be able to see the children nodes that would be generated by looking 
> > at
> >    the parent rule alone, and not having to refer to the children rule
> >    definitions to see how they are annotated.
> >    I have found that about half of many grammars turn out to be leaf nodes,
> >    so the ":" rule definition notation has a huge payoff in tree size and
> >    performance.
> >    Because of this the `x rule is not vital, but over time I found it was
> >    convenient, and a good trade-off for me.
> >    Cheers,
> >    Peter.
> > 
> >    On Fri, Dec 10, 2010 at 11:44 AM, Mathias <[1][email protected]>
> >    wrote:
> > 
> >      Over the course of developing parboiled (PEG parsing engine for Java 
> > and
> >      Scala using internals DSLs for grammar specification) I found that it's
> >      best to clearly separate the following three things:
> >      - grammar operators
> >      - parse tree building
> >      - parser action code
> > 
> >      The operators, together with the rules they are applied to, define the
> >      language your grammar recognizes. Nothing more, nothing less.
> >      Whether your engine builds a parse tree for input of that language or
> >      whether it doesn't should likely not be specified at the operator 
> > level,
> >      but it some other way.
> >      In "parboiled for Java" operators and rules are modeled as method calls
> >      whereas parse tree building is controlled via annotations on those
> >      methods. You can selectively enable or disable the creation of parse
> >      tree nodes per grammar rule, which allows you to tweak parse tree
> >      building exactly to your needs.
> > 
> >      The interface between the parsing engine and custom parser action code
> >      is the thing that has changed the most over the course of parboileds
> >      life cycle so far.
> >      Initially parser action code had to access the parse tree nodes in 
> > order
> >      to get to matched input. Creating a custom object structure during the
> >      parsing run (e.g. an AST) was done by decorating the parse tree. So the
> >      parse tree was the "work bench" of the parsing process.
> >      Over time this heavy centering around the parse tree turned out to have
> >      two problems:
> > 
> >      1. Low Performance.
> >      Always having to create a parse tree even when all you care about is 
> > the
> >      AST is just wasteful. Especially since the parse tree can contain a 
> > huge
> >      number of nodes for larger inputs (sometimes more nodes than input
> >      characters).
> > 
> >      2. Less room for automatic optimizations.
> >      The structure of the parse tree is dictated by the structure of the
> >      grammar rules. If your parser action code is built under the assumption
> >      that the parse tree has a given structure there is little leeway for 
> > the
> >      parsing engine to apply automatic rule optimizations. Decoupling the
> >      parser action code from the parse tree opens up the possibility to 
> > apply
> >      all kinds of automatic grammar tweaking before running the parser the
> >      first time. The engine might decide to completely change the rule
> >      structure, as long as all changes do not change the recognized language
> >      and are transparent to the parser action code.
> > 
> >      Currently parboiled implements the interface between the parsing engine
> >      and the action code in the following way:
> >      1. Actions can appear anywhere in a rule.
> >      2. Actions can access the matched input text of the sub rule 
> > immediately
> >      preceding the action but not of any other rule (so there is no sub rule
> >      labeling required).
> >      3. For working with custom objects (e.g. AST nodes) the engine provides
> >      a "Value Stack", which is a simple stack structure that serves as a 
> > fast
> >      work bench for a parsing run. Actions can push objects onto this stack,
> >      pop them off, swap them around, and so on.
> > 
> >      This solution completely decouples the parse tree from everything else.
> >      You can enable or disable parse tree building without any effect on the
> >      rest of the parser. There is no need for addressing sub rules in action
> >      expressions and given a somewhat efficient value stack implementation
> >      the whole thing is quite fast. Additionally, in "parboiled for Scala",
> >      the action code with its manipulations of the value stack can be
> >      statically type-checked at compile time, which is a huge plus.
> > 
> >      In case you are interested in more details or broader explanations, the
> >      parboiled documentation is quite complete.
> > 
> >      Cheers,
> >      Mathias
> > 
> >      ---
> >      [2][email protected]
> >      [3]http://www.parboiled.org
> >      On 09.12.2010, at 21:01, Alan Post wrote:
> > 
> >      > I'm working on my PEG parser, in particular the interface between
> >      > the parse tree and the code one can attach to productions that
> >      > are executed on a successful parse.
> >      >
> >      > I've arranged for the two predicate operations, & and !, to not add
> >      > any output to the parse tree. �That means that the following
> >      > production:
> >      >
> >      > �rule <- &a !b "c"
> >      >
> >      > Produces the same parse tree as:
> >      >
> >      > �rule <- "c"
> >      >
> >      > Internally, this means that I recognize that the sequence operator
> >      > (which contains the productions '&a', '!b', and '"c"' in this
> >      > example) is being called with predicates in every position but one,
> >      > and rather than returning a list containing that single element,
> >      > I return just the single element.
> >      >
> >      > As I've been doing this, I've found that I want a new operator 
> > similar
> >      > to '&'. �'&' matches the production it is attached to, but it does 
> > not
> >      > advance the position of the input buffer.
> >      >
> >      > I'd like an operator that matches the production it is attached to,
> >      > advances the input buffer, but doesn't add anything to the parse
> >      > tree.
> >      >
> >      > Here's an example:
> >      >
> >      > �mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> >      >
> >      > the mulexp production is a sequence of four other rules, but only
> >      > two of them are needed by the associated code. �It would be nice
> >      > if I could write the code rule like it is above, rather than say
> >      > this:
> >      >
> >      > �(lambda (x op y EOF) (* x y))
> >      >
> >      > Having to account for all the rules in the sequence, but really
> >      > only caring about two of them. �Here is the example rewritten
> >      > with '^' expressing "match the rule, advance the input, but don't
> >      > modify the parse tree":
> >      >
> >      > �mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))}
> >      >
> >      > Before I go inventing syntax for this use case, will you tell me if
> >      > this is already being done with other parsers? �Have any of you had
> >      > this problem and already solved it, and if so, what approach did you
> >      > take?
> >      >
> >      > -Alan
> >      > --
> >      > .i ko djuno fi le do sevzi
> >      >
> >      > _______________________________________________
> >      > PEG mailing list
> >      > [4][email protected]
> >      > [5]https://lists.csail.mit.edu/mailman/listinfo/peg
> > 
> >      _______________________________________________
> >      PEG mailing list
> >      [6][email protected]
> >      [7]https://lists.csail.mit.edu/mailman/listinfo/peg
> > 
> > References
> > 
> >    Visible links
> >    1. mailto:[email protected]
> >    2. mailto:[email protected]
> >    3. http://www.parboiled.org/
> >    4. mailto:[email protected]
> >    5. https://lists.csail.mit.edu/mailman/listinfo/peg
> >    6. mailto:[email protected]
> >    7. https://lists.csail.mit.edu/mailman/listinfo/peg
> 
> > _______________________________________________
> > PEG mailing list
> > [email protected]
> > https://lists.csail.mit.edu/mailman/listinfo/peg
> 
> 
> -- 
> 
> Plasma conduit breach
> 
> _______________________________________________
> PEG mailing list
> [email protected]
> https://lists.csail.mit.edu/mailman/listinfo/peg

-- 
.i ko djuno fi le do sevzi

_______________________________________________
PEG mailing list
[email protected]
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] and-predicate with slightly different behavior?

Reply via email to