Re: [PEG] and-predicate with slightly different behavior?

Ondřej Bílka Sat, 11 Dec 2010 13:22:51 -0800

On Sat, Dec 11, 2010 at 11:53:03AM -0700, Alan Post wrote:

>   expr <- #:left digit op #:right digit
>        -> {((eval (string->symbol (string op))) left right)}
> 
> Which is a very much a good suggestion.  I've added this to my
> backlog for this project.
> 
> Do you see anything else in these examples that you would do
> differently?
One of my PEG features is append operator which I denote by variable[]
that appends result to list(if result is list then concatenates them)
number:n[] ('+' number:n[] )*


I just have idea that we I collect results of rules to list rule+'s' and 
add as syntax sugar denote  n-th rule occurence as rule+n we can write
   expr <-  digit op digit
        -> {((eval (string->symbol (string op))) digit1 digit2)}

You could be interested at parametrized rules I use syntax
rule(params)
which translates to calling rule with params in host language. 
In params `expression` is translated to lambda matching given expression
For example 

listOf(rule,delim) =  apply(rule):res[] (apply(delim) apply(rule):res[])* -> res
                   | empty -> []

listOf(`number`,`"+"|"-"`)
> 
> Syntactically, I'm using the Racket-style #!key parameters (this is
> '#:left digit' vs. your example 'digit:left' and R5RS Scheme style
> strings ("foo\nbar") and characters (#\a or #\space), as that is my
> target language.  I consider those localization or style issues.  I
> very much appreciate comments on that decision, and I also
> appreciate comments on the design aspect of the interface between
> action and grammar.
If * collects results to list how looks #:n number* vs number*:n

> 
> Unlike parboiled, I don't have any explicit parse tree construction
> and manipulation.  I do have flexibility in the calling interface
> between the grammar and the parser actions, which is working on the
> same problem.  I have insufficient data to know whether I'll be
> developing a more explicit parse tree API as I continue to develop
> this parser, but the data I do have (from Mathias) has suggested it
> will be a natural extension of building a parser generator.
> 
> -Alan
> 
> On Sat, Dec 11, 2010 at 07:24:28PM +0100, Ondřej Bílka wrote:
> > Sorry I was replaying to two things at once. My objectinion about your 
> > example was first thing. 
> > My approach how do that is something like 
> > exp <- digit:x '*' digit:y EOF -> { (* x y)}
> > Then I started replaying to parboiled that we don't need build parse tree 
> > as syntax tree suffices.
> > Also in my peg you can refer to subrule used only once by its name.
> > On Sat, Dec 11, 2010 at 10:14:36AM -0700, .alyn.post. wrote:
> > > Will you explain in this example what you do with the tree object
> > > afterwards?
> > > 
> > > In the example I gave, the production is converted to a number.
> > > 
> > > I think in your example, you're showing how to convert a parse tree
> > > to a syntax tree, and I assume you later operator over the syntax
> > > tree?
> > > 
> > > Thank you,
> > > 
> > > -Alan
> > > 
> > > On Sat, Dec 11, 2010 at 10:46:40AM +0100, Ondřej Bílka wrote:
> > > > exp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> > > > This looks as step backwards
> > > > How it is different from old yacc
> > > > exp <- digit '*' digit EOF -> {$$ = (* $1 $3)}
> > > > 
> > > > My approach is different. I made tree structure explicit in rule by 
> > > > binding as
> > > > in example
> > > > 
> > > > tree = '(' number:value tree?:left tree?:rigth ')' result(Tree)
> > > > 
> > > > result(Tree) creates Tree object and sets value,left,rigth fields to 
> > > > corresponding values
> > > > 
> > > > On Fri, Dec 10, 2010 at 01:47:59PM +1300, Peter Cashin wrote:
> > > > >    Mathias:
> > > > >    A good plan to separate out the:
> > > > >    - grammar operators
> > > > >    - parse tree building
> > > > >    - parser action code
> > > > >    I agree with you, but in my case I want to have the grammar 
> > > > > language
> > > > >    totally independent of the implementation programming language.
> > > > >    So the grammar has no parser action code, although this can be 
> > > > > added later
> > > > >    in a particular programming language implementation. Parboiled has 
> > > > > much
> > > > >    closer integration with the programming language, which has 
> > > > > advantages,
> > > > >    but I really want grammar specifications that are totally 
> > > > > independent.
> > > > >    So now for the grammar operators, and parse tree building 
> > > > > annotations. For
> > > > >    many years I kept these separate: tree building annotations in the 
> > > > > rule
> > > > >    head (ie they annotate the rule name) or definition syntax (= for 
> > > > > interior
> > > > >    nodes, : for leaf nodes or terminals). The right hand side of the 
> > > > > rule had
> > > > >    the grammar expression body withs grammar operations only. One 
> > > > > more head
> > > > >    notation allows pruning the tree.
> > > > >    This works fine, and maybe I should have stuck to that separation, 
> > > > > but I
> > > > >    introduced the `x prefix notation because I found that it was an 
> > > > > advantage
> > > > >    to be able to see the children nodes that would be generated by 
> > > > > looking at
> > > > >    the parent rule alone, and not having to refer to the children rule
> > > > >    definitions to see how they are annotated.
> > > > >    I have found that about half of many grammars turn out to be leaf 
> > > > > nodes,
> > > > >    so the ":" rule definition notation has a huge payoff in tree size 
> > > > > and
> > > > >    performance.
> > > > >    Because of this the `x rule is not vital, but over time I found it 
> > > > > was
> > > > >    convenient, and a good trade-off for me.
> > > > >    Cheers,
> > > > >    Peter.
> > > > > 
> > > > >    On Fri, Dec 10, 2010 at 11:44 AM, Mathias 
> > > > > <[1][email protected]>
> > > > >    wrote:
> > > > > 
> > > > >      Over the course of developing parboiled (PEG parsing engine for 
> > > > > Java and
> > > > >      Scala using internals DSLs for grammar specification) I found 
> > > > > that it's
> > > > >      best to clearly separate the following three things:
> > > > >      - grammar operators
> > > > >      - parse tree building
> > > > >      - parser action code
> > > > > 
> > > > >      The operators, together with the rules they are applied to, 
> > > > > define the
> > > > >      language your grammar recognizes. Nothing more, nothing less.
> > > > >      Whether your engine builds a parse tree for input of that 
> > > > > language or
> > > > >      whether it doesn't should likely not be specified at the 
> > > > > operator level,
> > > > >      but it some other way.
> > > > >      In "parboiled for Java" operators and rules are modeled as 
> > > > > method calls
> > > > >      whereas parse tree building is controlled via annotations on 
> > > > > those
> > > > >      methods. You can selectively enable or disable the creation of 
> > > > > parse
> > > > >      tree nodes per grammar rule, which allows you to tweak parse tree
> > > > >      building exactly to your needs.
> > > > > 
> > > > >      The interface between the parsing engine and custom parser 
> > > > > action code
> > > > >      is the thing that has changed the most over the course of 
> > > > > parboileds
> > > > >      life cycle so far.
> > > > >      Initially parser action code had to access the parse tree nodes 
> > > > > in order
> > > > >      to get to matched input. Creating a custom object structure 
> > > > > during the
> > > > >      parsing run (e.g. an AST) was done by decorating the parse tree. 
> > > > > So the
> > > > >      parse tree was the "work bench" of the parsing process.
> > > > >      Over time this heavy centering around the parse tree turned out 
> > > > > to have
> > > > >      two problems:
> > > > > 
> > > > >      1. Low Performance.
> > > > >      Always having to create a parse tree even when all you care 
> > > > > about is the
> > > > >      AST is just wasteful. Especially since the parse tree can 
> > > > > contain a huge
> > > > >      number of nodes for larger inputs (sometimes more nodes than 
> > > > > input
> > > > >      characters).
> > > > > 
> > > > >      2. Less room for automatic optimizations.
> > > > >      The structure of the parse tree is dictated by the structure of 
> > > > > the
> > > > >      grammar rules. If your parser action code is built under the 
> > > > > assumption
> > > > >      that the parse tree has a given structure there is little leeway 
> > > > > for the
> > > > >      parsing engine to apply automatic rule optimizations. Decoupling 
> > > > > the
> > > > >      parser action code from the parse tree opens up the possibility 
> > > > > to apply
> > > > >      all kinds of automatic grammar tweaking before running the 
> > > > > parser the
> > > > >      first time. The engine might decide to completely change the rule
> > > > >      structure, as long as all changes do not change the recognized 
> > > > > language
> > > > >      and are transparent to the parser action code.
> > > > > 
> > > > >      Currently parboiled implements the interface between the parsing 
> > > > > engine
> > > > >      and the action code in the following way:
> > > > >      1. Actions can appear anywhere in a rule.
> > > > >      2. Actions can access the matched input text of the sub rule 
> > > > > immediately
> > > > >      preceding the action but not of any other rule (so there is no 
> > > > > sub rule
> > > > >      labeling required).
> > > > >      3. For working with custom objects (e.g. AST nodes) the engine 
> > > > > provides
> > > > >      a "Value Stack", which is a simple stack structure that serves 
> > > > > as a fast
> > > > >      work bench for a parsing run. Actions can push objects onto this 
> > > > > stack,
> > > > >      pop them off, swap them around, and so on.
> > > > > 
> > > > >      This solution completely decouples the parse tree from 
> > > > > everything else.
> > > > >      You can enable or disable parse tree building without any effect 
> > > > > on the
> > > > >      rest of the parser. There is no need for addressing sub rules in 
> > > > > action
> > > > >      expressions and given a somewhat efficient value stack 
> > > > > implementation
> > > > >      the whole thing is quite fast. Additionally, in "parboiled for 
> > > > > Scala",
> > > > >      the action code with its manipulations of the value stack can be
> > > > >      statically type-checked at compile time, which is a huge plus.
> > > > > 
> > > > >      In case you are interested in more details or broader 
> > > > > explanations, the
> > > > >      parboiled documentation is quite complete.
> > > > > 
> > > > >      Cheers,
> > > > >      Mathias
> > > > > 
> > > > >      ---
> > > > >      [2][email protected]
> > > > >      [3]http://www.parboiled.org
> > > > >      On 09.12.2010, at 21:01, Alan Post wrote:
> > > > > 
> > > > >      > I'm working on my PEG parser, in particular the interface 
> > > > > between
> > > > >      > the parse tree and the code one can attach to productions that
> > > > >      > are executed on a successful parse.
> > > > >      >
> > > > >      > I've arranged for the two predicate operations, & and !, to 
> > > > > not add
> > > > >      > any output to the parse tree. �That means that the following
> > > > >      > production:
> > > > >      >
> > > > >      > �rule <- &a !b "c"
> > > > >      >
> > > > >      > Produces the same parse tree as:
> > > > >      >
> > > > >      > �rule <- "c"
> > > > >      >
> > > > >      > Internally, this means that I recognize that the sequence 
> > > > > operator
> > > > >      > (which contains the productions '&a', '!b', and '"c"' in this
> > > > >      > example) is being called with predicates in every position but 
> > > > > one,
> > > > >      > and rather than returning a list containing that single 
> > > > > element,
> > > > >      > I return just the single element.
> > > > >      >
> > > > >      > As I've been doing this, I've found that I want a new operator 
> > > > > similar
> > > > >      > to '&'. �'&' matches the production it is attached to, but it 
> > > > > does not
> > > > >      > advance the position of the input buffer.
> > > > >      >
> > > > >      > I'd like an operator that matches the production it is 
> > > > > attached to,
> > > > >      > advances the input buffer, but doesn't add anything to the 
> > > > > parse
> > > > >      > tree.
> > > > >      >
> > > > >      > Here's an example:
> > > > >      >
> > > > >      > �mulexp <- digit '*' digit EOF -> {(lambda (x y) (* x y))}
> > > > >      >
> > > > >      > the mulexp production is a sequence of four other rules, but 
> > > > > only
> > > > >      > two of them are needed by the associated code. �It would be 
> > > > > nice
> > > > >      > if I could write the code rule like it is above, rather than 
> > > > > say
> > > > >      > this:
> > > > >      >
> > > > >      > �(lambda (x op y EOF) (* x y))
> > > > >      >
> > > > >      > Having to account for all the rules in the sequence, but really
> > > > >      > only caring about two of them. �Here is the example rewritten
> > > > >      > with '^' expressing "match the rule, advance the input, but 
> > > > > don't
> > > > >      > modify the parse tree":
> > > > >      >
> > > > >      > �mulexp <- digit ^'*' digit ^EOF -> {(lambda (x y) (* x y))}
> > > > >      >
> > > > >      > Before I go inventing syntax for this use case, will you tell 
> > > > > me if
> > > > >      > this is already being done with other parsers? �Have any of 
> > > > > you had
> > > > >      > this problem and already solved it, and if so, what approach 
> > > > > did you
> > > > >      > take?
> > > > >      >
> > > > >      > -Alan
> > > > >      > --
> > > > >      > .i ko djuno fi le do sevzi
> > > > >      >
> > > > >      > _______________________________________________
> > > > >      > PEG mailing list
> > > > >      > [4][email protected]
> > > > >      > [5]https://lists.csail.mit.edu/mailman/listinfo/peg
> > > > > 
> > > > >      _______________________________________________
> > > > >      PEG mailing list
> > > > >      [6][email protected]
> > > > >      [7]https://lists.csail.mit.edu/mailman/listinfo/peg
> > > > > 
> > > > > References
> > > > > 
> > > > >    Visible links
> > > > >    1. mailto:[email protected]
> > > > >    2. mailto:[email protected]
> > > > >    3. http://www.parboiled.org/
> > > > >    4. mailto:[email protected]
> > > > >    5. https://lists.csail.mit.edu/mailman/listinfo/peg
> > > > >    6. mailto:[email protected]
> > > > >    7. https://lists.csail.mit.edu/mailman/listinfo/peg
> > > > 
> > > > > _______________________________________________
> > > > > PEG mailing list
> > > > > [email protected]
> > > > > https://lists.csail.mit.edu/mailman/listinfo/peg
> > > > 
> > > > 
> > > > -- 
> > > > 
> > > > Plasma conduit breach
> > > > 
> > > > _______________________________________________
> > > > PEG mailing list
> > > > [email protected]
> > > > https://lists.csail.mit.edu/mailman/listinfo/peg
> > > 
> > > -- 
> > > .i ko djuno fi le do sevzi
> > 
> > -- 
> > 
> > Internet exceeded Luser level, please wait until a luser logs off before 
> > attempting to log back on.
> > 
> > _______________________________________________
> > PEG mailing list
> > [email protected]
> > https://lists.csail.mit.edu/mailman/listinfo/peg
> 
> -- 
> .i ko djuno fi le do sevzi

-- 

emissions from GSM-phones

_______________________________________________
PEG mailing list
[email protected]
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] and-predicate with slightly different behavior?

Reply via email to