> From: Terrence Brannon [mailto:[EMAIL PROTECTED]] Sent: 2002/03/17 11:43 > On Thursday, March 14, 2002, at 11:58 AM, Orton, Yves wrote: > > BTW: This is basically how the red-dragon suggest one should handle > > symbols, (with a perlish flavour) > > > > Would someone mind mentioning exactly what LL(1) , LL(N) and > LR(1) and LR(N) mean > and which of these P::RD is?
Ok. Yesterdays response to this question was not what it sould have been. :( Sorry. Especially considering the following line from the docs:Parse::RecDescent incrementally generates top-down recursive-descent text parsers from simple yacc-like grammar specifications. It provides: So here goes a second time. First of all the terms LL and LR apply to both grammers and parsers. -LL parsers means what I said it means, Left to right Leftmost derivatation. LL grammers are special grammers where LL(k) grammer means that an unambiguous production may be selected from the list given _only_ K tokens of lookahead. (This implies that left recursion is forbidden) -LR parsers are Left to Right right most derivation in reverse. An LR(K) grammer is a grammer where we can recognize the right side of a production having seen all of what is derived from that right side with K tokens of lookahead. LR grammars can describe more languages than LL grammars can. A brief general taxonomy of parsers NOTE: Be aware that there exists a parser that may parse any unambiguous context free grammer (CFG) in N**3 time . However most computer languages grammers can be parsed with much less. Unversal Parsers Cocke-Younger-Kasami, Earleys algorithm. Can parse any algorithm Are generally slow and are not usually needed for real world application Top Down Parsers Recursive Descent Parsers P::RD is one of these. (Caveats below) According to the dragon rarely used due to speed issues. P::RD thus may be the most commonly used recursive decent parser. Fairly easy to write compared to non backtracking methods. Parse LL grammers Build a grammer tree and then walk the tree. Predictive Parser Special subset of recursive descent with no backtracking. Non backtracking. Parse subset of LL grammers. (LL(1)) Uses either tabular or tree based grammer representations. Bottom Up Parsers LR Parsers Parses a large subset of CFG's (LR grammers) Most gerneral nonbacktracking shift-reduce parsing method known yet effecient as well. LR parsers can parse anything a predictive parser can parse and more LR parsers detect errors as soon as it is possible in a left to right scan Difficult to write by hand as state transition tables are used and must be generated. 3 common forms :SLR LALR LR. (least powerful and cheapest to most powerful and most expensive) YACC produces an LALR compiler. Are often very fast. Operator Precedence Parsers Can only handle a subset of CFGs (known as Operator grammers) A grammer is an operator grammer IFF it has no production right side with two adjacent nonterminals. Easy to write by hand. Difficult to prove correct, only apply to a limited subset of grammers has problem with grammers where an operator has different precedence (eg unary and binary -) Often combined with recursive descent techniques where needed. P::RD is a topdown (i was confused on this issue yesterday) LL parser. It cannot handle any form of left recursion. P::RD is actually an interesting beast that doesnt fit into the catagories and definitions that ive read very well. For instance it does not use a tokenizer (it IS a tokenizer!) It could be used for predictive parsing using dynamic rules. It includes tree pruning abilities like <commit> it uses pragmas to make avoiding certain problems easy (<leftop:> <rightop:>). In fact in general following my review I can see (some reasons) why Damian did the things he did and why he hasnt done the things he hasnt. In all P::RD is a very perlish dynamic recursive descent parser. Some thoughts: Since P::RD can only handle LL grammers it would seem that a tutorial on eliminating left recursion would be useful. As would an automatic way to eliminate that recursion (there is a known algorithm for doing this) just as Damian say in the docs. Argh: here it is How to eliminate Left Recursion:(extracted from Red Dragon page 47) Consider the left recursive production A : A x | y (x and y are any production or terminal that _do_not_ begin with A eg "expr: expr '+' term | term") We can remove the left recursion by converting it to A : y R R : x R | e (e is the empty production) So the left recursive expr: expr '+' term | term Becomes the nonleft recursive expr : term term_tail term_tail : '+' term | {1} (Is there a better way than {1} to match e?) Converting more complicated examples is left as an exercise for the reader. :-) Hope this wasnt a complete waste of time....