Re: [sqlite] Lemon: Conflicts with repeated TERMINALS
Joe Wilson <[EMAIL PROTECTED]> wrote: >The following grammar may be clearer to you: Yes, it is many thanks! I believe I am making progress! At least I can see the picture much clearer now and was able to come up with the following grammar with just one conflict unsolved: %left NEWLINE. /* Do these matter here at all? */ %nonassoc TEXT LINK. %left HEADING_START. %left HEADING_END. article ::= blocks. blocks ::= block. /* EOF */ blocks ::= blocks NEWLINE./* EOF */ blocks ::= blocks NEWLINE NEWLINE block. block ::= . /* EOF */ block ::= paragraph. block ::= heading. heading ::= HEADING_START text HEADING_END. paragraph ::= line. paragraph ::= paragraph NEWLINE line. line ::= text. text ::= textpiece. text ::= text textpiece. textpiece ::= TEXT. textpiece ::= LINK. I of course appreciate any comments ;-) My idea is that * A block can be either a paragraph or a heading. Multiple blocks are separated by two NEWLINEs. * A paragraph is made up of n >= 1 lines. Each line within a paragraph ends with a single NEWLINE. Two NEWLINEs start a new block (see above). * A line consists of text, which can be TEXT or LINK. Not all works well with the grammer, and unfortunately I do not understand why. Given this input, for example: TEXT, NEWLINE the parser gets stuck at paragraph ::= paragraph NEWLINE line. instead of falling back to the line above paragraph ::= line. to find the conditions of a paragraph fulfilled. Why does it not try the other alternatives? Or are there none in the grammar? >Try reading some papers on parsing or search for the book >"Compilers: Principles, Techniques, and Tools" (a.k.a. >the dragon book). I certainly will. >Also try writing on paper random sequences of tokens and >manually parse your grammar to see the conflicts firsthand. As I throw different token sequences to my experimental parser I am slowly starting to make sense of the debugger output. Ralf - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Lemon: Conflicts with repeated TERMINALS
--- Ralf Junker <[EMAIL PROTECTED]> wrote: > > paragraph ::= PARA text. > > I observed the new PARA terminal token (the clear separator!?). Unfortunately > the lexer does not > generate such a token. Paragraph repeats are also removed. It was just an HTML-like example. I just wanted to demonstrate one possible way to remove the conflicts by adding a special tag. I'm not suggesting that you alter your grammar in this way. > >Here's another: > > > > article ::= blocks. > > > > blocks ::= block. > > blocks ::= blocks block. > > > > block ::= heading NEWLINE. > > block ::= paragraph NEWLINE. > > > > heading ::= HEADING_START text HEADING_END. > > heading ::= HEADING_START text. > > heading ::= HEADING_START. > > > > paragraph ::= text. > > > > text ::= textpiece. > > text ::= text textpiece. > > > > textpiece ::= TEXT. > > textpiece ::= LINK. > > This one also removes paragraph repeats, doesn't it? Unfortunately paragraphs > need to repeat for > my grammar. Is there a way to achieve this without conflicts? In your original grammar, you could have random sequences of TEXT and LINK and NEWLINE tokens without any way of differentiating whether they were part of a "text" or "paragraph" or "heading", hence the conflict. So I figured that a paragraph may as well be any combination of TEXT and LINK tokens ending with NEWLINE. The headings in my 2nd grammar also have to end with NEWLINE. "paragraph" will not repeat, per se, but you can repeat "block" (see the "blocks" rules), where you can have several consecutive "block"s that happen to be of type paragraph so you can achieve the same effect. The following grammar may be clearer to you: article ::= blocks. blocks ::= block. blocks ::= blocks block. block ::= heading. block ::= paragraph. heading ::= HEADING_START text HEADING_END. heading ::= HEADING_START text NEWLINE. heading ::= HEADING_START NEWLINE. paragraph ::= NEWLINE. paragraph ::= text NEWLINE. text ::= textpiece. text ::= text textpiece. textpiece ::= TEXT. textpiece ::= LINK. It's much the same as the previous grammar, but puts the NEWLINEs as part of the paragraph and heading rules instead of in the block rule. The difference being that heading no longer needs to end with NEWLINE in one case if HEADING_END is encountered, as it is not ambiguous: heading ::= HEADING_START text HEADING_END. and a paragraph in this grammar may also be empty so you can parse consecutive NEWLINE tokens with this rule: paragraph ::= NEWLINE. Again, this was just an example of how to disambiguate the grammar. You can find other ways. > >Lemon generates an .out file for the .y file processed. > >You can examine it for errors. > > I have tried to make sense of the .out file before. It tells me where to look > for the problem, > but not how to fix it ... Try reading some papers on parsing or search for the book "Compilers: Principles, Techniques, and Tools" (a.k.a. the dragon book). Also try writing on paper random sequences of tokens and manually parse your grammar to see the conflicts firsthand. Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Lemon: Conflicts with repeated TERMINALS
Many thanks, Joe, >Your grammar is ambiguous. The text tokens run together for >various rules because the grammar lacks clear separators between >them. OK, I begin to understand. The "clear separators" need to be TERMINALs, right? I believed that these were imlicit because there are TEXT and LINK after all text tokens are fully expanded. Therefore I thought that the grammar would not be ambiguous. >You can fix it a million ways by altering your grammar. Thanks for the suggestions - I can see that they do not generate conflicts, but they certainly alter the grammar. >Here is one way: > > article ::= blocks. > > blocks ::= block. > blocks ::= blocks block. > > block ::= heading. > block ::= paragraph. > > heading ::= HEADING_START text HEADING_END. > heading ::= HEADING_START text. > heading ::= HEADING_START. > > paragraph ::= PARA text. > > text ::= textpiece. > text ::= text textpiece. > > textpiece ::= TEXT. > textpiece ::= LINK. I observed the new PARA terminal token (the clear separator!?). Unfortunately the lexer does not generate such a token. Paragraph repeats are also removed. >Here's another: > > article ::= blocks. > > blocks ::= block. > blocks ::= blocks block. > > block ::= heading NEWLINE. > block ::= paragraph NEWLINE. > > heading ::= HEADING_START text HEADING_END. > heading ::= HEADING_START text. > heading ::= HEADING_START. > > paragraph ::= text. > > text ::= textpiece. > text ::= text textpiece. > > textpiece ::= TEXT. > textpiece ::= LINK. This one also removes paragraph repeats, doesn't it? Unfortunately paragraphs need to repeat for my grammar. Is there a way to achieve this without conflicts? >Lemon generates an .out file for the .y file processed. >You can examine it for errors. I have tried to make sense of the .out file before. It tells me where to look for the problem, but not how to fix it ... I am sorry to appear stupid, but I still can not make sense of it all. Can someone still help, please? Ralf - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] Lemon: Conflicts with repeated TERMINALS
--- Ralf Junker <[EMAIL PROTECTED]> wrote: > article ::= blocks. > > blocks ::= block. > blocks ::= blocks block. > > block ::= heading. > block ::= paragraph. > > heading ::= HEADING_START text HEADING_END. > heading ::= HEADING_START text. > heading ::= HEADING_START. > > paragraph ::= text NEWLINE. > paragraph ::= paragraph text NEWLINE. > paragraph ::= text. > paragraph ::= paragraph text. > > text ::= textpiece. > text ::= text textpiece. > > textpiece ::= TEXT. > textpiece ::= LINK. Your grammar is ambiguous. The text tokens run together for various rules because the grammar lacks clear separators between them. You can fix it a million ways by altering your grammar. Here is one way: article ::= blocks. blocks ::= block. blocks ::= blocks block. block ::= heading. block ::= paragraph. heading ::= HEADING_START text HEADING_END. heading ::= HEADING_START text. heading ::= HEADING_START. paragraph ::= PARA text. text ::= textpiece. text ::= text textpiece. textpiece ::= TEXT. textpiece ::= LINK. Here's another: article ::= blocks. blocks ::= block. blocks ::= blocks block. block ::= heading NEWLINE. block ::= paragraph NEWLINE. heading ::= HEADING_START text HEADING_END. heading ::= HEADING_START text. heading ::= HEADING_START. paragraph ::= text. text ::= textpiece. text ::= text textpiece. textpiece ::= TEXT. textpiece ::= LINK. Lemon generates an .out file for the .y file processed. You can examine it for errors. Be a better sports nut! Let your teams follow you with Yahoo Mobile. Try it now. http://mobile.yahoo.com/sports;_ylt=At9_qDKvtAbMuh1G1SQtBI7ntAcJ - To unsubscribe, send email to [EMAIL PROTECTED] -
[sqlite] Lemon: Conflicts with repeated TERMINALS
I am trying to write a Wiki parser with Lemon. The Lemon features suite my needs perfectly, but I am unfortunately stuck with the problem of parsing conflicts. All conflicts seem caused by repeat constructs like this: text ::= textpiece. text ::= text textpiece. The complete grammar follows below and results in 10 conflicts. I have read the manual, looked at tutorials, and searched the mailing list, but nothing helped me to reduce the number of conflicts. Changing token order even tends cause more of them. Reading similar grammars for Bison makes me wonder why Bison apparently has no problems with them but Lemon does. Am I doing something wrong or is this simply not possible with Lemon? Ralf --- article ::= blocks. blocks ::= block. blocks ::= blocks block. block ::= heading. block ::= paragraph. heading ::= HEADING_START text HEADING_END. heading ::= HEADING_START text. heading ::= HEADING_START. paragraph ::= text NEWLINE. paragraph ::= paragraph text NEWLINE. paragraph ::= text. paragraph ::= paragraph text. text ::= textpiece. text ::= text textpiece. textpiece ::= TEXT. textpiece ::= LINK. - To unsubscribe, send email to [EMAIL PROTECTED] -