On 3.) : The parser just recognizes syntax and ignores semantic ambiguities; then in a first tree walker pass, the symbol tables are constructed; a second tree walker pass uses the symbol table information to resolve ambiguities (which of several interpretations of valid syntax is correct) and rewrites the tree into an unambiguous form. Then you have finished recognizing C++ and can continue with further processing. Never, never, never would I suggest adding a feedback mechanism that couples parser to lexer.
--Loring ----- Original Message ---- > From: The Researcher <researcher0...@gmail.com> > To: antlr-interest@antlr.org > Sent: Mon, July 11, 2011 12:55:05 PM > Subject: Re: [antlr-interest] Can one identify the type of parser needed for > a >given BNF grammar > > > > > Loring, > > > Thanks for the feedback. > > While I now can understand your answer, I still don't have enough > experience to implement any approach correctly. > > As a help to others who are working on this in private or interested in > creating a C++ parser I will gloss what Loring said because last week it > would have mostly been Greek to me. If I say something wrong, please > correct. I am not afraid to make mistakes in public so that others may learn > from them. > > The big thing that one has to learn about parsing C++ is how to handle the > ambiguities. That is the dragon to be slayed here. > > 1. And then you have to figure out how to prune the GLR-generated "forests". > > GLR http://en.wikipedia.org/wiki/GLR_parser > > The reason GLR is considered for C++ is that do to the ambiguities of C++ > most parsers in a typical fashion have to couple the feedback from the > parser to the lexer and reference a symbol table to correctly parse C++. > This is contrary to what is taught in school and most of the early examples > of parsing you will come across. This allows them to produce an AST without > ambiguities after the parser. This is pushing the work to the front. GLR is > one of the most powerful parsing strategies and can parse C++ without > coupling the parser to the lexer, but does so by generating multiple > subtrees for each ambiguity, for which you must later prune out the > ambiguities from the forests. This is pushing the work to the back. > > 2. C++ is nasty; it can be parsed with ANTLR (as shown by NEXT and David > Wigg's > adaptions of that grammar). > > The ANTLR C++ grammar does so by creating a symbol table and using > predicates during the parse. In my initial conversions of the ANTLR C > grammar to C# I was able to convert it, but did not understand why it did > certain things at first. After researching it was clear that a lot of the > ambiguities of C++ arise when an name is encountered and then it's use ( a > type, an identifier, etc) has to be know in order to take the appropriate > branch. Thus the creation of the symbol tables and other tables and the use > of predicates. This pushes the work to the front. > > 3. I believe that the right strategy with ANTLR is actually to use > multi-pass recognition to sort out the ambiguities. That has not been done > yet. > The problem is that C++ cannot be adequately described with a context-free > grammar; you have to do some context-sensitive processing to resolve the > syntax > that is semantically ambiguous. > > It could/should be possible to parse C++ with ANTLR without a symbol table > and associated predicates, and then in the AST analysis verify that the > input C++ syntax is valid C++ semantics. > Remember that the BNF for a grammar does not guarantee valid code/semantics > only correct syntax. So the AST is analyzed to find invalid semantics. I am > not sure if this will require coupling the parser to the lexer or the AST to > the parser, but if I go this route I will soon know. > > I know this went off topic but was worth it. > > Any answer to the original question is still appreciated. > > Thanks > > Eric > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: >http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.