Those rules are ambiguous, so your lexer is broken. The sequence 'c' cannot bother be TERMINAL1 and a TERMINAL2 as this grammar is context free. Hence ANTLR happens to decide that TERMINAL2 is what you want. The lexer has to return the same sequence of tokens as the same type every time. It is not driven by the parser. The lexer runs and produces all the the tokens, then the parser runs.
So, you can only have: TERMINAL : 'b' | 'c'; Or your TERMINAL2 should be a fragment rule, in which case you will always get TERMINAL1. If a particular sequence of characters means something, then you should process this with semantics (either during the parse or better yet, after you have produced an AST that you can walk and analyze. Jim From: [email protected] [mailto:[email protected]] Sent: Thursday, August 05, 2010 2:40 AM To: Jim Idle; [email protected] Cc: [email protected] Subject: RE: [antlr-interest] missing tokens and strange behaviour regarding some chars Thanks both Jim and Kevin Kevin, I tried to use more LEXER expressions but the problem when parsing was that the TOKEN code that the LEXER sends is different than the more general rule as they are no fragments but full lexer rules, so it was not working. And yes It is giving me a real hard time. Jim, I am doing something similar to what you suggested me. But I found the main error was in how I was mixing some TOKENS inside another LEXER rules and not only fragments, so the codes that were being sended were not the ones that I though would work because they were more general. Now the two problems that I had are solved, now I am extending the grammar and keep on testing it. Example a: TERMINAL1 rule2 TERMINAL1: TERMINAL2 | 'b' TERMINAL2: 'c' If I tried to send c rule2 I though that it was going to work correctly, but no because, as I discovered debugging (I don't know if this is a general case) it finds that 'c' is a TERMINAL2 TOKEN and so, it doesn't match the rule a. Is this assumption correct in general?? Because maybe for me It has worked until now, but I can find another problem when extending, and I want to do a robust compiler. Thanks for everything Nieves "Jim Idle" <[email protected]> 03/08/2010 18:18 To <[email protected]>, <[email protected]> cc Subject RE: [antlr-interest] missing tokens and strange behaviour regarding some chars Your expression is still defined in an LALR manner hence it will get confused, you need to define it as a cascading set of rules with higher precedence towards the bottom of the nest. That probably does not make a lot of sense to you as words, so the best thing to do is to read through the grammar for say Java or C and look at the expression rules. Then basically copy them and adapt themto your own operators. Jim > -----Original Message----- > From: [email protected] [ <mailto:antlr-interest-> mailto:antlr-interest- > [email protected]] On Behalf Of [email protected] > Sent: Tuesday, August 03, 2010 12:37 AM > To: [email protected] > Subject: [antlr-interest] missing tokens and strange behaviour regarding > some chars > > Hello to everyone! > > I am new with ANTLR but not with compilers. Before I explain the problem I'll > try to explain a little bit the situation background. > > I am trying to design for a custom language, first a syntax highlighter and > second a module that can store the information in a DB (so in essence would > be creating a compiler with its output as SQL queries). > My input language is defined in EBNF, thus it has left-recursion and > ambiguity. In order to solve it, I have changed it a little to avoid those > problems and mostly I have managed it without using predicates or > backtracking. > > Working with ANTLR Works, I am debugging the grammar with different > examples (just the parser), before adding the highlighting code in the > StringTemplate. but I get these strange errors, mostly regarding > NoViableAltException. > > One problem for example is trying to define negative expressions with the > simple_factor rule. > So when I debug expressions like 500 or +500 in the simple_factor, I don't get > an error. But If I try -500, I get the NoViableAltException. Also if I change - for > another symbol like @, it also work when I try @500. I have traced all the > possibilities in the different possibilities in simple_factor, but in no one the > first symbol can be a negative symbol. > And I am lost as to why this can happen. I add the whole grammar because it > is quite big to paste it. > > Another problem that appears is that sometimes tokens are missed when > reading, so for example if I have an input beginning with 'initiate and > confirm', ANTLR reads 'conf' and loses the first characters. With the same > grammar that I have posted. One example of this problem goes with the > input 'initiate and confirm sys_stop of SCOE_1553 of LLCS of EGSE of System > of ODB' with the rule initiate_and_confirm_step_statement. > > Thanks in advance for any input > > Nieves Salor Moral > > addition_operator: ADDITION_OPERATOR > ; > > ADDITION_OPERATOR > : '+'|'-' > ; > > UNSIGNED_INTEGER > : DIGIT+ > ; > > simple_factor > : addition_operator simple_factor > | NEGATION_BOOLEAN_OPERATOR simple_factor > | constant > | '('expression ')' > | function > | object_property_request > | OBJECT_TYPE partial_path > | 'ask user' '(' expression ('default' expression)? ')' > ('expect' predefined_type)? > ; > > constant: BOOLEAN_CONSTANT > | UNSIGNED_INTEGER ( numeric_constant| > RELATIVE_TIME_CONSTANT) > | RELATIVE_TIME_CONSTANT > | string_constant > | HEXADECIMAL_CONSTANT > ; > real_constant > : ('.' UNSIGNED_INTEGER)? ('e' addition_operator? > UNSIGNED_INTEGER)? > ; > > numeric_constant > : real_constant engineering_units? > ; > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
