[Updated]I am watching when i use the generated lexer and parser (Generated from the LinearMath grammar below) in a java application is that do really emit somekind of warning about two thinks:
1)extraneous input '<some_token>' expecting EOF *Only when a append the EOF token at the end of the rule* 2)required (...)+ loop did not match anything at input <some_token>' *Only when i use the '+' quantity token modifier* where <some_token> there is actually token. In fact the warnings is actually are a strings sended to the standart error. The matter is, again, how do i do to manage those errors altering normal flow with a real exception and treating it like one. Ok, so far this. Sorry for the bombing of emails!. Thanks for advance. Víctor. El 02/02/2011 11:22 p.m., Victor Giordano escribió: > Okey. So adding and EOF forces the parser to go to the end of the input > in search of others tokens in correct order. > > 1)But a still have a problem, consider the following grammar: > > grammar LinearMath; > > tokens > { > PLUS = '+'; > MINUS = '-'; > MUL = '*'; > DIV = '/'; > } > > inecuation: linexpr ((RELATIONSHIP) linexpr)+ EOF!; > catch [UnwantedTokenException ute] > { > System.out.println ("inecuation UnwantedTokenException " + > ute.toString()); > throw ute; > } > > linexpr : (MINUS|PLUS)? linterm ((PLUS|MINUS) linterm)* EOF; > > linterm : factor? ID; > > expr returns [double value] > : e=term {$value = $e.value;} > ( PLUS e=term {$value += $e.value;} > | MINUS e=term {$value -= $e.value;} > )*; > > term returns [double value] > : f=factor {$value = $f.value;} > ( MUL f=factor {$value *= $f.value;} > | DIV f=factor {$value /= $f.value;} > )*; > > factor returns [double value] > : DOUBLE {$value = Double.parseDouble($DOUBLE.text);} > | '(' e=expr ')'{$value = $e.value;}; > > ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; > > DOUBLE > : ('0'..'9')+ > | ('0'..'9')+ '.' ('0'..'9')* EXPONENT? > | '.' ('0'..'9')+ EXPONENT? > | ('0'..'9')+ EXPONENT > ; > > fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; > > NEWLINE:'\r'? '\n' { $channel = HIDDEN; }; > > WS : (' '|'\t'|'\n'|'\r')+ { $channel = HIDDEN; }; > > > RELATIONSHIP : '<'|'<='|'='|'>'|'>='; > > and with the following input: "x< y x" > that isn't a valid inecuation beacause the y x must have a binary > aritmetic operator (PLUS OR MINUS). The parser do his job very well, he > consume the "x" then "<" later "y" and when it reachs the seconds "x" it > emits an "UnwantedTokenException". The think is, that i am not being > able to catch it, and display an error to the final user. Look that i am > using to parse that input the inecuation "rule". > > Hope anyone can help me with this again. > > 2) Other thing is about invalid tokens, i manage to treat then > overriding a member function of the lexer called nextToken(), like this: > > @lexer::members > { > @Override > public Token nextToken() > { > while (true) { > state.token = null; > state.channel = Token.DEFAULT_CHANNEL; > state.tokenStartCharIndex = input.index(); > state.tokenStartCharPositionInLine = > input.getCharPositionInLine(); > state.tokenStartLine = input.getLine(); > state.text = null; > if ( input.LA(1)==CharStream.EOF ) { > return Token.EOF_TOKEN; > } > try { > mTokens(); > if ( state.token==null ) { > emit(); > } > else if ( state.token==Token.SKIP_TOKEN ) { > continue; > } > return state.token; > } > catch (RecognitionException re) { > reportError(re); > throw new RuntimeException("Invalid Character > : " + (char) (re.c)); > // or throw Error > } > } > } > } > > ¿It's that the correct way? > > Well that is all!!! > Thanks for advance!. > Victor!! > > El 02/02/2011 05:32 p.m., John B. Brodie escribió: >> Your grammar does not mention the EOF token. (more below...) >> On Wed, 2011-02-02 at 16:18 -0300, Victor Giordano wrote: >>> Hi there. I am having trouble with the error handling. >>> I have a grammar for recoignize linear expression. And it works great!. >>> The grammar for a linear expresion is the following: >>> >>> tokens >>> { >>> PLUS = '+'; >>> MINUS = '-'; >>> MUL = '*'; >>> DIV = '/'; >>> } >>> >>> linexpr : (MINUS|PLUS)? linterm ((PLUS|MINUS) linterm)*; >>> linterm : factor? ID; >>> >>> expr returns [double value] >>> : e=term {$value = $e.value;} >>> ( PLUS e=term {$value += $e.value;} >>> | MINUS e=term {$value -= $e.value;} >>> )*; >>> >>> term returns [double value] >>> : f=factor {$value = $f.value;} >>> ( MUL f=factor {$value *= $f.value;} >>> | DIV f=factor {$value /= $f.value;} >>> )*; >>> >>> factor returns [double value] >>> : DOUBLE {$value = Double.parseDouble($DOUBLE.text);} >>> | '(' e=expr ')'{$value = $e.value;}; >>> >>> ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; >>> >>> DOUBLE >>> : ('0'..'9')+ >>> | ('0'..'9')+ '.' ('0'..'9')* EXPONENT? >>> | '.' ('0'..'9')+ EXPONENT? >>> | ('0'..'9')+ EXPONENT >>> ; >>> >>> fragment EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ; >>> >>> NEWLINE:'\r'? '\n' { $channel = HIDDEN; }; >>> >>> WS : (' '|'\t'|'\n'|'\r')+ { $channel = HIDDEN; }; >>> >>> >>> But the problem ocurrs when, for example, i have: >>> "x x x" >>> >>> Then the parsers stop after processing the first "x". >>> ¿How do i correctly emit an invalid syntax error?. >>> I Try with the catch EarlyExitException, but it doesn't works. >>> I Want, inside my java aplicacition to catch this, and show to the final >>> user. >>> Something like this... >>> //line is equals to the user input... >>> >>> CharStream cs = new ANTLRStringStream(line); >>> LinearExpressionLexer lexer = new LinearExpressionLexer(cs); >>> CommonTokenStream tokens = new CommonTokenStream(lexer); >>> LinearExpressionParser parser = new >>> LinearExpressionParser(tokens); >>> res = parser.linexpr (); // and here, it's suppose to fail, >>> but it isn't. >>> Actually, the linexpr does returns some kind of data whose type is a >>> custom class called LinearExpresion. I omit to put the return in the >>> linearexpr parser rule to simplify things. >>> >>> Hope anyone can help me. >>> Greettings and thanks for advance. >> >> Greetings! >> >> By design ANTLR parsers stop after consuming the longest possible VALID >> input sequence. I believe the rational for this is that any remaining >> input will be available for some other tool to process. >> >> If you want ANTLR to try to process the entire input, reporting and >> recovering from syntax errors in the input; you must tell it to do that. >> >> By referring to the EOF token (a special built-in token) in your >> top-most rule will cause ANTLR to consume the entire input string. E.g. >> the parse will not have a valid input until the EOF is seen and so will >> consume all of the input sentence. >> >> I suggest adding a top-level rule similar to: >> >> start : linexpr EOF! ; >> >> and then call parser.start() instead of parser.linexpr() in your driver. >> >> (note the ! meta-character after the EOF token above will keep the EOF >> out of any AST produced, but you do not seem to be building an AST so it >> won't make any difference...) >> >> Hope this helps... >> -jbb >> >> >> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.