[il-antlr-interest: 35034] [antlr-interest] Use of the range operator inside parser rules
Hi all, I seem to remember that the `..` operator inside parser rules matches token ranges. For example: grammar T; parse : A..C; A : 'a'; X : 'x'; C : 'c'; D : 'd'; The `parse` rule would match the tokens A, X or C. (Needless to say, I'd never use it in this way, but I thought it was possible). However, when trying to generate a lexer parser from the grammar with ANTLR 3.1, ANTLR 3.2 and ANTLR 3.3, I get the following error: error(100): T.g:4:6: syntax error: antlr: T.g:4:6: unexpected token: A And using ANTLR 3.4, the following: error(10): internal error: T.g : java.util.NoSuchElementException: can't look backwards more than one token in this stream My question(s): - was the `..` inside parser rules ever supported to match token ranges? If so, when was this dropped? (I already searched the 3.x release-notes, but couldn't find anything) - or am I mistaken, and was it never supported? Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35027] Re: [antlr-interest] Possible bug with backtrack-generated predicate methods
Hi Franck, On Sat, Nov 26, 2011 at 8:54 AM, franck102 franck...@yahoo.com wrote: The grammar below won't compile, this looks like a bug to me? ... No bug, syntactic predicates and rule parameters can't be mixed. You can use rule scopes instead: - grammar Test; options { output=AST; backtrack=true; ASTLabelType=CommonTree; } program scope { String x; } @init { $program::x = null; } : 'raw'? (ID {$program::x=$ID.text;} - ID) (rule - rule)* | 'raw' ID ; rule : 'some' ID {System.out.println(called from: + $program::x);} ; ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; WS : (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}; - If you now parse raw A some B, then called from: A will be printed to your console. Also, you're trying to pass the tree of `program` as a parameter, but that tree hasn't been constructed yet, AFAIK (and will therefor be `null`). That's why my example shows how to use rule scopes with a simple string. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35029] Re: [antlr-interest] Possible bug with backtrack-generated predicate methods
On Sat, Nov 26, 2011 at 9:58 AM, franck102 franck...@yahoo.com wrote: In fact the tree has been constructed by the leading (ID-ID), That tree only exists inside your parenthesis, AFAIK. You can't reference it outside it (well, you can, but it will be `null`). So here is exactly what I am trying to do, there is probably a better way than what I have (untested pseudo-grammer, but you should get the idea): expr :( prefix - prefix ) ( suffix - /* *insert prefix as first child of suffix and return suffix * */ ) * prefix : ID; suffix :DOT ID - ^( DOT ID ) |'[' expr ']' - ^( INDEX expr ) I guess I could use a scope to pass down the prefix; or have suffix return both the root type and a flat list and build the tree in expr; but both seem painful to get right typing wise... Yes, it's a pain compared to simple passing the tree as a parameter, but that's the penalty for turning on global backtracking [1]. Bart. [1] http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35021] Re: [antlr-interest] Matching compound keywords in the lexer
Hi Franck, On Fri, Nov 25, 2011 at 9:47 PM, franck102 franck...@yahoo.com wrote: ... containOperator : CONTAINS_TEXT | CONTAINS_MATCH CONTAINS_TEXT : 'contains' WS+ ( 'match' { $type=CONTAINS_MATCH } | 'text' ) ; // CONTAINS_MATCH:; // causes token definitions can never be matched error Add CONTAINS_MATCH to your @tokens{...} and create an empty fragment rule called CONTAINS_MATCH to silence the warning: - tokens { CONTAINS_MATCH; } ... CONTAINS_TEXT : 'contains' WS+ ( 'match' CONTAINS_MATCH | 'text' ) ; ... fragment CONTAINS_MATCH : ; - Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35022] Re: [antlr-interest] Matching compound keywords in the lexer
On Fri, Nov 25, 2011 at 9:54 PM, Bart Kiers bki...@gmail.com wrote: ... - tokens { CONTAINS_MATCH; } ... CONTAINS_TEXT : 'contains' WS+ ( 'match' CONTAINS_MATCH | 'text' ) ; ... fragment CONTAINS_MATCH : ; - Sorry, that snippet should've looked like: - tokens { CONTAINS_MATCH; } ... CONTAINS_TEXT : 'contains' WS+ ( 'match' {$type=CONTAINS_MATCH;} | 'text' ) ; ... fragment CONTAINS_MATCH : ; - List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35004] Re: [antlr-interest] Eliminate characters in TOKEN
Hi Rampon, On Wed, Nov 23, 2011 at 10:54 AM, Rampon Jerome ramponjer...@yahoo.frwrote: ... it complained on output option to be AST. If I add it in my grammar options if complains and still return error It seems it automatically adds if not there but later on still return error ??? Is that normal ? Yes, the `!` to exclude characters from lexer rules (as was possible in v2) is no longer valid in v3 grammars. Any simple way to bypass rather than a later replaceAll. I would prefer to keep it target independent No, that's not possible. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 35009] Re: [antlr-interest] Lexer error reporting
Hi Bill, On Wed, Nov 23, 2011 at 5:41 PM, Bill Andersen bill.ander...@mac.comwrote: Hi Folks... Been trying to figure out how to shut off default Lexer behavior to print messages to System.err, such as: line 2:4 no viable alternative at character ' ' Instead, I'd like to catch these and do something with them. Overriding reportError(RecognitionException) doesn't work and no other option seems obvious. Both the lexer and parser have a `reportError(...)` method, and my guess is that you did something like this: @members { @Override public void reportError(RecognitionException e) ... } which is a short-hand for: @parser::members { // note the `parser::` @Override public void reportError(RecognitionException e) ... } But since a no viable alternative error is something that comes from the lexer, you need to explicitly override the lexer method like this: @lexer::members { @Override public void reportError(RecognitionException e) { System.out.println(CUSTOM ERROR...); } } Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34962] Re: [antlr-interest] Failure to ignore newline
On Fri, Nov 18, 2011 at 7:55 AM, David Riddle da...@mcgilly.com wrote: Hi - This should be a very simple thing - I'm attempting to have my grammar hide newline, carriage returns, etc. However, every concievable form of a grammar that attempts to skip over these things or send them to the hidden channel seems to fail for me. Here's a very basic example: grammar Test; prog: ID+; ID: 'a'..'z'+; WS: '\n'+ {$channel=HIDDEN;}; // Input: a \n b // Output: a n b I'm guessing that's no 0xA (new line char) in your input, but a backslash followed by a 'n'. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34964] Re: [antlr-interest] String concatenation expression rule
On Fri, Nov 18, 2011 at 12:39 PM, franck102 franck...@yahoo.com wrote: I am writing a grammar for a fairly complex expression language, and in particular I need to support string concatenation which is performed simply by separating string literals with a space; and which automatically converts other expressions to a string if needed to concatenate: a b - ab 2+3 mm - 5mm I suspect I could use predicates to write a rule like this: concatExpression :( expression | STRING_LITERAL )+ { apply only if at least one of the elements is a string literal }? Is there a way to achieve this? The alternative formulations I can think of are pretty messy... As far as I understand it, you don't need any predicate. I see a concat-expression has a lower precedence than addition, in which case this could do the trick: grammar T; options { output=AST; } tokens { ROOT; CONCAT; } parse : (expression ';')* EOF - ^(ROOT expression*) ; expression : (add - add) (add+ - ^(CONCAT add+))? ; add : atom (('+' | '-')^ atom)* ; atom : Number | String | '(' expression ')' - expression ; Number : '0'..'9'+ ('.' '0'..'9'+)?; String : '' ~''* ''; Space : ' ' {skip();}; You can test it with the following class: import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { String src = 42 - 2; 2 + 3 \mm\; \a\ \b\ 4-3-2 \c\; \pi = \ 3.14159;; TLexer lexer = new TLexer(new ANTLRStringStream(src)); TParser parser = new TParser(new CommonTokenStream(lexer)); CommonTree root = (CommonTree)parser.parse().getTree(); ; System.out.println(new DOTTreeGenerator().toDOT(root)); } } Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34967] Re: [antlr-interest] Failure to ignore newline
On Fri, Nov 18, 2011 at 4:01 PM, David Riddle da...@mcgilly.com wrote: Hi Bart - Yes, it's a \n, and I thought I told the grammar to set '\n' to a hidden channel. So, why is it not hidden? Assuming you mean a new line char, then it _is_ being sent to the HIDDEN channel as Norman already mentioned. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34937] Re: [antlr-interest] valid grammar does not compile
On Thu, Nov 17, 2011 at 9:14 AM, D. Frej dieter_f...@gmx.net wrote: Hi, I build the following grammar with antlrworks: grammar questionmark; horef :'\?' ('a'..'z') ; antlrworks tells me check grammar succeeded. However, debugging does not works because the generated code does not compile !? You should not escape the question mark. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34939] Re: [antlr-interest] valid grammar does not compile
On Thu, Nov 17, 2011 at 10:14 AM, D. Frej dieter_f...@gmx.net wrote: and still: the compilation error stays even if I do not quote the question mark Ah, hold on, you're using the `..` (range) operator inside a parser rule (horef). Either create a lexer rule matching '?' 'a'..'z': Horef :'?' 'a'..'z' ; , or move 'a'..'z' to a lexer rule: horef :'?' Letter ; Letter : 'a'..'z' ; But still, the question mark should not be escaped. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34916] Re: [antlr-interest] Having trouble with creating a parser for my desired grammar. Running afoul of multiple alternatives warnings
Hi John, On Tue, Nov 15, 2011 at 11:46 PM, John B. Brodie j...@acm.org wrote: Greetings! ... I do not think you want to recognize floating point values in the parser. any tokens you send to the HIDDEN $channel (or skip();) will be silently accepted before and after the '.' of the float. change your INTEGER rule to this: I fully agree... fragment FLOAT: ; INTEGER : DIGIT+ ('.' DIGIT+ {$type=FLOAT;} )? ; and use FLOAT in the number rule. .. however, Jarrod's grammar allows for input to end with `expression '.'`, which could be 123. (an INTEGER followed by a DOT). This would be input the lexer would trip over. A possible fix could look like: INTEGER : DIGIT+ ({input.LA(1)=='.' input.LA(2)='0' input.LA(2)='9'}?= '.' DIGIT+ {$type=FLOAT;})? ; I.e., only match a '.' if the character after the '.' is a digit. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34927] Re: [antlr-interest] Having trouble with creating a parser for my desired grammar. Running afoul of multiple alternatives warnings
Hi, On Wed, Nov 16, 2011 at 8:21 PM, Jarrod Roberson jar...@vertigrated.comwrote: actually thanks to Bart I need the FLOAT rule as a parser rule with the predicate because I want to be able to match But John raises a valid point that I didn't mention: by promoting such a rule to a parser rule, you run the risk that the parser matches a `number` rule for the input source: 123 . 5 (spaces around the '.') because the parser ignores the white spaces. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34928] Re: [antlr-interest] Having trouble with creating a parser for my desired grammar. Running afoul of multiple alternatives warnings
On Wed, Nov 16, 2011 at 8:38 PM, Bart Kiers bki...@gmail.com wrote: Hi, On Wed, Nov 16, 2011 at 8:21 PM, Jarrod Roberson jar...@vertigrated.comwrote: actually thanks to Bart I need the FLOAT rule as a parser rule with the predicate because I want to be able to match But John raises a valid point that I didn't mention: by promoting such a rule to a parser rule, you run the risk that the parser matches a `number` rule for the input source: 123 . 5 (spaces around the '.') because the parser ignores the white spaces. Or even the input: 123 /* some comments */ . /* more comments */ 5 would be a valid `number`... :) List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34930] Re: [antlr-interest] Having trouble with creating a parser for my desired grammar. Running afoul of multiple alternatives warnings
On Wed, Nov 16, 2011 at 8:45 PM, Jarrod Roberson jar...@vertigrated.comwrote: Or even the input: 123 /* some comments */ . /* more comments */ 5 would be a valid `number`... :) Is there a way to support both a - 1. b - 1.1. in a pure lexer rule then, I didn't think there was? See my earlier reply: http://antlr.markmail.org/message/wtwq2vbmhedek2cn in this thread. 1. would become: INTEGER, DOT 1.1. would become: FLOAT, DOT 1 . 1 would become: INTEGER SPACE DOT SPACE INTEGER Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34871] Re: [antlr-interest] Using range operator (INT .. INT)
On Sun, Nov 13, 2011 at 6:59 PM, Jiwon Seo seoji...@gmail.com wrote: Thanks for the reply! I'm trying to do it without extending lexer since I think my definition of FLOAT should not be a problem with the range operator. But it _is_ a problem if the `..` is preceded by an INT: the ('0'..'9')+ '.' is consumed by FLOAT and can then not match ('0'..'9')+ EXPONENT?, resulting in the MismatchedTokenException. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34763] Re: [antlr-interest] Empty ifs in Java
Hi Patrick, On Sun, Nov 6, 2011 at 1:40 PM, Patrick Zimmermann patr...@zakweb.dewrote: ... I still think that a scannerless parser might be a better alternative. Are there any good reasons against switching (apart from ANTLR being a great tool in general)? Nope, sorry, I don't have any real experience with PEG's. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34766] Re: [antlr-interest] Empty ifs in Java
You're welcome Patrick! Regards, Bart. On Sun, Nov 6, 2011 at 10:15 PM, Patrick Zimmermann patr...@zakweb.dewrote: Hi, Ok. I think I'll at least have a look what the alternatives would be. Thank you a lot for your patient answers again. Regards, Patrick On Sunday, 6. November 2011 21:31:40 Bart Kiers wrote: Hi Patrick, On Sun, Nov 6, 2011 at 1:40 PM, Patrick Zimmermann patr...@zakweb.de wrote: ... I still think that a scannerless parser might be a better alternative. Are there any good reasons against switching (apart from ANTLR being a great tool in general)? Nope, sorry, I don't have any real experience with PEG's. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34746] Re: [antlr-interest] Empty ifs in Java
Hi Patrick, The range operator, `..`, only works in lexer rules, not in parser rules as you're doing. Capitalize the `letter` rule to make it a lexer rule: grammar failuretest; start: letter+; Letter: 'A' .. 'B'; // or did you mean 'A'..'Z'? and then try again. Regards, Bart. On Sat, Nov 5, 2011 at 1:41 PM, Patrick Zimmermann patr...@zakweb.dewrote: Hello, I'm new to this mailing list, so hello everyone. I currently try to create a grammar to parse a textual wiki syntax. I'm using ANTLR 3.4 and the Java generator. I've reached a situation where ANTLR creates empty ifs in the generated Java code and I boiled it down to the following grammar: grammar failuretest; start: letter+; letter: 'A' .. 'B'; The resulting Java code contains the following statements: ... if ( () ) { ... if ( ) { ... As far as I know this is no legal Java. And the grammar looks fine to me, apart from that the second rule should probably be a lexer rule. So could this actually be a bug in antlr? Many thanks in advance, Patrick Zimmermann List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34748] Re: [antlr-interest] Empty ifs in Java
Hi, On Sat, Nov 5, 2011 at 4:16 PM, Patrick Zimmermann patr...@zakweb.dewrote: Hi, thank you a lot. Using a lexer rule does in fact solve this problem. And now I am already on the next: stripped down to: start : ('{' 'ab' '}')* '{a}'; using input: {ab}{a} Will not list '{ab' on the input stream in AntlrWorks and thus fails to parse the input. I suspect this is another should be done with the lexer-thing. No, the literals in your parser rule are implicit lexer rules, although it's better to create explicit rules instead of mixing them inside your parser rules: ABraced : '{a}'; OBrace : '{'; CBrace : '}'; AB : 'ab'; A : 'a'; If the lexer now tries to tokenize the input {ab, then the lexer will see {a and expects a } but there's a b instead: and an error is emitted. I'm currently thinking about whether ANTLR is the right tool for my job: In many cases the input I have is character wise context sensitive. I have some areas (the free text area) where '(' and ')' have a specific meaning and others (the note area) where '(' ')' are simply normal text. Or whitespace which is important in the text and to be ignored in tags and similar constructs. If I'm not mistaken the lexer runs completely before the parser and constructs tokens. Those tokens are then matched by the parser. So if an input would match several tokens (e.g. text not containing parenthesis) and the wrong one is chosen by the lexer the parser is screwed, right? Yes, the parser has no control over what tokens the lexer produces. I currently realize that I am forced to use lexer rules for certain constructs (like ..) because I need character ranges to define the chars that are allowed (unicode, only certain languages). Do you think ANTLR is the right tool for for this job and I'm just not seeing the point in how to do it, or should I better use something else? What? You could let the lexer simply create single tokens and create parser rules that match a certain range of tokens (like the `ab` rule below): start : OBrace ab CBrace OBrace A CBrace EOF ; ab : A B ; OBrace : '{'; CBrace : '}'; A : 'a'; B : 'b'; Thanks so far, Patrick Regards, Bart. PS. could you use the list for communication please? List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34728] Re: [antlr-interest] about range float and stuff
I don't understand what you mean. Bart. On Fri, Nov 4, 2011 at 5:33 PM, Jim Idle j...@temporal-wave.com wrote: It won't make it more difficult, and the lexer already does what Fabien asks. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 1:48 AM To: Fabien Hermenier Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff Hi Fabien, Handling this in the parser will make your life much harder than it has to. Doing it in the lexer, you will need a bit of custom code, but I'd go for something similar to this (something like it is on the WIki somewhere, but can't find it...): grammar RangeDemo; @lexer::members { java.util.QueueToken tokens = new java.util.LinkedListToken(); public void offer(int ttype, String ttext) { emit(new CommonToken(ttype, ttext)); } @Override public void emit(Token t) { state.token = t; tokens.offer(t); } @Override public Token nextToken() { super.nextToken(); return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll(); } } parse : (t=. {System.out.printf(\%-10s \%s\n, tokenNames[$t.type], $t.text);})* EOF ; FLOAT : INT '..' {offer(INT, $INT.text); offer(RANGE, ..);} | OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, ..);} | '.' DIGITS | DIGITS '.' DIGITS? ; RANGE : '..' ; INT : '1'..'9' DIGIT* | '0' ; OCTAL : '0' ('0'..'7')+ ; fragment DIGITS : DIGIT+; fragment DIGIT : '0'..'9'; SPACE : (' ' | '\t' | '\r' | '\n') {skip();} ; And if you run the class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { String src = ..07..8.5 1.9..02 1..3.4; RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src)); RangeDemoParser parser = new RangeDemoParser(new CommonTokenStream(lexer)); System.out.println(Parsing: ' + src + '); parser.parse(); } } You'll see the following being printed to the console: Parsing: '..07..8.5 1.9..02 1..3.4' RANGE .. OCTAL 07 RANGE .. FLOAT 8.5 FLOAT 1.9 RANGE .. OCTAL 02 INT1 RANGE .. FLOAT 3.4 Regards, Bart. On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier hermenierfab...@gmail.comwrote: Hi In an earlier version of my language, I had to parse range of integers in various base. Now I want to include float. I have read http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+po int,+dot,+range,+time+specs but I've still got some questions. All the work seems to be done at the lexer level so the type of the following tokens will be as example: 5 : DECIMAL_LITTERAL 07 : OCTAL_LITTERAL 7.5: FLOATING_POINT_LITTERAL 5..7 : DOTDOT In the last example, the result is not very convenient because I will still have to extract the bounds and compute their type by myself which seems quite redundant with the job performed by the lexer. May be I am missing something ? I would rather be able to express the range at the parser level which seems much more convenient to me: range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL. In this way, I will also be able to manage the possible spaces between the bounds and the DOTDOT. So, am I right to try to parse range at the parser level ? Or is there a solution to extract easily the bounds with their type if I am doing the job at the lexer level ? Thanks in advance, Fabien. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email- address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34729] Re: [antlr-interest] about range float and stuff
For what it's worth, I found the Wiki entry I based my suggestion on: http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 Regards, Bart. On Fri, Nov 4, 2011 at 3:11 PM, Bart Kiers bki...@gmail.com wrote: You're welcome Fabien, but note that it most likely looks a lot like something I found on the ANTLR Wiki: so I can't claim credit for it (perhaps a small part! :)). I'll have a look later on and see if I can dig up the Wiki page. Regards, Bart. On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier hermenierfab...@gmail.com wrote: Thanks Bart, I think I have understand your approach and indeed, it seems beautiful and simple. I will try your solution during the week-end. Fabien. Le 04/11/11 02:48, Bart Kiers a écrit : Hi Fabien, Handling this in the parser will make your life much harder than it has to. Doing it in the lexer, you will need a bit of custom code, but I'd go for something similar to this (something like it is on the WIki somewhere, but can't find it...): grammar RangeDemo; @lexer::members { java.util.QueueToken tokens = new java.util.LinkedListToken(); public void offer(int ttype, String ttext) { emit(new CommonToken(ttype, ttext)); } @Override public void emit(Token t) { state.token = t; tokens.offer(t); } @Override public Token nextToken() { super.nextToken(); return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll(); } } parse : (t=. {System.out.printf(\%-10s \%s\n, tokenNames[$t.type], $t.text);})* EOF ; FLOAT : INT '..' {offer(INT, $INT.text); offer(RANGE, ..);} | OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, ..);} | '.' DIGITS | DIGITS '.' DIGITS? ; RANGE : '..' ; INT : '1'..'9' DIGIT* | '0' ; OCTAL : '0' ('0'..'7')+ ; fragment DIGITS : DIGIT+; fragment DIGIT : '0'..'9'; SPACE : (' ' | '\t' | '\r' | '\n') {skip();} ; And if you run the class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { String src = ..07..8.5 1.9..02 1..3.4; RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src)); RangeDemoParser parser = new RangeDemoParser(new CommonTokenStream(lexer)); System.out.println(Parsing: ' + src + '); parser.parse(); } } You'll see the following being printed to the console: Parsing: '..07..8.5 1.9..02 1..3.4' RANGE .. OCTAL 07 RANGE .. FLOAT 8.5 FLOAT 1.9 RANGE .. OCTAL 02 INT1 RANGE .. FLOAT 3.4 Regards, Bart. On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier hermenierfab...@gmail.com wrote: Hi In an earlier version of my language, I had to parse range of integers in various base. Now I want to include float. I have read http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs but I've still got some questions. All the work seems to be done at the lexer level so the type of the following tokens will be as example: 5 : DECIMAL_LITTERAL 07 : OCTAL_LITTERAL 7.5: FLOATING_POINT_LITTERAL 5..7 : DOTDOT In the last example, the result is not very convenient because I will still have to extract the bounds and compute their type by myself which seems quite redundant with the job performed by the lexer. May be I am missing something ? I would rather be able to express the range at the parser level which seems much more convenient to me: range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL. In this way, I will also be able to manage the possible spaces between the bounds and the DOTDOT. So, am I right to try to parse range at the parser level ? Or is there a solution to extract easily the bounds with their type if I am doing the job at the lexer level ? Thanks in advance, Fabien. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34730] Re: [antlr-interest] about range float and stuff
If your (Jim) definition of without code means no @members section, then I find it a bit of an odd definition since the lexer rules from http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs are littered with `{ ... }` code blocks: not what I'd call without code. I much prefer the solution proposed by Terence in http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 (which I based my suggestion on): far less verbose than the first option, IMO. Bart. On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers bki...@gmail.com wrote: The only wiki-link posted in this thread is http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs which contains Java code, so you must mean something else (of which, I have no idea of)... Bart. On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle j...@temporal-wave.com wrote: The example on the Wiki already does all of this in the lexer, but without any code. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 7:12 AM To: Fabien Hermenier Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff You're welcome Fabien, but note that it most likely looks a lot like something I found on the ANTLR Wiki: so I can't claim credit for it (perhaps a small part! :)). I'll have a look later on and see if I can dig up the Wiki page. Regards, Bart. On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier hermenierfab...@gmail.comwrote: Thanks Bart, I think I have understand your approach and indeed, it seems beautiful and simple. I will try your solution during the week-end. Fabien. Le 04/11/11 02:48, Bart Kiers a écrit : Hi Fabien, Handling this in the parser will make your life much harder than it has to. Doing it in the lexer, you will need a bit of custom code, but I'd go for something similar to this (something like it is on the WIki somewhere, but can't find it...): grammar RangeDemo; @lexer::members { java.util.QueueToken tokens = new java.util.LinkedListToken(); public void offer(int ttype, String ttext) { emit(new CommonToken(ttype, ttext)); } @Override public void emit(Token t) { state.token = t; tokens.offer(t); } @Override public Token nextToken() { super.nextToken(); return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll(); } } parse : (t=. {System.out.printf(\%-10s \%s\n, tokenNames[$t.type], $t.text);})* EOF ; FLOAT : INT '..' {offer(INT, $INT.text); offer(RANGE, ..);} | OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, ..);} | '.' DIGITS | DIGITS '.' DIGITS? ; RANGE : '..' ; INT : '1'..'9' DIGIT* | '0' ; OCTAL : '0' ('0'..'7')+ ; fragment DIGITS : DIGIT+; fragment DIGIT : '0'..'9'; SPACE : (' ' | '\t' | '\r' | '\n') {skip();} ; And if you run the class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { String src = ..07..8.5 1.9..02 1..3.4; RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src)); RangeDemoParser parser = new RangeDemoParser(new CommonTokenStream(lexer)); System.out.println(Parsing: ' + src + '); parser.parse(); } } You'll see the following being printed to the console: Parsing: '..07..8.5 1.9..02 1..3.4' RANGE .. OCTAL 07 RANGE .. FLOAT 8.5 FLOAT 1.9 RANGE .. OCTAL 02 INT1 RANGE .. FLOAT 3.4 Regards, Bart. On Fri, Nov 4, 2011 at 7:28 AM, Fabien Hermenier hermenierfab...@gmail.com wrote: Hi In an earlier version of my language, I had to parse range of integers in various base. Now I want to include float. I have read http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+p oint,+dot,+range,+time+specs but I've still got some questions. All the work seems to be done at the lexer level so the type of the following tokens will be as example: 5 : DECIMAL_LITTERAL 07 : OCTAL_LITTERAL 7.5: FLOATING_POINT_LITTERAL 5..7 : DOTDOT In the last example, the result is not very convenient because I will still have to extract the bounds and compute their type by myself which seems quite redundant with the job performed by the lexer. May be I am missing something ? I would rather be able to express the range at the parser level which seems much more convenient to me: range: FLOATING_POINT_LITTERAL DOTDOT FLOATING_POINT_LITTERAL. In this way
[il-antlr-interest: 34733] Re: [antlr-interest] about range float and stuff
I only know that Terence's solution solves the OP's problem AFAIK, whereas yours I am not sure of: I simply find it too verbose to fully grasp by only reading through it. Sorry. Bart. On Fri, Nov 4, 2011 at 6:18 PM, Jim Idle j...@temporal-wave.com wrote: You may prefer whatever solution you like of course (though these are not the same solution), but you should be accurate about the other solutions and take the time to read through them. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 10:13 AM To: antlr-interest@antlr.org interest Subject: Re: [antlr-interest] about range float and stuff If your (Jim) definition of without code means no @members section, then I find it a bit of an odd definition since the lexer rules from http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+poi nt,+dot,+range,+time+specs are littered with `{ ... }` code blocks: not what I'd call without code. I much prefer the solution proposed by Terence in http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 (which I based my suggestion on): far less verbose than the first option, IMO. Bart. On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers bki...@gmail.com wrote: The only wiki-link posted in this thread is http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+po int,+dot,+range,+time+specs which contains Java code, so you must mean something else (of which, I have no idea of)... Bart. On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle j...@temporal-wave.com wrote: The example on the Wiki already does all of this in the lexer, but without any code. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 7:12 AM To: Fabien Hermenier Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff You're welcome Fabien, but note that it most likely looks a lot like something I found on the ANTLR Wiki: so I can't claim credit for it (perhaps a small part! :)). I'll have a look later on and see if I can dig up the Wiki page. Regards, Bart. On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier hermenierfab...@gmail.comwrote: Thanks Bart, I think I have understand your approach and indeed, it seems beautiful and simple. I will try your solution during the week-end. Fabien. Le 04/11/11 02:48, Bart Kiers a écrit : Hi Fabien, Handling this in the parser will make your life much harder than it has to. Doing it in the lexer, you will need a bit of custom code, but I'd go for something similar to this (something like it is on the WIki somewhere, but can't find it...): grammar RangeDemo; @lexer::members { java.util.QueueToken tokens = new java.util.LinkedListToken(); public void offer(int ttype, String ttext) { emit(new CommonToken(ttype, ttext)); } @Override public void emit(Token t) { state.token = t; tokens.offer(t); } @Override public Token nextToken() { super.nextToken(); return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll(); } } parse : (t=. {System.out.printf(\%-10s \%s\n, tokenNames[$t.type], $t.text);})* EOF ; FLOAT : INT '..' {offer(INT, $INT.text); offer(RANGE, ..);} | OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, ..);} | '.' DIGITS | DIGITS '.' DIGITS? ; RANGE : '..' ; INT : '1'..'9' DIGIT* | '0' ; OCTAL : '0' ('0'..'7')+ ; fragment DIGITS : DIGIT+; fragment DIGIT : '0'..'9'; SPACE : (' ' | '\t' | '\r' | '\n') {skip();} ; And if you run the class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { String src = ..07..8.5 1.9..02 1..3.4; RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src)); RangeDemoParser parser = new RangeDemoParser(new CommonTokenStream(lexer)); System.out.println(Parsing: ' + src + '); parser.parse(); } } You'll see the following being printed to the console: Parsing: '..07..8.5 1.9..02 1..3.4' RANGE .. OCTAL 07 RANGE .. FLOAT 8.5 FLOAT 1.9 RANGE .. OCTAL 02 INT1 RANGE .. FLOAT 3.4
[il-antlr-interest: 34734] Re: [antlr-interest] about range float and stuff
And if you really meant that the code on http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point,+dot,+range,+time+specs is without any code, then I disagree with that definition. Since you didn't comment on that anymore, I presume that _is_ what you were talking about. Bart. On Fri, Nov 4, 2011 at 6:30 PM, Bart Kiers bki...@gmail.com wrote: I only know that Terence's solution solves the OP's problem AFAIK, whereas yours I am not sure of: I simply find it too verbose to fully grasp by only reading through it. Sorry. Bart. On Fri, Nov 4, 2011 at 6:18 PM, Jim Idle j...@temporal-wave.com wrote: You may prefer whatever solution you like of course (though these are not the same solution), but you should be accurate about the other solutions and take the time to read through them. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 10:13 AM To: antlr-interest@antlr.org interest Subject: Re: [antlr-interest] about range float and stuff If your (Jim) definition of without code means no @members section, then I find it a bit of an odd definition since the lexer rules from http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+poi nt,+dot,+range,+time+specs are littered with `{ ... }` code blocks: not what I'd call without code. I much prefer the solution proposed by Terence in http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 (which I based my suggestion on): far less verbose than the first option, IMO. Bart. On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers bki...@gmail.com wrote: The only wiki-link posted in this thread is http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+po int,+dot,+range,+time+specs which contains Java code, so you must mean something else (of which, I have no idea of)... Bart. On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle j...@temporal-wave.com wrote: The example on the Wiki already does all of this in the lexer, but without any code. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 7:12 AM To: Fabien Hermenier Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff You're welcome Fabien, but note that it most likely looks a lot like something I found on the ANTLR Wiki: so I can't claim credit for it (perhaps a small part! :)). I'll have a look later on and see if I can dig up the Wiki page. Regards, Bart. On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier hermenierfab...@gmail.comwrote: Thanks Bart, I think I have understand your approach and indeed, it seems beautiful and simple. I will try your solution during the week-end. Fabien. Le 04/11/11 02:48, Bart Kiers a écrit : Hi Fabien, Handling this in the parser will make your life much harder than it has to. Doing it in the lexer, you will need a bit of custom code, but I'd go for something similar to this (something like it is on the WIki somewhere, but can't find it...): grammar RangeDemo; @lexer::members { java.util.QueueToken tokens = new java.util.LinkedListToken(); public void offer(int ttype, String ttext) { emit(new CommonToken(ttype, ttext)); } @Override public void emit(Token t) { state.token = t; tokens.offer(t); } @Override public Token nextToken() { super.nextToken(); return tokens.isEmpty() ? Token.EOF_TOKEN : tokens.poll(); } } parse : (t=. {System.out.printf(\%-10s \%s\n, tokenNames[$t.type], $t.text);})* EOF ; FLOAT : INT '..' {offer(INT, $INT.text); offer(RANGE, ..);} | OCTAL '..' {offer(OCTAL, $OCTAL.text); offer(RANGE, ..);} | '.' DIGITS | DIGITS '.' DIGITS? ; RANGE : '..' ; INT : '1'..'9' DIGIT* | '0' ; OCTAL : '0' ('0'..'7')+ ; fragment DIGITS : DIGIT+; fragment DIGIT : '0'..'9'; SPACE : (' ' | '\t' | '\r' | '\n') {skip();} ; And if you run the class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { String src = ..07..8.5 1.9..02 1..3.4; RangeDemoLexer lexer = new RangeDemoLexer(new ANTLRStringStream(src)); RangeDemoParser parser = new RangeDemoParser(new CommonTokenStream(lexer)); System.out.println(Parsing: ' + src
[il-antlr-interest: 34737] Re: [antlr-interest] about range float and stuff
Jim, this reply is far different than the clipped 1-liners you have contributed in this discussion so far. You can call my responses pedantic, but IMO you yourself are a part of the problem: by giving answers that are hard to interpret because of the lack of details you poor into it, I find it hard to comprehend what you mean. You must see the difference in this last reply of yours and the ones before it, no? Thank you for this last one, btw. Bart. On Fri, Nov 4, 2011 at 6:50 PM, Jim Idle j...@temporal-wave.com wrote: I meant that the code it uses is only for predicates. There are no methods called to do the parse (though I never personally object to that) or emit the tokens. The other code that is there is as examples on how you might handle errors or range checks and so on. As you said you did not grasp it by reading it, then you clearly cannot win by trying to make pedantic arguments about whether there is any code or not. Anyway, my original point was that: a) The OP quoted the example I commented on; b) He asked it do something that it already did; c) The example originally quoted, covers all combinations of the use of '.' including 1.method(), range and lots more, which is why it seems verbose. So, I don't know where you are going with the pedantry, but it is not worth my time to follow it any more. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 10:34 AM To: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff And if you really meant that the code on http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+poi nt,+dot,+range,+time+specs is without any code, then I disagree with that definition. Since you didn't comment on that anymore, I presume that _is_ what you were talking about. Bart. On Fri, Nov 4, 2011 at 6:30 PM, Bart Kiers bki...@gmail.com wrote: I only know that Terence's solution solves the OP's problem AFAIK, whereas yours I am not sure of: I simply find it too verbose to fully grasp by only reading through it. Sorry. Bart. On Fri, Nov 4, 2011 at 6:18 PM, Jim Idle j...@temporal-wave.com wrote: You may prefer whatever solution you like of course (though these are not the same solution), but you should be accurate about the other solutions and take the time to read through them. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 10:13 AM To: antlr-interest@antlr.org interest Subject: Re: [antlr-interest] about range float and stuff If your (Jim) definition of without code means no @members section, then I find it a bit of an odd definition since the lexer rules from http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating +poi nt,+dot,+range,+time+specs are littered with `{ ... }` code blocks: not what I'd call without code. I much prefer the solution proposed by Terence in http://www.antlr.org/wiki/pages/viewpage.action?pageId=3604497 (which I based my suggestion on): far less verbose than the first option, IMO. Bart. On Fri, Nov 4, 2011 at 5:59 PM, Bart Kiers bki...@gmail.com wrote: The only wiki-link posted in this thread is http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating +po int,+dot,+range,+time+specs which contains Java code, so you must mean something else (of which, I have no idea of)... Bart. On Fri, Nov 4, 2011 at 5:47 PM, Jim Idle jimi@temporal- wave.com wrote: The example on the Wiki already does all of this in the lexer, but without any code. Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr- interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Friday, November 04, 2011 7:12 AM To: Fabien Hermenier Cc: antlr-interest@antlr.org Subject: Re: [antlr-interest] about range float and stuff You're welcome Fabien, but note that it most likely looks a lot like something I found on the ANTLR Wiki: so I can't claim credit for it (perhaps a small part! :)). I'll have a look later on and see if I can dig up the Wiki page. Regards, Bart. On Fri, Nov 4, 2011 at 3:04 PM, Fabien Hermenier hermenierfab...@gmail.comwrote: Thanks Bart, I think I have understand your approach and indeed, it seems beautiful and simple. I will try your solution during the week-end. Fabien. Le 04/11/11 02:48, Bart Kiers a écrit : Hi Fabien
[il-antlr-interest: 34674] Re: [antlr-interest] How to Parse a datastream of tokens and values
Hi David, ANTLR's lexer greedily matches characters: the input PRCLINTON is being tokenized as a single VALUE-token, not as a PR- and VALUE-token. Regards, Bart. On Mon, Oct 31, 2011 at 6:24 PM, Weiler-Thiessen, David, SASKATOON, Engineering david.weiler-thies...@purina.nestle.com wrote: Hi I am trying to parse a string that is a collection of tokens and values. For example: PRCLINTON Where PR is my token, and CLINTON is the value for the token. I have started a simple grammar, see below, but it won't parse the sample above. message : productionReceipt ; productionReceipt : PR VALUE ; PR : 'PR' ; VALUE : ('a'..'z'|'A'..'Z')+ ; What am I doing wrong? I get a MisMatchedTokenException in ANTLRWorks. David Weiler-Thiessen Nestlé Purina PetCare phone: 306-933-0232 cell: 306-291-9770 This e-mail, its electronic document attachments, and the contents of its website linkages may contain confidential information. This information is intended solely for use by the individual or entity to whom it is addressed. If you have received this information in error, please notify the sender immediately and promptly destroy the material and any accompanying attachments from your system. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34676] Re: [antlr-interest] How to Parse a datastream of tokens and values
Sure, you can check, using a gated semantic predicate, if there is no PR ahead when matching the VALUE-token. Something like this: grammar T; @lexer::members { private boolean ahead(String text) { for(int i = 0; i text.length(); i++) { if(text.charAt(i) != input.LA(i + 1)) { return false; } } return true; } } message : productionReceipt EOF ; productionReceipt : PR VALUE ; PR : 'PR'; VALUE : {!ahead(PR)}?= ('a'..'z'|'A'..'Z')+ ; Regards, Bart. On Mon, Oct 31, 2011 at 10:01 PM, Weiler-Thiessen,David,SASKATOON,Engineering david.weiler-thies...@purina.nestle.com wrote: Hi ** ** Yes, I can see how that is happening. ** ** So, in my case, because I have token value pairs, and the values are not terminated by something deterministic, I can’t use ANTLR to lex the input stream. Is that correct? ** ** Turns out that the input stream is fix length format, so it can be parsed in other ways. I was just thinking that this might be a problem space that ANTLR could address also. ** ** David Weiler-Thiessen Nestlé Purina PetCare phone: 306-933-0232 cell: 306-291-9770 *This e-mail, its electronic document attachments, and the contents of its website linkages may contain confidential information. This information is intended solely for use by the individual or entity to whom it is addressed. If you have received this information in error, please notify the sender immediately and promptly destroy the material and any accompanying attachments from your system.* *From:* Bart Kiers [mailto:bki...@gmail.com] *Sent:* Monday, October 31, 2011 12:09 PM *To:* Weiler-Thiessen,David,SASKATOON,Engineering *Cc:* antlr-interest@antlr.org *Subject:* Re: [antlr-interest] How to Parse a datastream of tokens and values ** ** Hi David, ** ** ANTLR's lexer greedily matches characters: the input PRCLINTON is being tokenized as a single VALUE-token, not as a PR- and VALUE-token. ** ** Regards, ** ** Bart. ** ** On Mon, Oct 31, 2011 at 6:24 PM, Weiler-Thiessen, David, SASKATOON, Engineering david.weiler-thies...@purina.nestle.com wrote: Hi I am trying to parse a string that is a collection of tokens and values. For example: PRCLINTON Where PR is my token, and CLINTON is the value for the token. I have started a simple grammar, see below, but it won't parse the sample above. message : productionReceipt ; productionReceipt : PR VALUE ; PR : 'PR' ; VALUE : ('a'..'z'|'A'..'Z')+ ; What am I doing wrong? I get a MisMatchedTokenException in ANTLRWorks. David Weiler-Thiessen Nestlé Purina PetCare phone: 306-933-0232 cell: 306-291-9770 This e-mail, its electronic document attachments, and the contents of its website linkages may contain confidential information. This information is intended solely for use by the individual or entity to whom it is addressed. If you have received this information in error, please notify the sender immediately and promptly destroy the material and any accompanying attachments from your system. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address ** ** List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34614] [antlr-interest] Fwd: Fwd: Rule precedence works differently when using a predicate?
Apologies Jim, forgot to send to the list... -- Forwarded message -- From: Bart Kiers bki...@gmail.com Date: Thu, Oct 27, 2011 at 9:21 PM Subject: Re: [antlr-interest] Fwd: Rule precedence works differently when using a predicate? To: Jim Idle j...@temporal-wave.com On Thu, Oct 27, 2011 at 8:54 PM, Jim Idle j...@temporal-wave.com wrote: As I said earlier you need more predicates: Sorry Jim, I did not know you replied to my message below before. But you also need to not use .+, which essentially match anything anyway once it is triggered. Err, no, not with a predicate, AFAIK (see the rule ANY_EXEPT_B in my example below which does not match anything). Try something like this. fragment KEY : ; ANY : {!test()}?= 'KEY') | ({test()}?= . ) ; But once you take out .+ , then it might just work as it was anyway. Jim Thanks for your suggestion, but I know how to make it work. My question was more about why, when two rules match the same amount of characters, the rule later defined in the grammar is used to create a token. Let me give another example grammar: grammar T; @parser::members { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(aaaBaa)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } @lexer::members { private boolean noBAhead() { return input.LA(1) != 'B'; } } parse : (t=. {System.out.printf(\%-15s \%s\n, tokenNames[$t.type], $t.text);})+ EOF ; MANY_A : 'a'+ ; B : 'B' ; ANY_EXEPT_B : ({noBAhead()}?= . )+ ; If you run the TParser class, you will see the following output when parsing aaaBaa: ANY_EXEPT_B aaa B B ANY_EXEPT_B aa I.e., although the rule MANY_A also matches both aaa and aa, ANY_EXEPT_B matches them where I thought the rule defined first (MANY_A) would match them. Regards, Bart. -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Thursday, October 27, 2011 10:56 AM To: antlr-interest@antlr.org interest Subject: [antlr-interest] Fwd: Rule precedence works differently when using a predicate? Just a little bump, in case it got buried under some of the newer posts. And in case my previous grammar wasn't entirely clear, the following grammar: grammar T; @lexer::members { private boolean test() { return true; } } parse : KEY EOF ; KEY : 'key' ; ANY : ({test()}?= . )+ ; with the test class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(key)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } Produces the following error: line 1:0 mismatched input 'key' expecting KEY In other words, 'key' is being tokenized as ANY instead of KEY. Is this expected behavior or a bug? And if it's expected behavior, could someone point me to the documentation (book) or wiki-link that explains this? Cheers regards, Bart. --- From: Bart Kiers bki...@gmail.com Date: Mon, Oct 24, 2011 at 11:46 AM Subject: Rule precedence works differently when using a predicate? To: antlr-interest@antlr.org interest antlr-interest@antlr.org Hi all, As I understand it, ANTLR's lexer matches rules from top to bottom in the .g grammar file and when two rules match the same number of characters, the one that is defined first has precedence over the later one(s). However, take the following grammar: grammar T; @lexer::members { private boolean test() { return true; } } parse : (t=. {System.out.println(tokenNames[$t.type] + :: + $t.text);})* EOF ; KEY : 'key' ; ANY : ({test()}?= . )+ ; And the test class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(key)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } I'd expected KEY :: key to be printed to the console, however, ANY :: key is printed instead. So the last rule is matched, while the KEY rule also matches the same input and is defined before ANY. Why? Kind regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest
[il-antlr-interest: 34616] Re: [antlr-interest] Fwd: Rule precedence works differently when using a predicate?
Hi Jim, others, Sorry, but I'd appreciate it if you (or someone else) could answer my question with a bit more detail because I really don't understand you (Jim). You say `.+` matches forever, but in my example, there is a predicate in front of the `.` causing it _not_ to match forever as you can see yourself. The input aaaBaa is tokenized into 3 different tokens: aaa, B and aa and _not_ into one single token by the rule that has the `.+` and the predicate in it. Your last comment suggests to me that you imply that aaaBaa will be tokenized as a single token (which, again, is not the case). My question therefor remains the same: why are aaa and aa from the input aaaBaa being tokenized as ANY_EXEPT_B instead of MANY_A, where MANY_A is defined before ANY_EXEPT_B and MANY_A matches exactly the same amount of characters as ANY_EXEPT_B does? To me, it's as if input while would be matched by the ID rule instead of the WHILE rule in: WHILE : 'while'; ID : 'a'..'z'+; (which is not the case, of course!) Regards, Bart. On Thu, Oct 27, 2011 at 10:34 PM, Jim Idle j...@temporal-wave.com wrote: .+ matches forever Jim *From:* Bart Kiers [mailto:bki...@gmail.com] *Sent:* Thursday, October 27, 2011 12:22 PM *To:* Jim Idle *Subject:* Re: [antlr-interest] Fwd: Rule precedence works differently when using a predicate? On Thu, Oct 27, 2011 at 8:54 PM, Jim Idle j...@temporal-wave.com wrote: As I said earlier you need more predicates: Sorry Jim, I did not know you replied to my message below before. But you also need to not use .+, which essentially match anything anyway once it is triggered. Err, no, not with a predicate, AFAIK (see the rule ANY_EXEPT_B in my example below which does not match anything). Try something like this. fragment KEY : ; ANY : {!test()}?= 'KEY') | ({test()}?= . ) ; But once you take out .+ , then it might just work as it was anyway. Jim Thanks for your suggestion, but I know how to make it work. My question was more about why, when two rules match the same amount of characters, the rule later defined in the grammar is used to create a token. Let me give another example grammar: grammar T; @parser::members { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(aaaBaa)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } @lexer::members { private boolean noBAhead() { return input.LA(1) != 'B'; } } parse : (t=. {System.out.printf(\%-15s \%s\n, tokenNames[$t.type], $t.text);})+ EOF ; MANY_A : 'a'+ ; B : 'B' ; ANY_EXEPT_B : ({noBAhead()}?= . )+ ; If you run the TParser class, you will see the following output when parsing aaaBaa: ANY_EXEPT_B aaa B B ANY_EXEPT_B aa I.e., although the rule MANY_A also matches both aaa and aa, ANY_EXEPT_B matches them where I thought the rule defined first (MANY_A) would match them. Regards, Bart. -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of Bart Kiers Sent: Thursday, October 27, 2011 10:56 AM To: antlr-interest@antlr.org interest Subject: [antlr-interest] Fwd: Rule precedence works differently when using a predicate? Just a little bump, in case it got buried under some of the newer posts. And in case my previous grammar wasn't entirely clear, the following grammar: grammar T; @lexer::members { private boolean test() { return true; } } parse : KEY EOF ; KEY : 'key' ; ANY : ({test()}?= . )+ ; with the test class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(key)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } Produces the following error: line 1:0 mismatched input 'key' expecting KEY In other words, 'key' is being tokenized as ANY instead of KEY. Is this expected behavior or a bug? And if it's expected behavior, could someone point me to the documentation (book) or wiki-link that explains this? Cheers regards, Bart. --- From: Bart Kiers bki...@gmail.com Date: Mon, Oct 24, 2011 at 11:46 AM Subject: Rule precedence works differently when using a predicate? To: antlr-interest@antlr.org interest antlr-interest@antlr.org Hi all, As I understand it, ANTLR's lexer matches rules from top to bottom in the .g grammar file and when two rules match the same number of characters, the one that is defined first has precedence over the later one(s). However, take the following grammar: grammar T; @lexer::members { private boolean test
[il-antlr-interest: 34532] Re: [antlr-interest] suggestion for the static initializer too big in java
On Mon, Oct 24, 2011 at 10:26 AM, Patrick Ericx patrick.er...@gmail.comwrote: ... PS: is that mailing list available online? how can I access it ? receiving all these mails in my mailbox becomes a little overhead for me. Yes, here it is: http://antlr.markmail.org Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34533] [antlr-interest] Rule precedence works differently when using a predicate?
Hi all, As I understand it, ANTLR's lexer matches rules from top to bottom in the .g grammar file and when two rules match the same number of characters, the one that is defined first has precedence over the later one(s). However, take the following grammar: grammar T; @lexer::members { private boolean test() { return true; } } parse : (t=. {System.out.println(tokenNames[$t.type] + :: + $t.text);})* EOF ; KEY : 'key' ; ANY : ({test()}?= . )+ ; And the test class: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { TLexer lexer = new TLexer(new ANTLRStringStream(key)); TParser parser = new TParser(new CommonTokenStream(lexer)); parser.parse(); } } I'd expected KEY :: key to be printed to the console, however, ANY :: key is printed instead. So the last rule is matched, while the KEY rule also matches the same input and is defined before ANY. Why? Kind regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34547] Re: [antlr-interest] False error 208?
On Mon, Oct 24, 2011 at 5:26 PM, Jim Idle j...@temporal-wave.com wrote: Oh – I did not see it was a lexer error. Reorder the rules: IF :'if'; ID :('A'..'Z'|'a'..'z')+; RAW:({rawAhead()}?= . )+; Yeah, that is what I thought as well, but the above will cause the input if to be matched as a RAW-token, where I (and Boon) expected an IF-token... See my post: http://antlr.markmail.org/message/mkrpypwuccs25bgd Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34548] Re: [antlr-interest] False error 208?
On Mon, Oct 24, 2011 at 5:33 PM, Bart Kiers bki...@gmail.com wrote: ... where I (and Boon) ... Sorry, Bood, not Boon... List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34429] Re: [antlr-interest] Lexer grammar for filtering
Hi Balazs, Since PC is not a parser rule, you need to account for the space(s) between 'PC_HASH_VALUE' and DIGIT. And since you've set `filter=true`, you don't need a fall-through rule ELSE, AFAIK. Regards, Bart. On Mon, Oct 17, 2011 at 11:15 AM, Balazs Varnai bvar...@gmail.com wrote: Hi All, I have a simple grammar to collapse white-spaces and comment from a c source code input. Also I would like to filter out some variables with a specific name. These have a strict format, so no real C parsing needed. Works fine but for example a line #define PC_HASH_VALUE 1 is not recognized. As far I remember from previous ANTLR usage, this was working straight away. Any suggestions? Thanks! /* [ CODE ] */ lexer grammar Collapse; options { language = Java; filter = true; } @header { package rewriter; import java.util.*; import java.io.*; } @members { PrintStream out; public Collapse(CharStream input, PrintStream out) { this(input); this.out = out; } } PC: 'PC_HASH_VALUE' text=DIGIT {$channel=HIDDEN;}; fragment DIGIT: '0'..'9'; COMMENT : '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;} | '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;} ; WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;} ; ELSE : c=. {out.print((char)$c);} ; // match any char and emit /* [ END ] */ List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34115] Re: [antlr-interest] Unexpected text
Hi Graham, Inside your parser-grammar, $typeSpec is the object that is returned by the parser rule (RuleReturnScope). It includes, among others, the start- and end-token and the tree of the rule. Using `.text`, which is short for `getText()` will get the source of your input from `start` to `end` for that rule. If you'd print `$typeSpec.tree` instead, you'd see the AST without the as text. Regards, Bart. On Thu, Sep 22, 2011 at 6:41 AM, Graham Mer gd.an...@gmail.com wrote: Hi everybody! - Dr. Nick. I have a larger grammar, a tiny portion of which is attached below. Everything works as expected, except for one thing. In the tree grammar, note the following rule: decl: ID typeSpec { System.out.println( decl id= + $ID + ;type= + $typeSpec.text );} Given the following input: Dim foo as String Dim bar as Int I expect it to print: decl id=foo;type=String decl id=bar;type=Int But instead I get: decl id=foo;type=as String decl id=bar;type=as Int The string trees all look as expected, like (Dim foo String) (Dim bar Int), but I get the extra as in the type node, even though I exclude the AS node when building the tree in the following rule: asType: AS! type; I get the same results when the parser is generated by ANTLR 3.3 and 3.4. What am I doing wrong? Here are the grammars: grammar test; options{ output=AST; } start : varDef+ EOF; varDef : DIM^ ID asType; asType : AS! type; type: STRING_T | INT_T; AS : ('A'|'a')('S'|'s'); DIM : ('D'|'d')('I'|'i')('M'|'m'); STRING_T: ('S'|'s')('T'|'t')('R'|'r')('I'|'i')('N'|'n')('G'|'g'); INT_T : ('I'|'i')('N'|'n')('T'|'t'); ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*; WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;} ; tree grammar testTree; options { tokenVocab=test; ASTLabelType=CommonTree; output=AST; } start : statement+ ; statement : ^(DIM decl) ; decl: ID typeSpec { System.out.println( decl id= + $ID + ;type= + $typeSpec.text );} ; typeSpec: STRING_T | INT_T ; List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34116] Re: [antlr-interest] How to do iteration in Tree Grammar
Hi Yifan, How about something like this: foo: ^(VIRTUAL_NODE (bar {if($bar.value) return;})* ); ? Regards, Bart. 2011/9/22 轶凡 yifan@taobao.com Hi, I defined a tree grammar as below: foo: ^(VIRTUAL_NODE bar*) { echo($bar.value); }; bar returns [boolean value] : //… Omitted The generated source of rule foo is like below: public final void foo() throws XX { boolean bar40 = false; do{ //omitted bar40=bar(); //omitted }while (true) echo(bar40) } Actually in the rule ‘foo’, I want to do some actions against every ‘bar’, not the final bar’s value, code in imagination: public final void foo() throws XX { boolean bar40 = false; do{ //omitted bar40=bar(); if (bar40){ echo(bar40); return; } //omitted }while (true) echo(bar40) } How to change the rule ‘foo’ to archive my goal? Thanks for your help! This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you. 本电邮(包括任何附件)可能含有机密资料并受法律保护。如您不是正确的收件人,请您立即删除本邮件。请不要将本电邮进行复制并用作任何其他用途、或透露本邮件之内容。谢谢。 List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 34098] Re: [antlr-interest] Parsing CSS accurately and fast
Hi Vivek, On Mon, Sep 19, 2011 at 4:04 AM, Vivek Jhaveri vivekjhav...@hotmail.comwrote: ... To date, our efforts have yielded accuracy, but not performance. Hard to comment without seeing the grammar(s). Do you by any chance have `backtrack=true;` in the options-section of your parser? If so, you could try to add `memoize=true;` to it: that might increase the parse-time (most probably will). However, it would be better to remove `backtrack=true;` and only add predicates to rules where needed. ... However, this double parsing creates a new instance of the CSS2.1 parser for each successfully parsed piece of the core grammar. This results in extremely slow parse times. I would imagine things would go faster if you just parse once (core CSS), which would result in a proper AST, and then manipulate the AST according some other spec of CSS. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33903] Re: [antlr-interest] new Tree interfaces
Hi Ter, others, Perhaps generics could be introduced in v4? public interface TreeT { TreeT getParent(); T getPayload(); TreeT getChild(int i); int getChildCount(); String toStringTree(); } Regards, Bart On Sun, Sep 4, 2011 at 11:47 PM, Terence Parr pa...@cs.usfca.edu wrote: btw,If you want to take a look at the clean new tree interface: http://www.antlr.org/depot/antlr4/main/runtime/Java/src/org/antlr/v4/runtime/tree/Tree.java then sub interfaces http://www.antlr.org/depot/antlr4/main/runtime/Java/src/org/antlr/v4/runtime/tree/SyntaxTree.java http://www.antlr.org/depot/antlr4/main/runtime/Java/src/org/antlr/v4/runtime/tree/ParseTree.java http://www.antlr.org/depot/antlr4/main/runtime/Java/src/org/antlr/v4/runtime/tree/AST.java ... Ter List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33886] Re: [antlr-interest] org.antlr.runtime.Parser.getMissingSymbol (Line 70) throws NullPointerException
Hi, On Fri, Sep 2, 2011 at 1:51 AM, Dejas Ninethousand dejas9...@gmail.comwrote: For this grammar with an empty string input: grammar PySON; ... I think you forgot to ask your question. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33817] Re: [antlr-interest] FEN grammar
On Sun, Aug 28, 2011 at 11:36 PM, Bart Kiers bki...@gmail.com wrote: ... Fragment rules are only for other *fragment* rules: the parser has no notion of them. ... Correction, I meant to say: *Fragment rules are only for other lexer rule**s ...* Note that you can also call a fragment from another fragment rule. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33806] Re: [antlr-interest] Match token {n} times
Hi Jonne, No, that is not possible with a quantifier like {x}. You'll have to do it the hard way: ACCOUNT : NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER ; Regards, Bart. On Sun, Aug 28, 2011 at 3:20 PM, Jonne Zutt jonne.zutt...@gmail.com wrote: Dear all, Is it possible with antlr to match a token exactly n times? Something like the following: ACCOUNT : NUMBER{8}; NUMBER : ZERO | (NONZERO DIGIT*) ; ZERO: '0'; NONZERO : '1'..'9'; DIGIT : '0'..'9'; Thanks, Jonne. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33809] Re: [antlr-interest] FEN grammar
Hi Jonne, *On Sun, Aug 28, 2011 at 10:26 PM, Jonne Zutt jonne.zutt...@gmail.com wrote: * *... pieces : ('p'|'P' | 'n'|'N' | 'b'|'B' | 'r'|'R' | 'q'|'Q' | 'k'|'K' | '1'..'8')+; * In parser rules, you _can_ use literal tokens, but it's best to refrain from doing this: create separate lexer rules for them. And the `..` (range operator) is only valid inside lexer rules. So the rules `enPassant` and `pieces` are not doing what you think they are. Change those things and try your grammar again. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33810] Re: [antlr-interest] FEN grammar
Hi Jonne, After skimming through the Wiki page you mentioned, I realize what you mean by overlap. Then no, making fragments will not help you since you will only know at parse-time if a digit should be a part of a `number` or a `rank` or a `move`. Fragment rules are only for other fragment rules: the parser has no notion of them. I quickly hacked a FEN grammar together. I'm sure I goofed up some things, but perhaps it's a starting point: grammar Fen; parse : placement Space color Space castling Space enPassant Space numberSpace numberEOF ; placement : place+ (FSlash place+)+ ; place : piece | Rank ; color : W_lc | B_lc ; castling : Hyphen | K_uc? Q_uc? K_lc? Q_lc? // Must macth at least one though! You can do that by using a semantic predicate. ; enPassant : position | Hyphen ; position : file Rank ; file : A_lc | B_lc | C_H_lc ; number : digit+ ; digit : Zero | Rank | Nine ; piece : rook | knight | bishop | queen | king | pawn ; rook : R_lc | R_uc; knight : N_lc | N_uc; bishop : B_lc | B_uc; queen : Q_lc | Q_uc; king : K_lc | K_uc; pawn : P_lc | P_uc; FSlash : '/'; Space : ' '; Zero : '0'; Rank : '1'..'8'; Nine : '9'; Hyphen : '-'; W_lc : 'w'; R_lc : 'r'; N_lc : 'n'; A_lc : 'a'; B_lc : 'b'; C_H_lc : 'c'..'h'; Q_lc : 'q'; K_lc : 'k'; P_lc : 'p'; R_uc : 'R'; N_uc : 'N'; B_uc : 'B'; Q_uc : 'Q'; K_uc : 'K'; P_uc : 'P'; Regards, Bart. On Sun, Aug 28, 2011 at 10:26 PM, Jonne Zutt jonne.zutt...@gmail.comwrote: Hi all, I made my first attempts to use antlr today. Although I read some tutorials, example programs and a page about common pitfalls, I stepped into several pitfalls myself as well, I guess. Is there anybody who wants to shed some light on the below grammar to parse chess FEN strings (see http://en.wikipedia.org/wiki/Forsyth%E2%80%93Edwards_Notation). I am debugging with the string: rnbqkbnr//8/8/8/8//RNBQKBNR w KQkq - 0 1 without the quotes (this is the initial position for chess). I have several problems: - I was using more tokens, but several are overlapping (e.g., for the enPassant rule I used to have FILE RANK where RANK was a lexer token '1'..'8', but that overlaps with the NUMBER token and also with pieces). I'm not sure how to deal with tokens that have overlap? Should they always be changed into fragments? I wanted to make tokens for each piece as well. Such as KNIGHT : 'n' | 'N'; But the bishop turns out to be quite overloaded as well (with BLACK and FILE). - For some reason, 0 seems to match my NUMBER, but 1 does not match. This is what the debugger shows me. If I switch 0 1 into 1 0, the halfmoveClock is not matching. - If I press ctrl-Y in the AntlrWorks plugin, I loose all my data!! arghh. In IntelliJ that is my shortcut to delete a line. Below is my grammer. Any help / comments would be nice :) Thanks, Jonne. grammar Fen; input : fen EOF; fen : piecePlacement WS activeColor WS castling WS enPassant WS halfmoveClock WS fullmoveNumber; piecePlacement : pieces SEP pieces SEP pieces SEP pieces SEP pieces SEP pieces SEP pieces SEP pieces; pieces : ('p'|'P' | 'n'|'N' | 'b'|'B' | 'r'|'R' | 'q'|'Q' | 'k'|'K' | '1'..'8')+; activeColor : 'w' | 'b'; castling : NONE | ('K' | 'Q' | 'k' | 'q')+; enPassant : NONE | FILE '1'..'8'; halfmoveClock : NUMBER; fullmoveNumber : NUMBER; // LEXER WS : (' ' | '\t')+; SEP : '/'; NONE: '-'; FILE: 'a'..'h'; NUMBER : '0' | ('1'..'9' ('0'..'9')*); List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33791] Re: [antlr-interest] Problem with AST rewrite and variable
On Fri, Aug 26, 2011 at 1:40 PM, Adam Adamski thebigsma...@gmail.comwrote: Dear all ANTLR users, Hi Adam, Is it a mistake to use variable like I used? I mean: rule: IDN typ=(type1|type2|typ3) ; Yes, you can't do that. Try this instead: rule : (ARRAY_OF)* intermediate - ^(TYPE ^(ARRAY_OFLIST ARRAY_OF*) intermediate) ; intermediate : simpleType | systemType | objectType ; My final question: is it wrong to use same var name for many rules ? I used variable 'typ' in 4 rules. (I don't think It cause the problem, but I'm not sure) No, there's nothing wrong with that. Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33769] Re: [antlr-interest] Rewriting an AST with the same type listed twice
Hi Garry, You either use inline operators ^ and !, or use a rewrite rule, not both. To reference a rule, add a $ before its label: mapType : 'map' '' k=anyType ',' v=anyType '' - ^('map' $k $v) ; but I think this should work too: mapType : 'map' '' anyType ',' anyType '' - ^('map' anyType anyType) ; Regards, Bart. On Thu, Aug 25, 2011 at 5:22 PM, Garry Watkins ga...@dynafocus.com wrote: mapType : 'map' ''! anyType ','! anyType ''! | 'map' k=anyType ','! v=anyType ''! - ^('map' k v) ; I am trying to re-write the AST generated on this alternative: | 'map' k=anyType ','! v=anyType ''! - ^('map' k v) How does one alias the anyTypes? I am getting an error on the code above? However the bigger question is my parser won't recognize the following mapstring,string if it uses just the first line in my rule. However, it will recognize map string,string. Thanks Garry List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33715] Re: [antlr-interest] ignoring the rest of the file other than the rules defined
Could you post your grammar(s) and actual input? Example input containing is not really helpful. Regards, Bart. On Mon, Aug 22, 2011 at 1:27 PM, Swathi V swat...@zinniasystems.com wrote: Would be thankful if anyone helps me out. Problem : I have a huge file with certain categories and properties in those. Ex: CATEGORY1(XYZ) { PROPERTY1 : . PROPERTY2 : } CATEGORY2(XYZ) { PROPERTY3 : . PROPERTY4 : } .. What i need to do is ... i need only few categories and in that only few properties. for the rest i gotta ignore in the file. i have written 1. lexer 2. parser and AST creation for it here 3. to invoke the same. i have also used options { filter = true; } but i'm able to get the required thing. Can anyone help me out please... with an example for the above? Thank You.. -- Regards, Swathi.V. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33645] Re: [antlr-interest] iniFile grammer
Hi Романов, How about something like this: grammar INIFile; parse : Comment* section+ EOF ; section : SectionStart (property | Comment)* ; property : NameOrValue (Assign valueList?)? eol ; valueList : (Separator | NameOrValue)+ ; eol : LineBreak | Comment | EOF ; Comment : '#' ~('\r' | '\n')* LineBreak ; SectionStart : '[' ~(']')+ ']' Spaces? LineBreak ; Separator : ';' ; Assign : '=' ; NameOrValue : ~(' ' | '\t' | '=' | '\r' | '\n' | '#' | ';') ~('=' | '\r' | '\n' | '#' | ';')* ; LineBreak : '\r'? '\n' | '\r' ; Spaces : (' ' | '\t')+ {$channel=HIDDEN;} ; Regards, Bart. On Wed, Aug 17, 2011 at 2:02 PM, Романов Артем artemi...@yandex.ru wrote: I try define iniFile grammer(keys contains few subvalues). I defined it from C# Regex: Regex commentLine = new Regex(@^\s*#(?comment.*), RegexOptions.Compiled); Regex sectionLine = new Regex(@^\s*\[(?section.*)\].*, RegexOptions.Compiled); //Regex recordLine = new Regex(@^\s*(?key[^#[=\s]+)\s*=?\s*(?values[^#]*)(#(?comment.*))?, RegexOptions.Compiled); Regex recordLine2 = new Regex(@^\s*(?key[^#[=\s]+)\s*=?\s*((?value[^;\#]*);)*(?endValue[^;\#]*[^;\s\#]+)?\s*(#(?comment.*))?, RegexOptions.Compiled); foreach(var c in recordLine2.Math(string).Groups[value].Captures) //access to each value of key Sample ini struct: #comment [section] key1 key2= key3= # this and earlier lines contains 0 values key4=a# 1 values key5=;# 1 empty values key6=a;f # 2 values key7= a ; f;; # 3 values [section2] .. But I don't know how implement endValue(without semicolon) and get lot of warnings from my grammer. This grammer return wrong parse tree([section2] as keyLine). I testet grammer in ANTLRWork 1.4.3 grammar test; WS : (' '|'\t') {$channel=HIDDEN;}; EOL : ('\r\n'|'\n'|'\r') ; SHARP : '#' {System.out.println(#);}; EQUAL : '=' {System.out.println(=);}; SEMICOLON : ';' {System.out.println(;);}; COMMENT : SHARP .* EOL ;//{System.out.println(COM);}; SECTION : '[' .* ']' {System.out.println(SEC);}; ANY : . {System.out.println(ANY);}; iniFile : section* EOF; section : commentLine* sectionLine COMMENT* (keyLine COMMENT?)*; commentLine : COMMENT; sectionLine : SECTION (EOL|COMMENT); keyLine : key keyValues* (EOL|COMMENT); key : ~('='|'#'|'['|EOL)+ {System.out.println(key);}; keyValues : '=' (keyValue';')*; keyValue: ~(';'|'#'|EOL)* ; List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33495] Re: [antlr-interest] Line oriented parsing with Unix files on Windows machine
Hi Todd, Well, `EOL : ('\r' | '\n')+;` matches a single `\n`. So my guess is that some other rule in your grammar matches a `\n` as well. Can you post a complete (small) grammar that shows the problem you're having? Regards, Bart. On Thu, Aug 4, 2011 at 8:37 PM, Stevenson, Todd (GE Healthcare) todd.t.steven...@ge.com wrote: I have a line-oriented grammar that uses the following rule to define an end of line: EOL : ('\r' | '\n')+; My grammar parses input files fine when they are in windows format (lines end with '\r\n'), however the parser doesn't work when the input files are in unix format (lines end with '\n' only). It seems to ignore the new line character and attach it to the subsequent rule. Is there something I need to do the process these files correctly? Any ideas? Thanks. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33356] Re: [antlr-interest] Cannot match strings combining terminals w/o spaces between
Hi Samuel, Your input: history: is not tokenized as a STRING but as a WORD. You need to tell exclude the double quote in your WORD rule. Also, you put '\n' on the HIDDEN channel, yet you use it in your parser rule 'command'. This will cause the rule to never match properly: you need to remove the '\n' from the 'command' rule, or don't put '\n' it on the HIDDEN channel. Regards, Bart. On Wed, Jul 27, 2011 at 11:35 AM, Samuel Lampa samuel.la...@scilifelab.uu.se wrote: I got problems matching the string: history: ... with the following ANTLR code (work in progress, really): (STRING)':' Where I have the STRING terminal defined as: STRING:''('a'..'z'|'A'..'Z')+'' ; It works if I add the ending colon in the STRING definition itself, like so (and then remove it from the parent rule): STRING:''('a'..'z'|'A'..'Z')+''':' ; ... but this of course makes for a less general string definition :/ ... Any ideas how I should go about this? Best regards // Samuel Addendum: The full input string and EBNF code is as follows: === Input string === sam_to_bam.py --input1=$source.input1 --dbkey=${input1.metadata.dbkey} #if $source.index_source == history: --ref_file=$source.ref_file #else --ref_file=None #end if --output1=$output1 --index_dir=${GALAXY_DATA_INDEX_DIR} === ANTLR code === grammar GalaxyToolConfig; options {output=AST;} command:binary param* ifstatement '\n' text? ELSE text? ENDIF text? ; binary :WORD ; param :'--' PARAMNAME '=' ( VARIABLE | STRING ) ; ifstatement :IF ( STRING | VARIABLE ) EQ ( (STRING)':' | (VARIABLE)':' ) ; text :WORD WORD* ; IF:'#if' ; ELSE:'#else' ; ENDIF :'#end if' ; EQ :'==' ; COLON :':' ; PARAMNAME:('a'..'z')('a'..'z'|'A'..'Z'|'0'..'9'|'.'|'_')* ; STRING:''('a'..'z'|'A'..'Z')+'' ; VARIABLE :'$''{'?PARAMNAME'}'? ; // CHAR: ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|'$'|'{'|'}'|'='|''|'-'|':'|';') // ; WORD:(~(' '|'\t'|'\r'|'\n'))+ ; WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;} ; -- System Expert / Bioinformatician SNIC-UPPMAX / SciLifeLab Uppsala Uppsala University, Sweden -- E-mail: samuel.la...@scilifelab.uu.se Phone: +46 (0)18 - 471 1060 WWW: http://www.uppmax.uu.se Uppnex: https://www.uppnex.uu.se List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33367] Re: [antlr-interest] Whitespace in identifiers
FYI: http://stackoverflow.com/questions/6847971/antlr-identifier-with-whitespace On Wed, Jul 27, 2011 at 7:39 PM, Lukas Glowania lukas.glowa...@rub.dewrote: Hi, i want identifiers that can contain whitespace. |grammar WhitespaceInSymbols; premise : ( options {greedy=false;} : 'IF' ) id=ID{ System.out.println($id.text); }; ID : ('a'..'z'|'A'..'Z')+ (' '('a'..'z'|'A'..'Z')+)* ; WS : ' '+ {skip();} ; | When i test this with IF statement analyzed i get a MissingTokenException and the output IF statement analyzed. I thought, that by using greedy=false i could tell ANTLR to exit afer 'IF' and take it as a token. But instead the IF is part of the ID. Is there a way to achieve my goal? I already tried some variations of the greed=false-option, but without success. Thanks in advance! List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33369] Re: [antlr-interest] Quoted String Literal - confused by greed=false behavior.
And by default, greedy=true (except with .* and .+), so in this case, one could simply write STRING_LITERAL : '' ('' | ~'')* '' ; AFAIK. Regards, Bart. On Wed, Jul 27, 2011 at 9:54 PM, Sam Harwell sharw...@pixelminegames.comwrote: You're reading the greedy option in reverse. :) I'd write the rule this way: STRING_LITERAL : '' ( options{greedy=true;} : '' | ~'' )* '' ; Sam -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of G. Richard Bellamy Sent: Wednesday, July 27, 2011 1:49 PM To: antlr-interest@antlr.org Subject: [antlr-interest] Quoted String Literal - confused by greed=false behavior. I've got a lexer rule that should be gobbling everything after the double quote '' except for the last double quote - I basically stole the rule from a post from Jim Idle (http://www.antlr.org/pipermail/antlr-interest/2010-March/038051.html). I've also tried other variations on the same rule, and I'm a bit confused as it seems the {greedy=false;} option is being ignored. Any help is appreciated --- INPUT: @(FOO=) --- lexer grammar Lexer options { language=CSharp3; TokenLabelType=CommonToken; } DQUOTE : ''; STRING_LITERAL : DQUOTE (options { greedy = false; } : ( ( {input.LA(1) == '' input.LA(2) == ''}? DQUOTE DQUOTE | ~DQUOTE )* ) ) DQUOTE ; --- LEXER TRACE (excerpt): enter STRING_LITERAL line=1:7 enter DQUOTE line=1:7 exit DQUOTE ) line=1:8 enter DQUOTE ? line=1:9 exit DQUOTE ? line=1:9 exit STRING_LITERAL ? line=1:9 line 1:10 mismatched character 'EOF' expecting '' List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33373] Re: [antlr-interest] Quoted String Literal - confused by greed=false behavior.
On Wed, Jul 27, 2011 at 10:06 PM, G. Richard Bellamy rbell...@pteradigm.com wrote: Thanks to both of you for your help. Clearly I understated things when I said I was confused. 1. I was under the impression that greedy=true was the default, in every case. For instance, in The Definitive ANTLR Reference... By default, * and + are greedy, _except_ when preceded by the . (DOT). See: The Definitive ANTLR reference, Ch 4, Extended BNF Subrules, page 86. But as Jim mentioned, this is not the issue here. The rule: STRING : '' ('' | ~'')* ''; matches input like: a b c just fine (as a single STRING). Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33376] Re: [antlr-interest] Empty Quoted String Literal
Hi Richard, Both ANTLRWorks' debugger, and my Java test rig: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { WhitespaceInSymbolsLexer lexer = new WhitespaceInSymbolsLexer(new ANTLRStringStream(\\)); WhitespaceInSymbolsParser parser = new WhitespaceInSymbolsParser(new CommonTokenStream(lexer)); parser.compileUnit(); } } produce no errors when parsing (2 double quotes) as input: bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp antlr-3.3.jar org.antlr.Tool WhitespaceInSymbols.g bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ javac -cp antlr-3.3.jar *.java bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp .:antlr-3.3.jar Main bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ Then there must be something going differently in the CSharp3 target than in the Java target (note that I am not able to test the CSharp3 target here...). Regards, Bart. On Wed, Jul 27, 2011 at 11:01 PM, G. Richard Bellamy rbell...@pteradigm.com wrote: Sam, Bart Jim, I really appreciate your help on this. Here's a more complete example, without the greedy confusion. I'm including the combined grammar and a test rig. I get: CombinedLexer:line 1:2 mismatched character 'EOF' expecting '' just before a NullReferenceException. GRAMMAR: grammar Combined; options { language=CSharp3; TokenLabelType=CommonToken; output=AST; ASTLabelType=CommonTree; } @lexer::namespace{StringLiteralLexerTest} @parser::namespace{StringLiteralLexerTest} /* * Parser Rules */ public compileUnit : STRING ; /* * Lexer Rules */ STRING : '' ('' | ~'')* ''; TEST RIG: -- static void Main() { CombinedLexer lexer = new CombinedLexer(new ANTLRStringStream(@)); //lexer.TraceDestination = new ConsoleTextWriter(typeof(CombinedLexer)); CommonTokenStream tokenStream = new CommonTokenStream(lexer); CombinedParser parser = new CombinedParser(tokenStream); //parser.TraceDestination = new ConsoleTextWriter(typeof(CombinedParser)); CommonTree parseTree = parser.compileUnit().Tree; Console.WriteLine(parseTree.ToStringTree()); } List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33378] Re: [antlr-interest] Empty Quoted String Literal
Richard, Note that I am a C# illiterate, but, isn't the C# literal: @ only just a single quote? If so, then it is no wonder you're getting errors. Parsing: WhitespaceInSymbolsLexer lexer = new WhitespaceInSymbolsLexer(new ANTLRStringStream(\)); will also produce the error 'line 1:1 mismatched character 'EOF' expecting ''' with me (of course) since a single quote is no valid token. Regards, Bart. On Wed, Jul 27, 2011 at 11:14 PM, G. Richard Bellamy rbell...@pteradigm.com wrote: Bart, When I escape the quotes things work on my end as well - you'll note that I'm passing a set of non-escaped quotes... (C# Verbatim String), so you'll not be able to test this on your end (since Java doesn't have an equivalent to the Verbatim String). Thanks again. -rb On 7/27/2011 2:10 PM, Bart Kiers wrote: Hi Richard, Both ANTLRWorks' debugger, and my Java test rig: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { WhitespaceInSymbolsLexer lexer = new WhitespaceInSymbolsLexer(new ANTLRStringStream(\\)); WhitespaceInSymbolsParser parser = new WhitespaceInSymbolsParser(new CommonTokenStream(lexer)); parser.compileUnit(); } } produce no errors when parsing (2 double quotes) as input: bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp antlr-3.3.jar org.antlr.Tool WhitespaceInSymbols.g bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ javac -cp antlr-3.3.jar *.java bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp .:antlr-3.3.jar Main bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ Then there must be something going differently in the CSharp3 target than in the Java target (note that I am not able to test the CSharp3 target here...). Regards, Bart. On Wed, Jul 27, 2011 at 11:01 PM, G. Richard Bellamy rbell...@pteradigm.com mailto:rbell...@pteradigm.com wrote: Sam, Bart Jim, I really appreciate your help on this. Here's a more complete example, without the greedy confusion. I'm including the combined grammar and a test rig. I get: CombinedLexer:line 1:2 mismatched character 'EOF' expecting '' just before a NullReferenceException. GRAMMAR: grammar Combined; options { language=CSharp3; TokenLabelType=CommonToken; output=AST; ASTLabelType=CommonTree; } @lexer::namespace{StringLiteralLexerTest} @parser::namespace{StringLiteralLexerTest} /* * Parser Rules */ public compileUnit : STRING ; /* * Lexer Rules */ STRING : '' ('' | ~'')* ''; TEST RIG: -- static void Main() { CombinedLexer lexer = new CombinedLexer(new ANTLRStringStream(@)); //lexer.TraceDestination = new ConsoleTextWriter(typeof(CombinedLexer)); CommonTokenStream tokenStream = new CommonTokenStream(lexer); CombinedParser parser = new CombinedParser(tokenStream); //parser.TraceDestination = new ConsoleTextWriter(typeof(CombinedParser)); CommonTree parseTree = parser.compileUnit().Tree; Console.WriteLine(parseTree.ToStringTree()); } List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33380] Re: [antlr-interest] Empty Quoted String Literal
A quick test with Mono confirmed it, @ is just a single quote: Console.WriteLine(@); prints 1 quote! That's the problem. Bart. On Wed, Jul 27, 2011 at 11:25 PM, Bart Kiers bki...@gmail.com wrote: Richard, Note that I am a C# illiterate, but, isn't the C# literal: @ only just a single quote? If so, then it is no wonder you're getting errors. Parsing: WhitespaceInSymbolsLexer lexer = new WhitespaceInSymbolsLexer(new ANTLRStringStream(\)); will also produce the error 'line 1:1 mismatched character 'EOF' expecting ''' with me (of course) since a single quote is no valid token. Regards, Bart. On Wed, Jul 27, 2011 at 11:14 PM, G. Richard Bellamy rbell...@pteradigm.com wrote: Bart, When I escape the quotes things work on my end as well - you'll note that I'm passing a set of non-escaped quotes... (C# Verbatim String), so you'll not be able to test this on your end (since Java doesn't have an equivalent to the Verbatim String). Thanks again. -rb On 7/27/2011 2:10 PM, Bart Kiers wrote: Hi Richard, Both ANTLRWorks' debugger, and my Java test rig: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { WhitespaceInSymbolsLexer lexer = new WhitespaceInSymbolsLexer(new ANTLRStringStream(\\)); WhitespaceInSymbolsParser parser = new WhitespaceInSymbolsParser(new CommonTokenStream(lexer)); parser.compileUnit(); } } produce no errors when parsing (2 double quotes) as input: bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp antlr-3.3.jar org.antlr.Tool WhitespaceInSymbols.g bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ javac -cp antlr-3.3.jar *.java bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ java -cp .:antlr-3.3.jar Main bart@hades:~/Programming/ANTLR/Demos/WhitespaceInSymbols$ Then there must be something going differently in the CSharp3 target than in the Java target (note that I am not able to test the CSharp3 target here...). Regards, Bart. On Wed, Jul 27, 2011 at 11:01 PM, G. Richard Bellamy rbell...@pteradigm.com mailto:rbell...@pteradigm.com wrote: Sam, Bart Jim, I really appreciate your help on this. Here's a more complete example, without the greedy confusion. I'm including the combined grammar and a test rig. I get: CombinedLexer:line 1:2 mismatched character 'EOF' expecting '' just before a NullReferenceException. GRAMMAR: grammar Combined; options { language=CSharp3; TokenLabelType=CommonToken; output=AST; ASTLabelType=CommonTree; } @lexer::namespace{StringLiteralLexerTest} @parser::namespace{StringLiteralLexerTest} /* * Parser Rules */ public compileUnit : STRING ; /* * Lexer Rules */ STRING : '' ('' | ~'')* ''; TEST RIG: -- static void Main() { CombinedLexer lexer = new CombinedLexer(new ANTLRStringStream(@)); //lexer.TraceDestination = new ConsoleTextWriter(typeof(CombinedLexer)); CommonTokenStream tokenStream = new CommonTokenStream(lexer); CombinedParser parser = new CombinedParser(tokenStream); //parser.TraceDestination = new ConsoleTextWriter(typeof(CombinedParser)); CommonTree parseTree = parser.compileUnit().Tree; Console.WriteLine(parseTree.ToStringTree()); } List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33338] Re: [antlr-interest] Is it possible to do rewriting for lexer?
But you'd have to have a good reason to use a string literal as a parser rule instead of a lexer rule. Bart. On Tue, Jul 26, 2011 at 9:16 AM, Rob Aarnts r...@aarnts.com wrote: You're entirely right. I use this trick but with a parser rule in stead of a lexer one. So SINGLE_QUOTED_STRING should be single_quoted_string according to the ANTLR convention - Rob Aarnts phone: +31 2356 1 mobile: +31 6 5582 2856 fax: +31 8 4227 - On 26 jul 2011 09:02 Gokulakannan Somasundaram gokul...@gmail.com gokul...@gmail.com wrote: On Mon, Jul 25, 2011 at 4:44 PM, Rob Aarnts r...@aarnts.com wrote: Or: SINGLE_QUOTED_STRING returns [string result] : SQUOTE val=((~SQUOTE)*){ $result = val.Text; } SQUOTE ; Is this possible? To my knowledge, Lexer cannot return... Gokul. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33319] Re: [antlr-interest] Is it possible to do rewriting for lexer?
Hi Qiao, With Java as the target, you could do: SINGLE_QUOTED_STRING @after { String s = getText(); setText(s.substring(1, s.length() - 1)); } : SQUOTE (~SQUOTE)* SQUOTE ; Regards, Bart. On Sun, Jul 24, 2011 at 2:43 PM, Mu Qiao qiao...@gmail.com wrote: I have some token rules like: SINGLE_QUOTED_STRING: SQUOTE (~SQUOTE)* SQUOTE; I'd like to hide the first SQUOTE token and the last SQUOTE token from the parser grammar. Is there any way to do that? I've tried the hidden channel and the skip() method, but they work for the whole rule, not just for the tokens I need to hide. -- Best wishes, Mu Qiao GnuPG fingerprint: 92B1 B0C4 8D14 F8C4 EFA5 3ACC 30B3 0DE4 17B1 57E9 List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33278] Re: [antlr-interest] Left recursive grammar
Hi Luigi, I'm not sure if this is possible with ANTLR, or any other LL parser generator. See this for a work-around: http://stackoverflow.com/questions/3799890/antlr-ast-generating-possible-madness If it _is_ possible using some sort of fancy AST rewrite magic, I'm sure someone will correct me. Regards, Bart. On Thu, Jul 21, 2011 at 3:00 PM, Luigi Iannone iann...@cs.manchester.ac.ukwrote: Hi all, I have this simple grammar grammar test; options { language = Java; output = AST; } a : a*B -^(B a*) | A ; B : '.B' ; A : 'A' ; and I get the following output when I try to generate the parser in ANTRLWorks [13:48:53] error(210): The following sets of rules are mutually left-recursive [a] I read on the Web that there are solutions to solve this, however they will mess up the associativity, which I need to keep instead. So, for instance, for the input A.B.B the AST tree should be ^(B ^(B A)) Is there any way to change the grammar in order to eliminate the left recursion and obtain the above tree. I am afraid I do not get how to do it by just looking at what is there online about left recursive grammars. Thanks a lot for your help, Luigi List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33280] Re: [antlr-interest] Left recursive grammar
Hi Sam, But of course, with the inline tree rewrite operators it looks so straight forward! Nice one! Regards, Bart. On Thu, Jul 21, 2011 at 3:52 PM, Sam Harwell sharw...@pixelminegames.comwrote: Your example is ambiguous as well as left recursive. I assume you meant one of the following: a : a B | A; a : A* B | A; The first can be written as: a : A (B^)*; The second can be written as a : A (A* B^)? | A | B; Sam -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-boun...@antlr.org] On Behalf Of Luigi Iannone Sent: Thursday, July 21, 2011 8:00 AM To: ANTLR Subject: [antlr-interest] Left recursive grammar Hi all, I have this simple grammar grammar test; options { language = Java; output = AST; } a : a*B -^(B a*) | A ; B : '.B' ; A : 'A' ; and I get the following output when I try to generate the parser in ANTRLWorks [13:48:53] error(210): The following sets of rules are mutually left-recursive [a] I read on the Web that there are solutions to solve this, however they will mess up the associativity, which I need to keep instead. So, for instance, for the input A.B.B the AST tree should be ^(B ^(B A)) Is there any way to change the grammar in order to eliminate the left recursion and obtain the above tree. I am afraid I do not get how to do it by just looking at what is there online about left recursive grammars. Thanks a lot for your help, Luigi List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33170] Re: [antlr-interest] changes between 3.2 and 3.3 that brake a 3.2 tutorial
Hi Sébastien, Thanks! Hmm, that is odd. If I change the parse rule into: parse : (t=. {System.out.printf(text: \%-7s type: \%s \n, $t.text, tokenNames[$t.type]);})* EOF ; and the Main class into: import org.antlr.runtime.*; public class Main { public static void main(String[] args) throws Exception { TLLexer lexer = new TLLexer(new ANTLRFileStream(args[0])); CommonTokenStream tokens = new CommonTokenStream(lexer); TLParser parser = new TLParser(tokens); parser.parse(); } } the expected output _is_ printed to the console. Not sure why for(Object o : tokens.getTokens()) doesn't work... Will have a look later on. Regards, Bart. 2011/7/14 Sébastien Kirche sebastien.kir...@gmail.com Hi, I am currently playing with the antlr-3.3-complete.jar package. While trying to fix a lexer token priority problem in my grammar, I tried to test the token listing method described in the (very useful !) posts from Bart Kiers about its tutorial with the TL language. My problem is that his method applied on my grammar outputs nothing. I banged my head against my desk for a while for looking what I did wrong with reusing Bart's code, until I tested his code and example as-is and found that it outputs nothing too... As the TL tutorial is build with antlr 3.2, I tried to recompile with 3.2 instead of 3.3 and then it is working as expected o_O I cannot find in the 3.3 releases notes (http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3.3+Release+Notes) what causes that behavior difference between 3.2 and 3.3. The specific code from Bart Kiers I have tried is from the post at http://bkiers.blogspot.com/2011/03/3-lexical-analysis-of-tl.html Thanks. -- Sébastien Kirche List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33161] Re: [antlr-interest] ANTLR 3.1.2: Simplest action
Try: INTEGER : DIGIT+ { print($text) } ; Regards, Bart. On Wed, Jul 13, 2011 at 9:57 PM, Udo Weik weikeng...@aol.com wrote: Hello, I just want to access the attributes of INTEGER and DIGIT - but how? INTEGER: DIGIT+ { print( L: (INTEGER): ) } ; fragment DIGIT: '0'..'9' { print( L: (DIGIT): ) } ; Thanks and greetings Udo List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33162] Re: [antlr-interest] ANTLR 3.1.2: Simplest action
Hi Udo, `$text` is just a shorthand notation for `$XYZ.text` where `XYZ` is the rule you're currently in. Regards, Bart. On Wed, Jul 13, 2011 at 10:18 PM, Udo Weik weikeng...@aol.com wrote: Hello Bart and others, yep, thanks, it works... Try: INTEGER : DIGIT+ { print($text) } ; 1. Why do I not need the name, here INTEGER, like $INTEGER.text? 2. With 12345 the result is L: (DIGIT): 1 L: (DIGIT): 12 L: (DIGIT): 123 L: (DIGIT): 1234 L: (DIGIT): 12345 L: (INTEGER): 12345 I expected L: (DIGIT): 1 L: (DIGIT): 2 L: (DIGIT): 3 L: (DIGIT): 4 L: (DIGIT): 5 L: (INTEGER): 12345 Many thanks and greetings Udo On Wed, Jul 13, 2011 at 9:57 PM, Udo Weik weikeng...@aol.com mailto: weikeng...@aol.com wrote: Hello, I just want to access the attributes of INTEGER and DIGIT - but how? INTEGER: DIGIT+ { print( L: (INTEGER): ) } ; fragment DIGIT: '0'..'9' { print( L: (DIGIT): ) } ; Thanks and greetings Udo List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33163] Re: [antlr-interest] ANTLR 3.1.2: Simplest action
Oh, and since `DIGIT` is called from `INTEGER`, the `$text` from `DIGIT` references `$INTEGER.text`. Regards, Bart. On Wed, Jul 13, 2011 at 10:18 PM, Udo Weik weikeng...@aol.com wrote: Hello Bart and others, yep, thanks, it works... Try: INTEGER : DIGIT+ { print($text) } ; 1. Why do I not need the name, here INTEGER, like $INTEGER.text? 2. With 12345 the result is L: (DIGIT): 1 L: (DIGIT): 12 L: (DIGIT): 123 L: (DIGIT): 1234 L: (DIGIT): 12345 L: (INTEGER): 12345 I expected L: (DIGIT): 1 L: (DIGIT): 2 L: (DIGIT): 3 L: (DIGIT): 4 L: (DIGIT): 5 L: (INTEGER): 12345 Many thanks and greetings Udo On Wed, Jul 13, 2011 at 9:57 PM, Udo Weik weikeng...@aol.com mailto: weikeng...@aol.com wrote: Hello, I just want to access the attributes of INTEGER and DIGIT - but how? INTEGER: DIGIT+ { print( L: (INTEGER): ) } ; fragment DIGIT: '0'..'9' { print( L: (DIGIT): ) } ; Thanks and greetings Udo List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33165] Re: [antlr-interest] ANTLR 3.1.2: Simplest action
Hi Udo, Ah yes, apologies: you can only reference parser rules like that: parserRule : LexerRule { print($parserRule.text) } ; and *not*: LexerRule : OtherLexerRule { print($LexerRule.text) } ; Regards, Bart. On Wed, Jul 13, 2011 at 10:27 PM, Udo Weik weikeng...@aol.com wrote: Hello again Bart, `$text` is just a shorthand notation for `$XYZ.text` where `XYZ` is the rule you're currently in. $INTEGER.text doesn't work (rule INTEGER has no defined parameters), but your $text does. That's what I don't understand ;(. Many thanks and greetings Udo On Wed, Jul 13, 2011 at 10:18 PM, Udo Weik weikeng...@aol.com mailto: weikeng...@aol.com wrote: Hello Bart and others, yep, thanks, it works... Try: INTEGER : DIGIT+ { print($text) } ; 1. Why do I not need the name, here INTEGER, like $INTEGER.text? 2. With 12345 the result is L: (DIGIT): 1 L: (DIGIT): 12 L: (DIGIT): 123 L: (DIGIT): 1234 L: (DIGIT): 12345 L: (INTEGER): 12345 I expected L: (DIGIT): 1 L: (DIGIT): 2 L: (DIGIT): 3 L: (DIGIT): 4 L: (DIGIT): 5 L: (INTEGER): 12345 Many thanks and greetings Udo On Wed, Jul 13, 2011 at 9:57 PM, Udo Weik weikeng...@aol.commailto: weikeng...@aol.com mailto:weikeng...@aol.com mailto:weikeng...@aol.com wrote: Hello, I just want to access the attributes of INTEGER and DIGIT - but how? INTEGER: DIGIT+ { print( L: (INTEGER): ) } ; fragment DIGIT: '0'..'9' { print( L: (DIGIT): ) } ; Thanks and greetings Udo List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33092] Re: [antlr-interest] @members section in tree grammar
Hi Shane, Inside a combined- or lexer- or parser grammar, the `tokens` should be placed before the `members` section(s): (lexer | parser)? grammar ... options { ... } tokens { ... } @header { ... } (or: @parser::header { ... }, @lexer::header { ... }) @members { ... } (or: @parser::members { ... }, @lexer::members { ... }) But a tree grammar gets its tokens from the `tokenVocab` key: tree grammar ExprWalker; options { tokenVocab=Expr; ASTLabelType=CommonTree; } So no `tokens` section inside a tree grammar. Regards, Bart. On Sun, Jul 10, 2011 at 9:31 PM, Shane srber...@gmail.com wrote: I can put an @members section in a grammar without any problem, but when I put one in a tree grammar, I get a bunch of exceptions. It ignores everything after the @members section. BTW, I'm trying to get access to the error output, so I can show it to the user. Exception: unexpected token: tokens { grammar Expr; options { output=AST; ASTLabelType=CommonTree; backtrack=true; } @members { public String getTokenErrorDisplay(Token t) { return t.toString(); } } tokens { DIV = '/' ; EQUAL = '==' ; GREATER_OR_EQUAL= '=' ; GREATER_THAN= '' ; ... Can tree grammars handle @members sections? or am I doing something wrong? Thanks, srb List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33082] Re: [antlr-interest] Parsing Question
Hi Chris, Very original! :) Try to do more in lexer rules. Some of your keywords may probably also be a part of the instruction phrase: you need to be aware of that. How about something like this: grammar KnittingGrammer; parse : instruction+ EOF ; instruction : section FullStop | castOn FullStop ; section : NumberDecoration Section | Section NumberDecoration ; castOn : CastOn Number Stitches+ anyWordExceptStitches anyWord* ; anyWordExceptStitches : NumberDecoration | Number | Section | Word ; anyWord : NumberDecoration | Number | Section | Stitches | Word ; NumberDecoration : Digit+ ('st' | 'nd' | 'rd' | 'th') ; Number : Digit+ ; FullStop : ':' | ';' | ',' | '.' | '\n' | '\r' ; Section : 'section' ; CastOn : 'cast' Space+ 'on' | 'co' ; Stitches : 'stitch' 'es'? | '(sts)' | 'sts' ; Space : (' ' | '\t') {skip();} ; Word : ('a'..'z' | 'A'..'Z')+ ; fragment Digit : '0'..'9'; which parses input like this properly: 1st section: cast on 63 stitches (sts) and work in pattern section as follows: Regards, Bart. On Fri, Jul 8, 2011 at 2:15 AM, Chris Wegener ch...@wegenerconsulting.comwrote: Dear Friends- I am attempting to define a language that will let me parse knitting instructions. (Don't ask.) By and large it is a well understood convention with standard abbreviations and phrases. Occasionally the originator will insert a phrase in the instructions that are not directly relevant. What I would like is to parse out those words and deal with them around the issue of reading the instructions. I have tried: text :(letter)+; letter : ('a'..'z' | 'A'..'Z'); WS :(' ' | '\n' | '\r'); And it doesn't work at all. I changed it to: text : (letter)+; letter :~('' | '\\'); WS : (' ' | '\n' | '\r'); That works, but becomes unweildy very quickly when I start including all of the things I do know to scan for. I have attached the KnittingGrammer.g file with my rules. For example: 1st Section: Cast on 63 stitches (sts) and work in pattern as follows: is parsed into '1st Section:' and 'Cast on 63 stitches (sts)' which leaves the text until the colon which is the stop character. I would like to parse the 'and work in pattern as follows' into the parse tree under text so I can inspect it or lex it seperately or even display to the user. What am I missing or doing wrong? My thanks for your help in advance. Regards, Chris List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33033] Re: [antlr-interest] Line start vs non-line ...
Hi James, Something like this would do the trick: lines : (LineStartingWithPlus | LineNotStartingWithPlus | LineBreak)* EOF ; LineStartingWithPlus : '+' ~('\r' | '\n')* ; LineNotStartingWithPlus : ~'+' ~('\r' | '\n')* ; LineBreak : ('\r'? '\n' | '\r') ; Regards, Bart. On Tue, Jul 5, 2011 at 12:18 PM, James Ladd james_l...@hotmail.com wrote: Hi All, I hope this is a simple request to answer. I have a simple preprocessor I want to write but I can't get the rules right. I think I am regex challenged. In simple terms I want to have this: lines : (lineStartingWithPlus | lineNotStartingWithPlus)* ; If these were rules A and B respectively then ... + this line matches A this line matches B Please can someone help ? Rgs, James. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32989] Re: [antlr-interest] rule parameter question
Hi Mark, I presume you didn't see my answer on Stackoverflow: http://stackoverflow.com/questions/6529359/how-to-pass-commontree-parameter-to-an-antlr-rule ? If you did, is there anything that wasn't clear? Regards, Bart. On Thu, Jun 30, 2011 at 2:26 PM, Mark Truluck mark.trul...@cogiton.comwrote: Hello, I am trying to do what I think is a simple parameter passing to a rule in Antlr 3.3: --- grammar rule_params; options { output = AST; } rule_params : outer; outer: outer_id '[' inner[$outer_id.tree] ']'; inner[CommonTree parent]: inner_id '[' ']'; outer_id: '#'! ID; inner_id: '$'! ID ; ID : ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' )* ; - However the inner[CommonTree parent] generates the following: *** inner4=inner((outer_id2!=null?((Object)outer_id2.tree):null)); Resulting in this error: *** The method inner(CommonTree) in the type rule_paramsParser is not applicable for the arguments (Object) As best I can tell, this is the exact same as the example in the Antrl book: classDefinition[CommonTree mod] (Kindle Location 3993) - sorry I don't know the page number but it is in the middle of the book in chapter 9, section labeled Creating Nodes with Arbitrary Actions. Thanks for any help. Mark Truluck COGITON, Inc. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32991] Re: [antlr-interest] rule parameter question
No problem: SO _does_ work with notifications, but they only go out once a day or so by default. Good to hear it worked. Regards, Bart. On Thu, Jun 30, 2011 at 3:16 PM, Mark Truluck mark.trul...@cogiton.comwrote: Hi Bart – sorry – thought I'd get an email from them. That worked perfectly – thanks very much. Mark From: Bart Kiers bki...@gmail.com Date: Thu, 30 Jun 2011 14:42:53 +0200 To: Mark Truluck mark.trul...@cogiton.com Cc: antlr-interest@antlr.org interest antlr-interest@antlr.org Subject: Re: [antlr-interest] rule parameter question Hi Mark, I presume you didn't see my answer on Stackoverflow: http://stackoverflow.com/questions/6529359/how-to-pass-commontree-parameter-to-an-antlr-rule ? If you did, is there anything that wasn't clear? Regards, Bart. On Thu, Jun 30, 2011 at 2:26 PM, Mark Truluck mark.trul...@cogiton.comwrote: Hello, I am trying to do what I think is a simple parameter passing to a rule in Antlr 3.3: --- grammar rule_params; options { output = AST; } rule_params : outer; outer: outer_id '[' inner[$outer_id.tree] ']'; inner[CommonTree parent]: inner_id '[' ']'; outer_id: '#'! ID; inner_id: '$'! ID ; ID : ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' )* ; - However the inner[CommonTree parent] generates the following: *** inner4=inner((outer_id2!=null?((Object)outer_id2.tree):null)); Resulting in this error: *** The method inner(CommonTree) in the type rule_paramsParser is not applicable for the arguments (Object) As best I can tell, this is the exact same as the example in the Antrl book: classDefinition[CommonTree mod] (Kindle Location 3993) - sorry I don't know the page number but it is in the middle of the book in chapter 9, section labeled Creating Nodes with Arbitrary Actions. Thanks for any help. Mark Truluck COGITON, Inc. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32992] Re: [antlr-interest] rule parameter question
Just for the record, in case someone stumbles upon this post, the answer from SO: - If you don't explicitly specify the tree to be used in your grammar, .tree (which is short forgetTree()) will return a java.lang.Object and a CommonTree will be used as default Treeimplementation. To avoid casting, set the type of tree in your options { ... } section: options { output=AST; ASTLabelType=CommonTree; } On Thu, Jun 30, 2011 at 3:20 PM, Bart Kiers bki...@gmail.com wrote: No problem: SO _does_ work with notifications, but they only go out once a day or so by default. Good to hear it worked. Regards, Bart. On Thu, Jun 30, 2011 at 3:16 PM, Mark Truluck mark.trul...@cogiton.comwrote: Hi Bart – sorry – thought I'd get an email from them. That worked perfectly – thanks very much. Mark From: Bart Kiers bki...@gmail.com Date: Thu, 30 Jun 2011 14:42:53 +0200 To: Mark Truluck mark.trul...@cogiton.com Cc: antlr-interest@antlr.org interest antlr-interest@antlr.org Subject: Re: [antlr-interest] rule parameter question Hi Mark, I presume you didn't see my answer on Stackoverflow: http://stackoverflow.com/questions/6529359/how-to-pass-commontree-parameter-to-an-antlr-rule ? If you did, is there anything that wasn't clear? Regards, Bart. On Thu, Jun 30, 2011 at 2:26 PM, Mark Truluck mark.trul...@cogiton.comwrote: Hello, I am trying to do what I think is a simple parameter passing to a rule in Antlr 3.3: --- grammar rule_params; options { output = AST; } rule_params : outer; outer: outer_id '[' inner[$outer_id.tree] ']'; inner[CommonTree parent]: inner_id '[' ']'; outer_id: '#'! ID; inner_id: '$'! ID ; ID : ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9' | '_' )* ; - However the inner[CommonTree parent] generates the following: *** inner4=inner((outer_id2!=null?((Object)outer_id2.tree):null)); Resulting in this error: *** The method inner(CommonTree) in the type rule_paramsParser is not applicable for the arguments (Object) As best I can tell, this is the exact same as the example in the Antrl book: classDefinition[CommonTree mod] (Kindle Location 3993) - sorry I don't know the page number but it is in the middle of the book in chapter 9, section labeled Creating Nodes with Arbitrary Actions. Thanks for any help. Mark Truluck COGITON, Inc. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32998] Re: [antlr-interest] Ignore Mismatched input and No viable alternative
Can you give an example of your input, and explain what line you want to ignore? Regards, Bart. On Thu, Jun 30, 2011 at 10:33 PM, Nilo Roberto C Paim nilop...@gmail.comwrote: Hi all, I'm trying to parse a text file created by an application. There are lines in this text file that I know perfectly their contents and format, and these lines are correctly parsed. Otherwise, there are lines that can be recognized by any syntax I can think about. When I parse the file with my syntax, I'm getting several occurrences of Mismatched input and No viable alternative on the lines that I can't recognize. What I want is simply throw away these invalid lines, as if them would be absent from the text file. For my purposes, these lines are invisibles. How can I do that, ignoring the lexer/parser errors? TIA Nilo - Brasil List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 33000] Re: [antlr-interest] Ignore Mismatched input and No viable alternative
Perhaps, but Nilo is talking about discarding complete lines. Chances are that somewhere in these ignored-lines, text exists that looks like a token that _do_ need to be kept. That's why I asked for more info. Regards, Bart. On Thu, Jun 30, 2011 at 10:40 PM, Robin diabete...@gmail.com wrote: You could use the filter=true grammar option to discard tokens that don't match anything On Thu, Jun 30, 2011 at 10:36 PM, Bart Kiers bki...@gmail.com wrote: Can you give an example of your input, and explain what line you want to ignore? Regards, Bart. On Thu, Jun 30, 2011 at 10:33 PM, Nilo Roberto C Paim nilop...@gmail.com wrote: Hi all, I'm trying to parse a text file created by an application. There are lines in this text file that I know perfectly their contents and format, and these lines are correctly parsed. Otherwise, there are lines that can be recognized by any syntax I can think about. When I parse the file with my syntax, I'm getting several occurrences of Mismatched input and No viable alternative on the lines that I can't recognize. What I want is simply throw away these invalid lines, as if them would be absent from the text file. For my purposes, these lines are invisibles. How can I do that, ignoring the lexer/parser errors? TIA Nilo - Brasil List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32978] Re: [antlr-interest] Someting about range (to change)
Hi Fabien, Can you post the grammar that produces the error? Bart. On Wed, Jun 29, 2011 at 5:57 AM, Fabien Hermenier hermenierfab...@gmail.com wrote: Hi I am sorry, but this is another problem related to sequence of integer. I don't understand my error(s), despite severals pass on the wiki page related to this use case. I want to parse a string with a sequence parameter into it, such as toto-[2 ..3]-toto. Here is a short version of the grammar that works perfectly: --- grammar tryout; options { k=3; } fragment Digit :'0'..'9'; DEC_NUMBER: '1'..'9' Digit*; fragment Letter:'a'..'z'|'A'..'Z'; WS:('\n'|'\r'|'\t'|' ') {$channel=HIDDEN;}; LEFTY: (Letter|Digit) (Letter|Digit|'-')* '['; RIGHTY: ']' (('-' (Letter|Digit))|Letter|Digit)* ; number :DEC_NUMBER; //HEX_NUMBER, OCT_NUMBER, ... are following but removed for this example. range: LEFTY number '..' number RIGHTY; --- This grammar accepts toto-[1..3]-toto or toto-[1 .. 3]-toto. Now, I want to be able to accept the LEFTY token or the RIGHTY token even if they contains a '.' inside (not at the beginning). So I've modified LEFTY as following: LEFTY: (Letter|Digit) ('.'|Letter|Digit|'-')* '['; Now, ANTLR does no longer accept toto-[1..3]-toto. It requires at least one space between the first number and the range. I have read the wiki page related to range, integer, and so one. But in my case, I don't see where my grammar is ambiguous as no token can start with a '.' . So it seems there is a concept I don't get. Can anyone try to help me ? Thanks in advance, Fabien. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32948] Re: [antlr-interest] ANTLRWorks Interpreter
The interpreter is only suitable for very small grammars (without predicates!). For more complicated grammars, use ANTLRWorks' debugger instead. Regards, Bart. On Sat, Jun 25, 2011 at 10:25 PM, Mike Kappel mkap...@appfluent.com wrote: I just downloaded ANTLRWorks and tried the example expression grammar. I type in a simple expression into the Interpreter and see the generated parse tree. Fine. I then load the SQL 2003 grammar and type in a simple Insert statement. I click the arrow and it never returns from Interpreting (Operation in progress). Can ANTLRWorks handle a more complex grammar? Dr. Michael R. Kappel mkap...@appfluent.commailto:mkap...@appfluent.com List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32878] Re: [antlr-interest] Token Stream Rewriting
Is it _really_ returning a String * private String merge(String s1, List lst, String s2) {* *...* *return ...* * }* ? Not an Object _you think_ is a String: * private Object merge(String s1, List lst, String s2) {* *...* *return ...* * }* ? I ask because the stack-trace you posted: Caused by: java.lang.ClassCastException: java.util.ArrayList seems to suggest it is an ArrayList. Either way: a String or an ArrayList, both are wrong: that method needs to return a Tree. Bart. On Wed, Jun 22, 2011 at 8:39 AM, Fabien Hermenier hermenierfab...@gmail.com wrote: merge(...) is returning a String. Fabien. Le 22/06/11 00:36, Bart Kiers a écrit : Fabien, but what is the return type of this `merge(...)` method? Could you post the method? Or even better: post a SSCCE http://sscce.orgthat causes such an exception? Regards, Bart. On Wed, Jun 22, 2011 at 8:30 AM, Fabien Hermenier hermenierfab...@gmail.com wrote: In fact, I've badly readed the help. So yet, it is running a String that should be tokenized (then translated into tree I suppose) at run time. Le 22/06/11 00:28, Bart Kiers a écrit : Is your `merge(String, List, String)` method returning a java.util.ArrayList instead of a Tree? Regards, Bart. On Wed, Jun 22, 2011 at 8:04 AM, Fabien Hermenier hermenierfab...@gmail.com mailto:hermenierfab...@gmail.com wrote: Hi I have some troubles with token stream rewriting. Below is the piece of ANTLR code. I have a grammar, with an AST as output and Java as the target. I want to insert a sequence of token into the stream. I have followed the page http://www.antlr.org/wiki/display/~admin/2007/06/28/Token+stream+rewriting+with+rewrite+ruleshttp://www.antlr.org/wiki/display/%7Eadmin/2007/06/28/Token+stream+rewriting+with+rewrite+rules http://www.antlr.org/wiki/display/%7Eadmin/2007/06/28/Token+stream+rewriting+with+rewrite+rules and adapted the example that interest me. A piece of the code is following. Basically, in the alternative of 'explodedSet', I get the return values of other rules and do some stuff in the merge method. This one returns a list of String as explained in the online example. explodedSet: '{' (setContent (',' setContent)*)? '}' - ^(EXPLODED_SET setContent+) | {List l = new LinkedList();}LEFTY r1=brace_content{l.add($r1.ret);} (',' r2=brace_content{l.add($r2.ret);})* RIGHTY - { merge($LEFTY.text,l,$RIGHTY.text) }; brace_content returns [List ret]: st=number ('..' ed=number)? {$ret = new LinkedList(); for (int i = $st.val; i = $ed.val; i++) {$ret.add(i);}} | NAME {$ret = new LinkedList(); $ret.add($NAME.text);}; The code compiles well but at runtime, I've got this exception: Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.antlr.runtime.tree.Tree at org.antlr.runtime.tree.BaseTreeAdaptor.addChild(BaseTreeAdaptor.java:107) at Parser.explodedSet(Parser.java:560) So, the return value of merge does not seems to be converted into tokens nor Tree. Does someone has an idea ? Fabien. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- Fabien Hermenier Postdoctoral researcher at Flux http://sites.google.com/site/hermenierfabien/home List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- Fabien Hermenier Postdoctoral researcher at Flux http://sites.google.com/site/hermenierfabien/home List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32844] Re: [antlr-interest] test release of antlr 3.4
The file `/META-INF/MANIFEST.MF` is missing the 'Main-Class' attribute. Add the line: Main-Class: org.antlr.Tool to the file (inside the JAR) and all should be OK. Regards, Bart. On Mon, Jun 20, 2011 at 6:28 PM, Julien BLACHE j...@jblache.org wrote: A Z asicaddr...@gmail.com wrote: Hi, How is this jar different than 3.2? I tried the same command java -jar antlr-3.4.jar grammar.g but I get an error message: Invalid or corrupt jarfile antlr-3.4.jar Same issue here. It works when invoked this way java -cp antlr-3.4.jar org.antlr.Tool grammar.g I'll leave it up to the Java-literate to investigate/explain/fix ;) (Sun^WOracle Java 1.6.0_26 if it makes any difference) JB. -- Julien BLACHE http://www.jblache.org j...@jblache.org GPG KeyID 0xF5D65169 List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32770] Re: [antlr-interest] Facing NoSuchMethodError:CommonTreeNodeStream.getNodeIndex(obj) after upgrating ANTLR V3.1.1 to ANTLR v3.3
Meena, You appear to be running a v3.1 lexer parser with the run-time classes from v3.3. On Stackoverflow I recommended you generate a new lexer and parser using ANTLR v3.3 (and compile them!) which you can then use with the ANTLR v3.3 runt-time classes. Did you do that already? Bart. On Tue, Jun 14, 2011 at 1:04 PM, meena.subraman...@wipro.com wrote: Hi All, Currently I am using Antlr v3.1.1. Now I have upgraded it to Antlr v3.3 after that I am getting the following error while compiling my java files with the latest version Antlr. I am facing some compilation issues so not able to build. The error message is given below, UnexpectedException occurred: java.lang.NoSuchMethodError: Org.antlr.runtime.CommonTreeNodeStream.getNodeIndex(Ljava/lang/object;) I At *.expressoin.FocusCommonTreeNodeStream.getNodeIndex(FocusCommonTreeNodeS tream.java) Can you please suggest some solution for the above problem? Observation: As per my analysis CommonTreeNodeStream.getNodeIndex(obj) method is present till Antlr jar v3.1.2 but the releases after this version, does not contain the CommonTreeNodeStream.getNodeIndex(obj) method. So obviously v3.3 does not contain the method getNodeIndex(obj) so it is throwing NoSuchMethodError. Can you please clarify, is there any equivalent method available in the v3.3 for getNodeIndex(obj) method? Thanks in Advance!!! Regards, Meena. Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32771] Re: [antlr-interest] Facing NoSuchMethodError:CommonTreeNodeStream.getNodeIndex(obj) after upgrating ANTLR V3.1.1 to ANTLR v3.3
Yes, you're running it all with v3.3, but did you generate a new lexer and parser from your grammar using v3.3? Bart. On Tue, Jun 14, 2011 at 1:43 PM, meena.subraman...@wipro.com wrote: Bart, Yes brat, I have did the following to use Antlr v3.3. (we are downloading the ANTLR jars from maven.) 1. I have changed my *pom.xml* as below From : dependency groupIdorg.antlr/groupId artifactIdantlr-runtime/artifactId version3.1.1/version /dependency To : dependency groupIdorg.antlr/groupId artifactIdantlr-complete/artifactId version3.3/version /dependency 2. Rename the jar FROM antlr-3.3-complete TO antlr-complete-3.3 3. Run the below maven command to install the local repository with the new jar. mvn install:install-file -DgroupId=org.antlr -DartifactId=antlr-complete -Dversion=3.3 -Dpackaging=jar -Dfile=local\Path\to\antlr-complete-3.3.jar *4. *Run the below mvn command to compile: *mvn install -DskipTests=true* After following the above steps I got the below error: Build Failure : Compilation Error: NoSuchMethodError….and in the error message it clearly indicates that *getNodeIndex(obj)* Method is missing. Thanks, Meena. *From:* Bart Kiers [mailto:bki...@gmail.com] *Sent:* Tuesday, June 14, 2011 4:38 PM *To:* Meena Subramanian (WT01 - Banking Financial Services) *Cc:* antlr-interest@antlr.org *Subject:* Re: [antlr-interest] Facing NoSuchMethodError:CommonTreeNodeStream.getNodeIndex(obj) after upgrating ANTLR V3.1.1 to ANTLR v3.3 Meena, You appear to be running a v3.1 lexer parser with the run-time classes from v3.3. On Stackoverflow I recommended you generate a new lexer and parser using ANTLR v3.3 (and compile them!) which you can then use with the ANTLR v3.3 runt-time classes. Did you do that already? Bart. On Tue, Jun 14, 2011 at 1:04 PM, meena.subraman...@wipro.com wrote: Hi All, Currently I am using Antlr v3.1.1. Now I have upgraded it to Antlr v3.3 after that I am getting the following error while compiling my java files with the latest version Antlr. I am facing some compilation issues so not able to build. The error message is given below, UnexpectedException occurred: java.lang.NoSuchMethodError: Org.antlr.runtime.CommonTreeNodeStream.getNodeIndex(Ljava/lang/object;) I At *.expressoin.FocusCommonTreeNodeStream.getNodeIndex(FocusCommonTreeNodeS tream.java) Can you please suggest some solution for the above problem? Observation: As per my analysis CommonTreeNodeStream.getNodeIndex(obj) method is present till Antlr jar v3.1.2 but the releases after this version, does not contain the CommonTreeNodeStream.getNodeIndex(obj) method. So obviously v3.3 does not contain the method getNodeIndex(obj) so it is throwing NoSuchMethodError. Can you please clarify, is there any equivalent method available in the v3.3 for getNodeIndex(obj) method? Thanks in Advance!!! Regards, Meena. Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address * Please do not print this email unless it is absolutely necessary. * The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter
[il-antlr-interest: 32703] Re: [antlr-interest] issue with antlr requiring a whitespace at a specific place
Well, your grammar is now quite different than the few rule you posted (more rules, and not everything is visible, so I can't test it myself). All I can say is that the interpreter from ANTLRWorks has (quite some) odd quirks, so best not use it. If something seems odd in the interpreter, either create a little test rig yourself, or use ANTLRWorks' debugger to be sure if the error lies in your grammar, or the interpreter. Good luck! Regards, Bart. On Wed, Jun 8, 2011 at 1:04 PM, Olivier Sallou olivier.sal...@irisa.frwrote: For the same, I have mismatched token. I simplified it to maximum (see first line of attached screenshot) However I see in my editor (antlrworks) in interpreter tab: Ignore rules: WHITESPACE. I wonder why, I did not ask for such ignore, and I do not see how to remove this. Maybe this occurs in generated code too. Olivier Le 6/8/11 12:58 PM, Bart Kiers a écrit : Hi Olivier, I can't reproduce it. I tested with ANTLRWorks 1.4.2 as well. See the attached screenshot. Regards, Bart. On Wed, Jun 8, 2011 at 11:23 AM, Olivier Sallou olivier.sal...@irisa.fr wrote: Hi, I have an issue with antlrworks (1.4.2), where for a specific grammar, it requires a whitespace. I upgraded from antlrworks 1.1.7 where the same did not asked for the whitespace. example: '?' string | '%' string ':' percentage=INT | ... string: '' LOWID ''; LOWID: ('a'..'z'|'\-')+; INT : ('0'..'9')+ ; If I call my example rules with: ?\acgt\ it works fine but if I call %\acgt\:30 If fails. At least if I add a whitespace between % and \acgt\, it works: % \acgt\:30 I really can't understand why a whitespace is required here, and only here Thanks for your help Olivier -- gpg key id: 4096R/326D8438 (pgp.mit.edu) Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- gpg key id: 4096R/326D8438 (pgp.mit.edu) Key fingerprint = 5FB4 6F83 D3B9 5204 6335 D26D 78DC 68DB 326D 8438 List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32685] Re: [antlr-interest] AST with optional parameters
Hi David, Try this: (ID GETS) = ID GETS expr SEMI? - ^(GETS ID expr SEMI?) Regards, Bart. On Tue, Jun 7, 2011 at 1:41 PM, David Smith david.sm...@cc.gatech.eduwrote: I'm parsing a grammar in which the semicolon on the end of a line is optional. So two of the statement rules might be: | (ID GETS expr SEMI) = ID GETS expr SEMI - ^(GETS ID expr SEMI) | (ID GETS) = ID GETS expr - ^(GETS ID expr) Since this occurs with a number of different assignment statements, I would really like to collapse this into one rule that looks something like this: | (ID GETS) = ID GETS e=expr (s=SEMI)? - ^(GETS ID $e $s) but every implementation I can think of either refuses to generate the grammar or, as in the case above, generates the grammar but decides that the variable 's' is unknown. Is there a any way to achieve this? DMS David M. Smith http://www.cc.gatech.edu/fac/David.Smith Georgia Institute of Technology, College of Computing Sent from my ASR-33 Teletype List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32697] Re: [antlr-interest] AST with optional parameters
Jim, From an earlier message, David wrote: *Yes, the language is Matlab and a semicolon on the end of an assignment expression suppresses display of the result of the assignment. ...* Bart. On Tue, Jun 7, 2011 at 6:30 PM, Jim Idle j...@temporal-wave.com wrote: Why do you want the SEMI in your AST? Jim -Original Message- From: antlr-interest-boun...@antlr.org [mailto:antlr-interest- boun...@antlr.org] On Behalf Of David Smith Sent: Tuesday, June 07, 2011 4:41 AM To: antlr-interest@antlr.org Subject: [antlr-interest] AST with optional parameters I'm parsing a grammar in which the semicolon on the end of a line is optional. So two of the statement rules might be: | (ID GETS expr SEMI) = ID GETS expr SEMI - ^(GETS ID expr SEMI) | (ID GETS) = ID GETS expr - ^(GETS ID expr) Since this occurs with a number of different assignment statements, I would really like to collapse this into one rule that looks something like this: | (ID GETS) = ID GETS e=expr (s=SEMI)? - ^(GETS ID $e $s) but every implementation I can think of either refuses to generate the grammar or, as in the case above, generates the grammar but decides that the variable 's' is unknown. Is there a any way to achieve this? DMS David M. Smith http://www.cc.gatech.edu/fac/David.Smith Georgia Institute of Technology, College of Computing Sent from my ASR- 33 Teletype List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your- email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32660] Re: [antlr-interest] Any way to search the ANTLR interest archives?
Hi George, Sure, use Markmail: http://antlr.markmail.org/ http://antlr.markmail.org/Regards, Bart. On Sun, Jun 5, 2011 at 3:59 PM, George Spears geo...@woh.rr.com wrote: Hello, The ANTLR interest archives have a lot of valuable information in them.. (http://www.antlr.org/pipermail/antlr-interest/ ) Is there any way to search them? (Other than downloading everything to my local computer? If not, this would be a nice feature to have. Thanks, George Spears List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32637] Re: [antlr-interest] Accentuated chars in brazilian portuguese
Hi Nilo, The grammar: grammar Brasil; parse : WORD EOF ; WORD : ('\u00c0'..'\u00ff' | 'a'..'z' | 'A'..'Z' | '-')+ ; parses the input não just fine in ANTLRWorks. I'm not really familiar with C#, but for those who are, could you perhaps post *how* you are testing it? (post a test rig that shows the behavior you describe) Regards, Bart. On Wed, Jun 1, 2011 at 10:53 PM, Nilo Roberto C Paim nilop...@gmail.comwrote: Hi all, I'm newbie using Antlr and I'm facing a problem when trying to parse a text that contains accentuated chars in Brazilian Portuguese. I've put a word definition on my grammar as follows: WORD : ( '\u00c0'..'\u00ff' | 'a'..'z' | 'A'..'Z' | '-' )+ ; But have no success on parsing. Words like não (no in Portuguese) causes lexar throws Antlr.Runtime.NoViableAltException. I'm trying to use C#. Any hint? TIA Nilo, from Brasil... List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32609] Re: [antlr-interest] Do you need an ANTLR programmer?
On Mon, May 30, 2011 at 9:23 PM, ante...@freemail.hu wrote: On 5/30/2011 11:20 PM, Bart Kiers wrote: On Mon, May 30, 2011 at 9:13 PM, ante...@freemail.hu wrote: On 5/30/2011 10:41 PM, Bart Kiers wrote: Could you stop spamming the ANTLR mailing list please? Bart. You may not know that but As I was worried that that this mail can be perceived as a spam, I asked Terence Parr if it is ok, if I send a mail here. To my surprise, he said yes. How could I know? You might have included that information from the get-go. I am sure that I am not the only one being annoyed by such messages. And are you planning to spam this mailing list on a regular basis? Or just once? Bart. Now you know. Sorry for not mentioning it in the mail.. (I considered it) I am not sure if you agree with me but if it is allowed, it cannot be called spam. Well, the over-all definition of spam is this: Spam is the use of electronic messaging systems [...] to send unsolicited bulk messages indiscriminately. It _is_ unsolicited since no one asked for it. You may have gotten approval from someone, but that doesn't mean it's not unsolicited. So yes, it sure is spam. At the moment, Just once now. I really hope so. If every self-employed developer starts spamming the list, it'd become a mess. Bart. PS. I cc-ed the list so that others are aware of the fact it's now okay to advertise one selves here. Márton [1] http://en.wikipedia.org/wiki/Spam_(electronic) List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32611] Re: [antlr-interest] options greedy : getting the tokens consumed during the greedy match
Hi Vijay, You could grab all matched text in the `@after` block using the `getText()` method: COMMENT @init{ boolean isJavaDoc = false; System.out.println(Entering comment); } @after { System.out.println(Leaving comment, matched: + getText()); } : '/*' { if((char)input.LA(1) == '*') { isJavaDoc = true; } } (options {greedy=false;} : . )* '*/' ; Regards, Bart Kiers. On Mon, May 30, 2011 at 8:08 PM, Vijay Raj call.vijay...@yahoo.com wrote: Hi - I am trying to parse a given java file, with a code fragment that consumes comments as below: ( Code fragment got from Java.g , pasted in the antlr site, to give credit where it is due). COMMENT @init{ boolean isJavaDoc = false; System.out.println(Entering comment); } : '/*' { if((char)input.LA(1) == '*'){ isJavaDoc = true; } } (options {greedy=false;} : . )* '*/' ... ; I am trying to get all the characters mapped by the wildcard regex , as in 'options greedy' line in the grammar file and get the string into the Java world for further processing. What hidden system variables/ grammar should I use to take care of the same ? List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32612] Re: [antlr-interest] Do you need an ANTLR programmer?
On Mon, May 30, 2011 at 10:38 PM, Jim Idle j...@temporal-wave.com wrote: It always been OK, but there is obvious common sense involved, such as not posting such messages every week. For a start, I make a lot of my living writing professional ANTLR grammars and occasionally, you need to ask for work... which reminds me... But, in general I would shy away from appointing yourself unofficial arbitrator of the list. The list is basically whatever Ter says it is; the poster was polite enough to ask if it was OK to post and so that's that. Jim As I said: he could have posted that Terence gave him permission. Note that he send a message twice, in rapid succession. Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32565] Re: [antlr-interest] question about antlr
Hi Patrick, I have a hard time understanding what it is you're trying to do. Instead of springing the entire grammar on us (or me), could you extract only those rules that are causing the problem(s)? And could you include some example input you'd like to match/parse? Regards, Bart Kiers. On Thu, May 26, 2011 at 4:04 PM, Patrick Hofman patrick.hof...@invantive.com wrote: And now the grammar From: Patrick Hofman Sent: donderdag 26 mei 2011 15:35 To: antlr-interest@antlr.org Subject: question about antlr Hi all, I have bought the ANTLR book in order to learn ANTLR better, but I still don't get how to fix one thing. I hope you can help me with that. I have a grammar that is used to parse our custom formula format to an Excel formula. So when filling an Excel worksheet the string entered is something like '$C{D,.,.,.+1}' which means 'one cell right from the current cell'. You will understand that eventually we will get something like '$C{D,.,.,.+1} + $C{D,.,.,.+2}', so add up the value of the first cell right and the second cell right (in Excel when we are at A1 this results in '=B1 + B2'. The problem is this: In the grammar I cannot find a way to 'eat up' the text between two 'eca_kolom_expressie' strings (the '$C{...}' part). I already tried 'EXCEL_FRAGMENT' in a hundred ways, but none of them seemed to work. ('TILDE (options {greedy=false;} : .) TILDE' seems to work, but when removing the TILDEs it stops working) I have included the grammar. Can you point me in the right direction? Regards, Patrick Hofman Senior Consultant Invantive B.V. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32567] Re: [antlr-interest] Nasty LHS expression
Hi David, Can an `expr` match something that starts with `ID OPENP` and/or `ID GETS`? Perhaps you can post your entire grammar? Or at least the `expr` rule? Regards, Bart. On Thu, May 26, 2011 at 7:02 PM, David Smith david.sm...@cc.gatech.eduwrote: I am having a difficult time distinguishing two legal lines of code: ID = expression and ID(exp1, exp2, ...) = expression I tried this rule: stat: expr NEWLINE- expr | (ID OPENP .* CLOSEP GETS) = ID OPENP actualParameters CLOSEP GETS expr NEWLINE - ^(INDEX ID OPENP actualParameters CLOSEP expr) | ID GETS expr NEWLINE - ^(GETS ID expr) | NEWLINE - ; But it says that alternatives 2 and 3 can never be matched. How do I reactivate 2 and 3? DMS David M. Smith http://www.cc.gatech.edu/fac/David.Smith Georgia Institute of Technology, College of Computing Sent from my ASR-33 Teletype List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32513] Re: [antlr-interest] Collecting parameters
Hi David, Every root (or leaf) in the AST must be an instance of a ` org.antlr.runtime.tree.Tree`, so you can't create a node that is a `java.util.List`. By default, ANTLR creates its AST using `org.antlr.runtime.tree.CommonTree` objects which inherits the `getChildren()` method from `org.antlr.runtime.tree.BaseTree` which returns a List. This List contains `CommonTree`'s. So your VECTOR root already has a method to get the children `1,2,3,4`. Regards, Bart. On Sat, May 21, 2011 at 7:14 PM, David Smith david.sm...@cc.gatech.eduwrote: Your contributors have been very helpful with my novice questions, and I thank them. Here's another: I am trying to build an AST that processes text like: v4 = [1 2 3 4] The following rule works: term : (OPENB .+ CLOSEB) = OPENB vals CLOSEB - ^(VECTOR vals) |OPENB CLOSEB- EMPTY_VECTOR |DOUBLE |ID |'('! expr ')'! ; vals returns [List items] : vl+=expr (COMMA? vl+=expr)* {$items = $vl;} ; but it produces tree nodes like: (= v4 (VECTOR 1 2 3 4)) but I really want (= v4 (VECTOR values)) where 'values' is some kind of Java collection like an ArrayList. How do I do that? David M. Smith http://www.cc.gatech.edu/fac/David.Smith Georgia Institute of Technology, College of Computing Sent from my ASR-33 Teletype List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32499] Re: [antlr-interest] Can't figure this one out
Hi David, Your parser does not handle: ans = 3 * (-x + y) * 4 properly since `ans` is an ANS-token and not an IDENT-token. Therefor it does not get matched by your `assignmentStatement` rule. Also, you should probably add the EOF at the end of your `script` rule in your combined grammar. Regards, Bart. On Fri, May 20, 2011 at 3:47 AM, David Smith david.sm...@cc.gatech.eduwrote: I developed a tree parser by making minor changes to Scott Stanchfield's tutorial videos. I don't know where to start looking to explain the problem. Here are the pieces: // The grammar: grammar GTMat; options { language = Java; output=AST; ASTLabelType=CommonTree; } tokens { NEGATION; } @header { package parser; } @lexer::header { package parser; } script : statement* ; statement : assignmentStatement ; assignmentStatement : IDENT GETS^ expression SEMI? ; actualParameters : expression (COMMA expression)* ; // expressions -- fun time! term : (IDENT OPENP ) = IDENT '(' actualParameters ')' | OPENP! expression CLOSEP! | INTEGER | IDENT ; unary : (PLUS! | negation^)* term ; negation : MINUS - NEGATION ; mult : unary ((MULT^ | DIV^ ) unary)* ; add : mult ((PLUS^ | MINUS^) mult)* ; relation : add ((EQUALS^ | NOTEQ^ | LESS^ | LESSEQ^ | GT^ | GTEQ^) add)* ; expression : relation ((AND^ | OR^) relation)* ; GETS: '='; SWITCH : 'switch'; CASE: 'case'; OTHERWISE : 'otherwise'; IF : 'if'; ELSE: 'else'; ELSEIF : 'elseif'; END : 'end'; FOR : 'for'; WHILE : 'while'; ANS : 'ans'; COMMA : ','; OPENP : '('; CLOSEP : ')'; NOT : '~'; SEMI: ';'; PLUS: '+'; MINUS : '-'; MULT: '*'; DIV : '/'; EQUALS : '=='; NOTEQ : '!='; LESS: ''; LESSEQ : '='; GT : ''; GTEQ: '='; AND : ''; OR : '||'; SINGLE : '\''; fragment LETTER : ('a'..'z' | 'A'..'Z') ; fragment DIGIT : '0'..'9'; INTEGER : DIGIT+ ; IDENT : LETTER (LETTER | DIGIT)*; WS : (' ' | '\t' | '\n' | '\r' | '\f')+ {$channel = HIDDEN;}; COMMENT : '%' .* ('\n'|'\r') {$channel = HIDDEN;}; // The Walker Grammar: tree grammar EvaluatorWalker; options { language = Java; tokenVocab = GTMat; ASTLabelType = CommonTree; } @header { package parser; import java.util.Map; import java.util.HashMap; } @members { private MapString, Integer variables = new HashMapString, Integer(); } evaluator : assignment* EOF ; assignment : ^('=' IDENT e=expression) { variables.put($IDENT.text, e); } ; expression returns [int result] : ^('+' op1=expression op2=expression) { result = op1 + op2; } | ^('-' op1=expression op2=expression) { result = op1 - op2; } | ^('*' op1=expression op2=expression) { result = op1 * op2; } | ^('/' op1=expression op2=expression) { result = op1 / op2; } | ^(NEGATION e=expression) { result = -e; } | IDENT { result = variables.get($IDENT.text); } | INTEGER { result = Integer.parseInt($INTEGER.text); } ; // The Test Program: package parser; import org.antlr.runtime.ANTLRFileStream; import org.antlr.runtime.CharStream; import org.antlr.runtime.CommonTokenStream; import org.antlr.runtime.RecognitionException; import org.antlr.runtime.TokenStream; import org.antlr.runtime.tree.CommonTreeNodeStream; import java.io.IOException; public class Test4 { public static void main(String[] args) throws RecognitionException, IOException { CharStream stream = new ANTLRFileStream(Test.m); GTMatLexer lexer = new GTMatLexer(stream); TokenStream tokenStream = new CommonTokenStream(lexer); GTMatParser parser = new GTMatParser(tokenStream); GTMatParser.script_return evaluator = parser.script(); System.out.println(evaluator.tree.toStringTree()); CommonTreeNodeStream nodeStream = new CommonTreeNodeStream(evaluator.tree); EvaluatorWalker walker = new EvaluatorWalker(nodeStream); walker.evaluator(); System.out.println(ok); } } // The input code: x = 8 y = 2 + 3 ans = 3 * (-x + y) * 4 // When I run it, I get this: (= x 8) (= y (+ 2
[il-antlr-interest: 32502] Re: [antlr-interest] AST Question
Hi Your rule: targetsExpr : category ('CAND' targetsExpr)* - ^('CAND' category targetsExpr*) ; is incorrect. You're always using `CAND` in your rewrite rule but that rule could just match `category` only. You'll probably want to do: targetsExpr : category ('CAND'^ targetsExpr)* ; (Note the ^ after CAND which makes it the root) Regards, Bart. On Fri, May 20, 2011 at 10:11 AM, massimiliano.m...@gmail.com massimiliano.m...@gmail.com wrote: Hello All, I'm more or less a newbie using antlr. I have a small issue on creating the AST, using rewrite rules. I'm so sorry if this is a FAQ or similar! :) I have the following productions (it's like an algebra with 3 operators with different priorities): targetsExpr : (category) ('CAND' targetsExpr)* -^('CAND' category targetsExpr*) ; category: (matchEl) ('OR' category)* - ^('OR' matchEl category* ) ; matchEl : factor ('AND' factor)* -^('AND' factor*) ; factor : matchId OPAR targetValue COMMA targetName CPAR -^('FAC' matchId targetValue targetName) The problem is that the AST created contains productions as: CAND -OR - OR- AND (FAC, FAC). The second OR is created because the ``category'' production is passed multiple times. Is there a way to not create these kind of rules? You can see a sample of the AST created in http://www.mascanc.net/~max/policy.pdf. -- Massimiliano Masi http://www.mascanc.net/~max List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32507] Re: [antlr-interest] AST Question
On Fri, May 20, 2011 at 9:54 PM, massimiliano.m...@gmail.com massimiliano.m...@gmail.com wrote: Hi, On Fri, May 20, 2011 at 10:55 AM, Bart Kiers bki...@gmail.com wrote: targetsExpr : category ('CAND' targetsExpr)* - ^('CAND' category targetsExpr*) ; is incorrect. You're always using `CAND` in your rewrite rule but that rule could just match `category` only. You'll probably want to do: targetsExpr : category ('CAND'^ targetsExpr)* ; Thank you for your answer! It works now! But I've another question now. When I traverse the tree using this function (is there an example to have a visitor created by antlr?) Yes, this is typically what tree grammars are for. But you can also walk it manually. See: http://www.antlr.org/article/1100569809276/use.tree.grammars.tml http://www.antlr.org/article/1170602723163/treewalkers.html Regards, Bart. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32479] Re: [antlr-interest] Parsing comment-like sequences of arbitrary characters
Hi Rajesh, Inside a parser rule, the `~` negates tokens, not characters. So if you have no lexer rule that tokenizes one of: '%', ''^' or '$', then ~SEMICOLON won't match any of such tokens/characters. Your grammar (with minor modifications): grammar Test; options { output=AST; } tokens { OPTION; OPTION_BLOCK; } query_options : OPTIONS^ option_block ; option_block : L_BRACE option_def* R_BRACE - ^(OPTION_BLOCK option_def*) ; option_def : option_name option_value - ^(OPTION option_name option_value) ; option_name : ID (DOT^ ID)* ; option_value : COLON^ (~SEMICOLON)* SEMICOLON! | option_block ; OPTIONS : 'options'; ID: (LETTER | '_') (LETTER | DIGIT | '_')*; DOLLAR: '$'; PERCENT: '%'; CARET: '^'; DOT: '.'; L_BRACE: '{'; R_BRACE: '}'; COLON: ':'; SEMICOLON: ';'; DIGIT : '0'..'9'; SL_COMMENT: '#' ~('\r' | '\n')* { skip(); }; WS: (' ' | '\f' | '\r' | '\t')+ { skip(); }; fragment LETTER : 'a'..'z' | 'A'..'Z'; parses the input: options { foo: $ % 1 2 45 ^ $ $$$; } as follows: (options (OPTION_BLOCK (OPTION foo (: $ % 1 2 4 5 ^ $ $ $ $ as you can see after running the test rig: import org.antlr.runtime.*; import org.antlr.runtime.tree.*; import org.antlr.stringtemplate.*; public class Main { public static void main(String[] args) throws Exception { ANTLRStringStream in = new ANTLRStringStream(options { foo: $ % 1 2 45 ^ $ $$$; }); TestLexer lexer = new TestLexer(in); CommonTokenStream tokens = new CommonTokenStream(lexer); TestParser parser = new TestParser(tokens); TestParser.query_options_return returnValue = parser.query_options(); CommonTree tree = (CommonTree)returnValue.getTree(); DOTTreeGenerator gen = new DOTTreeGenerator(); StringTemplate st = gen.toDOT(tree); System.out.println(st); System.out.println(---\n + tree.toStringTree()); } } Regards, Bart. On Wed, May 18, 2011 at 12:55 AM, Rajesh Raman r...@fb.com wrote: Hello ANTLR-ites, I'm trying to parse an options structure, like the following: options { foo { bar { ww: $32.50; xx: Jekyll Hyde; } yy.zz: @15% p/a; } } (Please ignore the non-sensical values for ww, xx and yy.zz -- I'm just making a point, which will become clearer below). This options structure will be followed by a query expression whose grammar is more complicated, and includes ints/floats, identifiers, operators, etc. etc. The grammar I have for parsing the options structure looks like the below. (The grammar for the query language is complicated and therefore omitted.) snip // ... other stuff here tokens { // ... other ad hoc token values OPTION; OPTION_BLOCK; OPTION_VALUE; } // ... query_options : OPTIONS^ option_block ; option_block : L_BRACE option_def* R_BRACE - ^(OPTION_BLOCK option_def*) ; option_def : option_name option_value - ^(OPTION option_name option_value) ; option_name : ID (DOT^ ID)* ; option_value : COLON^ (~SEMICOLON)* SEMICOLON! | option_block ; //... other stuff here //... OPTIONS: 'options'; ID: (LETTER | '_') (LETTER | DIGIT | '_')*; DOT: '.'; L_BRACE: '{'; R_BRACE: '}'; COLON: ':'; SEMICOLON: ';'; SL_COMMENT: '#' ~('\r' | '\n')* NEWLINE { skip(); }; WS: (' ' | '\f' | '\r' | '\t')+ { skip(); }; ... /snip As mentioned, the options clause is part of a larger grammar for a language that includes operators, identifiers, numbers, etc., However, within the options clause, I want the characters between the colon and the semicolon to be treated as a single string, regardless of the fact that it may contain characters that lex into other tokens used by the language. This feels like I should be able to use the same techniques as used in comment-stripping (i.e,. see the line that has COLON^...). But this doesn't seem to work: - The stray characters that are not used elsewhere in the grammar are ignored and don't show up in the parse tree (e.g., $, @, %, , in the example above) - Character sequences that form valid tokens for the rest of the language (like integers or identifiers) are lexed into those respective tokens instead of being slurped into a single string as intended. E.g., when I input a string like options { foo: $ % 1 2 45 ^ $ $$$; } and display the resulting tree.toStringTree(), I get (options (OPTION_BLOCK (OPTION foo (: 1 2 45 Any guidance you have on the above will be greatly appreciated. Thanks in advance. ++Rajesh List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send
[il-antlr-interest: 32481] Re: [antlr-interest] Fragment rules inside parser rules
Hi Ben, You cannot use fragment rules inside parser rules. So yes, you need to make LKU a normal token. If you don't want that (for whatever reason), you need to include the '' in the 'NAME' rule: NAME : '' ('a'..'z')+ ; Regards, Bart. On Wed, May 18, 2011 at 3:00 PM, Ben Corne ben.co...@gmail.com wrote: Hello Do I really need to make LKU in the example below a normal token rule or is there a way to get this to work for the input 'foo;' not using literals inside the parser rule or real tokens. grammar Foo; program : (LKU NAME ';')+ ; fragment LKU : '' ; NAME : ('a'..'z')* ; = Regards Ben C. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32449] Re: [antlr-interest] Lexer code missing @header info
Hi David, The snippet: @header { package parser; import java.util.HashMap; } is short for: @parser::header { package parser; import java.util.HashMap; } You'll need to do the following as well: @lexer::header { package parser; import java.util.HashMap; } Regards, Bart. On Fri, May 13, 2011 at 3:08 PM, David Smith david.sm...@cc.gatech.eduwrote: I'm using ANTLRWorks to generate java code form a grammar with the front-end listed below. The code generation works flawlessly producing XXXLexer.java and XXXParser.java as expected with one minor annoyance. The package parser; line makes it into XXXParser.java, but not into XXXLexer.java. I therefore have to manually edit XXXLexer.java. Not really a big issue, but is there an easy cure? __ grammar XXX; options { output=AST; ASTLabelType=CommonTree; } tokens{ STATLIST; } @header { package parser; import java.util.HashMap; } David M. Smith http://www.cc.gatech.edu/fac/David.Smith Georgia Institute of Technology, College of Computing Sent from my ASR-33 Teletype List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32430] Re: [antlr-interest] Geting all tokens from lexer / token stream
CommonTokenStream inherits getTokens() which returns a List of Tokens. You'll need to cast them to a Token (or something that extends a Token), since it's a non-generics List list: CommonTokenStream tokens = new CommonTokenStream(lexer); for(Object o : tokens.getTokens()) { Token t = (Token)o; System.out.println(t); } Regards, Bart. On Wed, May 11, 2011 at 1:40 PM, Lars von Wedel lars.vonwe...@gmail.comwrote: Hello, I am writing an interactive interpreter and I would like to obtain all tokens from a lexer of token stream to test, whether the input is complete or continued on the next line. What is the easiest approach to do this ? I tried a CommonTokenStream but I am not sure how to tell it to pull all tokens from the lexer. Thanks and Regards, Lars List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32414] Re: [antlr-interest] Translating expressions - advice?
I get the impression you think that when creating AST's, ANTLR inserts parenthesis (brackets). This is not the case: I guess what you're seeing is just the tree's `toStringTree()` that displays these parenthesis to make the hierarchy of the tree apparent. Or am I misinterpreting your question? Regards, Bart. On Mon, May 9, 2011 at 3:10 PM, Hans-Juergen Rennau hren...@yahoo.dewrote: Hello People, being an ANTLR beginner, I would very much appreciate advice concerning good practise for a rather simple task. The task is the translation of a JPQL's (Java Persistence Query Language) where clause into a proprietary query language. The clause has the well-known expression structure: operands conncected by three operators: OR, AND and NOT, where precedence increases in that order. Example: a.x='1' AND (a.y='2' OR b.z='3') AND a.v like 'abc%' An important point is that the translation result will have a similar structure, that is, it will also be operands connected by those operators. Example: x='1' AND (y='2' OR z='3') AND v='123*' For this reason I am not sure if the classical approach for dealing with left-associative operators, as shown in the Definitve ANTLR Reference (3. A quick tour...) is the most appropriate one in this case. I mean rules like: conditional_term ('OR'^ conditional_term)* conditional_factor ('AND'^ conditional_factor)* This creates deep trees, where each operator creates a new level. That is fine for processing the operations. But a straightforward translation of the tree into a similar sequence of operands and operators yields a result which is correct but can be ugly, due to superflous brackets, example: (a OR (b OR (c AND d))) One possibility is to process the tree, removing superfluous brackets - perhaps by passing the context operator into the rule as a parameter, so that the rule can decide if to create brackets or not. This should not be too difficult, but my question is: is there a good practise for accomplishing the task? Would you recommend the approach just sketched, or a different tree representation to start with? (A tree I do want because there are other parts to be translated, not only the where clause, and a tree seems to me the way to deal with (possibly yet growing) complexity. Thank you very much for any suggestions. -- Hans-Juergen List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
[il-antlr-interest: 32415] Re: [antlr-interest] Translating expressions - advice?
Wait I think I misunderstood. Your example `(a OR (b OR (c AND d)))` is just an example expression, right? In that case, yes, these parenthesis are part of the token stream, but if you apply rewrite rules (or AST operators `^` and `!`) properly, these parenthesis are easily removed from your parse tree. See: http://www.antlr.org/wiki/display/ANTLR3/Tree+construction or: http://stackoverflow.com/questions/4931346/how-to-output-the-ast-built-using-antlr Regards, Bart. On Mon, May 9, 2011 at 4:10 PM, Bart Kiers bki...@gmail.com wrote: I get the impression you think that when creating AST's, ANTLR inserts parenthesis (brackets). This is not the case: I guess what you're seeing is just the tree's `toStringTree()` that displays these parenthesis to make the hierarchy of the tree apparent. Or am I misinterpreting your question? Regards, Bart. On Mon, May 9, 2011 at 3:10 PM, Hans-Juergen Rennau hren...@yahoo.dewrote: Hello People, being an ANTLR beginner, I would very much appreciate advice concerning good practise for a rather simple task. The task is the translation of a JPQL's (Java Persistence Query Language) where clause into a proprietary query language. The clause has the well-known expression structure: operands conncected by three operators: OR, AND and NOT, where precedence increases in that order. Example: a.x='1' AND (a.y='2' OR b.z='3') AND a.v like 'abc%' An important point is that the translation result will have a similar structure, that is, it will also be operands connected by those operators. Example: x='1' AND (y='2' OR z='3') AND v='123*' For this reason I am not sure if the classical approach for dealing with left-associative operators, as shown in the Definitve ANTLR Reference (3. A quick tour...) is the most appropriate one in this case. I mean rules like: conditional_term ('OR'^ conditional_term)* conditional_factor ('AND'^ conditional_factor)* This creates deep trees, where each operator creates a new level. That is fine for processing the operations. But a straightforward translation of the tree into a similar sequence of operands and operators yields a result which is correct but can be ugly, due to superflous brackets, example: (a OR (b OR (c AND d))) One possibility is to process the tree, removing superfluous brackets - perhaps by passing the context operator into the rule as a parameter, so that the rule can decide if to create brackets or not. This should not be too difficult, but my question is: is there a good practise for accomplishing the task? Would you recommend the approach just sketched, or a different tree representation to start with? (A tree I do want because there are other parts to be translated, not only the where clause, and a tree seems to me the way to deal with (possibly yet growing) complexity. Thank you very much for any suggestions. -- Hans-Juergen List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups il-antlr-interest group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.