Ah, one a fellow near-clueless user can help with. You might want to follow the example here: http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782 and implement a case-insensitive file stream. That should make your rules more straightforward, then you have rules like DATA : DATA_ ; LOOP : LOOP_ ;
--- [email protected] wrote: From: Klaus Martinschitz <[email protected]> To: [email protected] Subject: [antlr-interest] ANTLR Basic Question Date: Fri, 09 Jul 2010 21:10:55 +0200 Hi ANTLR Gurus, A beginner's question. I want to write a compiler for Crystallographic Information File Format ' (CIF). I don't want to explain the syntax in detail only the problem I have to face with. The data starts with a token 'data_' followed by arbitrary characters and an EOL, e.g. data_global . There is also a token 'loop_'; Somewehere in my BNF I write something like DATA :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_') ; LOOP : (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_') ; dataBlockHeading : (DATA NONBLANCKCHAR+ EOL) ; dataItem : (tag WHITESPACE value) | (LOOP loopHeader loopBody) ; The first two expressions are tokens the second are rules. My problem is following. The file starts with data_global BUT the *lo* of data_g*lo*bal is parsed from the LOOP token. How can this be if the parser is in the dataBlockHeadingrule? The parser must know that the characters *lo* belong to NONBLANCKCHAR and not to LOOP, or? I have attached the whole syntax at the end of the file Thanks for help Regards, Klaus grammar CIF1_1; options{ language=Java; } @lexer::header{ package at.netcrystals.cif_1_1.parser; } @parser::header{ package at.netcrystals.cif_1_1.parser; } DATA :(('d'|'D')('a'|'A')('t'|'T')('a'|'A')'_') ; LOOP : (('l'|'L')('o'|'O')('o'|'O')('p'|'P')'_') ; fragment ORDINARYCHAR : '!' | '%' | '&' | '(' | ')' | '*' | '+' | ',' | '-' | '.' | '/' | '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' | ':' | '<' | '=' | '>' | '?' | '@' | 'A' | 'B' | 'C' | 'D' | 'E' | 'F' | 'G' | 'H' | 'I' | 'J' | 'K' | 'L' | 'M' | 'N' | 'O' | 'P' | 'Q' | 'R' | 'S' | 'T' | 'U' | 'V' | 'W' | 'X' | 'Y' | 'Z' | '\\' | '^' | '\`' | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | 'h' | 'i' | 'j' | 'k' | 'l' | 'm' | 'n' | 'o' | 'p' | 'q' | 'r' | 's' | 't' | 'u' | 'v' | 'w' | 'x' | 'y' | 'z' | '{' | '|' | '}' | '~' ; NONBLANCKCHAR : ORDINARYCHAR | '"' | '#' | '$' | '\'' | '_' | ';' | '[' | ']' ; WHITESPACE : '\t'|' ' ; /************************************************************************************************ WhiteSpace and Comments ************************************************************************************************/ EOL :'\n'|'\r\n' ; /************************************************************************************************ * * Root * ************************************************************************************************/ cif : (dataBlock) EOF ; dataBlock : (dataBlockHeading dataItems) ; dataBlockHeading : (DATA NONBLANCKCHAR+ EOL) ; dataItems : dataItem* EOL ; dataItem : (tag WHITESPACE value) | (LOOP loopHeader loopBody) ; tag : NONBLANCKCHAR+ ; value : '.' | '?' | charString ; charString : singleQuotedString ; singleQuotedString : '\'' NONBLANCKCHAR* '\'' ; loopHeader : ( (WHITESPACE tag)+) ; loopBody : value (WHITESPACE value)+ ; List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.
