Hi Guys, Below I copy paste my solution for LITERAL of our SQL grammar.
GOOD: * all on LEXER level. * uses effective way of GETCHARINDEX() + EMIT() for most literals. * only if was found QUOTE QUOTE (rare case in life) then will be used complex algorithm. BAD: * I don¹t know yet if it needs to free pTmpStr manually. * I don¹t know yet if this solution will work for UTF16 input of Lexer. * I have to use direct access to produced Token object to modify ITS text copy. * I still think that solution is much more NOT trivial comparing to ! Of ANTLR v2 * solution is very target-oriented IMO. IMO: Ideal is ANTLR own syntax to control lexer¹s output Anybody can give hints for better solution? Before offer ideas, please carefully check STRING_LITERAL rule below: **Inside** of STRING_LITERAL should be possible QUOTE QUOTE and we should skip one of them. Example: 'aa¹¹bb¹¹cc''dd' => aa¹bb¹cc¹dd //------------------------------------------------------------- // String literals: fragment LETTER // caseSensitive = false, so we use only small chars. : 'a'..'z' | '@' ; fragment ESCAPE_SEQUENCE // Escape for VSQL can be: \' \_ \% : '\\' ( QUOTE | '_' | '%' ) ; STRING_LITERAL @init { int dquotes_count = 0; int theStart = $start; } : QUOTE { theStart = GETCHARINDEX(); } ( ESCAPE_SEQUENCE | ~('\'' | '\\') | QUOTE QUOTE { ++dquotes_count; } )* { $start = theStart; EMIT(); } QUOTE { if( dquotes_count > 0 ) // ONLY if was found '' { pANTLR3_COMMON_TOKEN pToken = LEXSTATE->token; pANTLR3_STRING pTmpStr = pToken->getText( pToken ); char* pStart = (char*) pTmpStr->chars; while( dquotes_count-- ) // we make string smaller in the same buffer. { char* pFirstQuote = strchr( pStart, '\'' ); if( *(pFirstQuote + 1) != '\'' ) // the second quote? continue; // Example: 'aa¹¹bb¹¹cc''dd' => aa¹bb¹cc¹dd int CharsOnLeft = pFirstQuote - pStart + 1; int CharsToMove = pTmpStr->len - CharsOnLeft; ANTLR3_MEMMOVE( pFirstQuote + 1, pFirstQuote + 2, CharsToMove ); // prepare for possible next loop: pStart = pFirstQuote + 1; pTmpStr->len--; } pToken->setText( pToken, pTmpStr ); } } ; List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.