Hi,
I've been happily using ANTLR to provide a parser for a proprietary
language, but I've recently been challenged by the need to handle the
following (very much like C's '#define key value' preprocessor
directive):
Define MY_CONST someValue
And subsequently in the source code, users can invoke $MY_CONST to
effectively reference someValue, as defined above. Ideally (I
believe) this would be done by the lexer, and I achieved it with the
following code:
grammar:
------------------------------------------
PRE_PROC_DEFINE
: DEFINE WS key=IDENT WS value=UNTIL_EOL
{
String valueStr = $value.text;
preProcDefines.put($key.text, /*valueStr*/valueStr.substring(0,
valueStr.length() - 1));
}
;
PRE_PROC_DEFINE_REF
: '$' key=IDENT
;
------------------------------------------
The experimental test driver code (very inefficient) follows. To
summarize, the PRE_PROC_DEFINE lexer rule above populates a HashMap in
the Lexer with each 'Define key value' found in the source code. The
test driver below attaches the lexer to a TokenRewriteStream. After
tokenizing the complete input, every token is gone through
sequentially to see if it starts with a '$'. If it does, then the
lexer's HashMap is queried for a replacement value and the token is
replaced with the appropriate text. Finally, toString() is called to
get the rewritten stream and the whole thing is run again through the
lexer, then finally the parser. Would it be better to solve this
problem via the template/rewrite rules described in The Definitive
ANTLR Reference? Or am I headed, to some degree, in the right
direction? The problem I found with my approach is that the lexer
never knows the index of the tokens it is generating (and that's
probably by design).
------------------------------------------
CharStream stream = new ANTLRStringStream(testString);
TPLLexer lexer = new TPLLexer(stream);
TokenRewriteStream rawTokens = new TokenRewriteStream(lexer);
List initTokens = rawTokens.getTokens();
int index = 0;
for (Object curObj : initTokens)
{
CommonToken curToken = (CommonToken) curObj;
String text = curToken.getText();
if (text.startsWith("$"))
{
text = text.substring(1); // remove '$'
if (lexer.getPreProcDefines().containsKey(text))
rawTokens.replace(index,
lexer.getPreProcDefines().get(text));
}
index++;
}
lexer = new TPLLexer(new
ANTLRStringStream(rawTokens.toString()));
CommonTokenStream preProcessedTokens = new
CommonTokenStream(lexer);
TPLParser parser = new TPLParser(preProcessedTokens);
return parser;
------------------------------------------
Thanks in advance for any responses!
--
You received this message because you are subscribed to the Google Groups
"il-antlr-interest" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/il-antlr-interest?hl=en.