Hi Sebastian,
See below.
Looking at SableCC error message, I see the problem. Your grammar is
effectively not LR(1); it probably is LR(2).
But IIRC for these classes this relation holds:
LR(0)< LR(1) = LR(2)
So it should be possible to transform my grammar, shouldn't it?
The set of languages that can be described by LR(1) grammars is
effectively identical to the set of languages that can be described by
LR(2) grammars. If I remember correctly, there exists an algorithm to
transform an LR(K) grammar (where K>1) to an LR(1) grammar, but it could
grow your grammar exponentially. In practical terms, the equivalent
LR(1) grammar is unlikely to be attractive to work with. :(
1- You probably want to start each case with a keyword. This is why C
and Java use 2 keywords: switch and case.
case_elem = case_keyword colon stmt+;
Unfortunately I can't change the language I'm trying to parse. It isn't defined
by me.
There is another approach: It is possible to feedback symbol table information
to the sablecc lexer? My case labels are actually enums, but look syntactically
like identifiers.
(Eg, for parsing the C expression (a)+b you have to know whether "a" is a type
or a variable, because they result in different parse trees)
You can play with the filter() methods of Lexer and Parser. But, this is
quite tricky to get right. :(
If you like to experiment, you could play with a SableCC 4 generated
lexer and try to adapt it to SableCC 3. The SableCC 4 lexer syntax has a
lookahead operator that could help you identify a case_identifier,
something like:
case_identifier = identifier Look blank* ':';
...Yeah, yeah, yeah, I know: Me too I need SableCC 4 yesterday. If only
there were more than 24 hours/day!
Etienne
--
Etienne M. Gagnon, Ph.D.
SableCC: http://sablecc.org
_______________________________________________
SableCC-Discussion mailing list
[email protected]
http://lists.sablecc.org/listinfo/sablecc-discussion