Hi Sebastian,

See below.

Looking at SableCC error message, I see the problem. Your grammar is 
effectively not LR(1); it probably is LR(2).
But IIRC for these classes this relation holds:

  LR(0)<  LR(1) = LR(2)

So it should be possible to transform my grammar, shouldn't it?

The set of languages that can be described by LR(1) grammars is effectively identical to the set of languages that can be described by LR(2) grammars. If I remember correctly, there exists an algorithm to transform an LR(K) grammar (where K>1) to an LR(1) grammar, but it could grow your grammar exponentially. In practical terms, the equivalent LR(1) grammar is unlikely to be attractive to work with. :(

1- You probably want to start each case with a keyword. This is why C and Java use 2 keywords: switch and case.
case_elem = case_keyword colon stmt+;
Unfortunately I can't change the language I'm trying to parse. It isn't defined 
by me.

There is another approach: It is possible to feedback symbol table information 
to the sablecc lexer? My case labels are actually enums, but look syntactically 
like identifiers.
(Eg, for parsing the C expression (a)+b you have to know whether "a" is a type 
or a variable, because they result in different parse trees)

You can play with the filter() methods of Lexer and Parser. But, this is quite tricky to get right. :(

If you like to experiment, you could play with a SableCC 4 generated lexer and try to adapt it to SableCC 3. The SableCC 4 lexer syntax has a lookahead operator that could help you identify a case_identifier, something like:

  case_identifier = identifier Look blank* ':';


...Yeah, yeah, yeah, I know: Me too I need SableCC 4 yesterday. If only there were more than 24 hours/day!

Etienne

--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org


_______________________________________________
SableCC-Discussion mailing list
[email protected]
http://lists.sablecc.org/listinfo/sablecc-discussion

Reply via email to