Hi Christoph,
So, you're working on an incremental PG. That's very interesting! Can
you tell us a few thing about it, yet? I wonder if it could be added as
one of the possible outputs of SableCC 4.x (given the flexible back-end)
eventually .
See below for my answers.
distance: | 1 | 2 | 3 | 4
- ----------------------------------------------
left: | a | c,a | d,a | a,d,$end
right: | c,a | d,a | a,d,$end | d,$end
As you can see, the intersection will never be empty. The problem lies
in the fact, that SableCC always computes the union of all lookaheads
(even the unambiguous ones) and thus creates more options then the grammar.
This is correct. SableCC computes /linear-approximate LALR(K)/ tables.
It will eventually expand the automaton, on a need basis, to accept
/linear-approximate LR(K)/ grammars, eliminating any LALR-specific
(unintuitive) conflict. The linear approximation significantly reduces
the computation time and size of lookahead information.
The K lookahead helps with some grammars, but the linear approximation
is a limitation for some tricky cases, like the one you are showing here.
In the example, a valid lookahead 2 table should look like:
aa | ac | ad | cd | ca
- ------------------------------------------
reduce | shift | reduce | reduce | reduce
That is an LALR(2) table, without linear-approximation. The exponential
worst-case size (and computation time) of such tables is the main reason
you don't find many LR(K) and LALR(K) products out there. They very
easily explode as soon as you start feeding them moderately-sized grammars.
Would that still be LALR? If yes, I could try to make a patch that fixes
this issue.
There is nothing to fix. The grammar is not linear-approximate
[LA]LR(K), for any K. The solution is to rewrite the grammar without
modifying the accepted language.
Let's try the usual inlining trick, to delay the shift/reduce decision
causing the conflict:
A = b a C? a d |
b C? a d |
C? a d;
C = c d?
This would seem to work. No conflict anymore.
SableCC 4 has (a not-implemented yet) inlining directive, and it will
suggest adding the appropriate inlining directive in the conflict error
message. You wouldn't need to rewrite your grammar, other than adding
the directive like this:
A = B? C? a d;
B = b a?;
C = c d?;
Inline
A.B;
SableCC 3 already implements inlining, but it is not user-controlled.
Unfortunately, this leads to some exponential explosion problems.
SableCC 4 will require explicit inlining directives, instead. Who knows;
it might help (that's what I hope), or it might result in user-caused
explosions, instead of automatic ones... :)
I hope this helps you!
Etienne
--
Etienne M. Gagnon, Ph.D.
SableCC: http://sablecc.org
_______________________________________________
SableCC-Discussion mailing list
[email protected]
http://lists.sablecc.org/listinfo/sablecc-discussion