Hi Christoph,

So, you're working on an incremental PG. That's very interesting! Can you tell us a few thing about it, yet? I wonder if it could be added as one of the possible outputs of SableCC 4.x (given the flexible back-end) eventually .

See below for my answers.

distance: | 1    | 2     | 3        | 4
- ----------------------------------------------
left:     | a    | c,a   | d,a      | a,d,$end

right:    | c,a  | d,a   | a,d,$end | d,$end


As you can see, the intersection will never be empty. The problem lies
in the fact, that SableCC always computes the union of all lookaheads
(even the unambiguous ones) and thus creates more options then the grammar.

This is correct. SableCC computes /linear-approximate LALR(K)/ tables. It will eventually expand the automaton, on a need basis, to accept /linear-approximate LR(K)/ grammars, eliminating any LALR-specific (unintuitive) conflict. The linear approximation significantly reduces the computation time and size of lookahead information.

The K lookahead helps with some grammars, but the linear approximation is a limitation for some tricky cases, like the one you are showing here.

In the example, a valid lookahead 2 table should look like:

aa     | ac     | ad     | cd     | ca
- ------------------------------------------
reduce | shift  | reduce | reduce | reduce

That is an LALR(2) table, without linear-approximation. The exponential worst-case size (and computation time) of such tables is the main reason you don't find many LR(K) and LALR(K) products out there. They very easily explode as soon as you start feeding them moderately-sized grammars.

Would that still be LALR? If yes, I could try to make a patch that fixes
this issue.

There is nothing to fix. The grammar is not linear-approximate [LA]LR(K), for any K. The solution is to rewrite the grammar without modifying the accepted language.

Let's try the usual inlining trick, to delay the shift/reduce decision causing the conflict:

A = b a C? a d |
    b C? a d |
    C? a d;
C = c d?

This would seem to work. No conflict anymore.

SableCC 4 has (a not-implemented yet) inlining directive, and it will suggest adding the appropriate inlining directive in the conflict error message. You wouldn't need to rewrite your grammar, other than adding the directive like this:

A = B? C? a d;
B = b a?;
C = c d?;

Inline
  A.B;

SableCC 3 already implements inlining, but it is not user-controlled. Unfortunately, this leads to some exponential explosion problems. SableCC 4 will require explicit inlining directives, instead. Who knows; it might help (that's what I hope), or it might result in user-caused explosions, instead of automatic ones... :)

I hope this helps you!

Etienne

--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org

_______________________________________________
SableCC-Discussion mailing list
[email protected]
http://lists.sablecc.org/listinfo/sablecc-discussion

Reply via email to