Re: LALR lookahead

Etienne M. Gagnon Sat, 22 Jan 2011 14:39:11 -0800

Hi Christoph,

So, you're working on an incremental PG. That's very interesting! Canyou tell us a few thing about it, yet? I wonder if it could be added asone of the possible outputs of SableCC 4.x (given the flexible back-end)eventually .


See below for my answers.

distance: | 1    | 2     | 3        | 4
- ----------------------------------------------
left:     | a    | c,a   | d,a      | a,d,$end

right:    | c,a  | d,a   | a,d,$end | d,$end


As you can see, the intersection will never be empty. The problem lies
in the fact, that SableCC always computes the union of all lookaheads
(even the unambiguous ones) and thus creates more options then the grammar.

This is correct. SableCC computes /linear-approximate LALR(K)/ tables.It will eventually expand the automaton, on a need basis, to accept/linear-approximate LR(K)/ grammars, eliminating any LALR-specific(unintuitive) conflict. The linear approximation significantly reducesthe computation time and size of lookahead information.

The K lookahead helps with some grammars, but the linear approximationis a limitation for some tricky cases, like the one you are showing here.

In the example, a valid lookahead 2 table should look like:

aa     | ac     | ad     | cd     | ca
- ------------------------------------------
reduce | shift  | reduce | reduce | reduce

That is an LALR(2) table, without linear-approximation. The exponentialworst-case size (and computation time) of such tables is the main reasonyou don't find many LR(K) and LALR(K) products out there. They veryeasily explode as soon as you start feeding them moderately-sized grammars.

Would that still be LALR? If yes, I could try to make a patch that fixes
this issue.

There is nothing to fix. The grammar is not linear-approximate[LA]LR(K), for any K. The solution is to rewrite the grammar withoutmodifying the accepted language.

Let's try the usual inlining trick, to delay the shift/reduce decisioncausing the conflict:


A = b a C? a d |
    b C? a d |
    C? a d;
C = c d?

This would seem to work. No conflict anymore.

SableCC 4 has (a not-implemented yet) inlining directive, and it willsuggest adding the appropriate inlining directive in the conflict errormessage. You wouldn't need to rewrite your grammar, other than addingthe directive like this:


A = B? C? a d;
B = b a?;
C = c d?;

Inline
  A.B;

SableCC 3 already implements inlining, but it is not user-controlled.Unfortunately, this leads to some exponential explosion problems.SableCC 4 will require explicit inlining directives, instead. Who knows;it might help (that's what I hope), or it might result in user-causedexplosions, instead of automatic ones... :)


I hope this helps you!

Etienne

--
Etienne M. Gagnon, Ph.D.
SableCC:                                            http://sablecc.org

_______________________________________________
SableCC-Discussion mailing list
[email protected]
http://lists.sablecc.org/listinfo/sablecc-discussion

Re: LALR lookahead

Reply via email to