Joel, Thanks for the prompt reply.
I've now given your code a much closer look. This behavior is another manifestation of a complex Bison GLR problem that I've known about for a while and am still working on. Unfortunately, I don't believe I've previously posted about it. It touches on many areas of the GLR implementation, and I just haven't found the time yet to properly describe what I know.
Ok. It would be useful to add a warning to the documentation that this kind of thing sometimes happens (at least for the moment).
Unfortunately, it seems like the most obvious solution will worsen the parser's time complexity. I'm exploring a more sophisticated solution, but I'm afraid it'll be a while before I return to it.
I can parse + other not very complicated stuff all the .c files in the gcc/postgresql/openafs/openmotif/netscape (original C version) source trees in under 30 cpu seconds. Having this increased to 60 seconds would not be a problem.
If your %merge functions eliminate subtrees, it appears that you are right: this bug actually does increase subtree sharing for your example input because it delays elimination until higher up.
For me this is not really a problem because it appears to be rare and I can ignore the leaked memory caused by not doing the free's. A bigger potential problem is that the high level merge has to do some analysis to handle the fact that the expected disambiguation did not happen in a lower level production. Since my project involves measuring C source characteristics a small amount of 'random' error can be handled (ie, I don't worry about doing a correct disambiguation). See www.knosof.co.uk/cbook/cbook.html
However, node sharing is a general problem in GLR parsers that can't always be solved by writing such %merge functions. Specifically, if any subtrees (such as tokens) exist on a stack before a split and appear on the RHS of reductions during the split, those subtrees will be shared by both stacks and you will have to be careful not to doubly free them when the stacks are merged.
Would I be correct to say this subtree sharing would not occur if the grammar was left recursive?
My usual solution is node reference counting. Until this bug is fixed, it looks like you will need that too. I suspect however that you'll eventually need reference counting regardless of this bug. You may even need it now for tokens.
My quick fix was to not free up any parse trees inside the merge functions. -- Derek M. Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltd mailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk _______________________________________________ help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison