On Fri, Feb 1, 2019 at 6:56 AM Akim Demaille <[email protected]> wrote:
> Hi Balázs! > > > Le 31 janv. 2019 à 19:15, Scheidler, Balázs < > [email protected]> a écrit : > > > > Hi, > > > > We, in the syslog-ng project (https://github.com/balabit/syslog-ng) > have a > > bison grammar file that contains a number of unused non-terminals. The > > reasons for this is complicated, which I could explain if needed. > > Yes, out of curiosity I'd be happy to know why it makes sense for you > (there's no plan to refuse grammars with useless symbols!). > The reason is that we use bison to parse portions of our configuration file, but the configuration language is extended with plugins that are loaded at runtime. The solution is: - we have a main grammar that supports the basic configuration language and a way to trigger on-demand loading of plugins - the plugin also has a bison grammar that "includes" rules from the main grammar file For this reason: - there are rules in the main config that are only used by plugins, which cause unused rules when compiling the main grammar - when we include these rules into the plugin grammar file, not all of our included rules will be used by a specific plugin. This means that both the main grammar and the plugin grammars will have some unused rules. We use a homegrown python script that grabs the reusable rules and adds them to the plugin grammar during compilation. I am happy to elaborate, if more information is needed. > > [...] > > Based on my debugging I've found this root cause: > > > > - rules are parsed as part of the grammar, and get an associated symbol > > number > > - the RHS of rules reference terminal and non-terminal symbols using a > > symbol number. These are resolved at grammar read time and the symbol > > number is generated into the output eventually making it to m4. > > - at this point reduce_grammar() happens, this removes the unused > > non-terminal rules, causing symbols to be renumbered. > > - this makes an effort to update all symbol number references, however > > RHS of rules is not updated. > > - RHS of rules that reference "old" numbers that are higher than the > > maximum, cause those ugly m4 errors that you see above > > - At the same time, in such a case an RHS expression can easily > > reference the wrong symbol, if they got renumbered. A different > > manifestation of the same bug, where dollar actions (e.g. $1, $2, etc) > > start to use an invalid <tag> to reference the value in YYSTYPE. > > Thank you for the careful analysis! Yes, you pinpointed the issue. > > For the record, something that is very useful to debug such issues is the > --trace option. In the present case, --trace=muscle would reveal the > generated symbol numbers. Comparing 3.2 and 3.3 is instructive. > I used --trace=muscles while trying to understand what bison does. I was also reading its code, which I've found pretty easy to read btw. > > > This was triggered in our code-base, because macOs brew updated to bison > > 3.3.1 recently. If at all possible it would be great if this problem > would > > not spread too far (e.g. Debian). bison 3.2 still seems to work properly. > > I'll try to address this asap and release the fix immediately. > Sorry about this issue. > > Our test suite is already quite big, but I regularly discover missing > cases... > > It's an uphill battle, but still a useful one. I find that tests (especially if they are fast) give me a lot of self confidence when cutting releases :) Bazsi
