Re: Bison C++ mid-rule value lost with variants

Akim Demaille Sun, 17 Jun 2018 07:02:46 -0700

Hi all,

> Le 29 juin 2017 à 15:55, Piotr Marcińczyk <[email protected]> a écrit :
> 
> I supposedly found a bug in lalr1.cc skeleton with variant semantic type.
> When using mid-rule action { $<type>$ = value; } to return a value that
> will be used in further semantic actions, it appears that the value on
> stack becomes zero. Tested with Bison version 3.0.4.


Piotr’s grammar file includes:

%token <int> NUM
%%
expr:
  NUM
| expr { $<int>$ = 42; } '+' NUM { std::cout << $<int>2 << '\n'; };

and one can see that when run, $<int>2 is not 42, but 0.

My opinion on this is somewhat different from the ones that
have been expressed so far.  IMHO, it has no good reason
to work.

Yes, it works with plain old unions.  But that’s unsafe, and
that’s because you hide things from your tool (Bison).  I
personally see these typed accesses to the semantical
values ($<int>2) are no different from a cast (and actually,
that’s exactly what they are with unions).

Yes, it works with C++ variants, or even std::any, when
used as a store for semantical values.  But we are still lying
to the tool.

I think this example shows that the design of mid-rules
actions has some loopholes, and in particular something which is
critically missing is the type of its semantical value.  Since,
Bison does not know the type of the semantical value of a mid-rule
action:
- the user must use $<int>$, not $$
- the destructor will not work
- the printer won’t work either
- the user must also be explicit when using the value in
  the other actions ($<int>2).

Sure, using std::variant or std::any we avoid the first three issues,
but that’s still by hiding things from Bison and relying on
magic on the language side.  And obviously that works only for
C++, and actually modern C++.

I think we should rather provide typed mid-rule actions.

How about:

expr:
  NUM
| expr %type<int>{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

It’s not clear whether it should be prefix or postfix

expr:
  NUM
| expr { $$ = 42; } %type<int> '+' NUM { std::cout << $2 << '\n'; };

The regular use of %type is prefix (%type <int> expr), but the
directives in rules are usually presented as postfix
(exp: "if" exp exp %prec "then"), although it is not mandatory.


Or go for a lighter syntax

expr:
  NUM
| expr <int>{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

or

expr:
  NUM
| expr { $$ = 42; }<int> '+' NUM { std::cout << $2 << '\n'; };


Personally, I prefer the prefix forms, but they don’t blend
nicely with named references:

expr:
  NUM
| expr <int>{ $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; };

Thoughts?





I wish we had chosen a prefix syntax for named references, say

expr:
  NUM
| expr val={ $<int>$ = 42; } '+' NUM { std::cout << $<int>val << '\n'; };

I think we discussed this with Joel E. Denny, but I don’t remember
the details. We could have used

expr:
  NUM
| expr val<int>={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };

Re: Bison C++ mid-rule value lost with variants

Reply via email to