Re: Bison C++ mid-rule value lost with variants

2018-11-28 Thread Akim Demaille



> Le 17 juin 2018 à 16:33, Rici Lake  a écrit :
> 
> Although unrelated to this proposal, I would also favour allowing
> 
> %%
>  nonterminal: rhs
> 
> as an alternative to
> 
> %type  nonterminal
> %%
> nonterminal: rhs

FTR, the more I contemplate this idea, the more I like it.

Unfortunately we hit again the classical S/R conflict of the
naive YACC grammar: in a grammar such as

%%

exp
: "number"
| exp "+" exp { $$ = $1 + $3; }
| exp "*" exp { $$ = $1 * $3; }
| "(" exp ")" { $$ = $2; }
| "["  { $$ = 42; } "]"  { $$ = $2; }
;

a single lookahead will not suffice for the parser to know,
when it's reading the rhs of a rule, whether the  coming
is that of a midrule action so we stay in the right hand rule of
the rule, or the tag of the lhs of the next rule.

GLR would of course solve the problem.  But I'm not eager to
sit Bison on top of GLR right now (we use LAC for instance,
not implemented in glr.c).


Re: Bison C++ mid-rule value lost with variants

2018-08-28 Thread Frank Heckenbach
Akim Demaille wrote:

> > IMHO, "active types" aren't really the problem (as std::variant or
> > an equivalent implementation can handle them), but indeed it's too
> > low-level and error-prone, though that would apply to all skeletons
> > (but of course, dropping it from C is impossible because of
> > backward-compatibility).
> 
> I agree.  There's just one place I don't know too well how to
> address, that for $-1, $-2, etc.  But again that's because as
> of today Bison is too limited to recover the type.

I wasn't even aware of $-1, just read about it in the manual. How is
this reconcilable with proper LR parsing? Is it really needed for
some grammars or could it be dropped too (at least in C++)?

> You seem to have a bottom-up approach: if I use this as a
> semantic value, what can I do with it.  I look at this the
> other way round.  Everything I know about LR shows that this
> is not needed, so I'll avoid it, _unless_, I'm given a counter
> example.

Not really bottom-up. I saw that Bison supported $<>, but buggy, and
tried to figure out if I can fix the bugs which std::variant allowed
me to do. Apparently you didn't see it as supported, but buggy, but
rather as unsupported.

> I'm teaching (well, I used to teach now) LR parsing, and I don't
> want my students (well, former) to have the impression I'm fooling
> them by using something different.

I thought the user actions were rather separate from parsing theory,
but I'll take your word for it. And we agree about dropping $<>
anyway. :)

> >>> When I did the coding, std::variant actually simplified things for
> >>> me (e.g., I could avoid adding move support in Bison's variant
> >>> implementation), so if I were you, I'd probably use it even if I
> >>> dopped $<>, but if you want to avoid its small runtime overhead,
> >>> that seems possible.
> >> 
> >> I don't think we can afford to simply drop all C++s pre 17.
> > 
> > As I wrote before, there are std::variant implementations for C++14
> > (which I'm actually using mostly so far) and I think also C++11
> > (I haven't tested this).
> 
> Yes, I know.  But then, I expect license nightmares to ship
> them, and also by pre 17, I also mean 98/03.

Both mparks's and Boost's implementation are released under the
Boost Software License, described on gnu.org as "[...] a lax,
permissive non-copyleft free software license, compatible with the
GNU GPL", so I see no big problems here, for both free and non-free
software developers.

As for pre-11, personally I'm not very interested (I shied away from
C++ for a long time, and IMHO it's only become useable with C++11),
but I see your point of view. So you'll need to make (internal)
moving dependent on the compiler version then. Maybe you can define
a template that does std::move for C++11 and NOP for older
compilers, to avoid sprinkling the code with ifdefs.

> > But I've come to rely on some of the features I implemented there.
> > AFAICS, you haven't commented on them yet, so I don't know how you
> > think about them. If you really object to them (or equivalent
> > features), I might prefer to keep using my skeletons.
> 
> I didn't answer because I have already too many threads open,
> and I prefer to stay kinda focus on the issues I plan to address
> in the short term, keeping the rest for later.  I wish I could
> do more :/  And I'm sorry to keep you waiting.

No problem. As I said, for now I won't do much about it anyway. But
please understand that I'd like to have clarity about those issues
before I'll do substantial work.

> > - pre-action for empty rules with a user action: $$ is initialized
> >  to the default value of the correct type. With completely static
> >  variants, this might even be natural (or may need to call the
> >  default constructor in a switch), and should be officially
> >  documented.
> > 
> >  (In contrast, for non-empty rules with a user action, I
> >  pre-initialize $$ to an invalid variant, so if the user action
> >  forgets to set $$, a bad_variant_access will happen on access
> >  which may catch some errors in user actions. This won't be
> >  possible with static variants, but I can live without that.)
> 
> Never thought about this before.
> 
> Do you happen to have test cases?

No isolated ones (my parsers contain a number of them). Here's a
slightly simplified extract to build a formal parameter list:

  using TFormalParameters = vector >;

  %type  fpar_list

  fpar_list:   type_name identifier {  $$  .emplace_back ($1, 
$2); }
   | fpar_list ',' type_name identifier { ($$ = $1).emplace_back ($3, 
$4); };

As a side note, this also relies on automatic moving for all $n in
the example -- let's assume TType is a move-only type (it isn't
actually, but I have other rules with move-only types and don't want
to have too many different examples here), and also for performance
(string, vector).

The first action assumes that $$ is default-initialized. Now I
realize I could also write it like this without 

Re: Bison C++ mid-rule value lost with variants

2018-08-28 Thread Frank Heckenbach
Hans Åberg wrote:

> > On 27 Aug 2018, at 22:10, Akim Demaille  wrote:
> > 
> >> Most of my porting work, apart from writing the new skeletons, was
> >> general grammar cleanup and conversion of semantic types from raw
> >> pointers and containers to smart pointers and other RAII classes
> >> (which was my main goal of the port, of course), and changes in the
> >> lexer (dropping flex, but that's another story).
> > 
> > I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
> > I have one parser here, 
> > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
> > and another there 
> > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
> > for instance, using Flex.
> 
> That is probably versions before 2.6; the yyin and yyout have been
> changed in the C++ header so that they are no longer pointers, so
> it is not only incompatible with the header of older versions, but
> also with the code it writes, resulting in the issue [1].
> 
> 1. 
> https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error

Though this wasn't actually my problem, I'll reply to this mail
rather than the main thraed to keep it separate from the actual
Bison discussion.

For a start, I didn't have very good experience communicating with
Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
etc. in the generated code, so over the years I'd been adjusting
various warning-suppression gcc options or doing dirty #define
tricks to avoid warnings, or sometimes even post-processing the
generated lexer with sed.

But the final straw was when, after changing to C++ Bison, I wanted
to switch to C++ Flex too and found this beautiful comment:

/* The c++ scanner is a mess. The FlexLexer.h header file relies on the
 * following macro. This is required in order to pass the 
c++-multiple-scanners
 * test in the regression suite. We get reports that it breaks inheritance.
 * We will address this in a future release of flex, or omit the C++ scanner
 * altogether. */

I know there are no guarantees in the future of free software
(neither of non-free software, of course), but such an
announcement/threat seemed too risky to me.

Meanwhile I'd often thought that all Flex actually does is matching
alternative regular expressions. Plain RE can do that as well, and
by capturing subexpressions I can find out which alternative was
matched.

Of course, it would (indeed turn out to be) somewhat slower (RE
built at runtime vs. compile time), but like parsing, lexing speed
is not a big issue to me. So I was ready to trade that in for
convenience of programming and one less dependence on a problematic
tool.

(Side node: Many years ago, on a different project, I dropped gperf
to recognize predefined identifiers for similar reasons, and put
them in a look-up table instead. Except for a tiny slowdown, that
had worked out well, so I was confident I could drop Flex, too. --
Now apparently the next one in line after dropping gperf and Flex
should be Bison, but don't worry, I don't see an easy way to replace
it, since Bison actually does some nontrivial stuff. :)

So I wrote a small library that builds that massive RE out of single
rules and maps subexpressions back to rules (even in the case that
rules contain subexpressions of their own), and that works for me.

Regards,
Frank



Re: Bison C++ mid-rule value lost with variants

2018-08-27 Thread Hans Åberg


> On 27 Aug 2018, at 22:10, Akim Demaille  wrote:
> 
>> Most of my porting work, apart from writing the new skeletons, was
>> general grammar cleanup and conversion of semantic types from raw
>> pointers and containers to smart pointers and other RAII classes
>> (which was my main goal of the port, of course), and changes in the
>> lexer (dropping flex, but that’s another story).
> 
> I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
> I have one parser here, 
> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
> and another there 
> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
> for instance, using Flex.

That is probably versions before 2.6; the yyin and yyout have been changed in 
the C++ header so that they are no longer pointers, so it is not only 
incompatible with the header of older versions, but also with the code it 
writes, resulting in the issue [1].

1. 
https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error





Re: Bison C++ mid-rule value lost with variants

2018-08-27 Thread Akim Demaille
Hi Frank!

Thanks for the feedback about 3.1!

> Le 27 août 2018 à 00:39, Frank Heckenbach  a écrit :
> 
> Akim Demaille wrote:
> 
>>> Le 19 juin 2018 à 00:27, Frank Heckenbach  a écrit 
>>> :
>> 
>> Yes, indeed.  $<> is really too low level a feature when
>> it is comes to 'active types' such as we have in C++ compared
>> to C.  So we should not strive to have it work for dubious
>> cases (a type mismatch between the declared type and the one
>> passed to $<>), and allow to not use it (typed mid-rule actions).
> 
> I agree with dropping $<>, though not quite for the same reasons.

I’m not sure our reasons are _so_ different.  My view is that
the problem we had with variants and midrule action show that
midrule actions are improperly baked.  In the model of the parser
there is no need for std style variant, because the type is always
known.  The implementation of parsers in type rich languages
(those with dependent types) do not need such an approach.

So rather than adding something I saw nowhere else in the implementation
of parsers, I’d rather fix the input.


> IMHO, "active types" aren't really the problem (as std::variant or
> an equivalent implementation can handle them), but indeed it's too
> low-level and error-prone, though that would apply to all skeletons
> (but of course, dropping it from C is impossible because of
> backward-compatibility).

I agree.  There’s just one place I don’t know too well how to
address, that for $-1, $-2, etc.  But again that’s because as
of today Bison is too limited to recover the type.


>> I'm willing to discourage the use of $<> in all the outputs,
>> but forbid them when there's really no way to support them.
> 
> Depends on your definition of "no way" (see above). ;)

You seem to have a bottom-up approach: if I use this as a
semantic value, what can I do with it.  I look at this the
other way round.  Everything I know about LR shows that this
is not needed, so I’ll avoid it, _unless_, I’m given a counter
example.

I’m teaching (well, I used to teach now) LR parsing, and I don’t
want my students (well, former) to have the impression I’m fooling
them by using something different.



> I saw your discussion with Hans about the calc++ examples. Cleaning
> that up as suggested certainly helps here.

Victor gave good ideas too.


> Most of my porting work, apart from writing the new skeletons, was
> general grammar cleanup and conversion of semantic types from raw
> pointers and containers to smart pointers and other RAII classes
> (which was my main goal of the port, of course), and changes in the
> lexer (dropping flex, but that’s another story).

I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
I have one parser here, 
https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
and another there 
https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
for instance, using Flex.


> Otherwise, AFAIR, the biggest changes were the different "%token"
> and "%type" declarations with types instead of union members (which
> made things easier),

Yes!  I made this for variants, but even for unions it’s a clear
improvement (IMHO).


> rewriting the interface to the parser, largely
> changing (and in some cases, implementing :) the "%define"s in the
> grammar file, and some changes to my Makefiles due to the additional
> generated files. -- Actually, all of these changes were large in a
> relative sense, but small in an absolute sense,

:) :) :)

> as all those parts
> of the code (parser interface, defines, Makefile, etc.) are rather
> short (at least in my projects). So they may be more of a
> psychological hurdle, since at first glance it all looks completely
> different, and since one needs that before one can actually get
> started with the new skeleton.
> 
> I fear existing projects using the C skeleton differ too widely to
> offer a detailed porting guide for all (or most) cases, but some
> easy to use examples may help here (see above), if they show the
> important features (not too many to avoid confusing users, but not
> too few so they're actually useful to many; it's probably a delicate
> balance).

I’ll try to see if I can come up with at least a short guideline.


>>> When I did the coding, std::variant actually simplified things for
>>> me (e.g., I could avoid adding move support in Bison's variant
>>> implementation), so if I were you, I'd probably use it even if I
>>> dopped $<>, but if you want to avoid its small runtime overhead,
>>> that seems possible.
>> 
>> I don't think we can afford to simply drop all C++s pre 17.
> 
> As I wrote before, there are std::variant implementations for C++14
> (which I'm actually using mostly so far) and I think also C++11
> (I haven’t tested this).

Yes, I know.  But then, I expect license nightmares to ship
them, and also by pre 17, I also mean 98/03.



>> You did wonders with your C++17 skeleton, and it would be
>> great to port your effort in the current framework.  Would
>> you 

Re: Bison C++ mid-rule value lost with variants

2018-08-26 Thread Frank Heckenbach
Akim Demaille wrote:

> > Le 19 juin 2018 à 00:27, Frank Heckenbach  a écrit 
> > :
> > 
> > Akim Demaille wrote:
> > 
> >> Well, you can use $<> wherever you want, in regular actions
> >> too.
> > 
> > And that's just as unsafe and hiding things from Bison. From the
> > rest of your mail I now see you want to get rid of $<> completely in
> > C++. I hadn't gathered this from your previous mail.
> 
> Yes, indeed.  $<> is really too low level a feature when
> it is comes to 'active types' such as we have in C++ compared
> to C.  So we should not strive to have it work for dubious
> cases (a type mismatch between the declared type and the one
> passed to $<>), and allow to not use it (typed mid-rule actions).

I agree with dropping $<>, though not quite for the same reasons.
IMHO, "active types" aren't really the problem (as std::variant or
an equivalent implementation can handle them), but indeed it's too
low-level and error-prone, though that would apply to all skeletons
(but of course, dropping it from C is impossible because of
backward-compatibility).

Declared types on mid-rule actions are safer in the normal case
(and if exotic uses of $<> will stop working, I wouldn't mind -- one
should just declare a more suitable type then, possibly itself a
variant if one does such complex things in the actions).

> I'm willing to discourage the use of $<> in all the outputs,
> but forbid them when there's really no way to support them.

Depends on your definition of "no way" (see above). ;)

> > Though it does help when moving from C to C++ which I did recently
> > (though my code was actually C++ the whole time, I had used the C
> > skeleton before). If I had used $<>, and it was not available in the
> > C++ skeleton, it would have been another hurdle at this point. In my
> > case, I'm talking doubly hypothetically; for other users (as the
> > original report indicates, there are people who use it) it may
> > become relevant, but some kind of porting guide may help.
> 
> That's a good idea.  But I'm not sure what it should cover :)

I saw your discussion with Hans about the calc++ examples. Cleaning
that up as suggested certainly helps here.

Most of my porting work, apart from writing the new skeletons, was
general grammar cleanup and conversion of semantic types from raw
pointers and containers to smart pointers and other RAII classes
(which was my main goal of the port, of course), and changes in the
lexer (dropping flex, but that's another story).

Otherwise, AFAIR, the biggest changes were the different "%token"
and "%type" declarations with types instead of union members (which
made things easier), rewriting the interface to the parser, largely
changing (and in some cases, implementing :) the "%define"s in the
grammar file, and some changes to my Makefiles due to the additional
generated files. -- Actually, all of these changes were large in a
relative sense, but small in an absolute sense, as all those parts
of the code (parser interface, defines, Makefile, etc.) are rather
short (at least in my projects). So they may be more of a
psychological hurdle, since at first glance it all looks completely
different, and since one needs that before one can actually get
started with the new skeleton.

I fear existing projects using the C skeleton differ too widely to
offer a detailed porting guide for all (or most) cases, but some
easy to use examples may help here (see above), if they show the
important features (not too many to avoid confusing users, but not
too few so they're actually useful to many; it's probably a delicate
balance).

> > When I did the coding, std::variant actually simplified things for
> > me (e.g., I could avoid adding move support in Bison's variant
> > implementation), so if I were you, I'd probably use it even if I
> > dopped $<>, but if you want to avoid its small runtime overhead,
> > that seems possible.
> 
> I don't think we can afford to simply drop all C++s pre 17.

As I wrote before, there are std::variant implementations for C++14
(which I'm actually using mostly so far) and I think also C++11
(I haven't tested this).

> You did wonders with your C++17 skeleton, and it would be
> great to port your effort in the current framework.  Would
> you contribute to that?

For the rest of this year I'll be quite busy (as you can see from
this late reply):, but next year I might have some time to work on
Bison.

However, I've ported my bigger parsers to my new skeletons and use
them actively. So I have a solution that works for me, and the
slight overhead of storing both the static (by the skeleton) and
dynamic (by std::variant) types is no issue to me.

But I've come to rely on some of the features I implemented there.
AFAICS, you haven't commented on them yet, so I don't know how you
think about them. If you really object to them (or equivalent
features), I might prefer to keep using my skeletons.

These are especially the following features (described in more
detail in my original 

Re: Bison C++ mid-rule value lost with variants

2018-08-12 Thread Akim Demaille
Hi Frank,

Sorry, I missed that message.  I found it while exploring the
mid-rule action fiasco with variants.

> Le 19 juin 2018 à 00:27, Frank Heckenbach  a écrit :
> 
> Akim Demaille wrote:
> 
>> Well, you can use $<> wherever you want, in regular actions
>> too.
> 
> And that's just as unsafe and hiding things from Bison. From the
> rest of your mail I now see you want to get rid of $<> completely in
> C++. I hadn’t gathered this from your previous mail.

Yes, indeed.  $<> is really too low level a feature when
it is comes to ‘active types’ such as we have in C++ compared
to C.  So we should not strive to have it work for dubious
cases (a type mismatch between the declared type and the one
passed to $<>), and allow to not use it (typed mid-rule actions).


> Though it does help when moving from C to C++ which I did recently
> (though my code was actually C++ the whole time, I had used the C
> skeleton before). If I had used $<>, and it was not available in the
> C++ skeleton, it would have been another hurdle at this point. In my
> case, I'm talking doubly hypothetically; for other users (as the
> original report indicates, there are people who use it) it may
> become relevant, but some kind of porting guide may help.

That’s a good idea.  But I’m not sure what it should cover :)


>> We tried to eliminate these runtime problems and make them
>> compile-time as much as possible.  A typical example
>> is the symbol constructors in C++, which forbid that in the
>> scanner you declare an INT and set yylval->float_val.
> 
> Provided one uses them. Currently, this is not enforced (in fact,
> Piotr’s grammar didn't), so not strictly forbidden.

And I don’t want to forbid them.  And since symbol constructors
appeared late, backward compatibility forbids that we require
them.  Yep I think they are the proper way to do it, so I promote
them.

> FWIW, I wouldn't
> mind strictly forbidding it (maybe by making other constructors
> private and adding friends as necessary or whatever is required).

If it were to be designed today, I would do that.



>> I'm sorry if I gave the impression I would not provide support
>> for modern C++, that's definitely not my point.  I want to
>> avoid _requiring_ it, but, if __cplusplus__ is modern enough,
>> we absolutely should support move semantics!  I'm focus on this
>> issue now just because I'm trying to catch up!  And it seems to
>> me that it shows we don't need to require std::variant.
> 
> If you're willing to drop $<> completely in C++ (both in mid-rule
> and regular actions), it’s probably possible to avoid std::variant.

I’m willing to discourage the use of $<> in all the outputs,
but forbid them when there’s really no way to support them.


> When I did the coding, std::variant actually simplified things for
> me (e.g., I could avoid adding move support in Bison's variant
> implementation), so if I were you, I'd probably use it even if I
> dopped $<>, but if you want to avoid its small runtime overhead,
> that seems possible.

I don’t think we can afford to simply drop all C++s pre 17.


You did wonders with your C++17 skeleton, and it would be
great to port your effort in the current framework.  Would
you contribute to that?


Re: Bison C++ mid-rule value lost with variants

2018-06-18 Thread Frank Heckenbach
Akim Demaille wrote:

> >> Piotr's grammar file includes:
> >> 
> >> %token  NUM
> >> %%
> >> expr:
> >>  NUM
> >> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
> >> 
> >> and one can see that when run, $2 is not 42, but 0.
> >> 
> >> My opinion on this is somewhat different from the ones that
> >> have been expressed so far.  IMHO, it has no good reason
> >> to work.
> >> 
> >> Yes, it works with plain old unions.  But that's unsafe, and
> >> that's because you hide things from your tool (Bison).
> > 
> > AFAIK, that's the only purpose of the $<> syntax in Bison which has
> > been around for I don't know how long. So claiming it has no reason
> > to work now seems a bit odd to me.
> 
> Well, you can use $<> wherever you want, in regular actions
> too.

And that's just as unsafe and hiding things from Bison. From the
rest of your mail I now see you want to get rid of $<> completely in
C++. I hadn't gathered this from your previous mail.

> And some people are doing nasty things with it in C,
> which forced, for backward compatibility with YACC, to keep
> weird code.
> 
> And as you know, YACC does not support C++, and obviously
> not (Bison) variants.  So to expect a « feature » from C to
> naturally work for C++ is not so straightforward.

Though it does help when moving from C to C++ which I did recently
(though my code was actually C++ the whole time, I had used the C
skeleton before). If I had used $<>, and it was not available in the
C++ skeleton, it would have been another hurdle at this point. In my
case, I'm talking doubly hypothetically; for other users (as the
original report indicates, there are people who use it) it may
become relevant, but some kind of porting guide may help.

> We tried to eliminate these runtime problems and make them
> compile-time as much as possible.  A typical example
> is the symbol constructors in C++, which forbid that in the
> scanner you declare an INT and set yylval->float_val.

Provided one uses them. Currently, this is not enforced (in fact,
Piotr's grammar didn't), so not strictly forbidden. FWIW, I wouldn't
mind strictly forbidding it (maybe by making other constructors
private and adding friends as necessary or whatever is required).

> > What's more important for me, and the reason I worked on this, are
> > other features, most importantly move semantics. Of course, they
> > also require modern C++ (i.e., C++11 or newer), so if that's a
> > problem, I'll have to keep my own fork anyway.
> 
> I'm sorry if I gave the impression I would not provide support
> for modern C++, that's definitely not my point.  I want to
> avoid _requiring_ it, but, if __cplusplus__ is modern enough,
> we absolutely should support move semantics!  I'm focus on this
> issue now just because I'm trying to catch up!  And it seems to
> me that it shows we don't need to require std::variant.

If you're willing to drop $<> completely in C++ (both in mid-rule
and regular actions), it's probably possible to avoid std::variant.

When I did the coding, std::variant actually simplified things for
me (e.g., I could avoid adding move support in Bison's variant
implementation), so if I were you, I'd probably use it even if I
dopped $<>, but if you want to avoid its small runtime overhead,
that seems possible.

Regards,
Frank



Re: Bison C++ mid-rule value lost with variants

2018-06-18 Thread Akim Demaille



> Le 18 juin 2018 à 15:26, Frank Heckenbach  a écrit :
> 
> Akim Demaille wrote:

Hi Frank,

>> Piotr's grammar file includes:
>> 
>> %token  NUM
>> %%
>> expr:
>>  NUM
>> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
>> 
>> and one can see that when run, $2 is not 42, but 0.
>> 
>> My opinion on this is somewhat different from the ones that
>> have been expressed so far.  IMHO, it has no good reason
>> to work.
>> 
>> Yes, it works with plain old unions.  But that's unsafe, and
>> that's because you hide things from your tool (Bison).
> 
> AFAIK, that's the only purpose of the $<> syntax in Bison which has
> been around for I don't know how long. So claiming it has no reason
> to work now seems a bit odd to me.

Well, you can use $<> wherever you want, in regular actions
too.  And some people are doing nasty things with it in C,
which forced, for backward compatibility with YACC, to keep
weird code.

And as you know, YACC does not support C++, and obviously
not (Bison) variants.  So to expect a « feature » from C to
naturally work for C++ is not so straightforward.

>> I personally see these typed accesses to the semantical
>> values ($2) are no different from a cast (and actually,
>> that's exactly what they are with unions).
> 
> I don't think so. C unions involve a hidden cast only when accessing
> a different member than was set, but that's not the case here.

Yes, I agree.  But the code does not know that.  And I mean
_statically_, not at runtime.

We tried to eliminate these runtime problems and make them
compile-time as much as possible.  A typical example
is the symbol constructors in C++, which forbid that in the
scanner you declare an INT and set yylval->float_val.

Here, it’s just another instance of the same class of issues.


> Apparently (and reasonably) this case is not supposed to work with
> $<>, so WRT whether or not to support $<> at all, I see no
> fundamental difference between C and C++, so if dropping support in
> C++, it would only be consequent to do it in C as well, and that
> might be nearly impossible due to backward-compatibility (or perhaps
> even Yacc compatibility, I don't know).

We are bound to POSIX for YACC, and POSIX forces us to accept

%union {
  int ival;
  float fval;
  char* sval;
}
%token  NUM
foo: NUM NUM NUM { $1; $2; $3; }

which completely breaks the typing system of Bison.  If I were
to chose, I would forbid this and force the user to

foo: NUM NUM NUM { $1; (float)$2; (const char*)$3; }

but the decision to allow this was made decades ago :)
At a time C functions were not even typed.

Since then people tend to be rigorous with types.


>> Sure, using std::variant or std::any we avoid the first three issues,
>> but that's still by hiding things from Bison and relying on
>> magic on the language side.  And obviously that works only for
>> C++, and actually modern C++.
> 
> As I wrote sometime before, I currently don't use any
> semantic-valued mid-rule actions, so I'm basically neutral on this
> proposal. In principle, I think a properly declared type for them
> seems reasonable, I just think this proposal is a few decades late.

I agree.



> What's more important for me, and the reason I worked on this, are
> other features, most importantly move semantics. Of course, they
> also require modern C++ (i.e., C++11 or newer), so if that's a
> problem, I'll have to keep my own fork anyway.

I’m sorry if I gave the impression I would not provide support
for modern C++, that’s definitely not my point.  I want to
avoid _requiring_ it, but, if __cplusplus__ is modern enough,
we absolutely should support move semantics!  I’m focus on this
issue now just because I’m trying to catch up!  And it seems to
me that it shows we don’t need to require std::variant.


Re: Bison C++ mid-rule value lost with variants

2018-06-18 Thread Frank Heckenbach
Akim Demaille wrote:

> Piotr's grammar file includes:
> 
> %token  NUM
> %%
> expr:
>   NUM
> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
> 
> and one can see that when run, $2 is not 42, but 0.
> 
> My opinion on this is somewhat different from the ones that
> have been expressed so far.  IMHO, it has no good reason
> to work.
> 
> Yes, it works with plain old unions.  But that's unsafe, and
> that's because you hide things from your tool (Bison).

AFAIK, that's the only purpose of the $<> syntax in Bison which has
been around for I don't know how long. So claiming it has no reason
to work now seems a bit odd to me.

> I personally see these typed accesses to the semantical
> values ($2) are no different from a cast (and actually,
> that's exactly what they are with unions).

I don't think so. C unions involve a hidden cast only when accessing
a different member than was set, but that's not the case here.

C++ variants would throw in this case, and Bison's variants would
assert or UB in this case.

Apparently (and reasonably) this case is not supposed to work with
$<>, so WRT whether or not to support $<> at all, I see no
fundamental difference between C and C++, so if dropping support in
C++, it would only be consequent to do it in C as well, and that
might be nearly impossible due to backward-compatibility (or perhaps
even Yacc compatibility, I don't know).

> Sure, using std::variant or std::any we avoid the first three issues,
> but that's still by hiding things from Bison and relying on
> magic on the language side.  And obviously that works only for
> C++, and actually modern C++.

As I wrote sometime before, I currently don't use any
semantic-valued mid-rule actions, so I'm basically neutral on this
proposal. In principle, I think a properly declared type for them
seems reasonable, I just think this proposal is a few decades late.

What's more important for me, and the reason I worked on this, are
other features, most importantly move semantics. Of course, they
also require modern C++ (i.e., C++11 or newer), so if that's a
problem, I'll have to keep my own fork anyway.

Regards,
Frank



Re: Bison C++ mid-rule value lost with variants

2018-06-17 Thread Hans Åberg


> On 17 Jun 2018, at 16:02, Akim Demaille  wrote:

> Or go for a lighter syntax...

Indeed.

> expr:
>  NUM
> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

> Personally, I prefer the prefix forms, but they don’t blend
> nicely with named references:
> 
> expr:
>  NUM
> | expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; };

This is fact consistent with the order in the other declarations:  
 .

> I wish we had chosen a prefix syntax for named references, say
> 
> expr:
>  NUM
> | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };

If the type is in the variable, it implies a runtime variant cast, which one 
might want for some reason.

(Just some bystander inputs.)





Re: Bison C++ mid-rule value lost with variants

2018-06-17 Thread Rici Lake
I enthusiastically support this proposal.

I agree with the preference for prefix positioning. The `%type` keyword is
just noise, imho, and thus unnecessary.

I like the `name={ code }` syntax, too, but it's probably too late
for that. Perhaps `{ code }[name]` would be a plausible alternative
syntax, although I'd still prefer `` in prefix position.

Although unrelated to this proposal, I would also favour allowing

%%
 nonterminal: rhs

as an alternative to

%type  nonterminal
%%
nonterminal: rhs


Rici

On Sun, Jun 17, 2018, 09:02 Akim Demaille  wrote:

> Hi all,
>
> > Le 29 juin 2017 à 15:55, Piotr Marcińczyk  a écrit
> :
> >
> > I supposedly found a bug in lalr1.cc skeleton with variant semantic type.
> > When using mid-rule action { $$ = value; } to return a value that
> > will be used in further semantic actions, it appears that the value on
> > stack becomes zero. Tested with Bison version 3.0.4.
>
> Piotr’s grammar file includes:
>
> %token  NUM
> %%
> expr:
>   NUM
> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
>
> and one can see that when run, $2 is not 42, but 0.
>
> My opinion on this is somewhat different from the ones that
> have been expressed so far.  IMHO, it has no good reason
> to work.
>
> Yes, it works with plain old unions.  But that’s unsafe, and
> that’s because you hide things from your tool (Bison).  I
> personally see these typed accesses to the semantical
> values ($2) are no different from a cast (and actually,
> that’s exactly what they are with unions).
>
> Yes, it works with C++ variants, or even std::any, when
> used as a store for semantical values.  But we are still lying
> to the tool.
>
> I think this example shows that the design of mid-rules
> actions has some loopholes, and in particular something which is
> critically missing is the type of its semantical value.  Since,
> Bison does not know the type of the semantical value of a mid-rule
> action:
> - the user must use $$, not $$
> - the destructor will not work
> - the printer won’t work either
> - the user must also be explicit when using the value in
>   the other actions ($2).
>
> Sure, using std::variant or std::any we avoid the first three issues,
> but that’s still by hiding things from Bison and relying on
> magic on the language side.  And obviously that works only for
> C++, and actually modern C++.
>
> I think we should rather provide typed mid-rule actions.
>
> How about:
>
> expr:
>   NUM
> | expr %type{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
>
> It’s not clear whether it should be prefix or postfix
>
> expr:
>   NUM
> | expr { $$ = 42; } %type '+' NUM { std::cout << $2 << '\n'; };
>
> The regular use of %type is prefix (%type  expr), but the
> directives in rules are usually presented as postfix
> (exp: "if" exp exp %prec "then"), although it is not mandatory.
>
>
> Or go for a lighter syntax
>
> expr:
>   NUM
> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
>
> or
>
> expr:
>   NUM
> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };
>
>
> Personally, I prefer the prefix forms, but they don’t blend
> nicely with named references:
>
> expr:
>   NUM
> | expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; };
>
> Thoughts?
>
>
>
>
>
> I wish we had chosen a prefix syntax for named references, say
>
> expr:
>   NUM
> | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };
>
> I think we discussed this with Joel E. Denny, but I don’t remember
> the details. We could have used
>
> expr:
>   NUM
> | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };
>
>
>
>
>


Re: Bison C++ mid-rule value lost with variants

2018-06-17 Thread Akim Demaille
Hi all,

> Le 29 juin 2017 à 15:55, Piotr Marcińczyk  a écrit :
> 
> I supposedly found a bug in lalr1.cc skeleton with variant semantic type.
> When using mid-rule action { $$ = value; } to return a value that
> will be used in further semantic actions, it appears that the value on
> stack becomes zero. Tested with Bison version 3.0.4.

Piotr’s grammar file includes:

%token  NUM
%%
expr:
  NUM
| expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

and one can see that when run, $2 is not 42, but 0.

My opinion on this is somewhat different from the ones that
have been expressed so far.  IMHO, it has no good reason
to work.

Yes, it works with plain old unions.  But that’s unsafe, and
that’s because you hide things from your tool (Bison).  I
personally see these typed accesses to the semantical
values ($2) are no different from a cast (and actually,
that’s exactly what they are with unions).

Yes, it works with C++ variants, or even std::any, when
used as a store for semantical values.  But we are still lying
to the tool.

I think this example shows that the design of mid-rules
actions has some loopholes, and in particular something which is
critically missing is the type of its semantical value.  Since,
Bison does not know the type of the semantical value of a mid-rule
action:
- the user must use $$, not $$
- the destructor will not work
- the printer won’t work either
- the user must also be explicit when using the value in
  the other actions ($2).

Sure, using std::variant or std::any we avoid the first three issues,
but that’s still by hiding things from Bison and relying on
magic on the language side.  And obviously that works only for
C++, and actually modern C++.

I think we should rather provide typed mid-rule actions.

How about:

expr:
  NUM
| expr %type{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

It’s not clear whether it should be prefix or postfix

expr:
  NUM
| expr { $$ = 42; } %type '+' NUM { std::cout << $2 << '\n'; };

The regular use of %type is prefix (%type  expr), but the
directives in rules are usually presented as postfix
(exp: "if" exp exp %prec "then"), although it is not mandatory.


Or go for a lighter syntax

expr:
  NUM
| expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };

or

expr:
  NUM
| expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; };


Personally, I prefer the prefix forms, but they don’t blend
nicely with named references:

expr:
  NUM
| expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; };

Thoughts?





I wish we had chosen a prefix syntax for named references, say

expr:
  NUM
| expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };

I think we discussed this with Joel E. Denny, but I don’t remember
the details. We could have used

expr:
  NUM
| expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };






Re: Bison C++ mid-rule value lost with variants

2018-04-08 Thread Frank Heckenbach
Hans Åberg wrote:

> > On 29 Jun 2017, at 15:55, Piotr Marcinczyk  wrote:
> > 
> > I supposedly found a bug in lalr1.cc skeleton with variant semantic type.
> 
> You might check if std::variant, of C++17, can be used instead. Cf.
>   http://en.cppreference.com/w/cpp/utility/variant
> 
> > When using mid-rule action { $$ = value; } to return a value that
> > will be used in further semantic actions, it appears that the value on
> > stack becomes zero. Tested with Bison version 3.0.4.

This answer may come a bit late, but I had the same problem (and
others) recently, so I wrote a new Bison skeleton using std::variant
which, as Hans said, solves this problem.

You can find it here:

http://lists.gnu.org/archive/html/bug-bison/2018-04/msg00011.html

To use the new skeleton, change in parser.y:

-%skeleton "lalr1.cc"
+%skeleton "lalr1-c++17.cc"

Since tokens are now of std::variant type, compiling requires C++17,
e.g. gcc-7 with "--std=c++17" option, and tokens are built a bit
differently, so change in lexer.flex (or use make_int):

-yylval->build(atoi(yytext));
+yylval->emplace(atoi(yytext));

Regards,
Frank



Re: Bison C++ mid-rule value lost with variants

2017-06-29 Thread Hans Åberg

> On 29 Jun 2017, at 15:55, Piotr Marcińczyk  wrote:
> 
> I supposedly found a bug in lalr1.cc skeleton with variant semantic type.

You might check if std::variant, of C++17, can be used instead. Cf.
  http://en.cppreference.com/w/cpp/utility/variant

> When using mid-rule action { $$ = value; } to return a value that
> will be used in further semantic actions, it appears that the value on
> stack becomes zero. Tested with Bison version 3.0.4.

This happens also if moved out to a separate rule, perhaps overwritten by the 
token semantic value. So it does not look safe for use.




Re: Bison C++ mid-rule value lost with variants

2017-06-29 Thread Piotr Marcińczyk
Thanks for the workaround. Actually, I used marker tokens with actions
getting value from stack before the rule, e.g.:

%type  AnswerToLifeMarker

expr: expr AnswerToLifeMarker + NUM { std::cout << $2 << std::endl; };

AnswerToLifeMarker: { usePreviousExpr($0); $$ = 42; };

​That's because I need a way to create and store labels for later jumps
between statements​ in generated code. So one global value is not
sufficient. I would need a separate stack for that - I tried this approach
and the code seemed more error prone and less maintainable.


​Best regards,

  Piotr Marcińczyk

2017-06-29 20:07 GMT+02:00 Kaz Kylheku :

> On 29.06.2017 06:55, Piotr Marcińczyk wrote:
>
>> I supposedly found a bug in lalr1.cc skeleton with variant semantic type.
>> When using mid-rule action { $$ = value; } to return a value that
>> will be used in further semantic actions, it appears that the value on
>> stack becomes zero. Tested with Bison version 3.0.4.
>>
>
> You probably need your code to work with Bison installations
> that don't have a fix for this (which is all of them, currently,
> except maybe yours).
>
> Could the workaround be as simple as:
>
>/* plain old local variable, injected into yyparse */
>
>{ int whatever = 42; } + NUM { std::cout << whatever << std::endl; }
>
> Also, you can have a scratch space in your parser structure
> for carrying a value from one mid-rule action to another.
>
>/* assumes %parse-param{your_parser_type *parser} */
>
>{ parser->stash = 42; } + NUM { std::cout << parser->stash <<
> std::endl: }
>
> I've done this latter sort of of thing.
>
> The parser structure is basically your "this" object in the entire
> parse job, and yyparse is its "method". :)
>
> Cheers ...
>
>


Re: Bison C++ mid-rule value lost with variants

2017-06-29 Thread Kaz Kylheku

On 29.06.2017 06:55, Piotr Marcińczyk wrote:
I supposedly found a bug in lalr1.cc skeleton with variant semantic 
type.

When using mid-rule action { $$ = value; } to return a value that
will be used in further semantic actions, it appears that the value on
stack becomes zero. Tested with Bison version 3.0.4.


You probably need your code to work with Bison installations
that don't have a fix for this (which is all of them, currently,
except maybe yours).

Could the workaround be as simple as:

   /* plain old local variable, injected into yyparse */

   { int whatever = 42; } + NUM { std::cout << whatever << std::endl; }

Also, you can have a scratch space in your parser structure
for carrying a value from one mid-rule action to another.

   /* assumes %parse-param{your_parser_type *parser} */

   { parser->stash = 42; } + NUM { std::cout << parser->stash << 
std::endl: }


I've done this latter sort of of thing.

The parser structure is basically your "this" object in the entire
parse job, and yyparse is its "method". :)

Cheers ...