Re: Bison C++ mid-rule value lost with variants
> Le 17 juin 2018 à 16:33, Rici Lake a écrit : > > Although unrelated to this proposal, I would also favour allowing > > %% > nonterminal: rhs > > as an alternative to > > %type nonterminal > %% > nonterminal: rhs FTR, the more I contemplate this idea, the more I like it. Unfortunately we hit again the classical S/R conflict of the naive YACC grammar: in a grammar such as %% exp : "number" | exp "+" exp { $$ = $1 + $3; } | exp "*" exp { $$ = $1 * $3; } | "(" exp ")" { $$ = $2; } | "[" { $$ = 42; } "]" { $$ = $2; } ; a single lookahead will not suffice for the parser to know, when it's reading the rhs of a rule, whether the coming is that of a midrule action so we stay in the right hand rule of the rule, or the tag of the lhs of the next rule. GLR would of course solve the problem. But I'm not eager to sit Bison on top of GLR right now (we use LAC for instance, not implemented in glr.c).
Re: Bison C++ mid-rule value lost with variants
Akim Demaille wrote: > > IMHO, "active types" aren't really the problem (as std::variant or > > an equivalent implementation can handle them), but indeed it's too > > low-level and error-prone, though that would apply to all skeletons > > (but of course, dropping it from C is impossible because of > > backward-compatibility). > > I agree. There's just one place I don't know too well how to > address, that for $-1, $-2, etc. But again that's because as > of today Bison is too limited to recover the type. I wasn't even aware of $-1, just read about it in the manual. How is this reconcilable with proper LR parsing? Is it really needed for some grammars or could it be dropped too (at least in C++)? > You seem to have a bottom-up approach: if I use this as a > semantic value, what can I do with it. I look at this the > other way round. Everything I know about LR shows that this > is not needed, so I'll avoid it, _unless_, I'm given a counter > example. Not really bottom-up. I saw that Bison supported $<>, but buggy, and tried to figure out if I can fix the bugs which std::variant allowed me to do. Apparently you didn't see it as supported, but buggy, but rather as unsupported. > I'm teaching (well, I used to teach now) LR parsing, and I don't > want my students (well, former) to have the impression I'm fooling > them by using something different. I thought the user actions were rather separate from parsing theory, but I'll take your word for it. And we agree about dropping $<> anyway. :) > >>> When I did the coding, std::variant actually simplified things for > >>> me (e.g., I could avoid adding move support in Bison's variant > >>> implementation), so if I were you, I'd probably use it even if I > >>> dopped $<>, but if you want to avoid its small runtime overhead, > >>> that seems possible. > >> > >> I don't think we can afford to simply drop all C++s pre 17. > > > > As I wrote before, there are std::variant implementations for C++14 > > (which I'm actually using mostly so far) and I think also C++11 > > (I haven't tested this). > > Yes, I know. But then, I expect license nightmares to ship > them, and also by pre 17, I also mean 98/03. Both mparks's and Boost's implementation are released under the Boost Software License, described on gnu.org as "[...] a lax, permissive non-copyleft free software license, compatible with the GNU GPL", so I see no big problems here, for both free and non-free software developers. As for pre-11, personally I'm not very interested (I shied away from C++ for a long time, and IMHO it's only become useable with C++11), but I see your point of view. So you'll need to make (internal) moving dependent on the compiler version then. Maybe you can define a template that does std::move for C++11 and NOP for older compilers, to avoid sprinkling the code with ifdefs. > > But I've come to rely on some of the features I implemented there. > > AFAICS, you haven't commented on them yet, so I don't know how you > > think about them. If you really object to them (or equivalent > > features), I might prefer to keep using my skeletons. > > I didn't answer because I have already too many threads open, > and I prefer to stay kinda focus on the issues I plan to address > in the short term, keeping the rest for later. I wish I could > do more :/ And I'm sorry to keep you waiting. No problem. As I said, for now I won't do much about it anyway. But please understand that I'd like to have clarity about those issues before I'll do substantial work. > > - pre-action for empty rules with a user action: $$ is initialized > > to the default value of the correct type. With completely static > > variants, this might even be natural (or may need to call the > > default constructor in a switch), and should be officially > > documented. > > > > (In contrast, for non-empty rules with a user action, I > > pre-initialize $$ to an invalid variant, so if the user action > > forgets to set $$, a bad_variant_access will happen on access > > which may catch some errors in user actions. This won't be > > possible with static variants, but I can live without that.) > > Never thought about this before. > > Do you happen to have test cases? No isolated ones (my parsers contain a number of them). Here's a slightly simplified extract to build a formal parameter list: using TFormalParameters = vector >; %type fpar_list fpar_list: type_name identifier { $$ .emplace_back ($1, $2); } | fpar_list ',' type_name identifier { ($$ = $1).emplace_back ($3, $4); }; As a side note, this also relies on automatic moving for all $n in the example -- let's assume TType is a move-only type (it isn't actually, but I have other rules with move-only types and don't want to have too many different examples here), and also for performance (string, vector). The first action assumes that $$ is default-initialized. Now I realize I could also write it like this without
Re: Bison C++ mid-rule value lost with variants
Hans Åberg wrote: > > On 27 Aug 2018, at 22:10, Akim Demaille wrote: > > > >> Most of my porting work, apart from writing the new skeletons, was > >> general grammar cleanup and conversion of semantic types from raw > >> pointers and containers to smart pointers and other RAII classes > >> (which was my main goal of the port, of course), and changes in the > >> lexer (dropping flex, but that's another story). > > > > I fought a lot with Flex, but it works ok in C++ too with lalr1.cc. > > I have one parser here, > > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot, > > and another there > > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat > > for instance, using Flex. > > That is probably versions before 2.6; the yyin and yyout have been > changed in the C++ header so that they are no longer pointers, so > it is not only incompatible with the header of older versions, but > also with the code it writes, resulting in the issue [1]. > > 1. > https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error Though this wasn't actually my problem, I'll reply to this mail rather than the main thraed to keep it separate from the actual Bison discussion. For a start, I didn't have very good experience communicating with Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings etc. in the generated code, so over the years I'd been adjusting various warning-suppression gcc options or doing dirty #define tricks to avoid warnings, or sometimes even post-processing the generated lexer with sed. But the final straw was when, after changing to C++ Bison, I wanted to switch to C++ Flex too and found this beautiful comment: /* The c++ scanner is a mess. The FlexLexer.h header file relies on the * following macro. This is required in order to pass the c++-multiple-scanners * test in the regression suite. We get reports that it breaks inheritance. * We will address this in a future release of flex, or omit the C++ scanner * altogether. */ I know there are no guarantees in the future of free software (neither of non-free software, of course), but such an announcement/threat seemed too risky to me. Meanwhile I'd often thought that all Flex actually does is matching alternative regular expressions. Plain RE can do that as well, and by capturing subexpressions I can find out which alternative was matched. Of course, it would (indeed turn out to be) somewhat slower (RE built at runtime vs. compile time), but like parsing, lexing speed is not a big issue to me. So I was ready to trade that in for convenience of programming and one less dependence on a problematic tool. (Side node: Many years ago, on a different project, I dropped gperf to recognize predefined identifiers for similar reasons, and put them in a look-up table instead. Except for a tiny slowdown, that had worked out well, so I was confident I could drop Flex, too. -- Now apparently the next one in line after dropping gperf and Flex should be Bison, but don't worry, I don't see an easy way to replace it, since Bison actually does some nontrivial stuff. :) So I wrote a small library that builds that massive RE out of single rules and maps subexpressions back to rules (even in the case that rules contain subexpressions of their own), and that works for me. Regards, Frank
Re: Bison C++ mid-rule value lost with variants
> On 27 Aug 2018, at 22:10, Akim Demaille wrote: > >> Most of my porting work, apart from writing the new skeletons, was >> general grammar cleanup and conversion of semantic types from raw >> pointers and containers to smart pointers and other RAII classes >> (which was my main goal of the port, of course), and changes in the >> lexer (dropping flex, but that’s another story). > > I fought a lot with Flex, but it works ok in C++ too with lalr1.cc. > I have one parser here, > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot, > and another there > https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat > for instance, using Flex. That is probably versions before 2.6; the yyin and yyout have been changed in the C++ header so that they are no longer pointers, so it is not only incompatible with the header of older versions, but also with the code it writes, resulting in the issue [1]. 1. https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error
Re: Bison C++ mid-rule value lost with variants
Hi Frank! Thanks for the feedback about 3.1! > Le 27 août 2018 à 00:39, Frank Heckenbach a écrit : > > Akim Demaille wrote: > >>> Le 19 juin 2018 à 00:27, Frank Heckenbach a écrit >>> : >> >> Yes, indeed. $<> is really too low level a feature when >> it is comes to 'active types' such as we have in C++ compared >> to C. So we should not strive to have it work for dubious >> cases (a type mismatch between the declared type and the one >> passed to $<>), and allow to not use it (typed mid-rule actions). > > I agree with dropping $<>, though not quite for the same reasons. I’m not sure our reasons are _so_ different. My view is that the problem we had with variants and midrule action show that midrule actions are improperly baked. In the model of the parser there is no need for std style variant, because the type is always known. The implementation of parsers in type rich languages (those with dependent types) do not need such an approach. So rather than adding something I saw nowhere else in the implementation of parsers, I’d rather fix the input. > IMHO, "active types" aren't really the problem (as std::variant or > an equivalent implementation can handle them), but indeed it's too > low-level and error-prone, though that would apply to all skeletons > (but of course, dropping it from C is impossible because of > backward-compatibility). I agree. There’s just one place I don’t know too well how to address, that for $-1, $-2, etc. But again that’s because as of today Bison is too limited to recover the type. >> I'm willing to discourage the use of $<> in all the outputs, >> but forbid them when there's really no way to support them. > > Depends on your definition of "no way" (see above). ;) You seem to have a bottom-up approach: if I use this as a semantic value, what can I do with it. I look at this the other way round. Everything I know about LR shows that this is not needed, so I’ll avoid it, _unless_, I’m given a counter example. I’m teaching (well, I used to teach now) LR parsing, and I don’t want my students (well, former) to have the impression I’m fooling them by using something different. > I saw your discussion with Hans about the calc++ examples. Cleaning > that up as suggested certainly helps here. Victor gave good ideas too. > Most of my porting work, apart from writing the new skeletons, was > general grammar cleanup and conversion of semantic types from raw > pointers and containers to smart pointers and other RAII classes > (which was my main goal of the port, of course), and changes in the > lexer (dropping flex, but that’s another story). I fought a lot with Flex, but it works ok in C++ too with lalr1.cc. I have one parser here, https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot, and another there https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat for instance, using Flex. > Otherwise, AFAIR, the biggest changes were the different "%token" > and "%type" declarations with types instead of union members (which > made things easier), Yes! I made this for variants, but even for unions it’s a clear improvement (IMHO). > rewriting the interface to the parser, largely > changing (and in some cases, implementing :) the "%define"s in the > grammar file, and some changes to my Makefiles due to the additional > generated files. -- Actually, all of these changes were large in a > relative sense, but small in an absolute sense, :) :) :) > as all those parts > of the code (parser interface, defines, Makefile, etc.) are rather > short (at least in my projects). So they may be more of a > psychological hurdle, since at first glance it all looks completely > different, and since one needs that before one can actually get > started with the new skeleton. > > I fear existing projects using the C skeleton differ too widely to > offer a detailed porting guide for all (or most) cases, but some > easy to use examples may help here (see above), if they show the > important features (not too many to avoid confusing users, but not > too few so they're actually useful to many; it's probably a delicate > balance). I’ll try to see if I can come up with at least a short guideline. >>> When I did the coding, std::variant actually simplified things for >>> me (e.g., I could avoid adding move support in Bison's variant >>> implementation), so if I were you, I'd probably use it even if I >>> dopped $<>, but if you want to avoid its small runtime overhead, >>> that seems possible. >> >> I don't think we can afford to simply drop all C++s pre 17. > > As I wrote before, there are std::variant implementations for C++14 > (which I'm actually using mostly so far) and I think also C++11 > (I haven’t tested this). Yes, I know. But then, I expect license nightmares to ship them, and also by pre 17, I also mean 98/03. >> You did wonders with your C++17 skeleton, and it would be >> great to port your effort in the current framework. Would >> you
Re: Bison C++ mid-rule value lost with variants
Akim Demaille wrote: > > Le 19 juin 2018 à 00:27, Frank Heckenbach a écrit > > : > > > > Akim Demaille wrote: > > > >> Well, you can use $<> wherever you want, in regular actions > >> too. > > > > And that's just as unsafe and hiding things from Bison. From the > > rest of your mail I now see you want to get rid of $<> completely in > > C++. I hadn't gathered this from your previous mail. > > Yes, indeed. $<> is really too low level a feature when > it is comes to 'active types' such as we have in C++ compared > to C. So we should not strive to have it work for dubious > cases (a type mismatch between the declared type and the one > passed to $<>), and allow to not use it (typed mid-rule actions). I agree with dropping $<>, though not quite for the same reasons. IMHO, "active types" aren't really the problem (as std::variant or an equivalent implementation can handle them), but indeed it's too low-level and error-prone, though that would apply to all skeletons (but of course, dropping it from C is impossible because of backward-compatibility). Declared types on mid-rule actions are safer in the normal case (and if exotic uses of $<> will stop working, I wouldn't mind -- one should just declare a more suitable type then, possibly itself a variant if one does such complex things in the actions). > I'm willing to discourage the use of $<> in all the outputs, > but forbid them when there's really no way to support them. Depends on your definition of "no way" (see above). ;) > > Though it does help when moving from C to C++ which I did recently > > (though my code was actually C++ the whole time, I had used the C > > skeleton before). If I had used $<>, and it was not available in the > > C++ skeleton, it would have been another hurdle at this point. In my > > case, I'm talking doubly hypothetically; for other users (as the > > original report indicates, there are people who use it) it may > > become relevant, but some kind of porting guide may help. > > That's a good idea. But I'm not sure what it should cover :) I saw your discussion with Hans about the calc++ examples. Cleaning that up as suggested certainly helps here. Most of my porting work, apart from writing the new skeletons, was general grammar cleanup and conversion of semantic types from raw pointers and containers to smart pointers and other RAII classes (which was my main goal of the port, of course), and changes in the lexer (dropping flex, but that's another story). Otherwise, AFAIR, the biggest changes were the different "%token" and "%type" declarations with types instead of union members (which made things easier), rewriting the interface to the parser, largely changing (and in some cases, implementing :) the "%define"s in the grammar file, and some changes to my Makefiles due to the additional generated files. -- Actually, all of these changes were large in a relative sense, but small in an absolute sense, as all those parts of the code (parser interface, defines, Makefile, etc.) are rather short (at least in my projects). So they may be more of a psychological hurdle, since at first glance it all looks completely different, and since one needs that before one can actually get started with the new skeleton. I fear existing projects using the C skeleton differ too widely to offer a detailed porting guide for all (or most) cases, but some easy to use examples may help here (see above), if they show the important features (not too many to avoid confusing users, but not too few so they're actually useful to many; it's probably a delicate balance). > > When I did the coding, std::variant actually simplified things for > > me (e.g., I could avoid adding move support in Bison's variant > > implementation), so if I were you, I'd probably use it even if I > > dopped $<>, but if you want to avoid its small runtime overhead, > > that seems possible. > > I don't think we can afford to simply drop all C++s pre 17. As I wrote before, there are std::variant implementations for C++14 (which I'm actually using mostly so far) and I think also C++11 (I haven't tested this). > You did wonders with your C++17 skeleton, and it would be > great to port your effort in the current framework. Would > you contribute to that? For the rest of this year I'll be quite busy (as you can see from this late reply):, but next year I might have some time to work on Bison. However, I've ported my bigger parsers to my new skeletons and use them actively. So I have a solution that works for me, and the slight overhead of storing both the static (by the skeleton) and dynamic (by std::variant) types is no issue to me. But I've come to rely on some of the features I implemented there. AFAICS, you haven't commented on them yet, so I don't know how you think about them. If you really object to them (or equivalent features), I might prefer to keep using my skeletons. These are especially the following features (described in more detail in my original
Re: Bison C++ mid-rule value lost with variants
Hi Frank, Sorry, I missed that message. I found it while exploring the mid-rule action fiasco with variants. > Le 19 juin 2018 à 00:27, Frank Heckenbach a écrit : > > Akim Demaille wrote: > >> Well, you can use $<> wherever you want, in regular actions >> too. > > And that's just as unsafe and hiding things from Bison. From the > rest of your mail I now see you want to get rid of $<> completely in > C++. I hadn’t gathered this from your previous mail. Yes, indeed. $<> is really too low level a feature when it is comes to ‘active types’ such as we have in C++ compared to C. So we should not strive to have it work for dubious cases (a type mismatch between the declared type and the one passed to $<>), and allow to not use it (typed mid-rule actions). > Though it does help when moving from C to C++ which I did recently > (though my code was actually C++ the whole time, I had used the C > skeleton before). If I had used $<>, and it was not available in the > C++ skeleton, it would have been another hurdle at this point. In my > case, I'm talking doubly hypothetically; for other users (as the > original report indicates, there are people who use it) it may > become relevant, but some kind of porting guide may help. That’s a good idea. But I’m not sure what it should cover :) >> We tried to eliminate these runtime problems and make them >> compile-time as much as possible. A typical example >> is the symbol constructors in C++, which forbid that in the >> scanner you declare an INT and set yylval->float_val. > > Provided one uses them. Currently, this is not enforced (in fact, > Piotr’s grammar didn't), so not strictly forbidden. And I don’t want to forbid them. And since symbol constructors appeared late, backward compatibility forbids that we require them. Yep I think they are the proper way to do it, so I promote them. > FWIW, I wouldn't > mind strictly forbidding it (maybe by making other constructors > private and adding friends as necessary or whatever is required). If it were to be designed today, I would do that. >> I'm sorry if I gave the impression I would not provide support >> for modern C++, that's definitely not my point. I want to >> avoid _requiring_ it, but, if __cplusplus__ is modern enough, >> we absolutely should support move semantics! I'm focus on this >> issue now just because I'm trying to catch up! And it seems to >> me that it shows we don't need to require std::variant. > > If you're willing to drop $<> completely in C++ (both in mid-rule > and regular actions), it’s probably possible to avoid std::variant. I’m willing to discourage the use of $<> in all the outputs, but forbid them when there’s really no way to support them. > When I did the coding, std::variant actually simplified things for > me (e.g., I could avoid adding move support in Bison's variant > implementation), so if I were you, I'd probably use it even if I > dopped $<>, but if you want to avoid its small runtime overhead, > that seems possible. I don’t think we can afford to simply drop all C++s pre 17. You did wonders with your C++17 skeleton, and it would be great to port your effort in the current framework. Would you contribute to that?
Re: Bison C++ mid-rule value lost with variants
Akim Demaille wrote: > >> Piotr's grammar file includes: > >> > >> %token NUM > >> %% > >> expr: > >> NUM > >> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > >> > >> and one can see that when run, $2 is not 42, but 0. > >> > >> My opinion on this is somewhat different from the ones that > >> have been expressed so far. IMHO, it has no good reason > >> to work. > >> > >> Yes, it works with plain old unions. But that's unsafe, and > >> that's because you hide things from your tool (Bison). > > > > AFAIK, that's the only purpose of the $<> syntax in Bison which has > > been around for I don't know how long. So claiming it has no reason > > to work now seems a bit odd to me. > > Well, you can use $<> wherever you want, in regular actions > too. And that's just as unsafe and hiding things from Bison. From the rest of your mail I now see you want to get rid of $<> completely in C++. I hadn't gathered this from your previous mail. > And some people are doing nasty things with it in C, > which forced, for backward compatibility with YACC, to keep > weird code. > > And as you know, YACC does not support C++, and obviously > not (Bison) variants. So to expect a « feature » from C to > naturally work for C++ is not so straightforward. Though it does help when moving from C to C++ which I did recently (though my code was actually C++ the whole time, I had used the C skeleton before). If I had used $<>, and it was not available in the C++ skeleton, it would have been another hurdle at this point. In my case, I'm talking doubly hypothetically; for other users (as the original report indicates, there are people who use it) it may become relevant, but some kind of porting guide may help. > We tried to eliminate these runtime problems and make them > compile-time as much as possible. A typical example > is the symbol constructors in C++, which forbid that in the > scanner you declare an INT and set yylval->float_val. Provided one uses them. Currently, this is not enforced (in fact, Piotr's grammar didn't), so not strictly forbidden. FWIW, I wouldn't mind strictly forbidding it (maybe by making other constructors private and adding friends as necessary or whatever is required). > > What's more important for me, and the reason I worked on this, are > > other features, most importantly move semantics. Of course, they > > also require modern C++ (i.e., C++11 or newer), so if that's a > > problem, I'll have to keep my own fork anyway. > > I'm sorry if I gave the impression I would not provide support > for modern C++, that's definitely not my point. I want to > avoid _requiring_ it, but, if __cplusplus__ is modern enough, > we absolutely should support move semantics! I'm focus on this > issue now just because I'm trying to catch up! And it seems to > me that it shows we don't need to require std::variant. If you're willing to drop $<> completely in C++ (both in mid-rule and regular actions), it's probably possible to avoid std::variant. When I did the coding, std::variant actually simplified things for me (e.g., I could avoid adding move support in Bison's variant implementation), so if I were you, I'd probably use it even if I dopped $<>, but if you want to avoid its small runtime overhead, that seems possible. Regards, Frank
Re: Bison C++ mid-rule value lost with variants
> Le 18 juin 2018 à 15:26, Frank Heckenbach a écrit : > > Akim Demaille wrote: Hi Frank, >> Piotr's grammar file includes: >> >> %token NUM >> %% >> expr: >> NUM >> | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; >> >> and one can see that when run, $2 is not 42, but 0. >> >> My opinion on this is somewhat different from the ones that >> have been expressed so far. IMHO, it has no good reason >> to work. >> >> Yes, it works with plain old unions. But that's unsafe, and >> that's because you hide things from your tool (Bison). > > AFAIK, that's the only purpose of the $<> syntax in Bison which has > been around for I don't know how long. So claiming it has no reason > to work now seems a bit odd to me. Well, you can use $<> wherever you want, in regular actions too. And some people are doing nasty things with it in C, which forced, for backward compatibility with YACC, to keep weird code. And as you know, YACC does not support C++, and obviously not (Bison) variants. So to expect a « feature » from C to naturally work for C++ is not so straightforward. >> I personally see these typed accesses to the semantical >> values ($2) are no different from a cast (and actually, >> that's exactly what they are with unions). > > I don't think so. C unions involve a hidden cast only when accessing > a different member than was set, but that's not the case here. Yes, I agree. But the code does not know that. And I mean _statically_, not at runtime. We tried to eliminate these runtime problems and make them compile-time as much as possible. A typical example is the symbol constructors in C++, which forbid that in the scanner you declare an INT and set yylval->float_val. Here, it’s just another instance of the same class of issues. > Apparently (and reasonably) this case is not supposed to work with > $<>, so WRT whether or not to support $<> at all, I see no > fundamental difference between C and C++, so if dropping support in > C++, it would only be consequent to do it in C as well, and that > might be nearly impossible due to backward-compatibility (or perhaps > even Yacc compatibility, I don't know). We are bound to POSIX for YACC, and POSIX forces us to accept %union { int ival; float fval; char* sval; } %token NUM foo: NUM NUM NUM { $1; $2; $3; } which completely breaks the typing system of Bison. If I were to chose, I would forbid this and force the user to foo: NUM NUM NUM { $1; (float)$2; (const char*)$3; } but the decision to allow this was made decades ago :) At a time C functions were not even typed. Since then people tend to be rigorous with types. >> Sure, using std::variant or std::any we avoid the first three issues, >> but that's still by hiding things from Bison and relying on >> magic on the language side. And obviously that works only for >> C++, and actually modern C++. > > As I wrote sometime before, I currently don't use any > semantic-valued mid-rule actions, so I'm basically neutral on this > proposal. In principle, I think a properly declared type for them > seems reasonable, I just think this proposal is a few decades late. I agree. > What's more important for me, and the reason I worked on this, are > other features, most importantly move semantics. Of course, they > also require modern C++ (i.e., C++11 or newer), so if that's a > problem, I'll have to keep my own fork anyway. I’m sorry if I gave the impression I would not provide support for modern C++, that’s definitely not my point. I want to avoid _requiring_ it, but, if __cplusplus__ is modern enough, we absolutely should support move semantics! I’m focus on this issue now just because I’m trying to catch up! And it seems to me that it shows we don’t need to require std::variant.
Re: Bison C++ mid-rule value lost with variants
Akim Demaille wrote: > Piotr's grammar file includes: > > %token NUM > %% > expr: > NUM > | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > > and one can see that when run, $2 is not 42, but 0. > > My opinion on this is somewhat different from the ones that > have been expressed so far. IMHO, it has no good reason > to work. > > Yes, it works with plain old unions. But that's unsafe, and > that's because you hide things from your tool (Bison). AFAIK, that's the only purpose of the $<> syntax in Bison which has been around for I don't know how long. So claiming it has no reason to work now seems a bit odd to me. > I personally see these typed accesses to the semantical > values ($2) are no different from a cast (and actually, > that's exactly what they are with unions). I don't think so. C unions involve a hidden cast only when accessing a different member than was set, but that's not the case here. C++ variants would throw in this case, and Bison's variants would assert or UB in this case. Apparently (and reasonably) this case is not supposed to work with $<>, so WRT whether or not to support $<> at all, I see no fundamental difference between C and C++, so if dropping support in C++, it would only be consequent to do it in C as well, and that might be nearly impossible due to backward-compatibility (or perhaps even Yacc compatibility, I don't know). > Sure, using std::variant or std::any we avoid the first three issues, > but that's still by hiding things from Bison and relying on > magic on the language side. And obviously that works only for > C++, and actually modern C++. As I wrote sometime before, I currently don't use any semantic-valued mid-rule actions, so I'm basically neutral on this proposal. In principle, I think a properly declared type for them seems reasonable, I just think this proposal is a few decades late. What's more important for me, and the reason I worked on this, are other features, most importantly move semantics. Of course, they also require modern C++ (i.e., C++11 or newer), so if that's a problem, I'll have to keep my own fork anyway. Regards, Frank
Re: Bison C++ mid-rule value lost with variants
> On 17 Jun 2018, at 16:02, Akim Demaille wrote: > Or go for a lighter syntax... Indeed. > expr: > NUM > | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > Personally, I prefer the prefix forms, but they don’t blend > nicely with named references: > > expr: > NUM > | expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; }; This is fact consistent with the order in the other declarations: . > I wish we had chosen a prefix syntax for named references, say > > expr: > NUM > | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; }; If the type is in the variable, it implies a runtime variant cast, which one might want for some reason. (Just some bystander inputs.)
Re: Bison C++ mid-rule value lost with variants
I enthusiastically support this proposal. I agree with the preference for prefix positioning. The `%type` keyword is just noise, imho, and thus unnecessary. I like the `name={ code }` syntax, too, but it's probably too late for that. Perhaps `{ code }[name]` would be a plausible alternative syntax, although I'd still prefer `` in prefix position. Although unrelated to this proposal, I would also favour allowing %% nonterminal: rhs as an alternative to %type nonterminal %% nonterminal: rhs Rici On Sun, Jun 17, 2018, 09:02 Akim Demaille wrote: > Hi all, > > > Le 29 juin 2017 à 15:55, Piotr Marcińczyk a écrit > : > > > > I supposedly found a bug in lalr1.cc skeleton with variant semantic type. > > When using mid-rule action { $$ = value; } to return a value that > > will be used in further semantic actions, it appears that the value on > > stack becomes zero. Tested with Bison version 3.0.4. > > Piotr’s grammar file includes: > > %token NUM > %% > expr: > NUM > | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > > and one can see that when run, $2 is not 42, but 0. > > My opinion on this is somewhat different from the ones that > have been expressed so far. IMHO, it has no good reason > to work. > > Yes, it works with plain old unions. But that’s unsafe, and > that’s because you hide things from your tool (Bison). I > personally see these typed accesses to the semantical > values ($2) are no different from a cast (and actually, > that’s exactly what they are with unions). > > Yes, it works with C++ variants, or even std::any, when > used as a store for semantical values. But we are still lying > to the tool. > > I think this example shows that the design of mid-rules > actions has some loopholes, and in particular something which is > critically missing is the type of its semantical value. Since, > Bison does not know the type of the semantical value of a mid-rule > action: > - the user must use $$, not $$ > - the destructor will not work > - the printer won’t work either > - the user must also be explicit when using the value in > the other actions ($2). > > Sure, using std::variant or std::any we avoid the first three issues, > but that’s still by hiding things from Bison and relying on > magic on the language side. And obviously that works only for > C++, and actually modern C++. > > I think we should rather provide typed mid-rule actions. > > How about: > > expr: > NUM > | expr %type{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > > It’s not clear whether it should be prefix or postfix > > expr: > NUM > | expr { $$ = 42; } %type '+' NUM { std::cout << $2 << '\n'; }; > > The regular use of %type is prefix (%type expr), but the > directives in rules are usually presented as postfix > (exp: "if" exp exp %prec "then"), although it is not mandatory. > > > Or go for a lighter syntax > > expr: > NUM > | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > > or > > expr: > NUM > | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; > > > Personally, I prefer the prefix forms, but they don’t blend > nicely with named references: > > expr: > NUM > | expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; }; > > Thoughts? > > > > > > I wish we had chosen a prefix syntax for named references, say > > expr: > NUM > | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; }; > > I think we discussed this with Joel E. Denny, but I don’t remember > the details. We could have used > > expr: > NUM > | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; }; > > > > >
Re: Bison C++ mid-rule value lost with variants
Hi all, > Le 29 juin 2017 à 15:55, Piotr Marcińczyk a écrit : > > I supposedly found a bug in lalr1.cc skeleton with variant semantic type. > When using mid-rule action { $$ = value; } to return a value that > will be used in further semantic actions, it appears that the value on > stack becomes zero. Tested with Bison version 3.0.4. Piotr’s grammar file includes: %token NUM %% expr: NUM | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; and one can see that when run, $2 is not 42, but 0. My opinion on this is somewhat different from the ones that have been expressed so far. IMHO, it has no good reason to work. Yes, it works with plain old unions. But that’s unsafe, and that’s because you hide things from your tool (Bison). I personally see these typed accesses to the semantical values ($2) are no different from a cast (and actually, that’s exactly what they are with unions). Yes, it works with C++ variants, or even std::any, when used as a store for semantical values. But we are still lying to the tool. I think this example shows that the design of mid-rules actions has some loopholes, and in particular something which is critically missing is the type of its semantical value. Since, Bison does not know the type of the semantical value of a mid-rule action: - the user must use $$, not $$ - the destructor will not work - the printer won’t work either - the user must also be explicit when using the value in the other actions ($2). Sure, using std::variant or std::any we avoid the first three issues, but that’s still by hiding things from Bison and relying on magic on the language side. And obviously that works only for C++, and actually modern C++. I think we should rather provide typed mid-rule actions. How about: expr: NUM | expr %type{ $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; It’s not clear whether it should be prefix or postfix expr: NUM | expr { $$ = 42; } %type '+' NUM { std::cout << $2 << '\n'; }; The regular use of %type is prefix (%type expr), but the directives in rules are usually presented as postfix (exp: "if" exp exp %prec "then"), although it is not mandatory. Or go for a lighter syntax expr: NUM | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; or expr: NUM | expr { $$ = 42; } '+' NUM { std::cout << $2 << '\n'; }; Personally, I prefer the prefix forms, but they don’t blend nicely with named references: expr: NUM | expr { $$ = 42; }[val] '+' NUM { std::cout << $val << '\n'; }; Thoughts? I wish we had chosen a prefix syntax for named references, say expr: NUM | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; }; I think we discussed this with Joel E. Denny, but I don’t remember the details. We could have used expr: NUM | expr val={ $$ = 42; } '+' NUM { std::cout << $val << '\n'; };
Re: Bison C++ mid-rule value lost with variants
Hans Åberg wrote: > > On 29 Jun 2017, at 15:55, Piotr Marcinczykwrote: > > > > I supposedly found a bug in lalr1.cc skeleton with variant semantic type. > > You might check if std::variant, of C++17, can be used instead. Cf. > http://en.cppreference.com/w/cpp/utility/variant > > > When using mid-rule action { $$ = value; } to return a value that > > will be used in further semantic actions, it appears that the value on > > stack becomes zero. Tested with Bison version 3.0.4. This answer may come a bit late, but I had the same problem (and others) recently, so I wrote a new Bison skeleton using std::variant which, as Hans said, solves this problem. You can find it here: http://lists.gnu.org/archive/html/bug-bison/2018-04/msg00011.html To use the new skeleton, change in parser.y: -%skeleton "lalr1.cc" +%skeleton "lalr1-c++17.cc" Since tokens are now of std::variant type, compiling requires C++17, e.g. gcc-7 with "--std=c++17" option, and tokens are built a bit differently, so change in lexer.flex (or use make_int): -yylval->build(atoi(yytext)); +yylval->emplace(atoi(yytext)); Regards, Frank
Re: Bison C++ mid-rule value lost with variants
> On 29 Jun 2017, at 15:55, Piotr Marcińczykwrote: > > I supposedly found a bug in lalr1.cc skeleton with variant semantic type. You might check if std::variant, of C++17, can be used instead. Cf. http://en.cppreference.com/w/cpp/utility/variant > When using mid-rule action { $$ = value; } to return a value that > will be used in further semantic actions, it appears that the value on > stack becomes zero. Tested with Bison version 3.0.4. This happens also if moved out to a separate rule, perhaps overwritten by the token semantic value. So it does not look safe for use.
Re: Bison C++ mid-rule value lost with variants
Thanks for the workaround. Actually, I used marker tokens with actions getting value from stack before the rule, e.g.: %type AnswerToLifeMarker expr: expr AnswerToLifeMarker + NUM { std::cout << $2 << std::endl; }; AnswerToLifeMarker: { usePreviousExpr($0); $$ = 42; }; That's because I need a way to create and store labels for later jumps between statements in generated code. So one global value is not sufficient. I would need a separate stack for that - I tried this approach and the code seemed more error prone and less maintainable. Best regards, Piotr Marcińczyk 2017-06-29 20:07 GMT+02:00 Kaz Kylheku: > On 29.06.2017 06:55, Piotr Marcińczyk wrote: > >> I supposedly found a bug in lalr1.cc skeleton with variant semantic type. >> When using mid-rule action { $$ = value; } to return a value that >> will be used in further semantic actions, it appears that the value on >> stack becomes zero. Tested with Bison version 3.0.4. >> > > You probably need your code to work with Bison installations > that don't have a fix for this (which is all of them, currently, > except maybe yours). > > Could the workaround be as simple as: > >/* plain old local variable, injected into yyparse */ > >{ int whatever = 42; } + NUM { std::cout << whatever << std::endl; } > > Also, you can have a scratch space in your parser structure > for carrying a value from one mid-rule action to another. > >/* assumes %parse-param{your_parser_type *parser} */ > >{ parser->stash = 42; } + NUM { std::cout << parser->stash << > std::endl: } > > I've done this latter sort of of thing. > > The parser structure is basically your "this" object in the entire > parse job, and yyparse is its "method". :) > > Cheers ... > >
Re: Bison C++ mid-rule value lost with variants
On 29.06.2017 06:55, Piotr Marcińczyk wrote: I supposedly found a bug in lalr1.cc skeleton with variant semantic type. When using mid-rule action { $$ = value; } to return a value that will be used in further semantic actions, it appears that the value on stack becomes zero. Tested with Bison version 3.0.4. You probably need your code to work with Bison installations that don't have a fix for this (which is all of them, currently, except maybe yours). Could the workaround be as simple as: /* plain old local variable, injected into yyparse */ { int whatever = 42; } + NUM { std::cout << whatever << std::endl; } Also, you can have a scratch space in your parser structure for carrying a value from one mid-rule action to another. /* assumes %parse-param{your_parser_type *parser} */ { parser->stash = 42; } + NUM { std::cout << parser->stash << std::endl: } I've done this latter sort of of thing. The parser structure is basically your "this" object in the entire parse job, and yyparse is its "method". :) Cheers ...