Re: Assistance Needed: Stuck at Version 2.3 on MacOS

2023-11-20 Thread Hans Åberg


> On Nov 20, 2023, at 01:05, Tiago  wrote:

> I am writing to inform you that I am currently stuck on version 2.3, and I am 
> unsure why this is happening.

This is the one provided by the system, which is in /usr/bin/. See this by 
typing 'which bison' in Terminal.

> I have tried everything, as I am using a MacOS software. Please let me know 
> if you can assist me.

You need to install a later version. You also need M4. 

This can be done conveniently with a package manager, like MacPorts [1] which 
helps to keep programs up to date, or directly from sources [2–3].

1. https://www.macports.org/
2. https://www.gnu.org/software/bison/
3. https://www.gnu.org/software/m4/





Re: About how reduce/reduce conflicts are managed

2023-11-19 Thread Hans Åberg


> On Nov 19, 2023, at 10:57, Domingo Alvarez Duarte  wrote:

> … I found that bison/byacc/kmyacc reports 35 reduce/reduce conflicts but 
> parsertl reports then as resolved.
…
> … how best to handle it, including questioning how bison does it.
> 
> Can someone help in clarifying this ?

See the Bison manual, sec 5.6:

Bison resolves by choosing the first rule in the grammar, but recommends avoid 
relying on it, instead rewriting the grammar.





Re: glr-parser hands me the wrong token

2023-04-09 Thread Hans Åberg


> On 9 Apr 2023, at 00:43, Adam Wozniak  wrote:
> 
> On inspection, i think the problem is not Bison, but my lack of
> understanding of the rule of yytext.
> (i should be using yylval instead)

The Flex lexer only produces a pointer to a buffer which will change on 
subsequent reads, so if using the string it points to, it is necessary hand 
over a copy of it to the Bison parser.




C++ std::move issue

2022-09-21 Thread Hans Åberg
In Bison 3.8.1, there may be a double application of std::move on the semantic 
value under some circumstances, which may be valid C++, but not intended. A 
std::move should leave the object in a valid but unspecified state, which 
probably means that assignments should be still possible.


In the parser, there is a std::move in:
/* Initialize the stack.  The initial state will be set in
   yynewstate, since the latter expects the semantical and the
   location values to have been already stored, initialize these
   stacks with a primary value.  */
yystack_.clear ();
yypush_ (YY_NULLPTR, 0, YY_MOVE (yyla));

Then there follows code that gets a new yyla.value:
// Read a lookahead token.
if (yyla.empty ())
  {
  …
  }

However, if this condition is false, there is a second std::move at:
// Shift the lookahead token.
yypush_ ("Shifting", state_type (yyn), YY_MOVE (yyla));
goto yynewstate;





Re: Flex size_t sizes

2021-11-14 Thread Hans Åberg


> On 14 Nov 2021, at 01:55, Kaz Kylheku  wrote:
> 
> On 2021-11-13 13:18, Hans Åberg wrote:
>> This works as long as nobody tries to compile the .ll file with an
>> incompatible Flex version even in the case the header is shipped.
> 
> Your build system has to handle that situation. If the downstream
> user builds your program in such a way that the .ll file is processed
> by Flex, rather than using the shipped scanner, then in that situation,
> that system's FlexLexer.h has to be pulled in or referenced; the build
> obviously cannot be using the shipped FlexLexer.h, in conjunction with
> the freshly generated lex.yy.cc.
> 
> I'd have it so that when the shipped code is being prepared, then
> the #include  line is replaced by the contents of the
> header, right there, in place. It then has no references to anything.
> 
> And so then if a fresh local build is done, then a lex.yy.cc will
> be generated whose #include  line is left alone and
> and refers to that system's FlexLexer.h. (The special editing happens
> only when a certain makefile target is invoked like, say,
> "make shipped-scanner").
> 
> If someone has multiple incompatible copies of Flex, and/or the
>  header, that is their problem; if that user complains,
> you can point your finger to your shipped scanner and tell them to
> just stick to that if they have a problem with Flex. And also that
> regenerating the scanner is a maintainer activity, and that maintainers
> must have a development system with all the right tools, reliably
> installed if they are to build the program entirely from scratch and
> work on it.

Those installing it may do it different, like compiling the .ll file even 
thought they should not. GNU 'make distcheck' puts the sources in a read only 
directory and forces compilation with flex and bison.

>> It should have been as in Bison, which always includes the correct
>> header. But Flex isn't developed, so it is what it is.
> 
> I'm not aware that Bison-generated parser sources depend on any
> Bison-specific external headers.

No, it generates it when compiling and add additional ones needed to same 
directory. So no problems there.

> (Then again, I wasn't aware that Flex had this problem in the C++ mode,
> and I don't have experience with every possible mode of using Bison.)

I thought it would suffice to be compilable with latest Flex, but then it turns 
out that there are home brew versions out there.





Re: Flex size_t sizes

2021-11-13 Thread Hans Åberg


> On 13 Nov 2021, at 22:04, Kaz Kylheku  wrote:
> 
> [Replying to HTML with HTML] 

Normally, text is expected on these types of lists, even though many probably 
can use styled text.

> On 2021-11-13 02:17, Hans Åberg wrote: 
> 
>> On 12 Nov 2021, at 23:41, Kaz Kylheku  wrote: 
>> 
>>> If there must be a FlexLexer.h, the thing to do is to arrange for the build
>>> process to use a local copy of FlexLexer.h that is in your tree. As part of
>>> generating the scanner, your Makefile (or whatever) steps should hunt down
>>> this header file, stick it into your tree, and make the code refer to that
>>> copy.
>>> 
>>> Check that copy into version control, and make sure downstream users have it
>>> as part of the distribution, and that they can build the scanner without
>>> having any portion of Flex installed on their system.
>> 
>> So this is a way, but if you do not want to ship it, then it must be present 
>> elsewhere. And if you ship it, and the the .ll file is compiled with an 
>> incompatible Flex version, it won't compile.
> 
> There is a subtlety here I might have buried in all my verbiage. 
> 
> This #include  is an issue EVEN IF YOU TAKE THE PRUDENT
> STEP OF SHIPPING THE GENERATED SCANNER.
> 
> That's the problem! 
> 
> There should not be a problem if you ship the lex source code, and
> expect the user to have Flex and run it to generate the code. Their
> installation of the utility should match their  header. If
> not, they have a bad/inconsistent installation. 
> Shipping the scanner should be immune to this kind of problem.

Not it isn't. If you decide to ship the original header, then a compiled 
version may ot work. Akim though made a header that checks the Flex version and 
adapts—see his reply in this thread.

> But even you ship Flex-generated code, it is not self-contained by
> default: that generated code wants a Flex-specific C++ header from the
> system. That requires Flex to be installed, and the right version, which
> spoils the whole idea of just shipping the C code.
> 
> Though that is quite ridiculous, you can easily work around it by
> shipping that header and somehow making sure that your lex.yy.cc file
> finds that copy of the header and not some system installed one.
> 
> That's what I'm saying. 

This works as long as nobody tries to compile the .ll file with an incompatible 
Flex version even in the case the header is shipped.

It should have been as in Bison, which always includes the correct header. But 
Flex isn't developed, so it is what it is.





Re: Flex size_t sizes

2021-10-20 Thread Hans Åberg


> On 20 Oct 2021, at 05:39, Akim Demaille  wrote:
> 
> Hi Hans,
> 
>> Le 14 oct. 2021 à 15:23, Hans Åberg  a écrit :
>> 
>> Hi Akim,
>> 
>> Saw you have edited Flex, so I take it up here, even though not strictly a 
>> Bison topic:
>> 
>> The Apple flex version has been edited to admit size_t sizes, 64-bit on the 
>> platform, and perhaps it might be good idea for regular flex, which uses 
>> int, only 32-bit there. If using '%option c++' and mixing the versions, then 
>> the FlexLexer.h header is incompatible with the C++ source code, generating 
>> a compile error.
>> 
>> On MacOS with MacPorts and Xcode, the files are in
>> /opt/local/include/FlexLexer.h
>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/FlexLexer.h
> 
> I have already reported how I dealt with that.  I used a wrapper.
> 
> https://github.com/akimd/vcsn/blob/master/build-aux/bin/flex%2B%2B.in
> 
> I'm no longer working on this now.
> 
> And I don't plan to spend time on flex.

It was intended as FYI, just happened to get this form. :-)

I believe that yours is different issue:

I rewrote using flex 2.6.4, by altering using the streambuf pointer that 
istream holds, rather than an istream pointer as in flex 2.5.*. In some sense 
this is more logical.

Then the only issue I found is the one I reported.





Flex size_t sizes

2021-10-14 Thread Hans Åberg
Hi Akim,

Saw you have edited Flex, so I take it up here, even though not strictly a 
Bison topic:

The Apple flex version has been edited to admit size_t sizes, 64-bit on the 
platform, and perhaps it might be good idea for regular flex, which uses int, 
only 32-bit there. If using '%option c++' and mixing the versions, then the 
FlexLexer.h header is incompatible with the C++ source code, generating a 
compile error.

On MacOS with MacPorts and Xcode, the files are in
/opt/local/include/FlexLexer.h
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/FlexLexer.h





C++ location_type error

2021-10-01 Thread Hans Åberg
If one, in C++, uses "%define api.location.type {location_type}", then it will 
cause a scoping error in GCC but not Clang. Curiously, both compilers are 
correct, the C++ standard [basic.scope.class] says this is an error, but it is 
not necessary to issue a diagnostic.





Re: 'make clean' not working

2021-09-23 Thread Hans Åberg


> On 23 Sep 2021, at 18:42, Akim Demaille  wrote:
> 
>> Le 23 sept. 2021 à 09:31, Hans Åberg  a écrit :
>> 
>>> I can't reproduce your problem.  I have run configure, make, make clean,
>>> make, without any problem (autoconf was not called).  Maybe something was
>>> touched in your tree, I don't know.
>> 
>> I can reproduce the problem by modifying a file by touching it after the 
>> build (tried with an m4 file) and making sure autoconf is not in the PATH.
> 
> That's expected: if you touch something that requires Autoconf, of course 
> autoconf is fired.
> 
> Andrea said he did nothing like that, yet autoconf was called.

Then it does not happen here: I moved a file, made a copy to the original name, 
and the problem showed. Then I deleted the copy, and moved the original back, 
and then it did not happen.





Re: 'make clean' not working

2021-09-23 Thread Hans Åberg


> On 23 Sep 2021, at 07:31, Akim Demaille  wrote:
> 
> Hi Andraea,
> 
>> Le 21 sept. 2021 à 16:45, Andrea Monaco  a 
>> écrit :
>> 
>> Hello,
>> 
>> I'm trying to rebuild a bison-3.8 release tree, but "make clean" aborts
>> with this error:
>> 
>> CDPATH="${ZSH_VERSION+.}:" && cd . && /bin/bash 
>> '/root/bison-3.8/build-aux/missing' autoconf
>> /root/bison-3.8/build-aux/missing: line 81: autoconf: command not found
>> WARNING: 'autoconf' is missing on your system.
>>You should only need it if you modified 'configure.ac',
>>or m4 files included by it.
>>The 'autoconf' program is part of the GNU Autoconf package:
>>
>>It also requires GNU m4 and Perl in order to run:
>>
>>
>> Makefile:3894: recipe for target 'configure' failed
>> make: *** [configure] Error 127
>> 
>> Is it expected?
…
> I can't reproduce your problem.  I have run configure, make, make clean,
> make, without any problem (autoconf was not called).  Maybe something was
> touched in your tree, I don't know.

I can reproduce the problem by modifying a file by touching it after the build 
(tried with an m4 file) and making sure autoconf is not in the PATH.




Re: Bison 3.7.5 released

2021-01-26 Thread Hans Åberg


> On 26 Jan 2021, at 06:54, Paul Eggert  wrote:
> 
> On 1/25/21 9:41 PM, Akim Demaille wrote:
> 
>> Thanks for the report.  I'll see if there is interest in fixing it
>> in gnulib.  As a matter of fact, they might have fixed it already,
> 
> Fixed by Jim Meyering in Gnulib commit 
> 7fa203018d02d8f2b71a6be1240c3963a63cab1c (2020-12-29).
> 
> In the meantime you can ignore the warning: it's the typical clang balderdash 
> where they want you to put in an unnecessary cast. Don't they know that casts 
> are bad in C?

They want the C/C++ languages to be something else than what they actually are. 
Some examples of that are dangling-else warnings and forcing unnecessary, 
Pascal style, logical operator parentheses. A workaround is putting in:
#pragma clang diagnostic ignored "-Wlogical-op-parentheses"
#pragma clang diagnostic ignored "-Wdangling-else"





Re: Bison 3.7.5 released

2021-01-25 Thread Hans Åberg


> On 24 Jan 2021, at 09:04, Akim Demaille  wrote:
> 
> This release fixes several issues…

When I compiled using clang-11 of MacPorts, I got some warnings.
../bison-3.7.5/configure CXX=/opt/local/bin/clang++-mp-11 
CC=/opt/local/bin/clang-mp-11 CPPFLAGS="-g -I /usr/local/include" LDFLAGS="-L 
/usr/local/lib" 

make -j
…
  AR   lib/liby.a
../bison-3.7.5/lib/hash.c:501:11: warning: implicit conversion from 'unsigned 
long' to 'float' changes value from 18446744073709551615 to 
18446744073709551616 [-Wimplicit-const-int-float-conversion]
  if (SIZE_MAX <= new_candidate)
  ^~~~ ~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stdint.h:173:27:
 note: expanded from macro 'SIZE_MAX'
#define SIZE_MAX  UINTPTR_MAX
  ^~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stdint.h:154:27:
 note: expanded from macro 'UINTPTR_MAX'
#define UINTPTR_MAX   18446744073709551615UL
  ^~
../bison-3.7.5/lib/hash.c:964:15: warning: implicit conversion from 'unsigned 
long' to 'float' changes value from 18446744073709551615 to 
18446744073709551616 [-Wimplicit-const-int-float-conversion]
  if (SIZE_MAX <= candidate)
  ^~~~ ~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stdint.h:173:27:
 note: expanded from macro 'SIZE_MAX'
#define SIZE_MAX  UINTPTR_MAX
  ^~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/stdint.h:154:27:
 note: expanded from macro 'UINTPTR_MAX'
#define UINTPTR_MAX   18446744073709551615UL
  ^~
2 warnings generated.
  AR   lib/libbison∀






Re: Error UTF-8 strings

2020-06-24 Thread Hans Åberg


> On 24 Jun 2020, at 16:05, Ken Moffat via Bug reports for Bison, the GNU 
> parser generator  wrote:
> 
> UTF-8 on its own is not a valid locale.
> 
> A quick search on google suggests that LC_CTYPE will, among other
> things, control what is a valid letter, and lowercase|uppercase
> conversions.

POSIX [1], sec. LC_CTYPE, only requires that the ASCII letters are converted.

1. https://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html





Re: Error UTF-8 strings

2020-06-24 Thread Hans Åberg


> On 24 Jun 2020, at 16:05, Ken Moffat via Bug reports for Bison, the GNU 
> parser generator  wrote:
> 
> On Wed, Jun 24, 2020 at 10:20:48AM +0200, Hans Åberg wrote:
>> 
>> I pointed out that out: There is a double bug, locale dependent generation 
>> of the parser file, and relying on software that can't handle LC_CTYPE=UTF-8.
> 
> On (at least) linux using glibc, LC_CTYPE requires a valid locale.
> And UTF-8 on its own is not a valid locale.
> 
> A quick search on google suggests that LC_CTYPE will, among other
> things, control what is a valid letter, and lowercase|uppercase
> conversions.

I have found no information about what POSIX says is a valid locale.

> Taking an easy case, with languages written in latin alphabets, what
> is the uppercase of 'i' ?  In Turkey it is İ (with a dot), because
> in turkish dotted-i and dotless-i are different letters.

There is a LANG variable that might be set.





Re: Error UTF-8 strings

2020-06-24 Thread Hans Åberg


> On 24 Jun 2020, at 07:27, Akim Demaille  wrote:
> 
>> Le 23 juin 2020 à 11:12, Hans Åberg  a écrit :
>> 
>> The other errors occurred because I report errors in the grammar actions as:
>>   throw syntax_error(@x, "Name " + $x.text + " already defined in this 
>> scope as "
>> + yytnamerr_(yytname_[x0->first - 255]) + ".");
>> 
>> So perhaps you made some API for that?
> 
> yysymbol_name(x0->first).

I have to write, as x0→first is an int,
  symbol_name((symbol_kind_type)x0→first)

— When you made that complicated name, perhaps you did not consider it spilling 
into the user namespace.

Also, changing tokens and symbols to 'enum class' worked fine, except that the 
lexer return an int,  so one must put in conversions there.

> If you need something else, you'll have to
> read the doc about parse.error=custom.

I saw it, and think of using it.





Re: Error UTF-8 strings

2020-06-24 Thread Hans Åberg


> On 24 Jun 2020, at 13:24, Akim Demaille  wrote:
> 
>> The problem is that I haven't changed my environment variable.
> 
> You claim something has changed with 3.6, I have shown nothing changed.

I may have used LC_CTYPE=en_US.UTF-8 at some point in time, but on MacOS, the 
default is LC_CTYPE=UTF-8.

> But leave Bison alone, it's unrelated to it.

Apart from that Bison compiles the parser file differently in those two cases.

>>> LC_ALL=UTF-8
>> …
>>> LC_ALL=fr_FR.UTF-8
>> 
>> I pointed out that out: There is a double bug, locale dependent generation 
>> of the parser file, and relying on software that can't handle LC_CTYPE=UTF-8.
> 
> I won't even send again the pointers about this.  I've already spent enough
> time on falsified claims.

Well, it is sneaky to have different compiles depending on the locale, as it is 
supposed to be platform independent.

> That was my last message in this thread.

As you wish.

> Cheers!

I will try!





Re: Error UTF-8 strings

2020-06-24 Thread Hans Åberg


> On 24 Jun 2020, at 07:22, Akim Demaille  wrote:
> 
> hi Hans,

Hello,

> Still no reproducible difference.  As I said, nothing has changed here.
> You probably changed something in your environment.

The problem is that I haven't changed my environment variable.

> LC_ALL=UTF-8
…
> LC_ALL=fr_FR.UTF-8

I pointed out that out: There is a double bug, locale dependent generation of 
the parser file, and relying on software that can't handle LC_CTYPE=UTF-8.





Re: Error UTF-8 strings

2020-06-23 Thread Hans Åberg

> On 23 Jun 2020, at 07:47, Akim Demaille  wrote:
> 
> give me a reproducible example.

Here is the calc++ example from Bison 3.2.1, where I have changed the 
assignment symbol to "≔", and no other changes, compiled with Bison 3.6.3 using 
LC_CTYPE=UTF-8. In the yytname_, there is "≔":
  "\"\\342\\211\\224\""

<>


Re: Error UTF-8 strings

2020-06-23 Thread Hans Åberg


> On 23 Jun 2020, at 07:47, Akim Demaille  wrote:
> 
>> The question is if that helps, as it is the yytname_ that is translated 
>> according to the LC_CTYPE environment variable.
>> 
>> This also introduces a locale dependency in the Bison compilation, so that 
>> the generated parser no longer is platform independent.
> 
> Yes, that is indeed exactly what I meant: verbose is bad, and always was.
> Use "detailed" instead.

From what I can see by comparing the outputs, UTF-8 is still converted, though 
the writeout might be correct. With "verbose", I get in yytname_ an entry
  "\"\\342\\210\\216\""
whereas with "detailed" in yy_sname
  "\342\210\216"

An improvement if you are not supposed to read the parser source code.

The other errors occurred because I report errors in the grammar actions as:
throw syntax_error(@x, "Name " + $x.text + " already defined in this 
scope as "
  + yytnamerr_(yytname_[x0->first - 255]) + ".");

So perhaps you made some API for that?





Re: Error UTF-8 strings

2020-06-23 Thread Hans Åberg


> On 23 Jun 2020, at 07:47, Akim Demaille  wrote:
> 
>> The question is if that helps, as it is the yytname_ that is translated 
>> according to the LC_CTYPE environment variable.
>> 
>> This also introduces a locale dependency in the Bison compilation, so that 
>> the generated parser no longer is platform independent.
> 
> Yes, that is indeed exactly what I meant: verbose is bad, and always was.
> Use "detailed" instead.

This is worse, I get (in Bison 3.6.3):
error: use of undeclared identifier 'yytname_'; did you mean 'yytable_'?
  + yytnamerr_(yytname_[x0->first - 255]) + ".");

>>> So, there is no new bug in 3.6 here, just something that is well known for
>>> ages, about which you and I already discussed.
>> 
>> Yes, there is, translation dependent on LC_CTYPE, which was not before.
> 
> I believe you are mistaken, and you won't find any difference between 3.6
> and 3.5 on this regard.
> 
> To prove me wrong, give me a reproducible example.

It was in an earlier 3.x version it worked. I do not try out error reporting 
that often.





Re: Error UTF-8 strings

2020-06-22 Thread Hans Åberg


> On 22 Jun 2020, at 07:59, Akim Demaille  wrote:
> 
>> Le 21 juin 2020 à 15:24, Hans Åberg  a écrit :
>> 
>> 
>>> On 21 Jun 2020, at 14:25, Hans Åberg  wrote:
>>> 
>>>> On 21 Jun 2020, at 11:45, Akim Demaille  wrote:
>>>> 
>>>> What locale are you using?
>>> 
>>> LC_CTYPE=UTF-8
>> 
>> The error goes away if setting LC_CTYPE=en_US.UTF-8 before recompiling the 
>> .yy file.
>> 
>> UTF-8 is language independent, so MacOS uses LC_CTYPE=UTF-8, but there are 
>> software that require a prefix.
> 
> Hans,
> 
> This is double-escaping of the UTF-8 characters is a well known problem
> of parse.error=verbose, that resulted in the introduction of "detailed"
> parse.error.  That was discussed extensively on Bison's lists, and is
> documented in NEWS of 3.6:
> 
> 
> 
> *** Improved syntax error messages
> 
>  Two new values for the %define parse.error variable offer more control to
>  the user.  Available in all the skeletons (C, C++, Java).
> 
>  %define parse.error detailed
> 
>  The behavior of "%define parse.error detailed" is closely resembling that
>  of "%define parse.error verbose" with a few exceptions.  First, it is safe
>  to use non-ASCII characters in token aliases (with 'verbose', the result
>  depends on the locale with which bison was run).  Second, a yysymbol_name
>  function is exposed to the user, instead of the yytnamerr function and the
>  yytname table.  Third, token internationalization is supported (see
>  below).

The question is if that helps, as it is the yytname_ that is translated 
according to the LC_CTYPE environment variable.

This also introduces a locale dependency in the Bison compilation, so that the 
generated parser no longer is platform independent.

> Besides, I have recently posted that Bison 3.7 will also make another step:
> 
> 
> 
> *** String aliases are faithfully propagated
> 
>  Bison used to interpret user strings (i.e., decoding backslash escapes)
>  when reading them, and to escape them (i.e., issue non-printable
>  characters as backslash escapes, taking the locale into account) when
>  outputting them.  As a consequence non-ASCII strings (say in UTF-8) ended
>  up "ciphered" as sequences of backslash escapes.  This happened not only
>  in the generated sources (where the compiler will reinterpret them), but
>  also in all the generated reports (text, xml, html, dot, etc.).  Reports
>  were therefore not readable when string aliases were not pure ASCII.
>  Worse yet: the output depended on the user's locale.
> 
>  Now Bison faithfully treats the string aliases exactly the way the user
>  spelled them.  This fixes all the aforementioned problems.  However, now,
>  string aliases semantically equivalent but syntactically different (e.g.,
>  "A", "\x41", "\101") are considered to be different.

This besides might help.

> So, there is no new bug in 3.6 here, just something that is well known for
> ages, about which you and I already discussed.

Yes, there is, translation dependent on LC_CTYPE, which was not before.




Re: Error UTF-8 strings

2020-06-21 Thread Hans Åberg


> On 21 Jun 2020, at 14:25, Hans Åberg  wrote:
> 
>> On 21 Jun 2020, at 11:45, Akim Demaille  wrote:
>> 
>> What locale are you using?
> 
> LC_CTYPE=UTF-8

The error goes away if setting LC_CTYPE=en_US.UTF-8 before recompiling the .yy 
file.

UTF-8 is language independent, so MacOS uses LC_CTYPE=UTF-8, but there are 
software that require a prefix.





Re: Error UTF-8 strings

2020-06-21 Thread Hans Åberg


> On 21 Jun 2020, at 11:45, Akim Demaille  wrote:
> 
> Hi Hans,

Hello,

>> Le 21 juin 2020 à 10:19, Hans Åberg  a écrit :
>> 
>> Use
>> %token  ""
>> where  has non-ASCII characters (high bit set), and trigger it in a 
>> syntax error, as the failing token or one of the expected ones.
> 
> I need details.  What skeleton?  

%skeleton "lalr1.cc"

> What options for parse.error?  

%define parse.error verbose

> What locale are you using?

LC_CTYPE=UTF-8

> If you want some support, give us something we can actually work on.
> 
> Give me something reproducible.  I'll disregard anything else.

It is embedded in a grammar about the size of C++98, so it will take some time 
to figure out a smaller example.





Re: Error UTF-8 strings

2020-06-21 Thread Hans Åberg


> On 21 Jun 2020, at 07:59, Akim Demaille  wrote:
> 
> Hi Hans,
> 
>> Le 20 juin 2020 à 22:54, Hans Åberg  a écrit :
>> 
>> It seems that the writeout of error strings with UTF-8 have retrograded, 
>> because with Bison 3.6.3 I get:
>> error: syntax error, unexpected "\302\260"
>> 
>> I wrote a workaround code for this, and removed it when the issue seemed 
>> fixed, but now perhaps it should be put back again.
> 
> Could you please give more details?  How can I reproduce that?

Use
  %token  ""
where  has non-ASCII characters (high bit set), and trigger it in a 
syntax error, as the failing token or one of the expected ones.




Error UTF-8 strings

2020-06-20 Thread Hans Åberg
It seems that the writeout of error strings with UTF-8 have retrograded, 
because with Bison 3.6.3 I get:
  error: syntax error, unexpected "\302\260"

I wrote a workaround code for this, and removed it when the issue seemed fixed, 
but now perhaps it should be put back again.





Re: Bison 3.5.1 released

2020-01-30 Thread Hans Åberg


> On 20 Jan 2020, at 06:05, Akim Demaille  wrote:
> 
>> Le 19 janv. 2020 à 21:52, Hans Åberg  a écrit :
>> 
>> —I expected you having access to the platform, or maybe it is MacOS 10.15 
>> specific, so I removed it and would need to compile it again.
> 
> No, I don't see see any failure on 10.14.6 (I don't want to experience the 
> mail migration failure you pointed me to).

It may have been fixed in the now released MacOS 10.15.3, 





Re: Bison 3.5.1 released

2020-01-24 Thread Hans Åberg


> On 20 Jan 2020, at 06:05, Akim Demaille  wrote:
> 
>> —I expected you having access to the platform, or maybe it is MacOS 10.15 
>> specific, so I removed it and would need to compile it again.
> 
> No, I don't see see any failure on 10.14.6 (I don't want to experience the 
> mail migration failure you pointed me to).

Rumors is that is fixed in the upcoming MacOS 10.15.3.





Re: Bison 3.5.1 released

2020-01-20 Thread Hans Åberg


> On 20 Jan 2020, at 13:37, Akim Demaille  wrote:
> 
> Hi Hans,

Hello Akim,

> Please keep the list in CC.

It was for the attachment, to avoid public cluttering.

>> Le 20 janv. 2020 à 10:48, Hans Åberg  a écrit :
>> 
>> Attached test suite logs 483-489.
> 
> Thanks!

You are welcome!

> But the errors are
> 
> 484. torture.at:271: testing State number type: 129 states ...
> ../../bison-3.5.1/tests/torture.at:271: ruby $abs_top_srcdir/tests/linear 129 
> >input.y || exit 77
> --- /dev/null 2020-01-20 10:37:01.0 +0100
> +++ 
> /usr/local/src/bison/build-bison-3.5.1-apple-clang/tests/testsuite.dir/at-groups/484/stderr
>2020-01-20 10:37:01.0 +0100
> @@ -0,0 +1 @@
> +/System/Library/Frameworks/Ruby.framework/Versions/2.6/usr/lib/ruby/2.6.0/universal-darwin19/rbconfig.rb:229:
>  warning: Insecure world writable dir /usr/local/src/bison in PATH, mode 
> 040777
> 484. torture.at:271: 484. State number type: 129 states (torture.at:271): 
> FAILED (torture.at:271)
> 
> and I don't think they are related to Bison.  I have no idea why you would 
> have only these tests fail, and not many more.

It is a permissions issue: One can put the sources in /usr/local/src/, but as 
one should not compile as root, only for the install, I allowed others to write.

Then the component does not admit that for any intermediate directory.

The issue does not show up when changing that, or using the home directory.





Re: Bison 3.5.1 released

2020-01-20 Thread Hans Åberg


> On 20 Jan 2020, at 06:05, Akim Demaille  wrote:
> 
>> Le 19 janv. 2020 à 21:52, Hans Åberg  a écrit :
>> 
>>> On 19 Jan 2020, at 21:34, Akim Demaille  wrote:
>>> 
>>>> Le 19 janv. 2020 à 18:48, Hans Åberg  a écrit :
>>>> 
>>>>> On 19 Jan 2020, at 14:53, Akim Demaille  wrote:
>>>>> 
>>>>> Bison 3.5.1 fixes a few minor issues from Bison 3.5.
>>>> 
>>>> I got some make check errors, MacOS 10.15.2, same for (MacPorts) gcc9 
>>>> 9.2.0, clang 9.0.1, and Apple clang version 11.0.0:
>>>> 
>>>> 483: State number type: 128 states   FAILED 
>>>> (torture.at:270)
>>>> 484: State number type: 129 states   FAILED 
>>>> (torture.at:271)
>>>> 485: State number type: 256 states   FAILED 
>>>> (torture.at:272)
>>>> 486: State number type: 257 states   FAILED 
>>>> (torture.at:273)
>>>> 487: State number type: 32768 states FAILED 
>>>> (torture.at:274)
>>>> 488: State number type: 65536 states FAILED 
>>>> (torture.at:275)
>>>> 489: State number type: 65537 states FAILED 
>>>> (torture.at:276)
>>> 
>>> Thanks for the report.
>>> 
>>> Could you please send the testsuite.log file?  It should probably be
>>> zipped, it's long.
>> 
>> Which compiler?
> 
> I guess the problem is the same in all cases, so any for a start.

I use the Apple clang, as people are most likely to use that one. Which log 
files do you want?

>> —I expected you having access to the platform, or maybe it is MacOS 10.15 
>> specific, so I removed it and would need to compile it again.
> 
> No, I don't see see any failure on 10.14.6 (I don't want to experience the 
> mail migration failure you pointed me to).

There have been a number of updates, and that was only case I saw. There were 
some propblems wit MacPorts at first, but that seems to have stabilized.





Re: Bison 3.5.1 released

2020-01-19 Thread Hans Åberg


> On 19 Jan 2020, at 21:34, Akim Demaille  wrote:
> 
>> Le 19 janv. 2020 à 18:48, Hans Åberg  a écrit :
>> 
>>> On 19 Jan 2020, at 14:53, Akim Demaille  wrote:
>>> 
>>> Bison 3.5.1 fixes a few minor issues from Bison 3.5.
>> 
>> I got some make check errors, MacOS 10.15.2, same for (MacPorts) gcc9 9.2.0, 
>> clang 9.0.1, and Apple clang version 11.0.0:
>> 
>> 483: State number type: 128 states   FAILED (torture.at:270)
>> 484: State number type: 129 states   FAILED (torture.at:271)
>> 485: State number type: 256 states   FAILED (torture.at:272)
>> 486: State number type: 257 states   FAILED (torture.at:273)
>> 487: State number type: 32768 states FAILED (torture.at:274)
>> 488: State number type: 65536 states FAILED (torture.at:275)
>> 489: State number type: 65537 states FAILED (torture.at:276)
> 
> Thanks for the report.
> 
> Could you please send the testsuite.log file?  It should probably be
> zipped, it's long.

Which compiler? —I expected you having access to the platform, or maybe it is 
MacOS 10.15 specific, so I removed it and would need to compile it again.





Re: Bison 3.5.1 released

2020-01-19 Thread Hans Åberg
Great! —This is a feature one might want to enable without checking directly. 


> On 19 Jan 2020, at 18:56, Adrian Vogelsgesang  
> wrote:
> 
> Seems like a bug to me. We should only accept "none" and "full".
> I am working on it right now, it should be easy to fix. 
> 
> Thanks for reporting!
> 
> On 19/01/2020, 15:27, "bug-bison on behalf of Hans Åberg" 
>  haber...@telia.com> wrote:
> 
> 
>> On 19 Jan 2020, at 14:53, Akim Demaille  wrote:
>> 
>> Adrian Vogelsgesang contributed lookahead correction for C++.
> 
>How do you know this has been enabled? —For example, this is legal:
>  %define parse.lac anything
> 
> 
> 
> 
> 
> 




Re: Bison 3.5.1 released

2020-01-19 Thread Hans Åberg


> On 19 Jan 2020, at 14:53, Akim Demaille  wrote:
> 
> Bison 3.5.1 fixes a few minor issues from Bison 3.5.

I got some make check errors, MacOS 10.15.2, same for (MacPorts) gcc9 9.2.0, 
clang 9.0.1, and Apple clang version 11.0.0:

483: State number type: 128 states   FAILED (torture.at:270)
484: State number type: 129 states   FAILED (torture.at:271)
485: State number type: 256 states   FAILED (torture.at:272)
486: State number type: 257 states   FAILED (torture.at:273)
487: State number type: 32768 states FAILED (torture.at:274)
488: State number type: 65536 states FAILED (torture.at:275)
489: State number type: 65537 states FAILED (torture.at:276)





Re: Bison 3.5 released [stable]

2019-12-14 Thread Hans Åberg


> On 14 Dec 2019, at 17:19, Akim Demaille  wrote:
> 
>> Le 13 déc. 2019 à 18:57, Hans Åberg  a écrit :
>> 
>> 
>>> On 13 Dec 2019, at 18:15, Akim Demaille  wrote:
>>> 
>>> https://ftp.gnu.org/gnu/bison/bison-3.5.tar.xz
>> 
>> Still, 'make install-pdf’ does not work, because the rule is not in the 
>> Makefile of a number of *po directories.
> 
> Gee...  Some day, some month, some year, some century, the full series of 
> tools will be fixed on this regard.

BTW, recent Automake seems fine figuring out include files from the compiler. I 
do not how it would work on complex project like Bison though.




Re: Bison 3.5 released [stable]

2019-12-13 Thread Hans Åberg


> On 13 Dec 2019, at 18:15, Akim Demaille  wrote:
> 
> https://ftp.gnu.org/gnu/bison/bison-3.5.tar.xz

Still, 'make install-pdf’ does not work, because the rule is not in the 
Makefile of a number of *po directories.





Re: [PATCH 0/3] yacc: compute the best type for the state number

2019-10-26 Thread Hans Åberg


> On 26 Oct 2019, at 09:05, Akim Demaille  wrote:
> 
>> Le 25 oct. 2019 à 18:13, Paul Eggert  a écrit :
>> 
>> On 10/25/19 7:15 AM, Théophile Ranquet wrote:
>>> 
>>> This sounds interesting and I would love reading what people have to
>>> say about this. However, I have failed at finding any such discussion
>>> or source. Could you perhaps share a few pointers?
>> 
>> 
>> I don't know of a good central email thread about this, but here's a style 
>> guideline:
>> 
>> https://www.gnu.org/software/emacs/manual/html_node/elisp/C-Integer-Types.html
> 
> About C++, this page has a good summary of the trend, which is "run away from 
> unsigned".
> 
> https://stackoverflow.com/questions/18795453/why-prefer-signed-over-unsigned-in-c
> 
> This talk is about undefined behavior, and why it's good to have some (in 
> particular because, as Paul already reported, this allows tools such as 
> sanitizers to catch these errors).
> 
> https://youtu.be/yG1OZ69H_-o

Undefined behavior also allows modern optimization, so it is important to stick 
to the lang specs, see:

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_21.html





Re: Error compiling bison 3.4.2 on Solaris

2019-10-23 Thread Hans Åberg


> On 23 Oct 2019, at 22:45, Paul Eggert  wrote:
> 
> On 10/22/19 11:08 PM, Akim Demaille wrote:
> 
>> /opt/local/lib/gcc9/gcc/x86_64-apple-darwin18/9.2.0/include-fixed/os/base.h
>> ...
>> #ifndef __has_builtin
>> #define __has_builtin(x) 0
>> #endif
> rg/bugzilla/show_bug.cgi?id=66970
>> ... GCC folks want to emulate that compiler.  By emasculating theirs.
> 
> Actually, the GCC installation procedure derives its os/base.h from macOS's, 
> so this problem is due to Apple: they don't particularly want you to use GCC, 
> and they are in effect emasculating GCC in their .

On MacOS 10.14, the system headers in /usr/include are removed, so I had to use 
for C++, with g++ and gcc set to the real ones,
../mli-root/configure CXX=g++ CXXFLAGS=-g CC=gcc CFLAGS=-g CPPFLAGS="-isysroot 
$(xcrun --show-sdk-path) -I /usr/local/include" LDFLAGS="-L /usr/local/lib”





Re: [PATCH 0/3] yacc: compute the best type for the state number

2019-10-07 Thread Hans Åberg


> On 7 Oct 2019, at 07:15, Akim Demaille  wrote:
> 
>> Le 2 oct. 2019 à 15:58, Paul Eggert  a écrit :
>> 
>> On 10/1/19 10:27 PM, Akim Demaille wrote:
>>> I don't understand why you shy away from -128 and -32768 though.
>> 
>> The C Standard doesn't guarantee support for -128 and -32768. This 
>> originally was for ones' complement machines such as the Unisys ClearPath 
>> Dorado enterprise servers (still in use via firmware translation to Intel 
>> Xeon, and they have a C compiler),
> 
> Wow!  I was unaware of this.  Ain't it this kind of machines that is still in 
> use to run ancient COBOL programs?  Are they still used in production for 
> programs written in C?

The C/C++ compiler is still allowed to use it for optimizations, even though 
all current hardware I think use modulo 2^k representations; there is an 
example here:
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html#signed_overflow

Then this links gives an example how the optimizer can remove an overflow 
check, causing a security issue:
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html





Re: [PATCH 0/3] yacc: compute the best type for the state number

2019-10-02 Thread Hans Åberg


> On 2 Oct 2019, at 00:43, Paul Eggert  wrote:
> 
> However, we should avoid unsigned types that are 'unsigned' or wider, as they 
> have too many issues. I doubt whether there are practical uses of Bison with 
> more than INT_MAX states; but if there are, we should use ptrdiff_t to count 
> states, not int, because any application likely to exceed INT_MAX is also 
> pretty likely to exceed UINT_MAX.
> 
> There is an area where Bison uses 'int' when it should use a wider type, 
> presumably intmax_t. This is for line and column numbers, which these come 
> from input files and can exceed INT_MAX in some practical cases. Fixing that 
> could be a subject of a different patch.
> 
> There are no doubt other uses of 'int' in Bison for indexes should be changed 
> to ptrdiff_t, for things like stack depth where Bison should not impose 
> arbitrary limits. I think I should take a look at that next; this will 
> probably entail improvements to the patch I proposed.

For C++, I use
  typedef std::ptrdiff_t int_type;
  typedef std::size_t size_type;

In C++ one uses fpos_t for file positions [1], in functions like std::fgetpos 
[2]. 

1. https://en.cppreference.com/w/cpp/io/c
2. https://en.cppreference.com/w/cpp/io/c/fgetpos





Re: [PATCH 0/3] yacc: compute the best type for the state number

2019-10-01 Thread Hans Åberg


> On 1 Oct 2019, at 21:21, Kaz Kylheku  wrote:
> 
> On 2019-10-01 06:53, Hans Åberg wrote:
>> One should note that the unsigned types are required to be 2’s
>> complement C/C++, unlike the signed ones, cf.
> 
> That is untrue by definition, since two's complement is strictly
> a mechanism for representing negative values, which the unsigned
> types do not do. They follow a "pure binary enumeration"
> (I think that's the wording).

Sorry, it should be integers modulo 2^n, where n is the number of bits, which 
2’s complement is equivalent to with a certain choice of integer 
representatives.





Re: [PATCH 0/3] yacc: compute the best type for the state number

2019-10-01 Thread Hans Åberg


> On 1 Oct 2019, at 10:35, Paul Eggert  wrote:
> 
> In other GNU applications, we've been moving away from using unsigned types 
> due to their confusing behavior (particularly when comparing signed vs 
> unsigned), and because modern tools such as 'gcc -fsanitize=undefined' can 
> check for signed integer overflow but (obviously) not for unsigned integer 
> overflow. In this newer style, it's OK to use unsigned types for bit vectors 
> and hash values since these typically don't involve integer comparisons and 
> integer overflow is typically harmless; for indexes, though, unsigned types 
> are so error-prone that they should be avoided.

One should note that the unsigned types are required to be 2’s complement 
C/C++, unlike the signed ones, cf. [1]. Also, in C++, indices are required to 
be large enough to hold all values, so on 64-bit machines they are that also 
for strings that usually are quite small.

1. 
https://en.wikipedia.org/wiki/Integer_overflow#Methods_to_mitigate_integer_overflow_problems





Re: Character error not reported

2019-07-03 Thread Hans Åberg


> On 3 Jul 2019, at 07:24, Akim Demaille  wrote:
> 
>> Le 2 juil. 2019 à 14:15, Hans Åberg  a écrit :
>> 
>>> On 2 Jul 2019, at 07:08, Akim Demaille  wrote:
>>> 
>>>> Le 18 juin 2019 à 18:09, Hans Åberg  a écrit :
>>>> 
>>>> As 8-bit character tokens are not useful with UTF-8, I have replaced it 
>>>> with:
>>>> %token token_error "token error"
>>>> 
>>>> . { return my_parser::token::token_error; }
>>>> 
>>>> Please let me know if there is a better way to generate a parser error.
>>> 
>>> I personally prefer to throw an exception.
>>> 
>>> .   throw parser::syntax_error(loc, "invalid character: "s + yytext);
>> 
>> I changed to that too, writing to make it look as though thrown by the 
>> parser:
>> . { throw my_parser::syntax_error(yylloc, "syntax error, unexpected 
>> my_parser token error.");
>> 
>> When the match is a part of an UTF-8 byte, it is not useful to report what 
>> it is.
> 
> You have a point.  I would still report the culprit, but improve the pattern.

As for Bison, I thought maybe a suggestion for better diagnostics.

> /* UTF-8 Encoded Unicode Code Point, from Flex's documentation. */
> mbchar
> [\x09\x0A\x0D\x20-\x7E]|[\xC2-\xDF][\x80-\xBF]|\xE0[\xA0-\xBF][\x80-\xBF]|[\xE1-\xEC\xEE\xEF]([\x80-\xBF]{2})|\xED[\x80-\x9F][\x80-\xBF]|\xF0[\x\90-\xBF]([\x80-\xBF]{2})|[\xF1-\xF3]([\x80-\xBF]{3})|\xF4[\x80-\x8F]([\x80-\xBF]{2})
> 
> %%
> 
> {mbchar}  throw parser::syntax_error(loc, "invalid character: "s + yytext);
> . throw parser::syntax_error(loc, "invalid byte: "s + yytext);

Thanks for the suggestion. I made a Haskell program generating such regex 
patterns for UTF-8 and UTF-32 character classes, and also a C++ version.

I think though of testing my own software I mentioned before as a replacement 
for Flex.





Re: Character error not reported

2019-07-02 Thread Hans Åberg


> On 2 Jul 2019, at 07:08, Akim Demaille  wrote:
> 
> Hi Hans,

Hello,

>> Le 18 juin 2019 à 18:09, Hans Åberg  a écrit :
>> 
>> As 8-bit character tokens are not useful with UTF-8, I have replaced it with:
>> %token token_error "token error"
>> 
>> . { return my_parser::token::token_error; }
>> 
>> Please let me know if there is a better way to generate a parser error.
> 
> I personally prefer to throw an exception.
> 
>  .   throw parser::syntax_error(loc, "invalid character: "s + yytext);

I changed to that too, writing to make it look as though thrown by the parser:
. { throw my_parser::syntax_error(yylloc, "syntax error, unexpected my_parser 
token error.");

When the match is a part of an UTF-8 byte, it is not useful to report what it 
is.

The token-error token may still be needed, though, as I store token values on 
the symbol table.





Re: Character error not reported

2019-06-18 Thread Hans Åberg


> On 17 Jun 2019, at 18:06, Akim Demaille  wrote:
> 
> Hi Hans,

Hi,

>> Le 17 juin 2019 à 15:12, Hans Åberg  a écrit :
>> 
>> When a byte with high bit set that is not used in the grammar, the parser 
>> generated by Bison 3.4.1, does not report an error, only if the high bit is 
>> not set.
> 
> This is hard to believe.  I suspect your problem is elsewhere.
> 
>> This occurs if one sets a Flex default rule
>> . { return yytext[0]; }
>> and the lexer finds a stray UTF-8 byte.
> 
> I would say that here, you return a char (yytext[0]) with "a high bit set", 
> on an architecture where char is signed, so you are actually returning a 
> negative int (when the 8th bit is set).  And for Bison, any negative token 
> number stands for end-of-file.

Indeed, likely the case.

> You should actually write:
> 
> . { return (unsigned char) yytext[0]; }

As 8-bit character tokens are not useful with UTF-8, I have replaced it with:
  %token token_error "token error"

. { return my_parser::token::token_error; }

Please let me know if there is a better way to generate a parser error.






Character error not reported

2019-06-17 Thread Hans Åberg
When a byte with high bit set that is not used in the grammar, the parser 
generated by Bison 3.4.1, does not report an error, only if the high bit is not 
set. This occurs if one sets a Flex default rule
  . { return yytext[0]; }
and the lexer finds a stray UTF-8 byte.







Re: Bison 3.4.1 released [stable]

2019-05-22 Thread Hans Åberg


> On 22 May 2019, at 22:27, Akim Demaille  wrote:
> 
>> Le 22 mai 2019 à 22:26, Hans Åberg  a écrit :
>> 
>>> On 22 May 2019, at 21:34, Akim Demaille  wrote:
>>> 
>>> The Bison team … Bison 3.4.1.
>> 
>> A bug, that was also in 3.4:
>> 
>> # make install-pdf
>> Making install-pdf in po
>> make[1]: *** No rule to make target `install-pdf'.  Stop.
>> make: *** [install-pdf-recursive] Error 1
> 
> Yes, known issue Hans.  I wait until a modern gettext hits my environment.

Who knows, perhaps I am the only one using it. :-)

> Thanks.

You are welcome.





Re: Bison 3.4.1 released [stable]

2019-05-22 Thread Hans Åberg


> On 22 May 2019, at 21:34, Akim Demaille  wrote:
> 
> The Bison team … Bison 3.4.1.

A bug, that was also in 3.4:

# make install-pdf
Making install-pdf in po
make[1]: *** No rule to make target `install-pdf'.  Stop.
make: *** [install-pdf-recursive] Error 1





Re: (non)Use of C++ 11 constructs in skeleton

2019-05-19 Thread Hans Åberg


> On 19 May 2019, at 18:09, Akim Demaille  wrote:
> 
>> Le 19 mai 2019 à 14:34, Hans Åberg  a écrit :
>> 
>>> I'll revert that for 3.4.1 then…
>> 
>> You must choose what you feel is best.
> 
> With regards to C++, I have no doubt, as proved by Frank's pointer. 

Almost all uses of “copyable” come from C++11 it seems: The C++98 standard only 
has two occurrences. A net search with it and C++ gives 45 million hits, 
without C++ only a hundredth of that.





Re: Bison 3.4 released [stable]

2019-05-19 Thread Hans Åberg


> On 19 May 2019, at 14:01, Akim Demaille  wrote:
> 
> We are happy to announce the release of Bison 3.4.

An old problem with installing PDF docs is back:

# make install-pdf
Making install-pdf in po
make[1]: *** No rule to make target `install-pdf'.  Stop.





Re: (non)Use of C++ 11 constructs in skeleton

2019-05-19 Thread Hans Åberg


> On 19 May 2019, at 13:38, Akim Demaille  wrote:
> 
>> Le 19 mai 2019 à 12:58, Frank Heckenbach  a écrit :
>> 
>> Akim Demaille wrote:
>> 
>>>> Le 19 mai 2019 à 11:02, Hans Åberg  a écrit :
>>>> 
>>>> Also a spelling error: copiable.
>>> 
>>> I'm installing this.  Thanks a lot Hans!
>>> 
>>>   fix: use copiable, not copyable
>> 
>> Am I missing something? Seems like "copyable" is a valid alternative
>> form:
>> 
>> https://en.wiktionary.org/wiki/copiable
>> 
>> and commonly used in C++:
>> 
>> https://en.cppreference.com/w/cpp/types/is_trivially_copyable
> 
> Bummer.  Reading 'copiable' felt so weird…  

I felt the opposite. :-)

> But the dictionary I checked had it, and not 'copyable’.

Probably the same as I did (see my other post). It has varied historically [1].

> I'll revert that for 3.4.1 then…

You must choose what you feel is best. Here is a video on the topic of English 
spelling history.
  https://www.youtube.com/watch?v=EqLiRu34kWo

1. 
https://books.google.com/ngrams/graph?content=copiable%2Ccopyable_insensitive=on_start=1800_end=2000=15=3=_url=t4%3B%2Ccopiable%3B%2Cc0%3B%2Cs0%3B%3Bcopiable%3B%2Cc0%3B%3BCopiable%3B%2Cc0%3B.t4%3B%2Ccopyable%3B%2Cc0%3B%2Cs0%3B%3Bcopyable%3B%2Cc0%3B%3BCopyable%3B%2Cc0





Re: (non)Use of C++ 11 constructs in skeleton

2019-05-19 Thread Hans Åberg


> On 19 May 2019, at 12:58, Frank Heckenbach  wrote:
> 
> Akim Demaille wrote:
> 
>>> Le 19 mai 2019 à 11:02, Hans Åberg  a écrit :
>>> 
>>> Also a spelling error: copiable.
>> 
>> I'm installing this.  Thanks a lot Hans!
>> 
>>fix: use copiable, not copyable
> 
> Am I missing something? Seems like "copyable" is a valid alternative
> form:
> 
>  https://en.wiktionary.org/wiki/copiable
> 
> and commonly used in C++:
> 
>  https://en.cppreference.com/w/cpp/types/is_trivially_copyable

It has varied historically, it seems [1]. I got it from NOAD/ODE only listing 
‘copiable’, whereas AmH lists both.

1. 
https://books.google.com/ngrams/graph?content=copiable%2Ccopyable_insensitive=on_start=1800_end=2000=15=3=_url=t4%3B%2Ccopiable%3B%2Cc0%3B%2Cs0%3B%3Bcopiable%3B%2Cc0%3B%3BCopiable%3B%2Cc0%3B.t4%3B%2Ccopyable%3B%2Cc0%3B%2Cs0%3B%3Bcopyable%3B%2Cc0%3B%3BCopyable%3B%2Cc0#t4%3B%2Ccopiable%3B%2Cc0%3B%2Cs0%3B%3Bcopiable%3B%2Cc0%3B%3BCopiable%3B%2Cc0%3B.t4%3B%2Ccopyable%3B%2Cc0%3B%2Cs0%3B%3Bcopyable%3B%2Cc0%3B%3BCopyable%3B%2Cc0





Re: (non)Use of C++ 11 constructs in skeleton

2019-05-19 Thread Hans Åberg


> On 19 May 2019, at 11:19, Akim Demaille  wrote:
> 
>>> Bison 3.4 is about to be published, I'm not going to do that now, but we'll 
>>> do that afterwards.
>> 
>> Also a spelling error: copiable.
> 
> Doh...  I was rolling 3.4, and I just pushed the big red button right on time.

:-)

> I'm installing this.  Thanks a lot Hans!

You are welcome!





Re: Bison 3.3.1 released [stable]

2019-02-04 Thread Hans Åberg


> On 27 Jan 2019, at 16:43, Akim Demaille  wrote:
> 
> In Bison 3.3, the new option --update replaces deprecated features with
> their modern spelling, but also applies fixes such as eliminating duplicate
> directives, etc.

It didn't capture %debug.





Re: Bison 3.3.1 released [stable]

2019-01-27 Thread Hans Åberg


> On 27 Jan 2019, at 19:32, Akim Demaille  wrote:
> 
>> Le 27 janv. 2019 à 18:26, Hans Åberg  a écrit :
>> 
>> 
>>> On 27 Jan 2019, at 16:43, Akim Demaille  wrote:
>>> 
>>> In Bison 3.3, the new option --update replaces deprecated features with
>>> their modern spelling, but also applies fixes such as eliminating duplicate
>>> directives, etc.
>> 
>> This is nice feature. You might add info about along with the warnings when 
>> it discovers something it can fix.
> 
> I think what you mean is done, and documented in the NEWS too.
> 
> *** Generation of fix-its for IDEs/Editors
> 
>   When given the new option -ffixit (aka -fdiagnostics-parseable-fixits),
>   bison now generates machine readable editing instructions to fix some
>   issues.  Currently, this is mostly limited to updating deprecated
>   directives and removing duplicates.

I get that without the option.

>   See the "fix-it:" lines below:
...
> foo.y: warning: fix-its can be applied.  Rerun with option '--update'. 
> [-Wother]

It was not prominent enough for me to notice. :-)

Maybe:
Run 'bison --update ' to update deprecated features; see the Bison manual 
for further information.

I put the essential information first. Just an input. :-)

>   This uses the same output format as GCC and Clang.

I tried it, and it caught one instance. I haven't tried it with GCC and Clang, 
though.





Re: Bison 3.3.1 released [stable]

2019-01-27 Thread Hans Åberg


> On 27 Jan 2019, at 16:43, Akim Demaille  wrote:
> 
> In Bison 3.3, the new option --update replaces deprecated features with
> their modern spelling, but also applies fixes such as eliminating duplicate
> directives, etc.

This is nice feature. You might add info about along with the warnings when it 
discovers something it can fix.





Re: Porting to typed C++ parser (was: Dynamic token kinds)

2018-12-22 Thread Hans Åberg
>> I do not reply to anybody specific, instead based on contents, but will try 
>> remember to not reply to you.
> 
> Do as you like, but that's not what I said. I just thought it funny
> that the person who replies to just about any mail here says he
> might not see my mail. No offense! :)

I usually wait for others to reply first, if any. A direct address is used for 
attention. If you don't want to have personal copies, you'll have to change the 
subscription settings or set the reply-to field to the mailing list.





Re: Porting to typed C++ parser (was: Dynamic token kinds)

2018-12-22 Thread Hans Åberg
>> Well, I apologize for replying to your email, but it seemed directed to be.
> 
> This one, yes. But so far you have replied to almost any of my
> mails, whether directed to you or someone else. Which is fine on a
> mailing list, but it makes your "threat" of not noticing my posts a
> bit less urgent. ;)

I do not reply to anybody specific, instead based on contents, but will try 
remember to not reply to you.





Re: Porting to typed C++ parser (was: Dynamic token kinds)

2018-12-19 Thread Hans Åberg


> On 19 Dec 2018, at 15:38, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> By default, I keep recipients (list) and CCs (Akim) as is.
>> 
>> With a reply to all, it would come out right.
> 
> Depends on what you consider right. It's a mailing list. Those who
> write here are expected to read via the list (subscription or
> archive; I know you do) or take their own measures (Reply-To or CC).
> With "reply to all", I get all of your mails twice, doesn't seem
> right to me (though only slightly annoying these days, as opposed to
> 20 years ago when every KB of internet data cost a fortune ;).

That can be changed in you list subscription. The emails I get have both your 
address and bug-bison, so you perhaps already get two copies anyway on the 
reply-alls.

>>> If you
>>> want a private copy, set a Reply-To.
>> 
>> Those are generally bad.
> 
> I don't agree. Some years ago there was a somewhat famous rant
> against them, but it was full of BS and misunderstandings.
> 
> Or you can set a CC to yourself if you wish.

I simply did not know why you had this funny cc'ing to some, but not to me. 
Don't worry about it.

>> I might not notice your posts, so do as you wish.
> 
> Honestly, you *not* noticing my posts is one of my smallest worries. ;)

Well, I apologize for replying to your email, but it seemed directed to be.

>>>>>> Actually, I pass the semantic value through a class object to
>>>>>> which the lexer function belongs, so the extra arguments are not
>>>>>> needed. So I must think more about this.
>>>>> 
>>>>> FWIW, my lexer function is also part of a "driver" object, but I
>>>>> don't think that's relevant here. Maybe you think of the "%param"
>>>>> arguments, but it's not about those.
>>>> 
>>>> No, I use the Flex C++ parser right now, which for some reason
>>>> requires an empty argument, so this works without patching the
>>>> header, that is, until they decided to bomb it in 2.6 and later.
>>> 
>>> I don't use flex anymore, and never used flex C++, so I don't know
>>> about that. But again, that's unrelated to the current proposal.
>> 
>> It was in response to your comment.
> 
> Which was in response to your comment about some mysterious "extra
> arguments".

I do not use %param, don't worry about the other.

>> There might be a problem with the Bison variant if one in the
>> lexer returns an object of another type that the token type, and
>> expect a conversion.
> 
> There is no conversion, and in most cases, there couldn't be any
> when types are too different; but even if types are similar enough
> that a safe automatic conversion would be possible, say char to int,
> I know of no variant or union that would do so. That's not specific
> to Bison's variant.
> 
> So such a mismatch is an error. The only difference is how this
> error is handled. std::variant throws an exception on usage; Bison's
> variant so far basically makes it UB (which is a valid answer; many
> things in C++ are UB), the current proposal would report an error on
> construction already, so slightly better than std::variant (in case
> the mismatching value is silently discarded).

Two lookup table variations: The values are stored, with the token value, as a 
pointer to a base class, or as variants.





Re: Porting to typed C++ parser (was: Dynamic token kinds)

2018-12-18 Thread Hans Åberg



> On 18 Dec 2018, at 13:20, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> On 17 Dec 2018, at 19:09, Frank Heckenbach  wrote:
>> 
>> [Note: you don't cc me, only others, for some reason.]
> 
> By default, I keep recipients (list) and CCs (Akim) as is.

With a reply to all, it would come out right.

> If you
> want a private copy, set a Reply-To.

Those are generally bad. I might not notice your posts, so do as you wish.

>>>> Actually, I pass the semantic value through a class object to
>>>> which the lexer function belongs, so the extra arguments are not
>>>> needed. So I must think more about this.
>>> 
>>> FWIW, my lexer function is also part of a "driver" object, but I
>>> don't think that's relevant here. Maybe you think of the "%param"
>>> arguments, but it's not about those.
>> 
>> No, I use the Flex C++ parser right now, which for some reason
>> requires an empty argument, so this works without patching the
>> header, that is, until they decided to bomb it in 2.6 and later.
> 
> I don't use flex anymore, and never used flex C++, so I don't know
> about that. But again, that's unrelated to the current proposal.

It was in response to your comment.

>>> In my proposal:
>>> 
>>> symbol_type make_symbol (token_type type, int &&);
>>> 
>>> the second parameter is actually the semantic value (and the first
>>> one the token kind, of course), so there are no extra arguments, no
>>> driver, no lexer, not even the parser (no hidden "this" parameter,
>>> since these are static functions), so I think there's nothing to
>>> worry about. It's just about building a token from these two
>>> parameters, as you'd expect to, by basically calling its
>>> constructor, and (as per the proposal) adding checks to help avoid
>>> mismatches between the two of them.
>>> 
>>> But if you tell us more about how you (plan to) build your tokens,
>>> we could say if there are any potential problems.
>> 
>> I use a polymorphic (virtual) pointer class, a string, a token
>> number, or a combination thereof depending on context. For the
>> first, it might be useful with typecasts that simplify the code.
> 
> A variant is (in this regard) somewhat similar to a polymorphic
> pointer (except that the pointer needs memory management, as you
> know), so it seems you should be able to use a variant instead.
> 
> With variants, you usually don't use type-casts, but accessors, such
> as get<> with std::variant or as<> with Bison's variant, though in
> Bison parsers, that usually happens automatically with $n.
> 
> So I see no reason why you shouldn't be able to use variants, but of
> course, without seeing any details, I can just talk generally.

There might be a problem with the Bison variant if one in the lexer returns an 
object of another type that the token type, and expect a conversion.





Re: Porting to typed C++ parser (was: Dynamic token kinds)

2018-12-17 Thread Hans Åberg


> On 17 Dec 2018, at 19:09, Frank Heckenbach  wrote:

[Note: you don't cc me, only others, for some reason.]

>>> On 17 Dec 2018, at 18:37, Frank Heckenbach  wrote:
>>> 
>>> Do you actually use Bison's variant? Otherwise, what you do is
>>> irrelevant here (sorry), as this is a proposal specifically about
>>> Bison's variant.
>> 
>> As I said, I do not use it now, but I wanted to know whether I
>> could use it before actually attempting to convert to it, which
>> may be irrelevant in your programming approach. I have used a
>> typed C++ parser I wrote myself before Akim started to write one,
>> but then it wasn't very useful.
> 
> OK, that's a different question (I think the issue about adding
> type-checks is settled now, just waiting for Akim's confirmation);

Specifically, I only have use for it simplifying casts and type checks, not as 
optimization.

> so:
> 
>> Actually, I pass the semantic value through a class object to
>> which the lexer function belongs, so the extra arguments are not
>> needed. So I must think more about this.
> 
> FWIW, my lexer function is also part of a "driver" object, but I
> don't think that's relevant here. Maybe you think of the "%param"
> arguments, but it's not about those.

No, I use the Flex C++ parser right now, which for some reason requires an 
empty argument, so this works without patching the header, that is, until they 
decided to bomb it in 2.6 and later.

> In my proposal:
> 
>  symbol_type make_symbol (token_type type, int &&);
> 
> the second parameter is actually the semantic value (and the first
> one the token kind, of course), so there are no extra arguments, no
> driver, no lexer, not even the parser (no hidden "this" parameter,
> since these are static functions), so I think there's nothing to
> worry about. It's just about building a token from these two
> parameters, as you'd expect to, by basically calling its
> constructor, and (as per the proposal) adding checks to help avoid
> mismatches between the two of them.
> 
> But if you tell us more about how you (plan to) build your tokens,
> we could say if there are any potential problems.

I use a polymorphic (virtual) pointer class, a string, a token number, or a 
combination thereof depending on context. For the first, it might be useful 
with typecasts that simplify the code.





Re: Dynamic token kinds

2018-12-17 Thread Hans Åberg


> On 17 Dec 2018, at 18:37, Frank Heckenbach  wrote:
> 
> Do you actually use Bison's variant? Otherwise, what you do is
> irrelevant here (sorry), as this is a proposal specifically about
> Bison's variant.

As I said, I do not use it now, but I wanted to know whether I could use it 
before actually attempting to convert to it, which may be irrelevant in your 
programming approach. I have used a typed C++ parser I wrote myself before Akim 
started to write one, but then it wasn't very useful.





Re: Dynamic token kinds

2018-12-17 Thread Hans Åberg


> On 17 Dec 2018, at 11:17, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> On 17 Dec 2018, at 10:48, Frank Heckenbach  wrote:
>> 
>> Might Bison generate a function with a switch statement, generate the right 
>> return for the lexer to use?
> 
> Different semantic types need separate functions since C++ is
> strongly typed. Perhaps an example makes it clearer:
> 
> Say we have tokens V_FOO and V_BAR with no semantic type, I_BAZ and
> I_QUX with semantic type int and S_BLA with type string. (BTW, I'm
> no fan of Hungarian notation, just use it here for the sake of
> example.) So far Bison generates (roughly speaking):
> 
>  symbol_type make_V_FOO ();
>  symbol_type make_V_BAR ();
>  symbol_type make_I_BAZ (int &&);
>  symbol_type make_I_QUX (int &&);
>  symbol_type make_S_BLA (string &&);

I thought something like that from looking at the calculator example.

> What I suggest to add (without changing the above), is:
> 
>  symbol_type make_symbol (token_type type);
>  // checks at runtime that type is V_FOO or V_BAR
> 
>  symbol_type make_symbol (token_type type, int &&);
>  // checks at runtime that type is I_BAZ or I_QUX
> 
>  symbol_type make_symbol (token_type type, string &&);
>  // checks at runtime that type is S_BLA
> 
> These runtime checks might be implemented via a switch if that's
> easier to auto-generate (it might be in fact) or with a simple
> "if (... || ...)" statement, that's an implementation detail.

Actually, I pass the semantic value through a class object to which the lexer 
function belongs, so the extra arguments are not needed. So I must think more 
about this.

>>> It's not that bad actually. Again, my lexers work fine as is.
>>> I just brought this up because Akim proposed to call the function
>>> "unsafe_..." which I thought was too harsh and proposed
>>> "unchecked_..." -- but by adding the checks, it would be neither
>>> unsafe nor unchecked. :)
>> 
>> This worries me.
> 
> That's why I suggest to add the check. :)

There must be some guard against programming errors, I think.

>> But also having having to use something more complex to be returned by the 
>> lexer than a value on the lookup table .
> 
> The lexer returns a token which contains the token kind (an enum)
> and the semantic value (a union value). As mismatch is bad. The
> make_FOO functions avoid a mismatch and are suitable for statically
> known token kinds. The direct constructor call can be used for
> dynamic token kinds, but allows a mismatch. The functions I propose
> to generate instead could be used for dynamic token kinds and avoid
> a mismatch.
> 
> Everything clear now?

Yes, it is the requirement of returning the semantic value that causes the 
issue. Then this requirement is perhaps born out the condition of not storing 
the type in the Bison variant.





Re: Dynamic token kinds

2018-12-17 Thread Hans Åberg


> On 17 Dec 2018, at 10:48, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> On 16 Dec 2018, at 15:48, Frank Heckenbach  wrote:
>>> 
>>> Hans Åberg wrote:
>>> 
>>>> The idea would be that rather than returning the token value,
>>>> something that does not need translation. I don't know if that
>>>> helps up the static approach, though.
>>> 
>>> Not sure what you mean here. Please elaborate.
>> 
>> I couldn't see the details when I looked at it. I don't use the typed 
>> parser, but might if it is safe.
> 
> The parser is type safe. This is only about an alternative way of
> creating tokens by the lexer, which is also type safe when used
> properly (as mine does). It's only about adding an additional safety
> net.

Right.

>>> I think Akim made it clear enough that he doesn't like the overhead.
>>> (I don't mind so much, as I used std::variant in my implementation,
>>> but I only have a few parsers to care about.)
>> 
>> In that case, my impression was that he thought he could do without it.
> 
> Well, he can. :)

In that case.

>>> One might validly say that preventing it is the job of the lexer
>>> (and my lexer does so), not Bison's, just like other kinds of
>>> undefined or wrong behaviour outside of the generated code, also
>>> dynamic token types are a somewhat special usage anyway, so Bison
>>> can just do nothing about it, that's fine.
>> 
>> I use the same thing, returning the token value found on a lookup
>> table, but I would not want to use the typed parser if I would
>> have to add translations for every possibility. The information
>> about it is in Bison, therefore it should not be put on the
>> writing of the lexer.
> 
> I think we agree here, and that was actually my concern when I
> started this thread. I don't want to have to write a separate case
> for each token kind in my lexer. Of course, we need a separate case
> for each semantic type because that involves a different type in the
> constructor/builder call already, but these are relatively few,
> compared to token kinds, in my lexers.

Might Bison generate a function with a switch statement, generate the right 
return for the lexer to use?

>>> I also suggested an approach in my previous mail with a few more
>>> generated functions that help runtime checking. I'd prefer Bison to
>>> add them, and then we'd have runtime checking as good as we'd have
>>> with std::variant in this respect.
>> 
>> Maybe an option. Akim perhaps haven't used this dynamic token
>> lookup.
> 
> I guess he hasn't. But I don't think we need an option. These would
> just be additional functions that one can use or not.

The with an option would be that those that do not need this feature could use 
a more optimal variant.

>> Those that do might prefer not risking the program to bomb.
> 
> It's not that bad actually. Again, my lexers work fine as is.
> I just brought this up because Akim proposed to call the function
> "unsafe_..." which I thought was too harsh and proposed
> "unchecked_..." -- but by adding the checks, it would be neither
> unsafe nor unchecked. :)

This worries me. But also having having to use something more complex to be 
returned by the lexer than a value on the lookup table .





Re: Dynamic token kinds

2018-12-16 Thread Hans Åberg


> On 16 Dec 2018, at 15:48, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>> The idea would be that rather than returning the token value,
>> something that does not need translation. I don't know if that
>> helps up the static approach, though.
> 
> Not sure what you mean here. Please elaborate.

I couldn't see the details when I looked at it. I don't use the typed parser, 
but might if it is safe.

>>>> Personally, I am already at C++17, so I have no reason for using a
>>>> simpler variant. Having the type stored just adds a size_t, and
>>>> that is used a lot in other circumstances, so no overhead to worry
>>>> about.
>>> 
>>> Well, we had this discussion recently (as far as Bison is
>>> concerned).
>> 
>> Indeed, but that was where it seemed not requiring the type being
>> stored in the variant. This situation might be different in that
>> respect.
>> 
>> Here, not making sure the type is properly returned may bomb the
>> program, so preventing that seems higher than a rather small
>> overhead.
> 
> I think Akim made it clear enough that he doesn't like the overhead.
> (I don't mind so much, as I used std::variant in my implementation,
> but I only have a few parsers to care about.)

In that case, my impression was that he thought he could do without it.

> One might validly say that preventing it is the job of the lexer
> (and my lexer does so), not Bison's, just like other kinds of
> undefined or wrong behaviour outside of the generated code, also
> dynamic token types are a somewhat special usage anyway, so Bison
> can just do nothing about it, that's fine.

I use the same thing, returning the token value found on a lookup table, but I 
would not want to use the typed parser if I would have to add translations for 
every possibility. The information about it is in Bison, therefore it should 
not be put on the writing of the lexer.

> I also suggested an approach in my previous mail with a few more
> generated functions that help runtime checking. I'd prefer Bison to
> add them, and then we'd have runtime checking as good as we'd have
> with std::variant in this respect.

Maybe an option. Akim perhaps haven't used this dynamic token lookup. Those 
that do might prefer not risking the program to bomb.





Re: Dynamic token kinds

2018-12-16 Thread Hans Åberg


> On 16 Dec 2018, at 11:13, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>> Perhaps an entirely static approach can be achieved by the type
>> being a part of the token_type. On the other hand, the use is for
>> dynamic token lookup, so it may be too much of an effort for that.
> 
> Not sure what you mean with "part of", but with Bison's variant the
> semantic type is determined by token_type, if that's what you mean.

The idea would be that rather than returning the token value, something that 
does not need translation. I don't know if that helps up the static approach, 
though.

>> Personally, I am already at C++17, so I have no reason for using a
>> simpler variant. Having the type stored just adds a size_t, and
>> that is used a lot in other circumstances, so no overhead to worry
>> about.
> 
> Well, we had this discussion recently (as far as Bison is
> concerned).

Indeed, but that was where it seemed not requiring the type being stored in the 
variant. This situation might be different in that respect.

Here, not making sure the type is properly returned may bomb the program, so 
preventing that seems higher than a rather small overhead.





Re: Dynamic token kinds

2018-12-16 Thread Hans Åberg



> On 16 Dec 2018, at 10:02, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> On 30 Nov 2018, at 00:40, Frank Heckenbach  wrote:
>>> 
>>> Hans Åberg wrote:
>>> 
>>>> It seems pretty standard to have lookup tokens with different
>>>> syntactic behavior, for example when they are declared of
>>>> different type in a language. So it is worrisome that the typed
>>>> parser deems the use unsafe.
>>> 
>>> What is potentially unsafe is that the actual type may not match the
>>> declared type in the grammar. With std::variant, a mismatch would
>>> cause an exception to be thrown. With Bison's static variant, a
>>> mismatch might lead to UB.
>>> 
>>> So perhaps this function could actually do a type check (which
>>> probably requires another auto-generated switch) and also throw or
>>> (if this is not desired) call std::terminate or so on mismatch,
>>> Akim?
>> 
>> The C++17 std::variant stores the type as an index. So perhaps
>> there should be and additional table storing the type, with a
>> symbol constructor that constructs the right value from the token.
> 
> Unlike std::variant, Bison's variant does not store the type at all.
> So the type check I suggested isn't actually that easy. By the time
> the symbol_type constructor is called, the variant value has already
> been constructed and type information lost.
> 
> So to make it safe, we might need something like this:
> 
>  static inline
>  symbol_type
>  make_symbol (token_type type, b4_locations_if([const location_type& l, ])T&& 
> v);
> 
> auto-generated for each semantic type T of any token (plus one
> without the "v" parameter for untyped tokens) that checks (at
> runtime) the "type" parameter against the (statically known) valid
> token types for T.

I felt that one should just return the token value, in part because otherwise 
the lookup data must be changed, and in part because the type is internal to 
the parser. Then the constructor must be able to add the type, and so arrived 
at the suggestion of an additional table, for use with std::variant.

Perhaps an entirely static approach can be achieved by the type being a part of 
the token_type. On the other hand, the use is for dynamic token lookup, so it 
may be too much of an effort for that.

Personally, I am already at C++17, so I have no reason for using a simpler 
variant. Having the type stored just adds a size_t, and that is used a lot in 
other circumstances, so no overhead to worry about.





Re: Dynamic token kinds

2018-12-15 Thread Hans Åberg


> On 30 Nov 2018, at 00:40, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> Sure, though for my taste "unsafe" sounds a bit harsh, perhaps
>>> "unchecked"? If you put in the next release, I'll change my code to
>>> use it.
>> 
>> It seems pretty standard to have lookup tokens with different
>> syntactic behavior, for example when they are declared of
>> different type in a language. So it is worrisome that the typed
>> parser deems the use unsafe.
> 
> What is potentially unsafe is that the actual type may not match the
> declared type in the grammar. With std::variant, a mismatch would
> cause an exception to be thrown. With Bison's static variant, a
> mismatch might lead to UB.
> 
> So perhaps this function could actually do a type check (which
> probably requires another auto-generated switch) and also throw or
> (if this is not desired) call std::terminate or so on mismatch,
> Akim?

The C++17 std::variant stores the type as an index. So perhaps there should be 
and additional table storing the type, with a symbol constructor that 
constructs the right value from the token.





Re: Dynamic token kinds

2018-11-29 Thread Hans Åberg


> On 29 Nov 2018, at 01:12, Frank Heckenbach  wrote:
> 
> Akim Demaille wrote:
> 
>> Wrt to the symbol constructor, you are right to be worried: I don't
>> consider it (so far?) to be part of the public API.  I do understand
>> something like it is needed, but I don't like that it looks safe
>> to use.
>> 
>> Would you be ok with parser::unsafe_make_symbol, or something like
>> this?
> 
> Sure, though for my taste "unsafe" sounds a bit harsh, perhaps
> "unchecked"? If you put in the next release, I'll change my code to
> use it.

It seems pretty standard to have lookup tokens with different syntactic 
behavior, for example when they are declared of different type in a language. 
So it is worrisome that the typed parser deems the use unsafe.





Re: 3.2.1.0...  bison is released [stable]

2018-11-09 Thread Hans Åberg


> On 9 Nov 2018, at 07:06, Akim Demaille  wrote:
> 
> We would have been happy not to have to announce the release of Bison 3.2.1,
> which fixes portability issues of Bison 3.2.

On MacOS 10.13.6 with the inhouse clang, I got some warnings. The obstack one 
is old, but I don't recall seeing the no symbols warning before.

--
$ ../bison-3.2.1/configure CC=/usr/bin/clang CXX=/usr/bin/clang++ 
…
$ make
…
  CC   lib/obstack.o
../bison-3.2.1/lib/obstack.c:351:31: warning: incompatible pointer types 
initializing 'void (*)(void) __attribute__((noreturn))' with an expression of 
type
  'void (void)' [-Wincompatible-pointer-types]
__attribute_noreturn__ void (*obstack_alloc_failed_handler) (void)
  ^
1 warning generated.
…
  AR   lib/libbison.a
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(binary-io.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(bitrotate.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(c-ctype.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(fd-hook.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(xtime.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(getprogname.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(math.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(sig-handler.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(stat-time.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(threadlib.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(timespec.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(unistd.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(wctype-h.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(xsize.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(localtime-buffer.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(binary-io.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(bitrotate.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(c-ctype.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(fd-hook.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(xtime.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(getprogname.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(math.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(sig-handler.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(stat-time.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(threadlib.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(timespec.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(unistd.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(wctype-h.o) has no symbols
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib:
 file: lib/libbison.a(xsize.o) has no symbols

Re: bison-3.2 released [stable]

2018-10-29 Thread Hans Åberg


> On 29 Oct 2018, at 21:33, Akim Demaille  wrote:
> 
> We are very happy to announce the release of Bison 3.2!

Actually two warnings on make with MacOS inhouse clang, the obstack one same as 
with 3.1 :-), and one with the bitset. All tests passed though.

--

MacOS 10.13.6.

$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin


$ make
…

  CC   lib/bitset.o
../bison-3.2/lib/bitset.c:356:16: warning: using the result of an assignment as 
a condition without parentheses [-Wparentheses]
while (num = bitset_list (src, list, BITSET_LIST_SIZE, ))
   ^~
../bison-3.2/lib/bitset.c:356:16: note: place parentheses around the assignment 
to silence this warning
while (num = bitset_list (src, list, BITSET_LIST_SIZE, ))
   ^
   ( )
../bison-3.2/lib/bitset.c:356:16: note: use '==' to turn this assignment into 
an equality comparison
while (num = bitset_list (src, list, BITSET_LIST_SIZE, ))
   ^
   ==
1 warning generated.

  CC   lib/obstack.o
../bison-3.2/lib/obstack.c:351:31: warning: incompatible pointer types 
initializing 'void (*)(void) __attribute__((noreturn))' with an expression of 
type
  'void (void)' [-Wincompatible-pointer-types]
__attribute_noreturn__ void (*obstack_alloc_failed_handler) (void)
  ^
1 warning generated.
--





Re: bison-3.2 released [stable]

2018-10-29 Thread Hans Åberg


> On 29 Oct 2018, at 21:33, Akim Demaille  wrote:
> 
> We are very happy to announce the release of Bison 3.2!

I got a warning on make with MacOS inhouse clang.

--

MacOS 10.13.6.

$ gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr 
--with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 10.0.0 (clang-1000.11.45.2)
Target: x86_64-apple-darwin17.7.0
Thread model: posix
InstalledDir: 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin


$ make
…

../bison-3.1/lib/obstack.c:351:31: warning: incompatible pointer types 
initializing 'void (*)(void) __attribute__((noreturn))' with an expression of 
type
  'void (void)' [-Wincompatible-pointer-types]
__attribute_noreturn__ void (*obstack_alloc_failed_handler) (void)
  ^
1 warning generated.

--


Re: Bison 3.1 api.value.type {semantic_type} error in gcc8

2018-10-17 Thread Hans Åberg


> On 16 Oct 2018, at 18:21, Akim Demaille  wrote:
> 
>> Le 16 oct. 2018 à 14:57, Hans Åberg  a écrit :
>> 
>> The minimal example below compiles with clang++6, but not with g++8. One 
>> would think that it should define a qualified name B::A, used as
>> mu::B::A a;
>> as in the other cases below. But I have it probably by legacy since when 
>> Bison used YYSTYPE only, though.
> 
> Concretely, it is my understanding that this is irrelevant
> to Bison, isn’t it?

One can use templates—in the example below, B is the parser, and A the semantic 
type. That would be using C++ paradigms rather than writing the code directly.

--
#include 
#include 

namespace mu {
  class A {};

  template
  class B {
  public:
typedef T A;
  };
}

int main () {
  mu::B::A a;
  mu::A b;

  return 0;
}
--




Re: Bison 3.1 api.value.type {semantic_type} error in gcc8

2018-10-16 Thread Hans Åberg


> On 16 Oct 2018, at 18:21, Akim Demaille  wrote:
> 
>> Le 16 oct. 2018 à 14:57, Hans Åberg  a écrit :
>> 
>> The minimal example below compiles with clang++6, but not with g++8. One 
>> would think that it should define a qualified name B::A, used as
>> mu::B::A a;
>> as in the other cases below. But I have it probably by legacy since when 
>> Bison used YYSTYPE only, though.
> 
> Concretely, it is my understanding that this is irrelevant
> to Bison, isn’t it?

It has been reported in 2011, and based on [basic.scope.class] they have 
decided that this is how it should be [1], though it looks as though 
[dcl.typedef] says differently. But too limited to worry about in Bison, I 
think.

1. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50418





Re: Bison 3.1 api.value.type {semantic_type} error in gcc8

2018-10-16 Thread Hans Åberg


> On 16 Oct 2018, at 13:19, Akim Demaille  wrote:
> 
>> Le 16 oct. 2018 à 11:20, Hans Åberg  a écrit :
>> 
>> In Bison 3.1,
>> %define api.value.type {semantic_type}
> 
> Actually, what are you trying to achieve?  semantic_type is the name
> of the typedef used by bison.  So, of course, it’s quite a bad idea
> to use that name.  Unless you do mean to use that name, but not
> Bison’s, rather some other definition, coming from some other place,
> in which case you must provide the namespace/path to it.

The minimal example below compiles with clang++6, but not with g++8. One would 
think that it should define a qualified name B::A, used as
  mu::B::A a;
as in the other cases below. But I have it probably by legacy since when Bison 
used YYSTYPE only, though.

A draft version of the C++ standard says:

7.1.3 The typedef specifier [dcl.typedef]
4. In a given class scope, a typedef specifier can be used to redefine any 
class-name declared in that scope
that is not also a typedef-name to refer to the type to which it already 
refers. [Example:
  struct S {
typedef struct A { } A; // OK
typedef struct B B; // OK
typedef A A;// error

};
— end example]

So in the example below, g++8 accepts:
  typedef class A A;
  typedef mu::A A;

--
namespace mu {
  class A {};

  class B {
  public:
typedef A A;
  };
}

int main () {
  mu::B::A a;

  return 0;
}
--




Re: Bison 3.1 api.value.type {semantic_type} error in gcc8

2018-10-16 Thread Hans Åberg


> On 16 Oct 2018, at 12:10, Akim Demaille  wrote:
> 
> Hi Hans,

Hello,

>> Le 16 oct. 2018 à 11:20, Hans Åberg  a écrit :
>> 
>> In Bison 3.1,
>> %define api.value.type {semantic_type}
>> produces an error in gcc8, though accepted in clang6, by the parser header 
>> typedef
>> #ifndef YYSTYPE
>>   /// Symbol semantic values.
>>   typedef semantic_type semantic_type;
>> #else
>>   typedef YYSTYPE semantic_type;
>> #endif
> 
> Please, be more specific.  Provide an input file, and the complete error from 
> the compiler.

I just happened to discover it, so I do not know much about it yet. It may have 
to do with -std=c++17 on g++ 8.2.0, and I have a class semantic_type.

A workaround is to make the typedef'ed name qualified:
  typedef ::semantic_type semantic_type;

So if I change to
  %define api.value.type {::semantic_type}
then it passes.





Bison 3.1 api.value.type {semantic_type} error in gcc8

2018-10-16 Thread Hans Åberg
In Bison 3.1,
  %define api.value.type {semantic_type}
produces an error in gcc8, though accepted in clang6, by the parser header 
typedef
  #ifndef YYSTYPE
/// Symbol semantic values.
typedef semantic_type semantic_type;
  #else
typedef YYSTYPE semantic_type;
  #endif





Unused explicitly named references ewarning

2018-09-18 Thread Hans Åberg
Just a thought, Bison might issue a warning if an explicitly named reference is 
not used in the action. For example, in
  exp: exp[x] "/" exp[y] {…}
if the action does not have $x and/or $y.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-18 Thread Hans Åberg



> On 18 Sep 2018, at 04:48, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>> C++ does not support the implementation of a (tracing) GC, because
>>>> the information needed, though available to the compiler, is not
>>>> available from the language. The fact that it does not have a GC
>>>> is another topic.
>>> 
>>> I know and (again) I don't care because I don't want to use GC
>>> anyway. You're basically discussing with yourself here.
>> 
>> It exemplifies the of limitations C++. See below.
> 
> Nope. Again: compile-time != runtime.

It is funny, below you give a link of people wanting to add unavailable runtime 
information to compile time.

>>> My option? I'm not having this problem.
>> 
>> So then why bother bringing it up in the first place?
> 
> Funnily, I preemptively answered this question just in the next
> sentence that you also quoted:

Hilariously, you rather introduced this new topic, when I discussed something 
else.

>>> Again, I only brought up the
>>> make_pair thing as a counterexample to your claim that an automatic
>>> break would avoid any possible double-move.
>> 
>> You gave a counterexample of something else than what I discussed.
> 
> Nope, you said: "So a way to make it safe is to jump out of the
> action statement.",

That referred to that other feature.

> I replied: "Not necessarily: You could use it
> twice within one expression (unsafe even with jump)", and you
> replied: "That would probably not be possible, if one gets an
> automated break after it." Then I gave an example of what I had said
> before, that it *is* possible to use it twice in one statement,
> counter to your claim. I never said it occured in my code, just
> explained why break isn't the solution.
> 
> You can check it here:
> http://lists.gnu.org/archive/html/bug-bison/2018-09/msg00036.html

Then when you clarified, I decided to follow your topic. 

>>> But it might be a solution to someone else's problem (or another
>>> problem of mine some other day ;).
>> 
>> Until that day, it's too esoteric to be worth implementing, in my opinion.
> 
> Esoteric I don't know, but certainly not hard to implement.
> Basically a flag that it set when moved from and causes an exception
> on any subsequent access (except assignment to, and destruction, of
> course). Seems like a nice student project ... :)

So then implement this compile time check yourself, rather than having lengthy 
discussions here.

> What would be much more interesting would be a generic decorator
> that does this and can be applied to any type one wants to, but
> that's more tricky and I think impossible in current C++ because one
> can't wrap "any method" generically. Maybe if Herb Sutter's
> metaclass proposal takes off, one (far) day ...
> https://herbsutter.com/2017/07/26/metaclasses-thoughts-on-generative-c/
>>>> Even though the compiler may have access to the information to
>>>> check that, you don't have access to that from the language
>>>> itself.
>>> 
>>> - Within(!) the compiler, more general, but a job for compiler
>>> experts, not me.
>> 
>> You won't get that for the same reason as in the GC case above.
> 
> And why would that be? You said yourself that "the compiler may have
> access to the information to check that".

Why don't you check with the GCC people if they want to implement it.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-17 Thread Hans Åberg


> On 18 Sep 2018, at 00:20, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>> Yes, indeed C++ does not support that,
>>> 
>>> Are you replying to your own statement now? I never claimed (or
>>> cared) whether C++ supports GC.
>> 
>> C++ does not support the implementation of a (tracing) GC, because
>> the information needed, though available to the compiler, is not
>> available from the language. The fact that it does not have a GC
>> is another topic.
> 
> I know and (again) I don't care because I don't want to use GC
> anyway. You're basically discussing with yourself here.

It exemplifies the of limitations C++. See below.

>>> At runtime, yes. Basically an extended unique_ptr could detect this
>>> automatically.
>> 
>> This looks like becoming you option. You might use it for
>> debugging only.
> 
> My option? I'm not having this problem.

So then why bother bringing it up in the first place?

> Again, I only brought up the
> make_pair thing as a counterexample to your claim that an automatic
> break would avoid any possible double-move.

You gave a counterexample of something else than what I discussed.

> But it might be a solution to someone else's problem (or another
> problem of mine some other day ;).

Until that day, it's too esoteric to be worth implementing, in my opinion.

>>> I'd rather see a compile-time check, even if it's a
>>> bit primitive, i.e. gives false positives.
>> 
>> Even though the compiler may have access to the information to
>> check that, you don't have access to that from the language
>> itself. Parsing the language is a long haul, even though there
>> were some here wanting help with that.
> 
> No, I don't want to go there. As said before, I see two viable
> compile-time options so far:
> 
> - Within Bison, hopefully rather easy to implement, but with some
>  false positives.
> 
> - Within(!) the compiler, more general, but a job for compiler
>  experts, not me.

You won't get that for the same reason as in the GC case above.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-17 Thread Hans Åberg


> On 17 Sep 2018, at 23:27, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>> This illustrates the problem with make_pair($x, $x): one may try a
>>>> reference count or GC, but C++ does not support the implementation
>>>> of a GC, even though the compiler has the required information, it
>>>> is not accessible from the language.
>>> 
>>> Hold it! We were discussing a static compiler check, and within two
>>> paragraphs you divert to runtime checks (which are less reliable)
>>> and then to GC which I dislike for many reasons. To me it seems like
>>> a last effort: if you can't do proper resource management (like
>>> RAII), let the garbage collector pick up the pieces.
>> 
>> Yes, indeed C++ does not support that,
> 
> Are you replying to your own statement now? I never claimed (or
> cared) whether C++ supports GC.

C++ does not support the implementation of a (tracing) GC, because the 
information needed, though available to the compiler, is not available from the 
language. The fact that it does not have a GC is another topic.

>>> Fact is, I can
>>> do proper resource management in most cases, and just said it would
>>> be nice to have an extra check if it's not too hard to implement.
>> 
>> It looks is not hard to implement such a check against double
>> moves, and that might be the best solution, though it calls for
>> more careful runtime testing.
> 
> At runtime, yes. Basically an extended unique_ptr could detect this
> automatically.

This looks like becoming you option. You might use it for debugging only. By 
contrast, a reference count cannot be optimized away, so this might be better.

> I'd rather see a compile-time check, even if it's a
> bit primitive, i.e. gives false positives.

Even though the compiler may have access to the information to check that, you 
don't have access to that from the language itself. Parsing the language is a 
long haul, even though there were some here wanting help with that. There are 
LALR C++ grammars out there at least for some earlier language version.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-16 Thread Hans Åberg


> On 16 Sep 2018, at 19:16, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>> Maybe not for you, but for a more normal use, with types having
>>>> both copy and move. Then one would like to use move to $$ whenever
>>>> possible. Doesn't matter with me, as it is just some pointers and
>>>> integers.
>>> 
>>> As I said, this already happens in "$$ = foo ();" (automatially) and
>>> "$$ = std::move ($k);" (whether the move is explicit or
>>> automatically inserted as we're discussing). So if $k is covered, no
>>> special handling for $$ seems necessary.
>> 
>> You can do it by hand,
> 
> Sorry, but how does this answer anything? The subject (also thread
> subject) is explicitly: "Automatially[sic, sorry for the typo ;]
> move from $n".
> 
> So saying "You can do it by hand" is just giving up (which is
> pointless, since I already have a solution).

It looks as though you are suggesting doing it by hand by writing $$ = 
std::move($k) instead of $$ = $k whenever necessary. I considered having it 
automatically.

>> but the case I considered was to find an automated approach
>> without explicitly calling std::move in $$ = $k. But your case is
>> different.
> 
> It's more general. "$$ = $k" is strictly a subset of the general
> case. (And IMHO not a very important one for k > 1. For k = 1, we
> can have a default action, but how often do you really need
> "$$ = $2"? In a rule like "expr = '(' expr ')';" typically, but
> that's about it.)

It happens every now and again. I have
  identifier_declaration definition[x]
where the first just defines the names and does not produce any value.

> I can't help but view your insisting on $$ as diverting from the
> actual topic. If you want to discuss any issues with moving to $$
> (whatever those issues may be; I don't see any), may I suggest you
> start a new thread, please?

You are the one keeping comping back to the issue, despite I told you I now see 
that you are interested in  something else.

>>>> No, but it seems me it is a hard problem. A compiler optimizer can
>>>> recognize such things if it has sufficient information about the
>>>> types, by tracing the code flow.
>>> 
>>> Yes, perhaps we should ignore the issue for now and hope for
>>> compilers to offer such a warning in the future (which would be more
>>> useful anyway, since it would work for all code, not only Bison
>>> grammars).
>> 
>> You might make, for debugging purposes, a simplified version of a
>> reference count, a boolean that tells whether the object has been
>> moved, and issue an error if moved again, or otherwise just have a
>> moved from state with such a property.
>> 
>> This illustrates the problem with make_pair($x, $x): one may try a
>> reference count or GC, but C++ does not support the implementation
>> of a GC, even though the compiler has the required information, it
>> is not accessible from the language.
> 
> Hold it! We were discussing a static compiler check, and within two
> paragraphs you divert to runtime checks (which are less reliable)
> and then to GC which I dislike for many reasons. To me it seems like
> a last effort: if you can't do proper resource management (like
> RAII), let the garbage collector pick up the pieces.

Yes, indeed C++ does not support that, and it looks it is the same with your 
problem for the same reasons, which is why it is so difficult to find solutions 
within the language.

> Fact is, I can
> do proper resource management in most cases, and just said it would
> be nice to have an extra check if it's not too hard to implement.

It looks is not hard to implement such a check against double moves, and that 
might be the best solution, though it calls for more careful runtime testing.

> Also note that "make_pair($x, $x)" was just a counterexample to your
> claim that an automatic break after the statement would avoid the
> problem. It's not something I actually need to do. If I did, I could
> choose between copying in this case (if possible), using a
> shared_ptr, or whatever.

Yes of course, I have a polymorphic (virtual) ref GC type that emulates T& 
holding just a non-null pointer, and one might need to clone the value 
sometimes.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-16 Thread Hans Åberg


> On 16 Sep 2018, at 17:38, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:

>>> (*) Nitpick: Except in a case like "$$ = $2;", but then I'd argue
>>>   it's the passing of $2 to $$'s assignment operator that's the
>>>   issue. :)
>> 
>> Maybe not for you, but for a more normal use, with types having
>> both copy and move. Then one would like to use move to $$ whenever
>> possible. Doesn't matter with me, as it is just some pointers and
>> integers.
> 
> As I said, this already happens in "$$ = foo ();" (automatially) and
> "$$ = std::move ($k);" (whether the move is explicit or
> automatically inserted as we're discussing). So if $k is covered, no
> special handling for $$ seems necessary.

You can do it by hand, but the case I considered was to find an automated 
approach without explicitly calling std::move in $$ = $k. But your case is 
different.

>> No, but it seems me it is a hard problem. A compiler optimizer can
>> recognize such things if it has sufficient information about the
>> types, by tracing the code flow.
> 
> Yes, perhaps we should ignore the issue for now and hope for
> compilers to offer such a warning in the future (which would be more
> useful anyway, since it would work for all code, not only Bison
> grammars).

You might make, for debugging purposes, a simplified version of a reference 
count, a boolean that tells whether the object has been moved, and issue an 
error if moved again, or otherwise just have a moved from state with such a 
property.

This illustrates the problem with make_pair($x, $x): one may try a reference 
count or GC, but C++ does not support the implementation of a GC, even though 
the compiler has the required information, it is not accessible from the 
language.

>>> On the same token, Bison could then also (optionally) warn if some
>>> $k which has a type is not used. I don't know if there is interest
>>> in such a feature. It could also be useful for other languages.
>>> 
>>> It might not be too hard to do for someone who's familiar with
>>> Bison's internals. (Unfortunately, I'm not very much, and don't have
>>> much free time right now.)
>> 
>> How would this be different from the current static type system?
> 
> The type system wouldn't change. Just an additional check:
> 
>  expr: expr '+' expr { $$ = $1; };
> 
> Obviously one forgot to use $3 here which has a type (unlike $2
> which doesn't have a type and so is not expected to be used).
> 
> Bison could detect and warn about this (proably optional, since I
> guess some people declare semantic types that are only meant to be
> used sometimes).

I use explicitly named variables, which guards against using the wrong $k 
number when changing a rule:
  expr: expr[x] "+" expr[y] { $$ = $x + $y; };

It would then, I gather, be easy to warn if such explicit name are unused, as 
Bison already checks if one is using undefined names.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 16 Sep 2018, at 01:01, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>> I realize now that you want to wrap $k with std::move and guard against 
>> reuse of that! (See below)
> 
> Yes, see $subject.
> 
>>> Sure it's possible: "make_pair ($1, $1)". Don't do that with
>>> automatic move!
>> 
>> Ah, your idea is to wrap std::move around the $k values! My idea
>> is to let the C++ language recognise that, and then automate the
>> assignment to $$.
> 
> Again, the assignment to $$ is *NOT* the issue! (*)
> It's the passing (moving) of $k.
> 
> (*) Nitpick: Except in a case like "$$ = $2;", but then I'd argue
>it's the passing of $2 to $$'s assignment operator that's the
>issue. :)

Maybe not for you, but for a more normal use, with types having both copy and 
move. Then one would like to use move to $$ whenever possible. Doesn't matter 
with me, as it is just some pointers and integers.

>> Maybe there is some more general C++ code checking program that
>> can be linked into Bison to check the  actions.
> 
> You mean some kind of code analyzer? This might be possible, but may
> be overkill.

Yes, you might check how people do check against that problem, if there is some 
program doing that. Then one might get ideas of how to get into Bison.

Perhaps it might be possible to have some DLL or external program and invoke 
that.

> Bison could keep track if any $k is used several times
> and warn about that; though it may give false positives in cases
> such as:
> 
>  $$ = $1 ? foo (move ($2)) : bar (move ($2));  // safe
> 
> But the user could work around such (hopefully) rare cases with a
> temporary variable holding move ($2).
> 
> Or can code analyzers recognize the above as safe? (Do you have
> experience with any?)

No, but it seems me it is a hard problem. A compiler optimizer can recognize 
such things if it has sufficient information about the types, by tracing the 
code flow.

> On the same token, Bison could then also (optionally) warn if some
> $k which has a type is not used. I don't know if there is interest
> in such a feature. It could also be useful for other languages.
> 
> It might not be too hard to do for someone who's familiar with
> Bison's internals. (Unfortunately, I'm not very much, and don't have
> much free time right now.)

How would this be different from the current static type system?





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 16 Sep 2018, at 00:18, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>case k:
>>  auto action_k = [...](){ ... return ...; };
>>  $$ = std::move(action_k(...));
> 
> I think you're at "$$" again rather than "$1" etc. I think we had
> cleared that up.

I realize now that you want to wrap $k with std::move and guard against reuse 
of that! (See below)

>>>> So a way to make it safe is to jump out of the action statement.
>>> 
>>> Not necessarily: You could use it twice within one expression
>>> (unsafe even with jump)
>> 
>> That would probably not be possible, if one gets an automated break after it.
> 
> Sure it's possible: "make_pair ($1, $1)". Don't do that with
> automatic move!

Ah, your idea is to wrap std::move around the $k values! My idea is to let the 
C++ language recognise that, and then automate the assignment to $$.

If you have a move only type and want it to safe as above, then you need some 
more general C++ checking mechanism.

>>> or just not use it twice (safe even without
>>> jump).
>> 
>> But then one has to find a way to guard against that.
> 
> So far, the guard would have to be the programmer. (Which is not
> completely different from explicit std::move where one also needs to
> be careful. C++ is not BASIC. ;)
> 
> If Bison can detect it automatically, that would be nice, but
> otherwise demanding some responsibility by the programmer seems
> acceptable.

Maybe there is some more general C++ code checking program that can be linked 
into Bison to check the  actions.

>>>> But that just applies it always, which might be safe for your move only 
>>>> type,
>>> 
>>> In fact it's safe for copy-only types, and possibly unsafe precisely
>>> for movable types (move-only or move-and-copyable).
>> 
>> It is always unsafe with moves, unless one can find a guard against 
>> unintentional reuse of a moved-from element.
> 
> "always ... unless" = "possibly" :)

Indeed you have a lot of code, and then somebody sometime in the future should 
check it, perhaps yourself, and have forgotten about the details.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 23:26, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>> But you can't safely or in general have Bison writing $$ =
>> std::move(a) directly as one might do something else to a
>> afterwards.
> 
> It would be safe if Bison checked that it's used only once in the
> action.

But in view of that is complicated, I was playing along with idea to somehow 
using C++ to recognize it. The only way seems to use function calls and return, 
as assignments can be applied more than once. So the idea is that if Bison now 
writes
  switch (yyn) {
… 
case k:
{
  …
}   
  }
where the action may have $$ = …, one has
case k:
  $$ = std::move(action_k(…));
where the action_k is some function where one returns the value instead. I'm 
not sure exactly how to pick it together, perhaps fitting a lambda capture
case k:
  auto action_k = […](){ … return …; };
  $$ = std::move(action_k(…));

>> So a way to make it safe is to jump out of the action statement.
> 
> Not necessarily: You could use it twice within one expression
> (unsafe even with jump)

That would probably not be possible, if one gets an automated break after it.

> or just not use it twice (safe even without
> jump).

But then one has to find a way to guard against that.

>>> What I want (or actually have, since I imeplented it :) is a way to
>>> make Bison apply std::move automatically.
>> 
>> But that just applies it always, which might be safe for your move only type,
> 
> In fact it's safe for copy-only types, and possibly unsafe precisely
> for movable types (move-only or move-and-copyable).

It is always unsafe with moves, unless one can find a guard against 
unintentional reuse of a moved-from element.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 22:56, Frank Heckenbach  wrote:
> 
> You don't need h at all.
> 
> Simply "b = std::move (a);" will do the same. All it does is convert
> a to an rvalue reference. If A has a move assignment operator, this
> will be chosen, if it doesn't but a copy assignment operator, that
> one will be chosen. That's all standard C++ behaviour.

But you can't safely or in general have Bison writing $$ = std::move(a) 
directly as one might do something else to a afterwards. So a way to make it 
safe is to jump out of the action statement. Using functions and returns is 
probably not a good idea because one would have to capture variables in the 
action.

> What I want (or actually have, since I imeplented it :) is a way to
> make Bison apply std::move automatically.

But that just applies it always, which might be safe for your move only type, 
but is not safe in general, right?




Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


On 15 Sep 2018, at 22:10, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>>>> The idea would be to write something equivalent to
>>>>>> return make_unique($1, $2, $3);
>>>>>> and the Bison writes something like
>>>>>> $$ = std::move(action_k(...return make_unique($1, $2, $3);...))
>>>>> 
>>>>> I don't follow you. What is action_k, and how would that cause
>>>>> moving from $1 etc.?
>>>> 
>>>> Action k in the switch statement.
>>> 
>>> Huh? I really don't get what your proposed syntax is supposed to
>>> mean. Is action_k supposed to be a lambda (what else could appear in
>>> an expression and contain a statement inside)? What would it do?
>> 
>> Just produce an r-value.
> 
> Again:
> 
> - make_unique already produces an rvalue
> 
> - (I'll ignore the "...return", since you didn't comment on it, I
>  assume it's a typo)
> 
> - Then, you say, action_k produces an rvalue, from an rvalue?
> 
> - Finally, std::move takes this rvalue and turns it into an rvalue
>  (because that's what std::move does).
> 
> Do you want a triple-r-value?
> 
> Sorry if I'm a bit cynical meanwhile, but I said I don't follow what
> you intend to do, so it would be nice to explain it with something
> more than half a sentence, really.

If I write:

A& h(A& a) {
  return a;
}

A&& h(A&& a) {
  return std::move(a);
}

int main() {
  A a, b;

  b = std::move(h(a));
  b = std::move(h(std::move(a)));

  return EXIT_SUCCESS;
}

Then if A only has copy assignment, that will be used, but if has that and move 
assignment or only move assignment, then move assignment will be used. No 
copying occurs with copy elision. Isn't that what you want?





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 21:57, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>>> The idea would be to write something equivalent to
>>>> return make_unique($1, $2, $3);
>>>> and the Bison writes something like
>>>> $$ = std::move(action_k(...return make_unique($1, $2, $3);...))
>>> 
>>> I don't follow you. What is action_k, and how would that cause
>>> moving from $1 etc.?
>> 
>> Action k in the switch statement.
> 
> Huh? I really don't get what your proposed syntax is supposed to
> mean. Is action_k supposed to be a lambda (what else could appear in
> an expression and contain a statement inside)? What would it do?

Just produce an r-value.

>> Move operators were originally designed to avoid copying in returns.
> 
> I don't know if this was so or not originally, but I'm talking about
> moving arguments, not return values. That's what I've been saying
> the whole time, including the thread subject! Moving the return
> value is no big problem most of the time: "$$ = make_unique ..."
> works without any std::move because a function result(*) is
> automatically an rvalue.

The idea is to create an r-value situation, which then translates into a move 
assignment.




Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 21:25, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:

>> The idea would be to write something equivalent to
>>  return make_unique($1, $2, $3);
>> and the Bison writes something like
>>  $$ = std::move(action_k(...return make_unique($1, $2, $3);...))
> 
> I don't follow you. What is action_k, and how would that cause
> moving from $1 etc.?

Action k in the switch statement. Move operators were originally designed to 
avoid copying in returns.

>> Even in view of copy elision, default in C++17 [1], this would be safe, 
>> because one cannot move an already moved object by mistake.
> 
> Why not?

Because it breaks the execution path, so one cannot apply it twice to the same 
value.

>> As the point is breaking out of the execution path, one might use your 
>> suggestion of a special operator in combination with an immediately 
>> following break in the action switch statement. So writing say
>>  $$$(make_unique($1, $2, $3));
>> translates into
>>  $$ = std::move(make_unique($1, $2, $3));
>>  break; 
> 
> What if I want to write:
> 
>  $$ = make_unique  ($1, $2, $3);
>  print ($$);

Then you can't use the proposed $$$, but has to use the old syntax.





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 20:19, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>> One idea might be to wrap the actions in inlines, and use return instead, as 
>> C++ can recognize r-values in such situations.
> 
> I think we discussed this before, but this would only cover the case
> "$$ = $N" (which is covered by the default action for N = 1 anyway).
> 
> More interesting are cases such as:
> 
>  $$ = make_unique  ($1, $2, $3);

The idea would be to write something equivalent to
  return make_unique($1, $2, $3);
and the Bison writes something like
  $$ = std::move(action_k(…return make_unique($1, $2, $3);…))

Even in view of copy elision, default in C++17 [1], this would be safe, because 
one cannot move an already moved object by mistake.

As the point is breaking out of the execution path, one might use your 
suggestion of a special operator in combination with an immediately following 
break in the action switch statement. So writing say
  $$$(make_unique($1, $2, $3));
translates into
  $$ = std::move(make_unique($1, $2, $3));
  break; 


1. https://en.cppreference.com/w/cpp/language/copy_elision





Re: Automatially move from $n (was: C++11 move semantics)

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 19:15, Frank Heckenbach  wrote:
> 
> Akim Demaille wrote:
> 
>>> By default it's empty, so it's like before, but one can e.g. add
>>> 
>>> %define api.rhs.access {std::move}
>> 
>> I would like to have your opinion on this, a few months after
>> having practiced the idea.  It looks great, but some ideas look
>> great first, and them show some limitations.
>> 
>> Would you recommend that we really import this into Bison?
> 
> I would. My grammar file is much more readable with it, as it saves
> me multiple std::move calls in most rules.
> 
> Of course, there's a danger of using a moved-from value. When I
> introduces it, I had to check my grammars for multiple uses of the
> same value which I did to a first approximation with a regex search
> (something like "(\$[0-9]+).*\1"), then checked the few remaining
> cases manually (but my grammar rules are rather simple; most of the
> actual work is done in external functions, often just make_unique<>,
> sometimes self-written ones).
> 
> As I wrote, if Bison could detect multiple uses and warn, that would
> be great, but I didn't look into it as I didn't want to patch Bison
> itself.
> 
> Another syntax (just for the sake of example "#1" for moving, while
> keeping "$1" as is) might be an idea, but is still dangerous if one
> uses $1 after #1, so probably not worth it.
> 
> So, lacking other ideas, I'd stay with api.rhs.access, which was
> easy to implement and does the job for me. I certainly don't want to
> put std::move everywhere in my grammar. -- In fact, if I'd
> ultimately have to, I'd make make up something like "#1" and
> preprocess my grammar with sed before feeding it to Bison, to keep
> it readable. Seeing as Bison does lots of processing of the source
> anyway, this would seem overly complicated and bizarre to me.

One idea might be to wrap the actions in inlines, and use return instead, as 
C++ can recognize r-values in such situations.





Re: Bison lexer

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 18:34, Akim Demaille  wrote:
> 
>> Le 15 sept. 2018 à 14:51, Hans Åberg  a écrit :
>> 
>> 
>>> On 15 Sep 2018, at 07:07, Akim Demaille  wrote:
>>> 
>>> So, while I totally understand Frank’s point, I’m less worried than
>>> he is, and use Flex’s C++ backend.
>> 
>> Which Flex version? It only works before 2.6.0. See:
>> 
>> https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error
> 
> I’ve been using all the versions of Flex for a while now.  See:
> 
> https://gitlab.lrde.epita.fr/vcsn/vcsn/commit/dcdf2b3cae7353d99520448a7b22a4334a72bc5d

You patch up the Flex header and then use it locally? — I used that approach, 
but then I got it working without patching in 2.5.37 by adding a class in .yy, 
but then it broke in 2.6. Another method I used in the past was my own skeleton 
file, but then must be maintained.





Re: Bison lexer

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 07:07, Akim Demaille  wrote:
> 
>> Le 31 août 2018 à 23:39, Hans Åberg  a écrit :
>> 
>>>>> But the final straw was when, after changing to C++ Bison, I wanted
>>>>> to switch to C++ Flex too and found this beautiful comment:
>>>>> 
>>>>> /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
>>>>>  * following macro. This is required in order to pass the 
>>>>> c++-multiple-scanners
>>>>>  * test in the regression suite. We get reports that it breaks 
>>>>> inheritance.
>>>>>  * We will address this in a future release of flex, or omit the C++ 
>>>>> scanner
>>>>>  * altogether. */
>>>> 
>>>> It has been like that since the 1990s, I believe.
>>> 
>>> Even better! :(
>>> 
>>> Especially since C++ in the 1990s was totally different from modern
>>> C++, so I have no idea if anything of this comment is still
>>> relevant, or maybe even more relevant, today compared to then.
>> 
>> Indeed, very old.
> 
> So, while I totally understand Frank’s point, I’m less worried than
> he is, and use Flex’s C++ backend.

Which Flex version? It only works before 2.6.0. See:

https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error

> It seems that the resources developments of Flex are scarce.  They
> easily agree on issues, but even for the most trivial ones (e.g.,
> delete three lines, https://github.com/westes/flex/issues/379),
> they ask for a patch.
> 
> But, then, who am I to discuss about the maintenance resources :-(

The issue above was discussed on their new mailing list in 2016 or so, but no 
fix yet.





Re: Bison lexer

2018-09-15 Thread Hans Åberg


> On 15 Sep 2018, at 07:02, Akim Demaille  wrote:
> 
>> Le 29 août 2018 à 15:56, Hans Åberg  a écrit :
>> 
>> I did that, too: I wrote some DFA/NFA code, and incidentally found the most 
>> efficient method make action matches via a reverse NFA lookup, cf. [1-3]. 
>> Also, I have made UTF-8/32 to octet character class translations. 
>> 
>> 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
>> 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
>> 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
> 
> That was interesting.  

Thanks. I wanted a dynamic lexer and and at least a partially dynamic parser so 
users define their own operators. One thing that remains with the lexer is the 
backreferenses (see below).

> I found that Tim Shen exposed his work on
>  https://www.youtube.com/watch?v=N_rkHzhXueo.

I haven't seen all.

> When it comes to conversion from expressions to automaton, I’m
> a big fan of Brzozozski’s derivatives, that, in addition, easily
> supported complement and intersection.  

I just have a C++ automaton class that builds the NFA directly through 
operators corresponding to those of a regular expression. The NFA then has no 
empty transitions, and a set of start states, which correspond to the singel 
DFA start state, in the subset construction. For example, for alteration just 
take the union of both the NFAs and their start state sets. A working regex 
implementation then has some additions, such as loops for count matches.

> No idea about group
> captures, and certainly not backward references.

Then when building the NFA, its start and end states form a group, which can be 
identified with unique number, if you so will. Backreferences I think of 
working so that when it appearing, one makes a lookup of its value by the 
reverse NFA method I give, and then inserting it as a dynamic NFA. Strictly, 
the value of the backreference may then change as one comes to a new one, but I 
suspect those that invented the concept have not considered that.

> redgrep implements this approach.  This talks touches the case
> of capturing groups.
> 
> https://www.youtube.com/watch?v=CMhqlRBfVX4=8s=pl%2Cwn

The method I give in effect the sub-NFA that the input string uses, so the 
group capture is automatic. Then working together towards DFA minimalization, 
it turns out that one cannot even use the DFA, because different  group markers 
may be merged in to the same DFA state. Any DF minimalization must then keep 
track of that. So it may be similar to LALR state merging, where one must to 
keep track of the whole rules.





Re: bison-2.7 and MacOS 10.13

2018-09-06 Thread Hans Åberg


> On 6 Sep 2018, at 12:43, David Barto  wrote:
> 
> For some reason I can’t get Bison-2.7 to run on MacOS 10.13. As I’ve posted 
> in the past, I need the older version of bison (for the time being) for our 
> older grammar files.
> 
> Anyone understand why this is happening? The same version of bison running on 
> 10.10 is just fine.

It might be this issue. Older versions of Bison are unsupported.

https://lists.gnu.org/archive/html/bug-bison/2017-09/msg2.html





Re: Bison lexer

2018-08-31 Thread Hans Åberg


> On 1 Sep 2018, at 00:12, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> I haven't used gcc-8 yet, but how is this relevant? If anything, I
>>> expect newer gcc versions to produce more warnings (usually useful)
>>> which flex might also suffer from.
>> 
>> Maybe the Flex lexers errors is due to using C89 to compile it or something.
> 
> No, the warnings seemed legit.

It uses "register" which has been deprecated in C++17.

>>> Interesting, thanks. Fortunately, my REs are not so complex, so the
>>> bug you reported won't affect me and lexing speed is not so
>>> important for me, so (at least for now) I can just use the library
>>> as is. But if I ever need something more sophisticated, I'll keep
>>> this in mind.
>> 
>> If that is what you are using, note that it is recursive, so the function 
>> stack might overflow. But perhaps the rewrite it someday.
> 
> I don't think my lexing REs should cause much recursion. No nested
> repetitions or such.

It is in the backtracking, which it does instead of a DFA iteration, in the GCC 
regex library, that is. Some example in the links I gave illustrate that.





Re: Bison lexer

2018-08-31 Thread Hans Åberg


> On 31 Aug 2018, at 22:26, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> For a start, I didn't have very good experience communicating with
>>> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
>>> etc. in the generated code, so over the years I'd been adjusting
>>> various warning-suppression gcc options or doing dirty #define
>>> tricks to avoid warnings, or sometimes even post-processing the
>>> generated lexer with sed.
>> 
>> GCC 8.2 uses C17 as default.
> 
> I haven't used gcc-8 yet, but how is this relevant? If anything, I
> expect newer gcc versions to produce more warnings (usually useful)
> which flex might also suffer from.

Maybe the Flex lexers errors is due to using C89 to compile it or something.

>>> But the final straw was when, after changing to C++ Bison, I wanted
>>> to switch to C++ Flex too and found this beautiful comment:
>>> 
>>>   /* The c++ scanner is a mess. The FlexLexer.h header file relies on the
>>>* following macro. This is required in order to pass the 
>>> c++-multiple-scanners
>>>* test in the regression suite. We get reports that it breaks 
>>> inheritance.
>>>* We will address this in a future release of flex, or omit the C++ 
>>> scanner
>>>* altogether. */
>> 
>> It has been like that since the 1990s, I believe.
> 
> Even better! :(
> 
> Especially since C++ in the 1990s was totally different from modern
> C++, so I have no idea if anything of this comment is still
> relevant, or maybe even more relevant, today compared to then.

Indeed, very old.

> Lesson (as if anyone was listening): Always put a date on such
> messages.

Probably just a hack, never actually developed.

>>> So I wrote a small library that builds that massive RE out of single
>>> rules and maps subexpressions back to rules (even in the case that
>>> rules contain subexpressions of their own), and that works for me.
>> 
>> I did that, too: I wrote some DFA/NFA code, and incidentally found
>> the most efficient method make action matches via a reverse NFA
>> lookup, cf. [1-3]. Also, I have made UTF-8/32 to octet character
>> class translations.
>> 
>> 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
>> 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
>> 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
> 
> Interesting, thanks. Fortunately, my REs are not so complex, so the
> bug you reported won't affect me and lexing speed is not so
> important for me, so (at least for now) I can just use the library
> as is. But if I ever need something more sophisticated, I'll keep
> this in mind.

If that is what you are using, note that it is recursive, so the function stack 
might overflow. But perhaps the rewrite it someday.





Bison lexer

2018-08-29 Thread Hans Åberg


> On 29 Aug 2018, at 00:31, Frank Heckenbach  wrote:
> 
> Hans Åberg wrote:
> 
>>> On 27 Aug 2018, at 22:10, Akim Demaille  wrote:
>>> 
>>>> Most of my porting work, apart from writing the new skeletons, was
>>>> general grammar cleanup and conversion of semantic types from raw
>>>> pointers and containers to smart pointers and other RAII classes
>>>> (which was my main goal of the port, of course), and changes in the
>>>> lexer (dropping flex, but that's another story).
>>> 
>>> I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
>>> I have one parser here, 
>>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
>>> and another there 
>>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
>>> for instance, using Flex.
>> 
>> That is probably versions before 2.6; the yyin and yyout have been
>> changed in the C++ header so that they are no longer pointers, so
>> it is not only incompatible with the header of older versions, but
>> also with the code it writes, resulting in the issue [1].
>> 
>> 1. 
>> https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error
> 
> Though this wasn't actually my problem, I'll reply to this mail
> rather than the main thraed to keep it separate from the actual
> Bison discussion.

One can change the subject. :-)

> For a start, I didn't have very good experience communicating with
> Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings
> etc. in the generated code, so over the years I'd been adjusting
> various warning-suppression gcc options or doing dirty #define
> tricks to avoid warnings, or sometimes even post-processing the
> generated lexer with sed.

GCC 8.2 uses C17 as default.

> But the final straw was when, after changing to C++ Bison, I wanted
> to switch to C++ Flex too and found this beautiful comment:
> 
>/* The c++ scanner is a mess. The FlexLexer.h header file relies on the
> * following macro. This is required in order to pass the 
> c++-multiple-scanners
> * test in the regression suite. We get reports that it breaks inheritance.
> * We will address this in a future release of flex, or omit the C++ 
> scanner
> * altogether. */

It has been like that since the 1990s, I believe.

> I know there are no guarantees in the future of free software
> (neither of non-free software, of course), but such an
> announcement/threat seemed too risky to me.

Indeed, it seems broken now.

> Meanwhile I'd often thought that all Flex actually does is matching
> alternative regular expressions. Plain RE can do that as well, and
> by capturing subexpressions I can find out which alternative was
> matched.
> 
> Of course, it would (indeed turn out to be) somewhat slower (RE
> built at runtime vs. compile time), but like parsing, lexing speed
> is not a big issue to me. So I was ready to trade that in for
> convenience of programming and one less dependence on a problematic
> tool.
> 
> (Side node: Many years ago, on a different project, I dropped gperf
> to recognize predefined identifiers for similar reasons, and put
> them in a look-up table instead. Except for a tiny slowdown, that
> had worked out well, so I was confident I could drop Flex, too. --
> Now apparently the next one in line after dropping gperf and Flex
> should be Bison, but don't worry, I don't see an easy way to replace
> it, since Bison actually does some nontrivial stuff. :)
> 
> So I wrote a small library that builds that massive RE out of single
> rules and maps subexpressions back to rules (even in the case that
> rules contain subexpressions of their own), and that works for me.

I did that, too: I wrote some DFA/NFA code, and incidentally found the most 
efficient method make action matches via a reverse NFA lookup, cf. [1-3]. Also, 
I have made UTF-8/32 to octet character class translations. 

1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html
2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472
3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html





Re: Bison C++ mid-rule value lost with variants

2018-08-27 Thread Hans Åberg


> On 27 Aug 2018, at 22:10, Akim Demaille  wrote:
> 
>> Most of my porting work, apart from writing the new skeletons, was
>> general grammar cleanup and conversion of semantic types from raw
>> pointers and containers to smart pointers and other RAII classes
>> (which was my main goal of the port, of course), and changes in the
>> lexer (dropping flex, but that’s another story).
> 
> I fought a lot with Flex, but it works ok in C++ too with lalr1.cc.
> I have one parser here, 
> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot,
> and another there 
> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat
> for instance, using Flex.

That is probably versions before 2.6; the yyin and yyout have been changed in 
the C++ header so that they are no longer pointers, so it is not only 
incompatible with the header of older versions, but also with the code it 
writes, resulting in the issue [1].

1. 
https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error





Re: Enhancement request: enabling Variant in C parsers

2018-08-26 Thread Hans Åberg


> On 26 Aug 2018, at 07:22, Akim Demaille  wrote:
> 
>> Le 25 août 2018 à 14:56, Hans Åberg  a écrit :
>> 
>> Then the example file could have comments like:
>> # Run calc++ 
>> # Causes a deliberate error. Should report ...
>> 
>> The idea is that coming new to bison copies the calc++ directory, compiles 
>> it, sees there are some example files to run, and ready to start 
>> experimenting.
> 
> The calculator is dumb: it expects a single expression, so
> there would be enough space to demonstrate all this.  And
> I want to keep it simple.
> 
> The README should suffice, shouldn’t it?

I depends on where you want to take it. From earlier in the thread, it scares 
off some, though we did not get to know why, whereas I think it is a good 
starting point for showing off some Bison features.

Maybe simple is good for a start, and more advanced features in another example 
some time in the future.





  1   2   >