Hi all, This is the fifth (!!!) beta of Bison 3.6. Many portability issues have been addressed.
Bison 3.6 includes big changes prompted by user feature requests. Dear users, we *need* feedback about these new features, we *need* you to try them on your project to make sure they address your need, to make sure your request was properly understood. Cheers, Here are the compressed sources: https://alpha.gnu.org/gnu/bison/bison-3.5.94.tar.gz (5.1MB) https://alpha.gnu.org/gnu/bison/bison-3.5.94.tar.xz (3.1MB) Here are the GPG detached signatures[*]: https://alpha.gnu.org/gnu/bison/bison-3.5.94.tar.gz.sig https://alpha.gnu.org/gnu/bison/bison-3.5.94.tar.xz.sig Use a mirror for higher download bandwidth: https://www.gnu.org/order/ftp.html [*] Use a .sig file to verify that the corresponding file (without the .sig suffix) is intact. First, be sure to download both the .sig file and the corresponding tarball. Then, run a command like this: gpg --verify bison-3.5.94.tar.gz.sig If that command fails because you don't have the required public key, then run this command to import it: gpg --keyserver keys.gnupg.net --recv-keys 0DDCAA3278D5264E and rerun the 'gpg --verify' command. This release was bootstrapped with the following tools: Autoconf 2.69 Automake 1.16.2 Flex 2.6.4 Gettext 0.19.8.1 Gnulib v0.1-3382-g2ac33b29f NEWS * Noteworthy changes in release 3.5.94 (2020-05-06) [beta] Portability issues. * Noteworthy changes in release 3.5.93 (2020-05-03) [beta] Portability issues. * Noteworthy changes in release 3.5.92 (2020-05-03) [beta] Portability issues. More documentation. Backward compatibility issues with C++. * Noteworthy changes in release 3.5.91 (2020-04-29) [stable] ** New features *** Returning the error token When the scanner returns an invalid token or the undefined token (YYUNDEF), the parser generates an error message and enters error recovery. Because of that error message, most scanners that find lexical errors generate an error message, and then ignore the invalid input without entering the error-recovery. The scanners may now return YYerror, the error token, to enter the error-recovery mode without triggering an additional error message. See the bistromathic for an example. *** The bistromathic features internationalization Its way to build the error message is more general and is easy to use in other projects. * Noteworthy changes in release 3.5.90 (2020-04-18) [beta] ** Backward incompatible changes TL;DR: replace "#define YYERROR_VERBOSE 1" by "%define parse.error verbose". The YYERROR_VERBOSE macro is no longer supported; the parsers that still depend on it will now produce Yacc-like error messages (just "syntax error"). It was superseded by the "%error-verbose" directive in Bison 1.875 (2003-01-01). Bison 2.6 (2012-07-19) clearly announced that support for YYERROR_VERBOSE would be removed. Note that since Bison 3.0 (2013-07-25), "%error-verbose" is deprecated in favor of "%define parse.error verbose". ** New features *** Improved syntax error messages Two new values for the %define parse.error variable offer more control to the user. Available in all the skeletons (C, C++, Java). **** %define parse.error detailed The behavior of "%define parse.error detailed" is closely resembling that of "%define parse.error verbose" with a few exceptions. First, it is safe to use non-ASCII characters in token aliases (with 'verbose', the result depends on the locale with which bison was run). Second, a yysymbol_name function is exposed to the user, instead of the yytnamerr function and the yytname table. Third, token internationalization is supported (see below). **** %define parse.error custom With this directive, the user forges and emits the syntax error message herself by defining the yyreport_syntax_error function. A new type, yypcontext_t, captures the circumstances of the error, and provides the user with functions to get details, such as yypcontext_expected_tokens to get the list of expected token kinds. A possible implementation of yyreport_syntax_error is: int yyreport_syntax_error (const yypcontext_t *ctx) { int res = 0; YY_LOCATION_PRINT (stderr, *yypcontext_location (ctx)); fprintf (stderr, ": syntax error"); // Report the tokens expected at this point. { enum { TOKENMAX = 10 }; yysymbol_kind_t expected[TOKENMAX]; int n = yypcontext_expected_tokens (ctx, expected, TOKENMAX); if (n < 0) // Forward errors to yyparse. res = n; else for (int i = 0; i < n; ++i) fprintf (stderr, "%s %s", i == 0 ? ": expected" : " or", yysymbol_name (expected[i])); } // Report the unexpected token. { yysymbol_kind_t lookahead = yypcontext_token (ctx); if (lookahead != YYSYMBOL_YYEMPTY) fprintf (stderr, " before %s", yysymbol_name (lookahead)); } fprintf (stderr, "\n"); return res; } **** Token aliases internationalization When the %define variable parse.error is set to `custom` or `detailed`, one may specify which token aliases are to be translated using _(). For instance %token PLUS "+" MINUS "-" <double> NUM _("double precision number") <symrec*> FUN _("function") VAR _("variable") In that case the user must define _() and N_(), and yysymbol_name returns the translated symbol (i.e., it returns '_("variable")' rather that '"variable"'). In Java, the user must provide an i18n() function. *** List of expected tokens (yacc.c) Push parsers may invoke yypstate_expected_tokens at any point during parsing (including even before submitting the first token) to get the list of possible tokens. This feature can be used to propose autocompletion (see below the "bistromathic" example). It makes little sense to use this feature without enabling LAC (lookahead correction). *** Deep overhaul of the symbol and token kinds To avoid the confusion with types in programming languages, we now refer to token and symbol "kinds" instead of token and symbol "types". The documentation and error messages have been revised. All the skeletons have been updated to use dedicated enum types rather than integral types. Special symbols are now regular citizens, instead of being declared in ad hoc ways. **** Token kinds The "token kind" is what is returned by the scanner, e.g., PLUS, NUMBER, LPAREN, etc. While backward compatibility is of course ensured, users are nonetheless invited to replace their uses of "enum yytokentype" by "yytoken_kind_t". This type now also includes tokens that were previously hidden: YYEOF (end of input), YYUNDEF (undefined token), and YYERRCODE (error token). They now have string aliases, internationalized when internationalization is enabled. Therefore, by default, error messages now refer to "end of file" (internationalized) rather than the cryptic "$end", or to "invaid token" rather than "$undefined". Therefore in most cases it is now useless to define the end-of-line token as follows: %token T_EOF 0 "end of file" Rather simply use "YYEOF" in your scanner. **** Symbol kinds The "symbol kinds" is what the parser actually uses. (Unless the api.token.raw %define variable is used, the symbol kind of a terminal differs from the corresponding token kind.) They are now exposed as a enum, "yysymbol_kind_t". This allows users to tailor the error messages the way they want, or to process some symbols in a specific way in autocompletion (see the bistromathic example below). *** Modernize display of explanatory statements in diagnostics Since Bison 2.7, output was indented four spaces for explanatory statements. For example: input.y:2.7-13: error: %type redeclaration for exp input.y:1.7-11: previous declaration Since the introduction of caret-diagnostics, it became less clear. This indentation has been removed and submessages are displayed similarly as in GCC: input.y:2.7-13: error: %type redeclaration for exp 2 | %type <float> exp | ^~~~~~~ input.y:1.7-11: note: previous declaration 1 | %type <int> exp | ^~~~~ Contributed by Victor Morales Cayuela. *** C++ The token and symbol kinds are yy::parser::token_kind_type and yy::parser::symbol_kind_type. The symbol_type::kind() member function allows to get the kind of a symbol. This can be used to write unit tests for scanners, e.g., yy::parser::symbol_type t = make_NUMBER ("123"); assert (t.kind () == yy::parser::symbol_kind::S_NUMBER); assert (t.value.as<int> () == 123); ** Documentation *** User Manual In order to avoid ambiguities with "type" as in "typing", we now refer to the "token kind" (e.g., `PLUS`, `NUMBER`, etc.) rather than the "token type". We now also refer to the "symbol type" (e.g., `PLUS`, `expr`, etc.). *** Examples There are now two examples in examples/java: a very simple calculator, and one that tracks locations to provide accurate error messages. The lexcalc example (a simple example in C based on Flex and Bison) now also demonstrates location tracking. A new C example, bistromathic, is a fully featured interactive calculator using many Bison features: pure interface, push parser, autocompletion based on the current parser state (using yypstate_expected_tokens), location tracking, internationalized custom error messages, lookahead correction, rich debug traces, etc. It shows how to depend on the symbol kinds to tailor autocompletion. For instance it recognizes the symbol kind "VARIABLE" to propose autocompletion on the existing variables, rather than of the word "variable".