(This is https://github.com/akimd/bison/pull/33).
I install this series of commits, which replaces the RFC I sent here: https://lists.gnu.org/r/bison-patches/2020-03/msg00057.html I will repeat below why I believe this is an good move. But before that a note about a change from the previous version of this branch: I went from "yysymbol_code_t" to "yysymbol_type_t", to mirror the "token type" (yytokentype) that the documentation refers to. Of course what matters for the implementation is that these are numbers (aka codes), but the human is only concerned with the fact that it's an enumeration of all the different symbol types (where "type" should be understood as "sort", "kind", etc., not as in "typing"). If someone has a better idea than "symbol type" and "token type", I'm all ears! But yes, as a consequence, in C++ we have symbol_type_type. Cheers! There are many advantages in exposing the symbol (internal) numbers: - custom error messages can use them to decide how to represent a given symbol, or a set of symbols. See my message about custom error messages for examples. (https://lists.gnu.org/archive/html/bison-patches/2020-01/msg00000.html) - we need something similar in uses of yyexpected_tokens. For instance, currently, bistromathic's completion() reads: int ntokens = expected_tokens (line, tokens, YYNTOKENS); [...] for (int i = 0; i < ntokens; ++i) if (tokens[i] == YYTRANSLATE (TOK_VAR)) [...] else if (tokens[i] == YYTRANSLATE (TOK_FUN)) [...] else [...] - now that it's a compile-time expression, we can easily build static tables, use switch, etc. - some users depended on the ability to get the token number from a symbol to write test cases for their scanners. But Bison 3.5 removed the table this feature depended upon (a reverse yytranslate). Now they can check against the actual symbol number, without having pay (space and time) a conversion. See https://lists.gnu.org/r/bug-bison/2020-01/msg00001.html, and https://lists.gnu.org/archive/html/bug-bison/2020-03/msg00015.html. - it helps us clearly separate the internal symbol numbers from the external token numbers, whose difference is sometimes blurred in the code when values coincide (e.g. "yychar = yytoken = YYEOF"). - it allows us to get rid of ugly macros with inconsistent names such as YYUNDEFTOK and YYTERROR, and to group related definitions together. - similarly it provides a clean access to the $accept symbol (which proves convenient in a current experimentation of mine with several %start symbols). I have left the 'regen' commits, because it's useful to see the impact of these changes. I'm struggling with Java, where the concept of enum is completely different. I would also like to move the lone #defines for YYEOF and YYEMPTY into yytokentype, so that we can use "strong typing" for both yytoken and yychar. And get rid of more macros. I'd be happy to receive comments, as usual. Akim Demaille (21): style: comment changes about token numbers yacc.c: introduce an enum that defines the symbol's number regen yacc.c: use yysymbol_type_t instead of int for yytoken yacc.c: also define a symbol number for the empty token regen yacc.c: prefer YYSYMBOL_YYERROR to YYSYMBOL_error regen bistromathic: use symbol numbers instead of YYTRANSLATE yysymbol_type_t: always assign an enumerator regen yacc.c: revert to not using yysymbol_type_t in the yytranslate table regen yacc.c: fix more errors from make maintainer-check-g++ regen glr.c: use yysymbol_type_t, YYSYMBOL_YYEOF etc. glr.c, yacc.c: propagate yysymbol_type_t regen glr.c: remove the yySymbol alias c++: also use symbol_type_type c++: replace symbol_number_type with symbol_type_type TODO | 5 + data/skeletons/bison.m4 | 34 ++++-- data/skeletons/c++.m4 | 66 ++++++---- data/skeletons/c.m4 | 49 ++++++-- data/skeletons/glr.c | 100 ++++++++-------- data/skeletons/glr.cc | 26 ++-- data/skeletons/lalr1.cc | 95 +++++++-------- data/skeletons/yacc.c | 105 ++++++++-------- examples/c/bistromathic/parse.y | 30 ++--- src/parse-gram.c | 206 +++++++++++++++++++++++++------- src/parse-gram.h | 2 +- src/parse-gram.y | 6 +- tests/headers.at | 2 + tests/local.at | 10 +- 14 files changed, 474 insertions(+), 262 deletions(-) -- 2.26.0
