Hans Åberg wrote: > > On 17 Dec 2018, at 10:48, Frank Heckenbach <[email protected]> wrote: > > > > I think we agree here, and that was actually my concern when I > > started this thread. I don't want to have to write a separate case > > for each token kind in my lexer. Of course, we need a separate case > > for each semantic type because that involves a different type in the > > constructor/builder call already, but these are relatively few, > > compared to token kinds, in my lexers. > > Might Bison generate a function with a switch statement, generate the right > return for the lexer to use?
Different semantic types need separate functions since C++ is strongly typed. Perhaps an example makes it clearer: Say we have tokens V_FOO and V_BAR with no semantic type, I_BAZ and I_QUX with semantic type int and S_BLA with type string. (BTW, I'm no fan of Hungarian notation, just use it here for the sake of example.) So far Bison generates (roughly speaking): symbol_type make_V_FOO (); symbol_type make_V_BAR (); symbol_type make_I_BAZ (int &&); symbol_type make_I_QUX (int &&); symbol_type make_S_BLA (string &&); What I suggest to add (without changing the above), is: symbol_type make_symbol (token_type type); // checks at runtime that type is V_FOO or V_BAR symbol_type make_symbol (token_type type, int &&); // checks at runtime that type is I_BAZ or I_QUX symbol_type make_symbol (token_type type, string &&); // checks at runtime that type is S_BLA These runtime checks might be implemented via a switch if that's easier to auto-generate (it might be in fact) or with a simple "if (... || ...)" statement, that's an implementation detail. > >> Maybe an option. Akim perhaps haven't used this dynamic token > >> lookup. > > > > I guess he hasn't. But I don't think we need an option. These would > > just be additional functions that one can use or not. > > The with an option would be that those that do not need this feature could > use a more optimal variant. According to my proposal everyone could use any function. In fact, my lexers do, they use the "safe" make_FOO functions by default, and the (so far) unchecked ones for the dynamicalled looked-up tokens. > >> Those that do might prefer not risking the program to bomb. > > > > It's not that bad actually. Again, my lexers work fine as is. > > I just brought this up because Akim proposed to call the function > > "unsafe_..." which I thought was too harsh and proposed > > "unchecked_..." -- but by adding the checks, it would be neither > > unsafe nor unchecked. :) > > This worries me. That's why I suggest to add the check. :) > But also having having to use something more complex to be returned by the > lexer than a value on the lookup table . The lexer returns a token which contains the token kind (an enum) and the semantic value (a union value). As mismatch is bad. The make_FOO functions avoid a mismatch and are suitable for statically known token kinds. The direct constructor call can be used for dynamic token kinds, but allows a mismatch. The functions I propose to generate instead could be used for dynamic token kinds and avoid a mismatch. Everything clear now? Regards, Frank
