> On 29 Aug 2018, at 00:31, Frank Heckenbach <[email protected]> wrote: > > Hans Åberg wrote: > >>> On 27 Aug 2018, at 22:10, Akim Demaille <[email protected]> wrote: >>> >>>> Most of my porting work, apart from writing the new skeletons, was >>>> general grammar cleanup and conversion of semantic types from raw >>>> pointers and containers to smart pointers and other RAII classes >>>> (which was my main goal of the port, of course), and changes in the >>>> lexer (dropping flex, but that's another story). >>> >>> I fought a lot with Flex, but it works ok in C++ too with lalr1.cc. >>> I have one parser here, >>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/dot, >>> and another there >>> https://gitlab.lrde.epita.fr/vcsn/vcsn/tree/master/lib/vcsn/rat >>> for instance, using Flex. >> >> That is probably versions before 2.6; the yyin and yyout have been >> changed in the C++ header so that they are no longer pointers, so >> it is not only incompatible with the header of older versions, but >> also with the code it writes, resulting in the issue [1]. >> >> 1. >> https://stackoverflow.com/questions/34438023/openfoam-flex-yyin-rdbufstdcin-rdbuf-error > > Though this wasn't actually my problem, I'll reply to this mail > rather than the main thraed to keep it separate from the actual > Bison discussion.
One can change the subject. :-) > For a start, I didn't have very good experience communicating with > Flex maintainer(s?) who seemed rather nonchalant WRT gcc warnings > etc. in the generated code, so over the years I'd been adjusting > various warning-suppression gcc options or doing dirty #define > tricks to avoid warnings, or sometimes even post-processing the > generated lexer with sed. GCC 8.2 uses C17 as default. > But the final straw was when, after changing to C++ Bison, I wanted > to switch to C++ Flex too and found this beautiful comment: > > /* The c++ scanner is a mess. The FlexLexer.h header file relies on the > * following macro. This is required in order to pass the > c++-multiple-scanners > * test in the regression suite. We get reports that it breaks inheritance. > * We will address this in a future release of flex, or omit the C++ > scanner > * altogether. */ It has been like that since the 1990s, I believe. > I know there are no guarantees in the future of free software > (neither of non-free software, of course), but such an > announcement/threat seemed too risky to me. Indeed, it seems broken now. > Meanwhile I'd often thought that all Flex actually does is matching > alternative regular expressions. Plain RE can do that as well, and > by capturing subexpressions I can find out which alternative was > matched. > > Of course, it would (indeed turn out to be) somewhat slower (RE > built at runtime vs. compile time), but like parsing, lexing speed > is not a big issue to me. So I was ready to trade that in for > convenience of programming and one less dependence on a problematic > tool. > > (Side node: Many years ago, on a different project, I dropped gperf > to recognize predefined identifiers for similar reasons, and put > them in a look-up table instead. Except for a tiny slowdown, that > had worked out well, so I was confident I could drop Flex, too. -- > Now apparently the next one in line after dropping gperf and Flex > should be Bison, but don't worry, I don't see an easy way to replace > it, since Bison actually does some nontrivial stuff. :) > > So I wrote a small library that builds that massive RE out of single > rules and maps subexpressions back to rules (even in the case that > rules contain subexpressions of their own), and that works for me. I did that, too: I wrote some DFA/NFA code, and incidentally found the most efficient method make action matches via a reverse NFA lookup, cf. [1-3]. Also, I have made UTF-8/32 to octet character class translations. 1. https://gcc.gnu.org/ml/libstdc++/2018-04/msg00032.html 2. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85472 3. https://gcc.gnu.org/ml/libstdc++/2018-05/msg00015.html
