Hi Victor, We’ve made some progress in the last couple of months. Bison 3.2 is not released yet, but it should happen within a couple of weeks.
> Le 19 août 2018 à 22:11, Victor Khomenko <[email protected]> a > écrit : > > Hi Akim, > >>> I did look at it - this example scared me off c++ parsers :-( >> >> Really??? Then I failed. Can you be more specific? > > Ok, I'll try to make some suggestions for the manual and bison. Please note > that much of what I say might be due to lack of understanding - I’m just a > regular bison user, and don't understand the skeletons, internals, etc. So you are the right audience :) Something would be wrong if you had to be a Bison contributor to be a Bison user. > The context is that I have 8 bison C-style pure parsers in my program, most > of them are very simple, but a couple are fairly complicated, including a glr > one. This is already quite advanced! Do you happen to have some of them living in the same application/library? Or each time they are separate? > 1. The C++ section in the bison manual starts with the description of > generated files: position.hh, location.hh, stack.hh, file.hh and file.cc. > That's 5 generated files, so at this point many prospective users would just > run away screaming. I’d say the first three files are completely unnecessary > and can be dumped into file.hh (or even .cc) We made some progress: you can have from 1 to 3 files. Obviously parser.cc must always exist. parser.hh is on demand (and that’s not new, it’s been the case since a long time), bound to %defines. location.hh is nice to have if for instance you tag your ast with locations (I know you don’t, but that’s not typical). In which case it’s nice to have a means to reach the location type without having to include the whole parser.hh. You control this with api.location.file. > - note that the contents could be in some bison::detail namespace, so that > the types are exactly the same in all parsers and the linker can merge the > functions if they are declared as inline but were not actually inlined. The > work-around for multiple parsers suggested in Sect. 10.1.3.3 (making one > parser generate these includes in master/ and the other parsers use them) is > ugly and introduces unwanted include and build dependencies (e.g. there could > be two independent libraries within a project, each using a parser; this > technique would make them dependent both for includes and the built order). There’s not much to share. The location types _can_ be different, for instance in the way you handle filenames. And build systems are all about dependencies, so it should not be too much of a problem to get things right. However, you are totally free to design your own location type and to ask Bison to use it. > I’d say file.hh can also be suppressed in simple cases, e.g. if one replaces > the generated class by a function (see below) - it's then possible to declare > the parser function at the point of use. Yes, just don’t pass %defines. Like in C. > 2. int yylex (semantic type* yylval, location type* yylloc, type1 arg1, ...) > [Method on parser] > In C++ one would usually pass the parameters by references rather than > pointers - also to avoid handling null-pointers in yylex. That would be nicer, you are right, but backward compatibility won’t let us fix that. Besides, this interface is wrong and should not be used. You should use symbol constructors and return complete symbols rather than hoping for the returned type and the semantic value to be consistant. Actually, it would be a nice feature to add to C parsers too. > 4. calc++ is a rather complicated example, for warming up it would be better > to use something simple. Then calc++ would be good to illustrate various > intricacies like mutual #include dependencies and a flex scanner. The section about C++ was meant to be read by people who already know about the C interface. That could have been a bad choice. I do not plan to duplicate the whole chapters of introduction, they still should be read, but I agree we needed a simpler introduction. So I added one (Section 10.1.1, A Simple C++ Example). https://lists.gnu.org/archive/html/bison-patches/2018-10/msg00107.html > 3. the destructor of calcxx_driver should not be virtual; I don’t know what you are referring to here. It is not. It’s not even explicit. > in fact, the automatically generated destructor should be ok, so no need to > declare it at all. The constructor is not necessary either - since C++11 one > can give the default initial values to data members: > std::map<std::string, int> variables={ {"one", 1}, {"two", 2} }; > trace_scanning=false; > trace_parsing=false; We could modernize the example, sure, but it’s C++98, so that was not an option. I’m ok with that move. > Maybe some other functions within the class, like error(), can be defined at > the point of declaration. Bison cannot know how you want the error to be reported to the user: stderr? Or a log for a daemon? Or an alert modal window for an application with a GUI? etc. > 4. int calcxx_driver::parse (const std::string& f); > This is confusing, as the parser also has the function with the same name, > but no parameters. Maybe call it parse_file? The idea is that the driver is the exposed interface to the rest of the program, and ‘parse’ seems right. But I could live with parse_file if you really think it makes a difference. > 5. scan_begin() and scan_end() could be removed to simplify the interface, > and inlined into parse(…). > 6. implementing calcxx_driver::scan_begin() and calcxx_driver::scan_end() in > the lexer file is really bad - I guess one can easily implement them in the > calcxx_driver.cpp file - this does not cause any problems with #includes. I disagree. Flex’s interface is painful, and I don’t want to have to have it generate a header file. So you should leave Flex-related implementation details there. Here, it’s ‘just’ ‘yyin’ but in more ambitious projects, you will probably need to use yypush_buffer_state, yy_push_state, yy_switch_to_buffer, yypop_buffer_state, etc. My point here was to show the pattern I ended up following to be able to use Flex in C++ effectively. > 7. it would be better to use C++ streams rather than C-style fopen/fclose > (note that error() does use streams). Sure. But then again, that’s Flex, not Bison. I have put a tarball with the current state of the doc and of the examples in here: https://www.lrde.epita.fr/~akim/private/bison/bison-3.1.91.35-8fd5.tar.gz https://www.lrde.epita.fr/~akim/private/bison/bison-3.1.91.35-8fd5.tar.xz https://www.lrde.epita.fr/~akim/private/bison/bison.pdf Please, let me know what you think about it. Thanks again for your detailed report. Cheers!
