On Oct 3, 2007, at 12:49 PM, Benjamin Meyer wrote: > The past few months I have been writing many tools with Roberto > Raggi's c++ preprocessor and parser. It is very fast and I have > enjoyed messing with it. As you (chris) already know exactly what > you guys need/want and what would make a good parser I am very > curious what you can say about it (where it is good/bad, what it is > missing etc) > > The one I have been using can be found in this package: > > ftp://ftp.trolltech.com/qtjambi/source/qtjambi-gpl-src-4.3.0_01.tar.gz > > Located in: generator/parser/ and the preprocessor is in generator/ > parser/rpp
It is somewhat irritating to me that there is almost no comments for this: it seems well thought out and written. Is there any out of line documentation available? Overall, it is an impressive piece of work. There are some minor strange (to me) design decisions: for example, what is ConditionAST, why does it exist? The ASTs produced seem to be a bit heavier-weight than the clang ASTs, and relies on the entire lexed token stream being available to interpret the location info. However, in my first few minutes looking at it, I don't think that it shares the "fatal flaws" (from the clang perspective only, obviously) in its design or implementation that elsa has. As a matter of fact, while the details differ significantly, its design is somewhat similar to clangs, validating clang's design ;-). One thing that is impossible for me to do from inspection is to determine how complete the parser is. Since I don't have it built and you do, here are some questions for you: :) 1) looking at the preprocessor, the implementation doesn't look particularly speedy. It is using std::strings to push text around. Have you timed the preprocessor on large inputs to see how fast it really is? 2) the preprocessor seems to get the 90% case right, but doesn't seem to be fully conformant. Do you have any idea whether it has been tested against the hard cases in the standard? For example, the clang/test/Preprocessor directory has some example hard cases. 3) does the code handle nasty features like trigraphs? 4) how good is the C++ support? It seems like there is significant coverage for a big chunk of the language, but it seems like pieces are missing. Without at least partial template instantiation support you can't correctly parse some C++ code for example. Note that this requires full handling of template specialization etc. Are there known holes/deficiencies? 5) it looks like a lot of semantic checks are missing. Is there anything that talks about the current state of the parser? It also reads and ignores lots of stuff, even simple things like break/ continue/goto stmts. -Chris _______________________________________________ cfe-dev mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
