Was reading Terence's web-log entry of today (http://www.antlr.org/blog/antlr3/lookahead.tml), which contained the following paragraph:
> The extra power of GLR and PEGs comes at the cost of time and space. > Grimm's experiments show SDF2 (traditional GLR) and Elkhound are roughly > 7 times slower than a more traditional LL(k) parser generator such as > JavaCC at parsing Java source code. Rats! is currently about 2 times > slower than JavaCC. The extra machinery beyond the LR and LL foundations > slows down parsing for even deterministic languages like Java. Has anyone come up with a series of benchmark tests for evaluating parsers. For instance, a set of grammars and language examples which are used in determining which parsing methods are faster with different kinds of inputs. Am thinking we might have examples of both programming languages (C, Java) and data languages (XML, HTML, X12). Then we would be able to have analyses like, "for simple, well-behaved languages like A and B, parser X performs better; but when the language becomes more complex, such as languages C and D, parser Y performs better." Then we could compare things like not only speed, but memory usage, and possibly even compile them all on a single platform and have a competition. If not, what examples do you use to test your parser? How do you measure speed and memory usage (if you do measure them)? How do you check correctness? Would help in the decision-making in deciding which parsing model to use, for instance. (E.g., should I use ANTLR or Rats! as a base for my application?) For that matter, a nice theoretical performance comparison would be helpful, too. For instance, I seem to recall Earley's algorithm has time complexity of about O(n^3) where n is the size of the input. Not sure what it's space complexity was. I think LL(k) is good ol' O(n). Not sure what others are. Am thinking a table which shows preprocessing (constructing parse tables, etc.) and parsing space and time complexities as functions of grammar and input sizes. Here's my beginning: Preprocessing Parsing Parse Method Space Time Space Time ------------------ ------- ------ ------- ------ Earley ? ? ? O(n^3) LL(1) ? ? ? O(n) PEG (backtracking) ? ? ? ? Rats! ? ? ? O(n) ANTLR... . . . Cheers, David Mercer _______________________________________________ PEG mailing list PEG@lists.csail.mit.edu https://lists.csail.mit.edu/mailman/listinfo/peg