Re: [PEG] Parser Performance Benchmarks

Bob Foster Thu, 19 Oct 2006 20:02:13 -0700

I would write this table:

                      Generation        Parsing
Parse Method         Space    Time    Space    Time
------------------  -------  ------  -------  ------
Earley                 ?        ?       ?     O(n^3)
LL(1)                  ?        ?       ?     O(n)
PEG (packrat)          ?        ?       ?     O(n)
LL(*)                  ?        ?       ?     ?


                      Generation        Parsing
Implementations      Space    Time    Space    Time
------------------  -------  ------  -------  ------
Pappy (PEGp)           ?        ?       ?     O(n) k=?
Rats! (PEGp)           ?        ?       ?     O(n) k=2
ANTLR (LL(*))          ?        ?       ?     O(n) k=1
JavaCC (LL(*)?)        ?        ?       ?     O(n) k=?

In the Parse Method table, the numbers would be a formal complexitymeasure, which says nothing about the constant factor. I don't know thatgeneration time has even been studied formally.

In the Implementations table, the numbers would be based on measurementsusing a set of sample grammars and inputs, which would yield acomplexity verification and, equally important, the constant factor.

Note that e.g. LL(*) which provides infinite lookahead might have aformal complexity quite a bit higher than O(n), as would PEG withoutpackrat. In this case, the actual complexity in real grammars might beconsiderably less than the formal.

But it's very hard to compare apples to apples, as it is rare that twoimplementations accept the same grammar.


Bob

David Mercer wrote:
> Was reading Terence's web-log entry of today
> (http://www.antlr.org/blog/antlr3/lookahead.tml), which contained the
> following paragraph:
>
>
>>The extra power of GLR and PEGs comes at the cost of time and space.
>>Grimm's experiments show SDF2 (traditional GLR) and Elkhound are roughly
>>7 times slower than a more traditional LL(k) parser generator such as
>>JavaCC at parsing Java source code. Rats! is currently about 2 times
>>slower than JavaCC. The extra machinery beyond the LR and LL foundations
>>slows down parsing for even deterministic languages like Java.
>
>

> Has anyone come up with a series of benchmark tests for evaluatingparsers.

> For instance, a set of grammars and language examples which are used in

> determining which parsing methods are faster with different kinds ofinputs.> Am thinking we might have examples of both programming languages (C,Java)> and data languages (XML, HTML, X12). Then we would be able to haveanalyses

> like, "for simple, well-behaved languages like A and B, parser X performs

> better; but when the language becomes more complex, such as languagesC and

> D, parser Y performs better."  Then we could compare things like not only
> speed, but memory usage, and possibly even compile them all on a single
> platform and have a competition.
>
> If not, what examples do you use to test your parser?  How do you measure
> speed and memory usage (if you do measure them)?  How do you check
> correctness?
>
> Would help in the decision-making in deciding which parsing model to use,
> for instance.  (E.g., should I use ANTLR or Rats! as a base for my
> application?)  For that matter, a nice theoretical performance comparison
> would be helpful, too.  For instance, I seem to recall Earley's algorithm

> has time complexity of about O(n^3) where n is the size of the input.Not> sure what it's space complexity was. I think LL(k) is good ol' O(n).Not

> sure what others are.  Am thinking a table which shows preprocessing

> (constructing parse tables, etc.) and parsing space and timecomplexities as

> functions of grammar and input sizes.  Here's my beginning:
>
>                      Preprocessing       Parsing
> Parse Method         Space    Time    Space    Time
> ------------------  -------  ------  -------  ------
> Earley                 ?        ?       ?     O(n^3)
> LL(1)                  ?        ?       ?     O(n)
> PEG (backtracking)     ?        ?       ?       ?
> Rats!                  ?        ?       ?     O(n)
> ANTLR...
> . . .
>
> Cheers,
>
> David Mercer
>
>
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg
>
>



_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Parser Performance Benchmarks

Reply via email to