Re: [PEG] Breaking the Mizushima(2010) cut operator

Juancarlo Añez Mon, 11 Aug 2014 16:33:06 -0700

On Mon, Aug 11, 2014 at 5:52 PM, Marcus Brinkmann <
marcus.brinkm...@ruhr-uni-bochum.de> wrote:


> As I wrote at the time, it seems difficult (at least for me!) to come up
> with a proper example that actually shows non-linear behaviour on arbitrary
> long input (rather than a small example grammer that only accepts a finite
> set of sentences).


As you know, my work involves parsing hundreds of thousands of lines of
COBOL, VB6, Natural, Java, etc. (it's legacy rescue).

I've done several experiments on pruning less upon cut (they are in
branches in the Grako repository, if anyone is interested), and all of them
have had a negative impact on performance, probably because:

   1. Performance degrades when the pruning algorithm is complex.
   2. Performance degrades as memory consumption increases (by pruning
   less).

To empirically prove linearity of Grako's drastic pruning approach I'd have
to find an example that's small enough so #2 doesn't kick in, and I'm not
interested, because I have, f.i, a 46 KLOC VB6 file to take care of.

I'm even less interested now that Marcus is making progress with Grako++ (
https://github.com/lambdafu/grakopp).

The theoretical O(n) performance of PEG parsers implies worst-case O(n)
space. Grako's memoization uses Python's O(1) implementation of hashmaps,
yet my experience is that the O(1) is only theoretical, probably because
the multi-level caches of modern computers get much more misses when the
hashmap grows. Perhaps some experiments on performance according to cache
size are due.

Cheers,

-- 
Juancarlo *Añez*

_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Breaking the Mizushima(2010) cut operator

Reply via email to