[ruby.parslet] LR parsing

Kaspar Schiess Mon, 22 Aug 2011 07:04:03 -0700

Hi all,

Here's what I answered Li Xiao (xli), who has been so kind as to 
formulate a patch for parslet that introduces left recursive parsing as 
per the OMeta paper.


I am rejecting the patch now, because it makes parslet slower. And I 
might reject it in the future (even if speed improves) because I think 
it takes parslet further away from its goals.

I would like to discuss changes with profound impact like this one on 
the mailing list first. Is LR parsing really the one thing that's 
missing for real world use of parslet? Why have I never missed it? How 
can we sacrifice 'parsing is like top down methods calling each other' 
and still not descend into the realm of LR(k) parsers and the associated 
fuzziness?

(I hope I don't give the impression that I don't like the patch as such: 
I do. I just would like to reject what it does to parslet.)

Please comment!

kaspar

Below is the answer I gave to the pull request:

---- github issue answer -----

Ok, so here goes:

without your patch:

     benchmarks/001-treetop.rb                 0.450000   0.040000 
0.490000 (  0.489828)
     benchmarks/002-http_parser.rb             2.100000   0.070000 
2.170000 (  2.178904)
     benchmarks/003-smalltalk.rb               7.310000   0.300000 
7.610000 (  7.593408)

with the patch:

     benchmarks/001-treetop.rb                 0.440000   0.040000 
0.480000 (  0.474451)
     benchmarks/002-http_parser.rb             2.550000   0.040000 
2.590000 (  2.585775)
     benchmarks/003-smalltalk.rb               9.100000   0.160000 
9.260000 (  9.254780)

smalltalk is the most complex example, doing a lot of branches and tests 
for every char. Also, it is the closest to parsing programming languages.

We need to get closer to master before this patch can go in, regardless 
of all the other issues I have with LR-enabled PEG parsers. If possible, 
I'd like to shave off a few seconds from master, since treetop still 
beats it to pulp:

      size       parslet   treetop
            245:   0.040     0.010
           1203:   0.130     0.050
           2281:   0.270     0.090
           3218:   0.430     0.090
           4131:   0.470     0.160
           5109:   0.670     0.140
           6099:   0.750     0.230
           7105:   0.850     0.270
           8062:   1.010     0.220
           9073:   1.150     0.340
          10091:   1.280     0.370

Also, while I kind of like your implementation (it is elegant and 
encapsulated), I still don't like the idea of LR-enabling parslet. It 
makes parsers behave in a counterintuitive manner IMHO.

(Results achieved using the parslet-benchmark project, bin/run and 
bin/compare respectively - this message also went to the mailing list)

--------------

[ruby.parslet] LR parsing

Reply via email to