Hi all,
Here's what I answered Li Xiao (xli), who has been so kind as to
formulate a patch for parslet that introduces left recursive parsing as
per the OMeta paper.
I am rejecting the patch now, because it makes parslet slower. And I
might reject it in the future (even if speed improves) because I think
it takes parslet further away from its goals.
I would like to discuss changes with profound impact like this one on
the mailing list first. Is LR parsing really the one thing that's
missing for real world use of parslet? Why have I never missed it? How
can we sacrifice 'parsing is like top down methods calling each other'
and still not descend into the realm of LR(k) parsers and the associated
fuzziness?
(I hope I don't give the impression that I don't like the patch as such:
I do. I just would like to reject what it does to parslet.)
Please comment!
kaspar
Below is the answer I gave to the pull request:
---- github issue answer -----
Ok, so here goes:
without your patch:
benchmarks/001-treetop.rb 0.450000 0.040000
0.490000 ( 0.489828)
benchmarks/002-http_parser.rb 2.100000 0.070000
2.170000 ( 2.178904)
benchmarks/003-smalltalk.rb 7.310000 0.300000
7.610000 ( 7.593408)
with the patch:
benchmarks/001-treetop.rb 0.440000 0.040000
0.480000 ( 0.474451)
benchmarks/002-http_parser.rb 2.550000 0.040000
2.590000 ( 2.585775)
benchmarks/003-smalltalk.rb 9.100000 0.160000
9.260000 ( 9.254780)
smalltalk is the most complex example, doing a lot of branches and tests
for every char. Also, it is the closest to parsing programming languages.
We need to get closer to master before this patch can go in, regardless
of all the other issues I have with LR-enabled PEG parsers. If possible,
I'd like to shave off a few seconds from master, since treetop still
beats it to pulp:
size parslet treetop
245: 0.040 0.010
1203: 0.130 0.050
2281: 0.270 0.090
3218: 0.430 0.090
4131: 0.470 0.160
5109: 0.670 0.140
6099: 0.750 0.230
7105: 0.850 0.270
8062: 1.010 0.220
9073: 1.150 0.340
10091: 1.280 0.370
Also, while I kind of like your implementation (it is elegant and
encapsulated), I still don't like the idea of LR-enabling parslet. It
makes parsers behave in a counterintuitive manner IMHO.
(Results achieved using the parslet-benchmark project, bin/run and
bin/compare respectively - this message also went to the mailing list)
--------------