Hi Eric,

 > For information, removing the "str('<%').absent?" from the :text rule and
> parsing the same string takes about 0.5 seconds.
>
> For erb like parsers, having long plain text is quite usual, and I was
> wondering what would be the most efficient way to parse it?

I am going to answer this on two levels. Both levels are implemented in 
the experiment 'optimizer.rb' in master. [1]

1) Currently, what I would suggest is to capture the pattern

   (str('<%').absent? >> any).repeat(1)

in a custom parser atom and use that instead of the above. This is 
illustrated by 'AbsentParser' in the referenced source code and makes 
the ERB parser already quite a bit faster.

In fact, there is really no limit to what custom parser atoms can do - 
parslet is exceptionally easy to extend that way. Unfortunately no one 
is doing it.

2) In the near future, the pattern will be to write an optimizer that 
optimizes the parser ahead of time. Using the optimized parser and the 
'AbsentParser' atom from solution 1) will then make parsing faster 
without making the parser description more complex.

Maybe someone will even jump at the possibility of implementing a 
parslet-optimizer that generalizes on the idea & implements a standard 
set of optimisations.


1) is open to programmers with even the released version of parslet. 2) 
will be implemented and available for the next version (or the one after 
that, depending on other factors).

Please note that 2) can be implemented with almost only the public and 
the semipublic API of parslet. Nothing stops the motivated programmer 
from doing it him/herself.

And in the far future, we might even compile down to C... Either way: 
Here's your answer. Hope it makes any sense at all.

regards,
kaspar

[1] 
https://github.com/kschiess/parslet/blob/872611321c4af390c8b89e5d54b24613b4280fba/experiments/optimizer.rb


Reply via email to