[ruby.parslet] parsing large input

Melissa Whittington Mon, 21 Nov 2011 10:00:29 -0800

I have a fairly complex parslet parser that has been working great so
far. Input contains multiple "documents" and each document has
multiple sections with multiple subsections etc., so it's fairly
nested.


I'm having an issue now with input that is very large, specifically a
document has a section with 5000+ subsections. The total input has
over 500k lines!

My problem is that parslet consumes all the RAM/CPU available, so I'm
looking for a way to parse large input without killing the machine it
is running on. I can split up the input into subsections and parse
each subsection separately, and that sort of works but I'd rather not
do that if I don't have to.

I'm assuming that parslet is holding on to information so that it can
backtrack when it doesn't match a rule. Does anyone have advice on how
to structure rules or hint to parslet that it doesn't need to
backtrack after a certain point and free up the information it's
saving?

-mj

[ruby.parslet] parsing large input

Reply via email to