On 12/03/2012 10:18 PM, Solomon Gibbs wrote:
On Mon, Dec 3, 2012 at 9:03 PM, Tim Goddard <t...@goddard.net.nz> wrote:
You shouldn't need to do any form of lookahead there. Since the behaviour
for the general case is just to skip, why not try to parse the header and
use the error action to skip to the end if it fails?

This is a general question -- the JPEG example just happens to be a
simple illustration.

If I don't do lookahead, then every APP0 segment type (JFIF, EXIF,
unknown, etc.) will be entered; I will then have to do everything in
finishing transitions or implement logic to roll back everything
except the segment that succeeds. This also rules out the possibility
of seeking the input to a different location based on the content of
the segment.

Generally, it seems like lookahead should produce a simpler, faster, machine.

Unlikely. As soon as you need multiple characters to consider, you need to coordinate a buffering strategy between library and host program, and you have more/larger tables to work with. You would have a more complicated process for pausing the machine while you fetch more input. As it is, Ragel's state is a single integer and it can process every character in a buffer and never needs to backtrack. (And, it doesn't require a callback mechanism for fetching more input, which I think is great!)

There are lots of Parser designs that use lookahead, and only run actions for the parse tree that is actually found. However Parser algorithms generally run slower than state machines, which is why it is most common to use a scanner to produce tokens from characters, and then parse the tokens.

Try a design where each possibility in the machine stores small bits of data in a struct, and then whichever finishing action happens first takes only the relevant bits from the struct and acts on them. Then use "fgoto" to prevent other finishing actions from running. If it gets too complicated, make the machine generate tokens for a parser and handle the complication there.

Thats the strategy that worked well for me.

-Mike

_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to