On 12/03/2012 10:18 PM, Solomon Gibbs wrote:
On Mon, Dec 3, 2012 at 9:03 PM, Tim Goddard <t...@goddard.net.nz> wrote:
You shouldn't need to do any form of lookahead there. Since the behaviour
for the general case is just to skip, why not try to parse the header and
use the error action to skip to the end if it fails?
This is a general question -- the JPEG example just happens to be a
simple illustration.
If I don't do lookahead, then every APP0 segment type (JFIF, EXIF,
unknown, etc.) will be entered; I will then have to do everything in
finishing transitions or implement logic to roll back everything
except the segment that succeeds. This also rules out the possibility
of seeking the input to a different location based on the content of
the segment.
Generally, it seems like lookahead should produce a simpler, faster, machine.
Unlikely. As soon as you need multiple characters to consider, you need
to coordinate a buffering strategy between library and host program, and
you have more/larger tables to work with. You would have a more
complicated process for pausing the machine while you fetch more input.
As it is, Ragel's state is a single integer and it can process every
character in a buffer and never needs to backtrack. (And, it doesn't
require a callback mechanism for fetching more input, which I think is
great!)
There are lots of Parser designs that use lookahead, and only run
actions for the parse tree that is actually found. However Parser
algorithms generally run slower than state machines, which is why it is
most common to use a scanner to produce tokens from characters, and then
parse the tokens.
Try a design where each possibility in the machine stores small bits of
data in a struct, and then whichever finishing action happens first
takes only the relevant bits from the struct and acts on them. Then use
"fgoto" to prevent other finishing actions from running. If it gets too
complicated, make the machine generate tokens for a parser and handle
the complication there.
Thats the strategy that worked well for me.
-Mike
_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users