On 11/03/2012 07:51 PM, Michael Lachmann wrote:
Just continuing the performance question.
I tried to write a simple wc in ragel.
The main part looks like this:
--
%%{
machine wc;
word = (^space)+ > { nwords++; };
line = (space* . (word . space+)* word?) &
(^'\n')* > { nlines++; };
main := (line . '\n')** (line)? ;
}%%
--
(NOTE: I think this is still not totally correct. How do I handle a
line without '\n' at the end of the file correctly?)
If you want a machine closer in spirit to the C code, try
%%{
machine wc;
word = (^space)+ >{ nwords++; };
line = (any-'\n')* '\n' >{ nlines++; };
words = (space* word space)*;
main := words | line*;
}%%
Keep in mind that ragel runs the sub-machines "words" and "line*" in
parallel, much like your C loop, so all actions occur regardless of
which one reaches a final state.
(also, this machine doesn't bother to make the end of the file a valid
final state)
Did I do some errors in the ragel coding? Could I have done it more efficiently?
Or, will hand coding always be a fairly bit faster than ragel generated code?
Well, you wrote a hand-crafted parser with essentially 2 states and such
simple transitions that you don't even need to check which state you're
in before you switch() on the next character. The assembly is probably
a dozen bytes long, and probably runs closer to 1GB/s except your hard
drive and ram can only deliver 130MB/s.
When you get into any significantly complex state machine, you'll need
to track which actions to fire more carefully, and end up about the same
speed as what ragel generates.
_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users