On 11/03/2012 07:51 PM, Michael Lachmann wrote:
Just continuing the performance question.
I tried to write a simple wc in ragel.

The main part looks like this:
--
%%{
   machine wc;
   word = (^space)+ > { nwords++; };
   line = (space* . (word . space+)* word?) &
      (^'\n')*  > { nlines++; };
     main := (line . '\n')** (line)?  ;
}%%
--
(NOTE: I think this is still not totally correct. How do I handle a
line without '\n' at the end of the file correctly?)

If you want a machine closer in spirit to the C code, try
%%{
    machine wc;
    word = (^space)+ >{ nwords++; };
    line = (any-'\n')* '\n' >{ nlines++; };
    words = (space* word space)*;
    main := words | line*;
}%%

Keep in mind that ragel runs the sub-machines "words" and "line*" in parallel, much like your C loop, so all actions occur regardless of which one reaches a final state.

(also, this machine doesn't bother to make the end of the file a valid final state)

Did I do some errors in the ragel coding? Could I have done it more efficiently?
Or, will hand coding always be a fairly bit faster than ragel generated code?

Well, you wrote a hand-crafted parser with essentially 2 states and such simple transitions that you don't even need to check which state you're in before you switch() on the next character. The assembly is probably a dozen bytes long, and probably runs closer to 1GB/s except your hard drive and ram can only deliver 130MB/s.

When you get into any significantly complex state machine, you'll need to track which actions to fire more carefully, and end up about the same speed as what ragel generates.


_______________________________________________
ragel-users mailing list
ragel-users@complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Reply via email to