In message <[EMAIL PROTECTED]>, "Robert Edgar" w
rites:
>which has got me up to about 450 line a second but that is still slow though
>I am stil using the readline, but using readline and a string tokenizer I
>can get 10x this speed which seem to me to indicate that the readline is not
>really a bottleneck but that the regex still is or is regex not really
>designed for this sort of processing and I would be better just doing a
>simple hard coded parse of the string?.

You'll definitely do better than you are now by reading the entire input
into a char array, but now that you bring it up, yes, in general, any
tokenization or parsing task that can be done in an application-specific
manner without regular expressions will be a good bit faster than with
regular expressions (and this is far more true with the performance
enhancements in JDK 1.4).  Right now you're probably paying a good deal
of overhead related to the saved groups.  If I had been paying more
attention I would have noticed what you were doing and suggested just
using StreamTokenizer (I usually suggest that people avoid using regular
expressions when they don't need them).  Anyway, it's kind of like when
people use Util.substitute() when StringBuffer.replace() will do.  Although
now that there's a String.replace() and split() supporting regular
expression in JDK 1.4 ... but that would be going off topic.

daniel



--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to