On 17.08.2009, at 16:29, Markus Wiederkehr wrote:
If you are interested I could write a regex based version which
will not
reintroduce the double space bug.
I'ld use the regex to extract charset, encoding and encoded string
in one
go. I think it will be at least as fast as the current method.
However, java.util.regex requires Java 1.4, if that's a no-go I won't
bother.
Regex wouldn't be a problem since Mime4j already depends on Java 5.
I'm not sure how a regex solution could compete with a few indexOf and
substring calls in terms of speed though. I mean Pattern.compile()
alone has to build a DFA from the input string.
That's why the Pattern.compile() call is only executed once when the
class is loaded:
final static Pattern regex = Pattern.compile("...");
From what I can see, the indexOf calls seem to be quite optimized, so
I do not expect a noticable speed improvement by switching to regular
expressions.
I'd like to give it a try by refactoring and fixing the existing code.
Fine with me!