Re: Java > Scala

Dmitry Olshansky Fri, 02 Dec 2011 13:45:25 -0800

On 03.12.2011 1:08, Marco Leise wrote:

Cool, thx for your answers. The source code for OpenJDK can be
downloaded if you want to take a look at it. You are probably right
about them not decoding the characters lazily since their strings are
UTF-16.
The commented version of opIndex is a bit faster on my Core 2. This is
the first time that I witnessed such speed differences between
processors. :)

Wow. I knew something was wrong with non-BT test code, from what I heardit should have been faster but it wasn't for me :)

Also I found that the trie is usually queried twice for each matching
character in the input string. You can't optimize opIndex any further
(but try size_t in there instead of uint, it helped here) unless you
make some changes on the larger scale. So if you should find out that
the second query isn't required, that would help more than anything else.
I said it on IRC today: This library will be my reference for compile
time code generation in D. There is a lot of expertise in it, good work!


There I have two options to work through:

- separate negative and positive character classes it would killpossible branching here.- and now looking at test_11 in you profile output, I see the likelyculprit: I should re-think lookahead tests, they used to reduce numberof savepoints during matching.

P.S.: I'm fine with treating anything that is escaped, but not special,
as is. \w did cause an infinite loop though, so you may want to test


Hm can't reproduce.

with the original regex. For \. you can assert(false, "\. is not a valid
escape sequence")

No that was bad idea ... and I planed to change that exception. Now I'mmore into ignore the backslash.


 or just ignore the backslash. Personally I usually

don't escape anything just to be on the safe side. :p


Worthy of a small community poll.

Re: Java > Scala

Reply via email to