Hi Ivan,

Though the new code has a good effect, the asymmetry and duplication seems unnecessary. Can it be structured to have a single copy of the loop comparing the available range
and still get the desired performance improvement.

Like:

boolean match(Matcher matcher,int i, CharSequence seq) {
    int[] buf =buffer;
    int len = buf.length;
    for (int j =0; j < Math.min(len, matcher.to); j++) {
        if (buf[j] != seq.charAt(i+j))
            return false;
    }
    if (len >= matcher.to) {
        matcher.hitEnd =true;
        return false;
    }
    return next.match(matcher, i+len, seq);
}

Regards, Roger


On 10/28/19 9:03 PM, Ivan Gerasimov wrote:
Hello!

When building a Pattern object, the regex parser recognizes "slices" - continuous char subsequences, which all have to be matched case-sensitively/case-insensitively.  Matching with such a slice is implemented as a simple loop over a portion of the input.

In the current implementation, on each iteration of the loop it is checked if we have hit the end of the input (which is an uncommon case).

This check can be done only once, before the loop, which will make the loop lighter.

Benchmark shows up to +4% to the throughput for the case-insensitive matching.

Would you please help review the enhancement?

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8225466
WEBREV: http://cr.openjdk.java.net/~igerasim/8225466/00/webrev/


----------- benchmark results ---------------

UNFIXED
Benchmark                Mode  Cnt    Score   Error  Units
PatternBench.sliceIFind  avgt   16  190.612 ? 0.336  ns/op

FIXED
Benchmark                Mode  Cnt    Score   Error  Units
PatternBench.sliceIFind  avgt   16  182.954 ? 0.493  ns/op
-------------------------------------------


Reply via email to