Hi Ivan,
Though the new code has a good effect, the asymmetry and duplication
seems unnecessary.
Can it be structured to have a single copy of the loop comparing the
available range
and still get the desired performance improvement.
Like:
boolean match(Matcher matcher,int i, CharSequence seq) {
int[] buf =buffer;
int len = buf.length;
for (int j =0; j < Math.min(len, matcher.to); j++) {
if (buf[j] != seq.charAt(i+j))
return false;
}
if (len >= matcher.to) {
matcher.hitEnd =true;
return false;
}
return next.match(matcher, i+len, seq);
}
Regards, Roger
On 10/28/19 9:03 PM, Ivan Gerasimov wrote:
Hello!
When building a Pattern object, the regex parser recognizes "slices" -
continuous char subsequences, which all have to be matched
case-sensitively/case-insensitively. Matching with such a slice is
implemented as a simple loop over a portion of the input.
In the current implementation, on each iteration of the loop it is
checked if we have hit the end of the input (which is an uncommon case).
This check can be done only once, before the loop, which will make the
loop lighter.
Benchmark shows up to +4% to the throughput for the case-insensitive
matching.
Would you please help review the enhancement?
BUGURL: https://bugs.openjdk.java.net/browse/JDK-8225466
WEBREV: http://cr.openjdk.java.net/~igerasim/8225466/00/webrev/
----------- benchmark results ---------------
UNFIXED
Benchmark Mode Cnt Score Error Units
PatternBench.sliceIFind avgt 16 190.612 ? 0.336 ns/op
FIXED
Benchmark Mode Cnt Score Error Units
PatternBench.sliceIFind avgt 16 182.954 ? 0.493 ns/op
-------------------------------------------