Hi Roger.
Thank you for taking a look!
The variant with a single loop was the first thing I tried (should have
mentioned that in the review request).
Unfortunately, that showed performance decrease.
I suspect that hitting the end of the input should be a less common
thing, that's why it it beneficial to separate it as a slow path.
With kind regards,
Ivan
On 12/18/19 7:48 AM, Roger Riggs wrote:
Hi Ivan,
Though the new code has a good effect, the asymmetry and duplication
seems unnecessary.
Can it be structured to have a single copy of the loop comparing the
available range
and still get the desired performance improvement.
Like:
boolean match(Matcher matcher,int i, CharSequence seq) {
int[] buf =buffer;
int len = buf.length;
for (int j =0; j < Math.min(len, matcher.to); j++) {
if (buf[j] != seq.charAt(i+j))
return false;
}
if (len >= matcher.to) {
matcher.hitEnd =true;
return false;
}
return next.match(matcher, i+len, seq);
}
Regards, Roger
On 10/28/19 9:03 PM, Ivan Gerasimov wrote:
Hello!
When building a Pattern object, the regex parser recognizes "slices"
- continuous char subsequences, which all have to be matched
case-sensitively/case-insensitively. Matching with such a slice is
implemented as a simple loop over a portion of the input.
In the current implementation, on each iteration of the loop it is
checked if we have hit the end of the input (which is an uncommon case).
This check can be done only once, before the loop, which will make
the loop lighter.
Benchmark shows up to +4% to the throughput for the case-insensitive
matching.
Would you please help review the enhancement?
BUGURL: https://bugs.openjdk.java.net/browse/JDK-8225466
WEBREV: http://cr.openjdk.java.net/~igerasim/8225466/00/webrev/
----------- benchmark results ---------------
UNFIXED
Benchmark Mode Cnt Score Error Units
PatternBench.sliceIFind avgt 16 190.612 ? 0.336 ns/op
FIXED
Benchmark Mode Cnt Score Error Units
PatternBench.sliceIFind avgt 16 182.954 ? 0.493 ns/op
-------------------------------------------
--
With kind regards,
Ivan Gerasimov