Thank you Roger for review!

I've adjusted the test as you suggested and pushed the fix.

With kind regards,
Ivan

On 2/10/20 1:11 PM, Roger Riggs wrote:
Hi Ivan,

This look fine.

In the test TegExTest: 5074, I would output the failed cases to System.err.
That way they get properly interleaved with the test progress output.

No need for another review.

Thanks, Roger



On 2/5/20 8:22 PM, Ivan Gerasimov wrote:
Hello!

j.u.regex.Pattern supports a special char class \R, which is specified to be equal to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029].

In particular, this means that the input "\r\n" must match to both patterns \R and \R\R.

(In the later case, first \R matches \r and second \R matches \n.)

A pattern \R{2} is expected to be equal to \R\R.

However with the current implementation this does not hold (so, Pattern.matches("\\R{2}", "\r\n") == false, while Pattern.matches("\\R\\R", "\r\n") == true).

The root cause of this bug is that the special char class \R is handled via dedicated class LineEnding, which is not able to correctly handle backtracking inĀ  presence of quantifiers).

A simple solution is to treat \R with quantifiers as an anonymous group, which will make it comply with the specification.

Without quantifiers, \R is still handled via more efficient implementation of LineEnding.

Would you please help review the fix?

Some minor cleanup was done along the way in the affected code.

BUGURL: https://bugs.openjdk.java.net/browse/JDK-8235812
WEBREV: http://cr.openjdk.java.net/~igerasim/8235812/00/webrev/

Control build and testing (tiers1-4) are all green.


--
With kind regards,
Ivan Gerasimov

Reply via email to