Hi Ivan,
This look fine.
In the test TegExTest: 5074, I would output the failed cases to System.err.
That way they get properly interleaved with the test progress output.
No need for another review.
Thanks, Roger
On 2/5/20 8:22 PM, Ivan Gerasimov wrote:
Hello!
j.u.regex.Pattern supports a special char class \R, which is specified
to be equal to \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029].
In particular, this means that the input "\r\n" must match to both
patterns \R and \R\R.
(In the later case, first \R matches \r and second \R matches \n.)
A pattern \R{2} is expected to be equal to \R\R.
However with the current implementation this does not hold (so,
Pattern.matches("\\R{2}", "\r\n") == false, while
Pattern.matches("\\R\\R", "\r\n") == true).
The root cause of this bug is that the special char class \R is
handled via dedicated class LineEnding, which is not able to correctly
handle backtracking inĀ presence of quantifiers).
A simple solution is to treat \R with quantifiers as an anonymous
group, which will make it comply with the specification.
Without quantifiers, \R is still handled via more efficient
implementation of LineEnding.
Would you please help review the fix?
Some minor cleanup was done along the way in the affected code.
BUGURL: https://bugs.openjdk.java.net/browse/JDK-8235812
WEBREV: http://cr.openjdk.java.net/~igerasim/8235812/00/webrev/
Control build and testing (tiers1-4) are all green.