Hi,

This is an update from Xerces for file impl/xpath/regex/TokenRange.java. For details, please refer to: https://bugs.openjdk.java.net/browse/JDK-8035577.

Webrevs: http://cr.openjdk.java.net/~joehw/jdk9/8035577/webrev/

Existing tests: JAXP SQE and unit tests passed.

Test cases added for typo fix in RangeToken.intersectRanges. Code also updated to fix a bug where regular expression intersection returns incorrect value when first range ends later than second range. Example below. Test cases have been added to cover any scenarios that the code changes affect.

new RegularExpression("(?[b-d]&[a-r])"); -> returns [b-d] (Correct)
new RegularExpression("(?[a-r]&[b-d])"); -> returns [b-de-r] (Incorrect)

Thanks,
David

P.S. Notes on bug fixes.
1) Line 404 removal of while loop.
This fixes a new bug where incorrect results are given when first range ends later than second range. In the old code we got
(?[a-r]&[b-d]) -> returns [b-de-r]
By removing the while loop, we get [b-d].
This while loop looks like a copy-paste error from subtractRanges. In subtractRanges we need to keep the leftover portion from the first range, but this does not apply to intersection.

2) Line 388, addition of src2 += 2;
This code change affects anything of the form (?[a-r]&[b-eg-j]). The code execution is diagrammed below.
o------------o  (src1)
  o--o o--o     (src2)
For the first match we get
o------------o  (src1)
  o--o          (src2)
Next we want to run src2+=2 to get the second pair of endpoints (since the first two endpoints are already used). Notice how src1begin has been updated to this.ranges[src1] = src2end+1, which is directly from the code.
      o------o  (src1)
       o--o     (src2)
The src2+=2 statement was left out of the old code, and is added in this webrev. If we leave out the src2+=2 at line 388, on the next iteration of the large while loop we will reach case "} else if (src2end < src1begin) {" which also executes "src2+=2". This means the correct final result is generated, but on a later loop. We want to add the new code because it's better to have all associated variable updated in the sameloop. In addition, all the other conditions have similar src1 or src2 updates.

Reply via email to