Integrated: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-13 Thread Ian Graves
On Wed, 8 Sep 2021 20:24:31 GMT, Ian Graves  wrote:

> The duplicate condition in this chain of expressions needs to be shrunk to 
> drop a couple of character that are not excluded spacing marks.

This pull request has now been integrated.

Changeset: 3d9dc8f8
Author:Ian Graves 
URL:   
https://git.openjdk.java.net/jdk/commit/3d9dc8f824abf597d9b28f456cfeb5af927221b8
Stats: 539 lines in 4 files changed: 147 ins; 388 del; 4 mod

8273430: Suspicious duplicate condition in 
java.util.regex.Grapheme#isExcludedSpacingMark

Reviewed-by: naoto

-

PR: https://git.openjdk.java.net/jdk/pull/5425


Re: RFR: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark [v2]

2021-09-10 Thread Naoto Sato
On Fri, 10 Sep 2021 20:57:34 GMT, Ian Graves  wrote:

>> The duplicate condition in this chain of expressions needs to be shrunk to 
>> drop a couple of character that are not excluded spacing marks.
>
> Ian Graves has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Refactoring test to whitebox

Looks good. Nice cleanup!

-

Marked as reviewed by naoto (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5425


Re: RFR: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark [v2]

2021-09-10 Thread Ian Graves
On Fri, 10 Sep 2021 20:57:34 GMT, Ian Graves  wrote:

>> The duplicate condition in this chain of expressions needs to be shrunk to 
>> drop a couple of character that are not excluded spacing marks.
>
> Ian Graves has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Refactoring test to whitebox

Updating the tests here. Existing ones included code copy/pasted directly from 
the unit under test. Refactored to a whitebox-style testing that injects 
accessors into `java.util.regex` so we don't have to modify code in two places 
to keep it in sync. Also relaxed two private static methods in 
`java.util.regex.Grapheme` to package-private so the injected accessor class 
can reach them and expose them for the test.

-

PR: https://git.openjdk.java.net/jdk/pull/5425


Re: RFR: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark [v2]

2021-09-10 Thread Ian Graves
> The duplicate condition in this chain of expressions needs to be shrunk to 
> drop a couple of character that are not excluded spacing marks.

Ian Graves has updated the pull request incrementally with one additional 
commit since the last revision:

  Refactoring test to whitebox

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/5425/files
  - new: https://git.openjdk.java.net/jdk/pull/5425/files/ed2e4d2a..c60523af

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=5425=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=5425=00-01

  Stats: 545 lines in 4 files changed: 147 ins; 395 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/5425.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/5425/head:pull/5425

PR: https://git.openjdk.java.net/jdk/pull/5425


Re: RFR: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-08 Thread Naoto Sato
On Wed, 8 Sep 2021 20:24:31 GMT, Ian Graves  wrote:

> The duplicate condition in this chain of expressions needs to be shrunk to 
> drop a couple of character that are not excluded spacing marks.

The copyright year in Grapheme.java should be 2021, otherwise looks good.

-

Marked as reviewed by naoto (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/5425


RFR: 8273430: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-08 Thread Ian Graves
The duplicate condition in this chain of expressions needs to be shrunk to drop 
a couple of character that are not excluded spacing marks.

-

Commit messages:
 - 8273430: Suspicious duplicate condition in 
java.util.regex.Grapheme#isExcludedSpacingMark

Changes: https://git.openjdk.java.net/jdk/pull/5425/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=5425=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8273430
  Stats: 10 lines in 2 files changed: 7 ins; 0 del; 3 mod
  Patch: https://git.openjdk.java.net/jdk/pull/5425.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/5425/head:pull/5425

PR: https://git.openjdk.java.net/jdk/pull/5425


Re: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-07 Thread Naoto Sato

It does look incorrect. I will take a look.

Naoto

On 9/6/21 12:16 AM, Andrey Turbanov wrote:

Hello.
I found suspicious condition in the method
java.util.regex.Grapheme#isExcludedSpacingMark
It's detected by IntelliJ IDEA inspection 'Condition is covered by
further condition'
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/regex/Grapheme.java#L157

```
private static boolean isExcludedSpacingMark(int cp) {
return  cp == 0x102B || cp == 0x102C || cp == 0x1038 ||
cp >= 0x1062 && cp <= 0x1064 ||
cp >= 0x1062 && cp <= 0x106D ||  // <== here is the warning
cp == 0x1083 ||
cp >= 0x1087 && cp <= 0x108C ||
cp == 0x108F ||
cp >= 0x109A && cp <= 0x109C ||
cp == 0x1A61 || cp == 0x1A63 || cp == 0x1A64 ||
cp == 0xAA7B || cp == 0xAA7D;
}
```
There are 2 sub-conditions in this complex condition:
cp >= 0x1062 && cp <= 0x1064 ||
cp >= 0x1062 && cp <= 0x106D ||

The second condition is _wider_ than the first one.
I believe it's a bug. The second condition (according to
https://www.compart.com/en/unicode/category/Mc) should look like this:

cp >= 0x1067 && cp <= 0x106D ||

0x1065, 0x1066 are not from the Spacing_Mark category.


Andrey Turbanov



Re: Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-07 Thread Jim Laskey
Bug submitted on your behalf.

https://bugs.openjdk.java.net/browse/JDK-8273430

> On Sep 6, 2021, at 4:16 AM, Andrey Turbanov  wrote:
> 
> Hello.
> I found suspicious condition in the method
> java.util.regex.Grapheme#isExcludedSpacingMark
> It's detected by IntelliJ IDEA inspection 'Condition is covered by
> further condition'
> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/regex/Grapheme.java#L157
> 
> ```
> private static boolean isExcludedSpacingMark(int cp) {
>   return  cp == 0x102B || cp == 0x102C || cp == 0x1038 ||
>   cp >= 0x1062 && cp <= 0x1064 ||
>   cp >= 0x1062 && cp <= 0x106D ||  // <== here is the warning
>   cp == 0x1083 ||
>   cp >= 0x1087 && cp <= 0x108C ||
>   cp == 0x108F ||
>   cp >= 0x109A && cp <= 0x109C ||
>   cp == 0x1A61 || cp == 0x1A63 || cp == 0x1A64 ||
>   cp == 0xAA7B || cp == 0xAA7D;
> }
> ```
> There are 2 sub-conditions in this complex condition:
> cp >= 0x1062 && cp <= 0x1064 ||
> cp >= 0x1062 && cp <= 0x106D ||
> 
> The second condition is _wider_ than the first one.
> I believe it's a bug. The second condition (according to
> https://www.compart.com/en/unicode/category/Mc) should look like this:
> 
> cp >= 0x1067 && cp <= 0x106D ||
> 
> 0x1065, 0x1066 are not from the Spacing_Mark category.
> 
> 
> Andrey Turbanov



Suspicious duplicate condition in java.util.regex.Grapheme#isExcludedSpacingMark

2021-09-06 Thread Andrey Turbanov
Hello.
I found suspicious condition in the method
java.util.regex.Grapheme#isExcludedSpacingMark
It's detected by IntelliJ IDEA inspection 'Condition is covered by
further condition'
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/regex/Grapheme.java#L157

```
private static boolean isExcludedSpacingMark(int cp) {
   return  cp == 0x102B || cp == 0x102C || cp == 0x1038 ||
   cp >= 0x1062 && cp <= 0x1064 ||
   cp >= 0x1062 && cp <= 0x106D ||  // <== here is the warning
   cp == 0x1083 ||
   cp >= 0x1087 && cp <= 0x108C ||
   cp == 0x108F ||
   cp >= 0x109A && cp <= 0x109C ||
   cp == 0x1A61 || cp == 0x1A63 || cp == 0x1A64 ||
   cp == 0xAA7B || cp == 0xAA7D;
}
```
There are 2 sub-conditions in this complex condition:
cp >= 0x1062 && cp <= 0x1064 ||
cp >= 0x1062 && cp <= 0x106D ||

The second condition is _wider_ than the first one.
I believe it's a bug. The second condition (according to
https://www.compart.com/en/unicode/category/Mc) should look like this:

cp >= 0x1067 && cp <= 0x106D ||

0x1065, 0x1066 are not from the Spacing_Mark category.


Andrey Turbanov