Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread Sergey Tsypanov
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

Marked as reviewed by stsypanov (Author).

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread Claes Redestad
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

Marked as reviewed by redestad (Reviewer).

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread Jaikiran Pai
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

Hello @liach, I don't follow what this change is achieving. I think I might be 
missing something though. I read through the linked JIRA which states:

> In the getGenericInfo() methods of Method, Field, Constructor, and 
> RecordComponent, the genericInfo field is read twice, and the second read 
> returned may be null under race conditions.

Considering the `Constructor` class as an example, which looks like this:


@Override
ConstructorRepository getGenericInfo() {
// lazily initialize repository if necessary
if (genericInfo == null) {
// create and cache generic info repository
genericInfo =
ConstructorRepository.make(getSignature(),
   getFactory());
}
return genericInfo; //return cached repository
}

I can understand that the `ConstructorRepository.make(getSignature(), 
getFactory());` might end up getting called more than once in case of race (or 
if ConstructorRepository.make(getSignature(), getFactory()) really returns 
null), but those should be harmless races. Plus, the getSignature() isn't 
expensive, since it returns an already assigned `final` field.
Is there some other race condition here?

The JBS issue also states:

> Class::getGenericInfo() originally had the same issue, but was fixed in 
> 8016236.

I had a look at the RFR 
https://mail.openjdk.org/pipermail/core-libs-dev/2013-June/017798.html. That's 
a slightly different issue, from what I understand. In that case, the call to 
getGenericInfo() was being preceded by a call to some other expensive method. 
The change there proposed to first call getGenericInfo() and let it be 
initialized and only then decide whether to call the other expensive methods.
In that change, I can see that the `getGenericInfo()` method on the `Class` 
class was changed too and that change is almost similar to what's being 
proposed in this current PR. However, `Class.genericInfo` field is `volatile` 
and I think that's why Doug changed that method to first write it to a local 
field and then use that local field for the rest of the work. In the current PR 
however, which touches `Field`, `Method`, `Constructor` and `RecordComponent`, 
the `genericInfo` isn't a `volatile` field in any of those classes, so I don't 
see why this local assignment is needed or would help. Am I missing something?

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8299852: Modernize ConcurrentHashMap [v3]

2023-02-20 Thread Per Minborg
> `java.util.concurrent.ConcurrentHashMap` is relatively old and has not been 
> updated to reflect the current state of Java and can be modernized: 
> 
> * Add `@Serial` annotations 
> * Seal classes and restrict subclassing for internal classes 
> * Use pattern matching for instance 
> * Remove redundant modifiers (such as some final declarations) 
> * Use Objects.requireNonNull for invariant checking 
> * Use diamond operators instead of explicit types 
> * Simplify expressions using Math::max

Per Minborg has updated the pull request with a new target base due to a merge 
or a rebase. The pull request now contains four commits:

 - Merge master
 - Add @Override annotations
 - Untrack file
 - Modernize ConcurrentHashMap

-

Changes: https://git.openjdk.org/jdk/pull/11924/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11924&range=02
  Stats: 534 lines in 1 file changed: 164 ins; 21 del; 349 mod
  Patch: https://git.openjdk.org/jdk/pull/11924.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11924/head:pull/11924

PR: https://git.openjdk.org/jdk/pull/11924


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread Claes Redestad
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

I think of this pattern of reading a to-be-lazily-initialized value into a 
local as simple hygiene, `volatile` or not. The code might seem solid without 
it - but stranger things than eliding a field load has happened. Storing into 
the local variable removes some doubt about how this code will be executed.

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2]

2023-02-20 Thread Volker Simonis
On Fri, 17 Feb 2023 21:21:56 GMT, Jorn Vernee  wrote:

> > (because they are all not referenced from the program any more).
> 
> So, it sounds like this is testing a case where 
> `LambdaMetafactory.metafactory` is being called directly?
> 
> I'd like to point out that, while I buy that people are doing this (I see 
> enough of this on StackOverflow), `LambdaMetafactory` is a runtime API 
> specifically designed to support a language feature. Calling it directly is 
> an unintended use case.

Sorry, but that's a really weird argument. You can't introduce a new public API 
and then blame people for using it! If the API is really only intended to 
support the internal implementation of a language feature than make it private 
and don't export it.

-

PR: https://git.openjdk.org/jdk/pull/12493


Re: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2]

2023-02-20 Thread Volker Simonis
On Sat, 18 Feb 2023 01:55:52 GMT, Mandy Chung  wrote:

> > Finally, the PR fixes a regression compared to the JDK 11 behavior.
> 
> This is an intended change that improves the C heap memory usage. In 
> addition, the spec of `LambdaMetafactory` has no guarantee of the spinned 
> classes be unloaded independent of it defining loader. This is indeed a 
> behavioral change if `LambdaMetafactory` API is called directly instead of 
> via `invokedynamic`. I can create a CSR to retrofit this behavioral change.

Yes, please. I think that would be helpful.

-

PR: https://git.openjdk.org/jdk/pull/12493


Re: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2]

2023-02-20 Thread David Holmes
On Mon, 20 Feb 2023 08:49:38 GMT, Volker Simonis  wrote:

> Sorry, but that's a really weird argument. You can't introduce a new public 
> API and then blame people for using it! If the API is really only intended to 
> support the internal implementation of a language feature than make it 
> private and don't export it.

The spec states:

> API Note:
>   These linkage methods are designed to support the evaluation of lambda 
> expressions and method references in the Java Language. 

The API has to be public as the call gets injected into application code.

-

PR: https://git.openjdk.org/jdk/pull/12493


Re: RFR: 8302154: Hidden classes created by LambdaMetaFactory can't be unloaded [v2]

2023-02-20 Thread Alan Bateman
On Mon, 20 Feb 2023 09:34:17 GMT, David Holmes  wrote:

> You can't introduce a new public API and then blame people for using it! If 
> the API is really only intended to support the internal implementation of a 
> language feature than make it private and don't export it.

There are a number of APIs that exist to support the Java Language. 
LambdaMetafactory makes it clear in the first paragraph of its API docs that it 
supports the lambda expression and method reference expression features of the 
Java Language. If a Java compiler compiles these language constructs to 
invokedynamic instructions then the bootstrap methods have to be standard APIs, 
otherwise the class files won't work on all VM implementations. The APIs adding 
to j.l.runtime are similar. Maybe some advance tooling/framework might use them 
directly but it should be rare.

-

PR: https://git.openjdk.org/jdk/pull/12493


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread David Holmes
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

The field needs to be volatile for these construction races to be thread-safe, 
otherwise no guarantee that seeing a non-null genericInfo will mean you see any 
writes done by the factory methods.

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8301552: Use AtomicReferenceArray for caching instead of CHM in ZoneOffset [v8]

2023-02-20 Thread Per Minborg
On Thu, 9 Feb 2023 13:46:14 GMT, Per Minborg  wrote:

>> `ZoneOffset` instances are cached by the `ZoneOffset` class itself for 
>> values in the range [-18h, 18h] for each second that is on an even quarter 
>> of an hour (i.e. at most 2*18*4+1 = 145 values). 
>> 
>> Instead of using a `ConcurrentHashMap` for caching instanced, we could 
>> instead use an `AtomicReferenceArray` with direct slot value access for said 
>> even seconds. This will improve performance and reduce the number of object 
>> even though the backing array will go from an initial 32 in the CHM to an 
>> initial/final 145 in the ARA. The CHM will contain much more objects and 
>> array slots for typical numbers of entries in the cache and will compute 
>> hash/bucket/collision on the hot code path for each cache access.
>
> Per Minborg has updated the pull request incrementally with three additional 
> commits since the last revision:
> 
>  - Remove unused setup method
>  - Rename method in test
>  - Add copyright header

Other potential candidates for using `LazyReferenceArray`:

* java.nio.charset.CoderResult
* ZoneRule
* Locale
* sun.font.FileFontStrike, PhysicalStrike
* sun.security.Token
* sun.security.provider.ParameterCache
* sun.security.ssl.SSLSessionContextImpl
* sun.util.locale.provider.LocaleProviderAdapter

NOTE: The one mentioned above are candidates and further investigation is 
needed to see if they are actually a fit.

-

PR: https://git.openjdk.org/jdk/pull/12346


Re: RFR: JDK-8302800: Augment NaN handling tests of FDLIBM methods

2023-02-20 Thread Raffaello Giulietti
On Sun, 19 Feb 2023 21:30:19 GMT, Joe Darcy  wrote:

> Augment NaN-handling tests to include exotic NaN bit patterns. Assuming this 
> changeset goes back first, I'll update 
> https://github.com/openjdk/jdk/pull/12608 to follow the same convention 
> afterward.

Otherwise LGTM

test/jdk/java/lang/Math/Tests.java line 41:

> 39: private Tests(){}; // do not instantiate
> 40: 
> 41: static volatile double zero = 0.0;

Not directly related to this PR and just out of curiosity: is there a reason 
for this to be `volatile` and non-`final`?

-

PR: https://git.openjdk.org/jdk/pull/12644


Re: RFR: JDK-8302666: Replace CHM with VarHandle in ForeachOrderedTask [v2]

2023-02-20 Thread Viktor Klang
On Sat, 18 Feb 2023 02:40:01 GMT, ExE Boss  wrote:

>> Viktor Klang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Write the initial value of the next reference without using the VarHandle
>
> src/java.base/share/classes/java/util/stream/ForEachOps.java line 515:
> 
>> 513: //
>> 514: @SuppressWarnings("unchecked")
>> 515: var leftDescendant = (ForEachOrderedTask> T>)NEXT.getAndSet(this, null);
> 
> I don’t think that this needs a `@SuppressWarnings("unchecked")`, as casts of 
> signature polymorphic return values don’t emit unchecked warnings (unless 
> that changed recently):
> Suggestion:
> 
> var leftDescendant = (ForEachOrderedTask) 
> NEXT.getAndSet(this, (ForEachOrderedTask) null);

Thanks for having a look at the PR, you're right, the suppression isn't needed.

-

PR: https://git.openjdk.org/jdk/pull/12320


Re: RFR: JDK-8302666: Replace CHM with VarHandle in ForeachOrderedTask [v3]

2023-02-20 Thread Viktor Klang
> I noticed when looking at the code that there was no real need to use a CHM 
> to perform the tracking of activation in an ordered fashion on 
> ForEachOrderedTask, but instead a VarHandle can be used, reducing allocations 
> and indirection.

Viktor Klang has updated the pull request incrementally with one additional 
commit since the last revision:

  Updating copyright header of ForEachOps.java and removing unnecessary 
suppression of an unchecked cast.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12320/files
  - new: https://git.openjdk.org/jdk/pull/12320/files/3b38f942..866d5a39

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12320&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12320&range=01-02

  Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/12320.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12320/head:pull/12320

PR: https://git.openjdk.org/jdk/pull/12320


Re: RFR: JDK-8302666: Replace CHM with VarHandle in ForeachOrderedTask [v2]

2023-02-20 Thread Viktor Klang
On Fri, 17 Feb 2023 16:48:27 GMT, Paul Sandoz  wrote:

>> Viktor Klang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Write the initial value of the next reference without using the VarHandle
>
> That's a nice find, looks good. (Update the year in the copyright header.)

@PaulSandoz Thanks for the review, Paul—I've eliminated an unneeded 
SuppressWarnings, updated the copyright year, and pushed the update here. 👍

-

PR: https://git.openjdk.org/jdk/pull/12320


Re: RFR: 8301627: System.exit and Runtime.exit debug logging [v6]

2023-02-20 Thread Chris Hegarty
On Fri, 17 Feb 2023 17:27:50 GMT, Roger Riggs  wrote:

>> It can be difficult to find the cause of calls to 
>> `java.lang.System.exit(status)` and `Runtime.exit(status)` because the Java 
>> runtime exits.
>> The status value and stack trace are logged using the System Logger named 
>> `java.lang.Runtime` with message level `System.Logger.Level.DEBUG`.
>
> Roger Riggs has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains eight additional 
> commits since the last revision:
> 
>  - Move test to the java/lang/RuntimeTests directory.
>Added implNote to System.exit.
>A few javadoc updates for review comments.
>  - Merge branch 'master' into 8301627-log-system-exit
>  - Improve implNote for Runtime.exit() with review suggestions.
>  - Correct System.getLogger link
>  - Add an @implNote to Runtime.exit to describe the java.lang.Runtime logging.
>  - Added try/catch around lookup of logger so exceptions do not prevent 
> System.exit.
>Added test case with console logger (when java.util.logging) not present.
>Removed @implNote tag its not appropriate in implementation javadoc.
>Still looking into when and where the log configuration should be 
> described.
>  - Locate the System logger before taking the shutdown lock
>  - Add logging of calls to Runtime.exit to the system logger 
> "java.lang.Runtime".

I left one trivial comment relating to a typo, but otherwise LGTM.

src/java.base/share/classes/java/lang/Runtime.java line 160:

> 158:  *
> 159:  * @implNote
> 160:  * If the {@linkplain System#getLogger(String) the system logger} 
> for {@code java.lang.Runtime}

Trivially, remove the double "the the" here.

-

Marked as reviewed by chegar (Reviewer).

PR: https://git.openjdk.org/jdk/pull/12517


Re: String encodeASCII

2023-02-20 Thread Claes Redestad
Unsafe might be similarly tricky dependency-wise, but perhaps less so. Either 
solution might work fine if we extract the code for encoding to a utility class 
that isn’t initialized eagerly with String.class itself.

I’ll file an RFE and get the ”trivial” fix to use StringCoding.countPositives 
in to establish a better baseline. I’m sure there are ways to beat this, for 
example with a fused loop. An unrolled and/or VH-based solution like you’ve 
proposed could at the very least be used to speedup the case of Strings for 
which we need to replace with ’?’.

/Claes

19 feb. 2023 kl. 02:54 skrev Brett Okken 
mailto:brett.okken...@gmail.com>>:

Thanks for taking a look at this.
That is a pretty good improvement with a much smaller change.
I recognize the intricacies of bootstrapping, but have no expertise. Would 
using Unsafe directly be easier? Particularly for shorter strings, doing the 
copying and checking together rather than as separate loops seems to speed 
things up considerably, even for happy-path of no failures.

Brett

On Sat, Feb 18, 2023 at 5:49 PM Claes Redestad 
mailto:claes.redes...@oracle.com>> wrote:
Hi,

general comment: String might be one of the trickier places to add a VarHandle 
dependency, since String is used very early in the bootstrap and depended upon 
by everything else, so I’d expect such a solution would have to navigate 
potential circularity issues carefully. It’d be good to experiment with changes 
to java.lang.String proper to see that the solution that works nice externally 
is or can be made feasible within String.

Specifically on the performance opportunity then while US-ASCII encoding is 
probably on the way out we shouldn’t neglect it.

One way to go about this without pulling VarHandles into String might be to use 
what other encode methods in String does and leverage 
StringCoding.countPositives:

https://github.com/openjdk/jdk/pull/12640

Testing this on the existing StringEncode microbenchmark, shows a promising 
speed-up when the input is ASCII-encodable:

Baseline
Benchmark  (charsetName)  Mode  Cnt  Score Error  
Units
StringEncode.encodeAsciiLongUS-ASCII  avgt5  26626,025 ± 448,307  
ns/op
StringEncode.encodeAsciiShort   US-ASCII  avgt5 33,336 ±   2,032  
ns/op

Patch:
Benchmark  (charsetName)  Mode  Cnt ScoreError  
Units
StringEncode.encodeAsciiLongUS-ASCII  avgt5  5492,985 ± 40,066  
ns/op
StringEncode.encodeAsciiShort   US-ASCII  avgt528,545 ±  4,883  
ns/op

(You might see that this will go back into a scalar loop on encoding failures. 
That loop could still benefit from unrolling or byteArrayViewVarHandle, but I 
think you have a bigger problem in an app than raw performance if you have a 
lot of encoding failures...)

WDYT?

/Claes

18 feb. 2023 kl. 19:36 skrev Brett Okken 
mailto:brett.okken...@gmail.com>>:

https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/String.java#L976-L981

For String.encodeASCII, with the LATIN1 coder is there any interest in
exploring the performance impacts of utilizing a
byteArrayViewVarHandle to read/write as longs and utilize a bitmask to
identify if negative values are present?

A simple jmh benchmark covering either 0 or 1 non-ascii (negative)
values shows times cut in half (or more) for most scenarios with
strings ranging in length from 3 - ~2000.
VM version: JDK 17.0.6, OpenJDK 64-Bit Server VM, 17.0.6+10
Windows 10 Intel(R) Core(TM) i7-9850H

Hand unrolling the loops shows noted improvement, but does make for
less aesthetically pleasing code.


Benchmark  (nonascii)  (size)  Mode
CntScore Error  Units
AsciiEncodeBenchmark.jdk none   3  avgt
4   15.531 ±   1.122  ns/op
AsciiEncodeBenchmark.jdk none  10  avgt
4   17.350 ±   0.473  ns/op
AsciiEncodeBenchmark.jdk none  16  avgt
4   18.277 ±   0.421  ns/op
AsciiEncodeBenchmark.jdk none  23  avgt
4   20.139 ±   0.350  ns/op
AsciiEncodeBenchmark.jdk none  33  avgt
4   22.008 ±   0.656  ns/op
AsciiEncodeBenchmark.jdk none  42  avgt
4   24.393 ±   1.155  ns/op
AsciiEncodeBenchmark.jdk none 201  avgt
4   55.884 ±   0.645  ns/op
AsciiEncodeBenchmark.jdk none 511  avgt
4  120.817 ±   2.917  ns/op
AsciiEncodeBenchmark.jdk none2087  avgt
4 

Re: RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread David Schlosnagle
On Sat, 18 Feb 2023 23:26:08 GMT, Claes Redestad  wrote:

> When encoding Strings to US-ASCII we can speed up the happy path 
> significantly by using `StringCoding.countPositives` as a speculative check 
> for whether there are any chars that needs to be replaced by `'?'`. Once a 
> non-ASCII char is encountered we fall back to the slow loop and replace as 
> needed.
> 
> An alternative could be unrolling or using a byte array VarHandle, as 
> show-cased by Brett Okken here: 
> https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
> Having to replace chars with `?` is essentially an encoding error so it might 
> be safe to assume this case is exceptional in practice.

src/java.base/share/classes/java/lang/String.java line 976:

> 974: private static byte[] encodeASCII(byte coder, byte[] val) {
> 975: if (coder == LATIN1) {
> 976: byte[] dst = Arrays.copyOf(val, val.length);

Given the tweaks in https://git.openjdk.org/jdk/pull/12613 should this use 
`val.clone()` (would skip the length check)

Suggestion:

byte[] dst = val.clone();

src/java.base/share/classes/java/lang/String.java line 982:

> 980: if (dst[i] < 0) {
> 981: dst[i] = '?';
> 982: }

I'm curious if using countPositives (and vectorization) to scan forward would 
be valuable for long (mostly ASCII) strings or if the method call 
overhead/non-constant stride is not a win for shorter strings or heavily 
non-ascii inputs. 

Suggestion:

for (int i = positives; i < dst.length; i = 
StringCoding.countPositives(dst, i + 1, dst.length - i);) {
if (dst[i] < 0) {
dst[i] = '?';
}

-

PR: https://git.openjdk.org/jdk/pull/12640


Re: RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread Claes Redestad
On Sun, 19 Feb 2023 07:24:30 GMT, David Schlosnagle  wrote:

>> When encoding Strings to US-ASCII we can speed up the happy path 
>> significantly by using `StringCoding.countPositives` as a speculative check 
>> for whether there are any chars that needs to be replaced by `'?'`. Once a 
>> non-ASCII char is encountered we fall back to the slow loop and replace as 
>> needed.
>> 
>> An alternative could be unrolling or using a byte array VarHandle, as 
>> show-cased by Brett Okken here: 
>> https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
>> Having to replace chars with `?` is essentially an encoding error so it 
>> might be safe to assume this case is exceptional in practice.
>
> src/java.base/share/classes/java/lang/String.java line 976:
> 
>> 974: private static byte[] encodeASCII(byte coder, byte[] val) {
>> 975: if (coder == LATIN1) {
>> 976: byte[] dst = Arrays.copyOf(val, val.length);
> 
> Given the tweaks in https://git.openjdk.org/jdk/pull/12613 should this use 
> `val.clone()` (would skip the length check)
> 
> Suggestion:
> 
> byte[] dst = val.clone();

Yeah, probably makes sense. On that note I found that `val.clone()` 
underperform `Arrays.copyOf(val, val.length)` in C1 compiled code: 
https://bugs.openjdk.org/browse/JDK-8302850 - while this shouldn't affect peak 
performance it might be cause for a warmup regression.

> src/java.base/share/classes/java/lang/String.java line 982:
> 
>> 980: if (dst[i] < 0) {
>> 981: dst[i] = '?';
>> 982: }
> 
> I'm curious if using countPositives (and vectorization) to scan forward would 
> be valuable for long (mostly ASCII) strings or if the method call 
> overhead/non-constant stride is not a win for shorter strings or heavily 
> non-ascii inputs. 
> 
> Suggestion:
> 
> for (int i = positives; i < dst.length; i = 
> StringCoding.countPositives(dst, i + 1, dst.length - i);) {
> if (dst[i] < 0) {
> dst[i] = '?';
> }

There's some overhead doing countPositives calls, and doing it in a loop might 
provoke poor worst case performance. Im sure you can find inputs for which this 
is a win, though.

-

PR: https://git.openjdk.org/jdk/pull/12640


RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread Claes Redestad
When encoding Strings to US-ASCII we can speed up the happy path significantly 
by using `StringCoding.countPositives` as a speculative check for whether there 
are any chars that needs to be replaced by `'?'`. Once a non-ASCII char is 
encountered we fall back to the slow loop and replace as needed.

An alternative could be unrolling or using a byte array VarHandle, as 
show-cased by Brett Okken here: 
https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
Having to replace chars with `?` is essentially an encoding error so it might 
be safe to assume this case is exceptional in practice.

-

Commit messages:
 - Cleanup, clone
 - Merge branch 'master' into encodeASCII
 - Improve encodeASCII by leveraging countPositives

Changes: https://git.openjdk.org/jdk/pull/12640/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12640&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8302863
  Stats: 24 lines in 1 file changed: 11 ins; 2 del; 11 mod
  Patch: https://git.openjdk.org/jdk/pull/12640.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12640/head:pull/12640

PR: https://git.openjdk.org/jdk/pull/12640


Re: RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread Claes Redestad
On Sat, 18 Feb 2023 23:26:08 GMT, Claes Redestad  wrote:

> When encoding Strings to US-ASCII we can speed up the happy path 
> significantly by using `StringCoding.countPositives` as a speculative check 
> for whether there are any chars that needs to be replaced by `'?'`. Once a 
> non-ASCII char is encountered we fall back to the slow loop and replace as 
> needed.
> 
> An alternative could be unrolling or using a byte array VarHandle, as 
> show-cased by Brett Okken here: 
> https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
> Having to replace chars with `?` is essentially an encoding error so it might 
> be safe to assume this case is exceptional in practice.

Baseline:

Benchmark  (charsetName)  Mode  Cnt  Score Error  
Units
StringEncode.encodeAsciiLongUS-ASCII  avgt5  26626,025 ± 448,307  
ns/op
StringEncode.encodeAsciiShort   US-ASCII  avgt5 33,336 ±   2,032  
ns/op


Patch:

Benchmark  (charsetName)  Mode  Cnt ScoreError  
Units
StringEncode.encodeAsciiLongUS-ASCII  avgt5  5492,985 ± 40,066  
ns/op
StringEncode.encodeAsciiShort   US-ASCII  avgt528,545 ±  4,883  
ns/op

-

PR: https://git.openjdk.org/jdk/pull/12640


RFR: 8302818: Optimize wrapper sets and immutable sets of Enums

2023-02-20 Thread Tingjun Yuan
Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
when the argument is also a `EnumSet`, but there is no such optimization for 
wrapper sets (returned by `Collections.unmodifiableSet`, 
`Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
methods) of `Enum`s.

This PR introduces optimization classes for these situations.  No public APIs 
are changed.

-

Commit messages:
 - add `import` in SetFactories.java
 - Fix typo
 - Update SetFactories.java
 - make configure not executable
 - Update Collections.java
 - Revert "Merge remote-tracking branch 'origin/optimize-enum-coll' into 
optimize-enum-coll"
 - Merge remote-tracking branch 'origin/optimize-enum-coll' into 
optimize-enum-coll
 - Check for unmodifiable sets in `unmodifiableSet`
 - fix error
 - add tests for Set.of factories
 - ... and 12 more: https://git.openjdk.org/jdk/compare/e4d1cff6...7f704c35

Changes: https://git.openjdk.org/jdk/pull/12498/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8302818
  Stats: 723 lines in 8 files changed: 674 ins; 0 del; 49 mod
  Patch: https://git.openjdk.org/jdk/pull/12498.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12498/head:pull/12498

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums

2023-02-20 Thread liach
On Thu, 9 Feb 2023 16:20:31 GMT, Tingjun Yuan  wrote:

> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Just curious, how does this enum set change affect the performance of 
construction of regular sets via Set.of calls?

On the JBS there's https://bugs.openjdk.org/browse/JDK-8145048 which appears to 
fit your changes.

Just took a peek again. Besides the critical error of not throwing IAE for 
duplicate inputs (which you should probably add new test cases for), I think 
the 2 implementations share the same `toArray` `hashCode`, and the `contains` 
logic can be pulled to call an abstract `contains(int ordinal)` instead.

Have you submitted an issue at https://bugreport.java.com/bugreport/ ? It will 
show up a few days later on the JBS. If not, I can create an issue on the JBS 
for you too as long as you provide a prompt.

Created as https://bugs.openjdk.org/browse/JDK-8302818

I strongly recommend you to add tests of pure-enum and mixed-enum immutable set 
creation to ensure your code is logically correct. You seems to be developing 
the JDK in some weird way as you upload changes directly to GitHub Web UI.
I recommend developing according to the docs: see 
[IDE](https://github.com/openjdk/jdk/blob/master/doc/ide.md) and 
[building](https://github.com/openjdk/jdk/blob/master/doc/building.md) docs.

Also, try set your console's code page to 437 (US-ASCII), as some non-English 
consoles have caused build failures for me before. If you need help, you can 
leave this PR as a work in progress (so we don't spam the openjdk mailing 
lists) and I can answer your questions about the setup.

src/java.base/share/classes/java/util/Collections.java line 1153:

> 1151: return (Set) s;
> 1152: }
> 1153: if (s instanceof RegularEnumSetCompatible) {

Why not use `if (s instanceof RegularEnumSetCompatible es)` instead?

src/java.base/share/classes/java/util/Collections.java line 1194:

> 1192: private static final long serialVersionUID = 
> -1110577510253015312L;
> 1193: 
> 1194: final RegularEnumSetCompatible es;

Can't you just use `` as the type argument instead, for all occurrences? Can 
avoid a cast in `elementType()` too, and it's compatible argument to superclass 
constructor.

src/java.base/share/classes/java/util/Collections.java line 2409:

> 2407: public long[] elements() {
> 2408: synchronized (mutex) {
> 2409: return es.elements();

Is this really safe? `elements()` in `JumboEnumSet` returns the backing array, 
so this returned array is mutable. Should perform `es.elements().clone()` for 
more safety.

src/java.base/share/classes/java/util/ImmutableCollections.java line 904:

> 902: @SuppressWarnings({"varargs", "unchecked", "rawtypes"})
> 903: static  Set setFromArray(E... input) {
> 904: if (input.length == 0) {

This special case can be removed, as no caller calls with an array with length 
<= 2

src/java.base/share/classes/java/util/ImmutableCollections.java line 913:

> 911: try {
> 912: // Sorry, I have to use a raw type here.
> 913: es = EnumSet.copyOf((Collection)Arrays.asList(input));  // 
> rejects null

This fails to throw `IllegalArgumentException` on duplicated elements.

src/java.base/share/classes/java/util/ImmutableCollections.java line 1131:

> 1129: long lastReturned = 0;
> 1130: final Enum[] universe
> 1131: = 
> SharedSecrets.getJavaLangAccess().getEnumConstantsShared(elementType);

I think the convention is to put `SharedSecrets.getJavaLangAccess()` instance 
in a static final field like `JLA` than calling it every time

src/java.base/share/classes/java/util/ImmutableCollections.java line 1178:

> 1176: return (es.elements() & ~elements) == 0;
> 1177: } else {
> 1178: for (Object o : c) {

Can just call super here

src/java.base/share/classes/java/util/JumboEnumSetCompatible.java line 38:

> 36: ImmutableCollections.ImmutableJumboEnumSet {
> 37: Class elementType();
> 38: long[] elements();

Since long arrays are mutable (unlike the `long elements()` in regular set), 
might add a note about this, as enum sets never throws CMEs

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums

2023-02-20 Thread Tingjun Yuan
On Thu, 9 Feb 2023 19:37:03 GMT, liach  wrote:

> Just curious, how does this enum set change affect the performance of 
> construction of regular sets via Set.of calls?

See `ImmutableCollections.setFromArray`, it relies on the fact that 
`EnumSet.copyOf` will throw a `ClassCastException` if the elements are not enum 
constants of a same enum class.  Trying to catch an exception is a bit 
expensive (so maybe I can optimize that).

> On the JBS there's https://bugs.openjdk.org/browse/JDK-8145048 which appears 
> to fit your changes.

I only enhanced `EnumSet`, but not `EnumMap` (yet) so I don't think it fits my 
changes.

> Just took a peek again. Besides the critical error of not throwing IAE for 
> duplicate inputs (which you should probably add new test cases for), I think 
> the 2 implementations share the same `toArray` `hashCode`, and the `contains` 
> logic can be pulled to call an abstract `contains(int ordinal)` instead.

@liach Thanks for reviewing. I've fixed the issues you've mentioned.

> Have you submitted an issue at https://bugreport.java.com/bugreport/ ? It 
> will show up a few days later on the JBS. If not, I can create an issue on 
> the JBS for you too as long as you provide a prompt.

No I haven't. Submit one for me please. Thanks!

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums

2023-02-20 Thread Tingjun Yuan
On Thu, 9 Feb 2023 16:20:31 GMT, Tingjun Yuan  wrote:

> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Oops, my bad. Please ignore the merge warning.

I've just add several tests for `Set.of` factories.

By the way, I've just built the JDK successfully on my Mac, and the tests run 
successfully too. Let me know if it fails on other operating systems.

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums

2023-02-20 Thread ExE Boss
On Thu, 9 Feb 2023 16:20:31 GMT, Tingjun Yuan  wrote:

> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

src/java.base/share/classes/java/util/Collections.java line 1152:

> 1150: if (s.getClass() == UnmodifiableSet.class) {
> 1151: return (Set) s;
> 1152: }

To avoid re‑wrapping already unmodifiable `EnumSet`s, this should also check 
for those:
Suggestion:

if (s.getClass() == UnmodifiableSet.class
|| s.getClass() == UnmodifiableRegularEnumSet.class
|| s.getClass() == UnmodifiableJumboEnumSet.class) {
return (Set) s;
}

src/java.base/share/classes/java/util/Collections.java line 1158:

> 1156: if (s instanceof JumboEnumSetCompatible) {
> 1157: return (Set)new 
> UnmodifiableJumboEnumSet<>((JumboEnumSetCompatible)s);
> 1158: }

Suggestion:

if (s.getClass() == UnmodifiableSet.class
|| s.getClass() == UnmodifiableRegularEnumSet.class
|| s.getClass() == UnmodifiableJumboEnumSet.class) {
return (Set) s;
}
if (s instanceof RegularEnumSetCompatible es) {
return (Set) new UnmodifiableRegularEnumSet<>(es);
}
if (s instanceof JumboEnumSetCompatible es) {
return (Set) new UnmodifiableJumboEnumSet<>(es);
}

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: Speed up StringLatin1.regionMatchesCI_UTF16

2023-02-20 Thread Claes Redestad
RFE filed: https://bugs.openjdk.org/browse/JDK-8302872

/Claes

18 feb. 2023 kl. 19:58 skrev Eirik Bjørsnøs 
mailto:eir...@gmail.com>>:

Hi,

This PR continues the effort to speed up case-insensitive string comparisons, 
this time tackling comparison of latin1-coded strings with utf16-coded strings:

https://github.com/openjdk/jdk/pull/12637

This builds on top of #12632, it makes sense to review that one first.

Thanks,
Eirik.



Re: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread Claes Redestad
RFE filed: https://bugs.openjdk.org/browse/JDK-8302871

/Claes

18 feb. 2023 kl. 10:28 skrev Eirik Bjørsnøs 
mailto:eir...@gmail.com>>:

Hi,

The following PR suggests we can speed up StringLatin1.regionMatchesCI by 
applying 'the oldest ASCII trick in the book':

https://github.com/openjdk/jdk/pull/12632

Thanks,
Eirik.



Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread liach
On Sun, 19 Feb 2023 18:41:18 GMT, liach  wrote:

> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
> thread safe

We don't fear calling the factory twice for benign races, as the distinct 
constructor factory instances are behaviorally the same.

The true issue lies in the double getfield operations: Java memory model 
doesn't require the second read to happen-after a write reflected in the first 
read, so return this.genericInfo may return null while this.genericInfo == null 
evaluates to false, in case genericInfo is initialized lazily by another 
thread. See https://bugs.openjdk.org/browse/JDK-8261404

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread Tingjun Yuan
> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Tingjun Yuan has updated the pull request incrementally with one additional 
commit since the last revision:

  Optimize `EnumSet.copyOf` and `Set.copyOf`

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12498/files
  - new: https://git.openjdk.org/jdk/pull/12498/files/7f704c35..41f4a909

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=00-01

  Stats: 27 lines in 2 files changed: 22 ins; 2 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/12498.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12498/head:pull/12498

PR: https://git.openjdk.org/jdk/pull/12498


RFR: 8302871: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread Eirik Bjorsnos
This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
'the oldest ASCII trick in the book'.

The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
updated to use `equalsIgnoreCase`

To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
`EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 code 
point pairs have an `equalsIgnoreCase` consistent with Character.toUpperCase, 
Character.toLowerCase.

Performance is tested for matching and mismatching cases of code point pairs 
picked from the ASCII letter, ASCII number and latin1 letter ranges. Results in 
the first comment below.

-

Commit messages:
 - Exhaustive verification needs to cover the case b1 == b2
 - Move multiplication exclusion to the lat1 range branch
 - Speed up StringLatin1.regionMatchesCI by applying the 'oldest ASCII trick in 
the book'

Changes: https://git.openjdk.org/jdk/pull/12632/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8302871
  Stats: 158 lines in 4 files changed: 148 ins; 5 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread Eirik Bjorsnos
On Sat, 18 Feb 2023 09:21:25 GMT, Eirik Bjorsnos  wrote:

> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Benchmark results:

Baseline:


Benchmark  (codePoints)  (size)  Mode  Cnt 
ScoreError  Units
RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15  
2216.525 ± 79.626  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15 
5.049 ±  0.044  ns/op
RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
708.977 ± 19.381  ns/op
RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15 
3.726 ±  0.036  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
2134.499 ± 23.064  ns/op
RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15 
4.227 ±  0.070  ns/op


Patch:


Benchmark  (codePoints)  (size)  Mode  Cnt 
ScoreError  Units
RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
809.729 ± 40.257  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15 
4.334 ±  0.031  ns/op
RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
370.814 ± 39.790  ns/op
RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15 
3.766 ±  0.072  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
1247.979 ±  7.826  ns/op
RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15 
4.819 ±  0.026  ns/op

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread liach
On Mon, 20 Feb 2023 13:15:03 GMT, Tingjun Yuan  wrote:

>> Currently, the two subclasses of `java.util.EnumSet` optimize bulk 
>> operations when the argument is also a `EnumSet`, but there is no such 
>> optimization for wrapper sets (returned by `Collections.unmodifiableSet`, 
>> `Collections.synchronizedSet`, etc.) and immutable sets (returned by 
>> `Set.of` methods) of `Enum`s.
>> 
>> This PR introduces optimization classes for these situations.  No public 
>> APIs are changed.
>
> Tingjun Yuan has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Optimize `EnumSet.copyOf` and `Set.copyOf`

src/java.base/share/classes/java/util/EnumSet.java line 186:

> 184: } else {
> 185: while (i.hasNext())
> 186: result.add(i.next());

Can't we just use addAll in all cases?

src/java.base/share/classes/java/util/Set.java line 742:

> 740: E e1 = it.next();
> 741: if (!it.hasNext()) {
> 742: return Set.of(e0, e1);

Bad change, doesn't handle e0.equals(e1), and this is getting beyond the 
original issue

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread David Schlosnagle
On Sat, 18 Feb 2023 09:21:25 GMT, Eirik Bjorsnos  wrote:

> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
181:

> 179:  return ( U <= 'Z' // In range A-Z
> 180:  || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
> A-grave-Thorn, excl. multiplication
> 181:  && U == (b2 & 0xDF); // b2 has same uppercase

I'm curious if the order of comparisons could alter performance to a small 
degree. For example, it might be interesting to compare various permutations 
like below to short circuit reject unequal uppercased b2

Suggestion:

 // uppercase b1 using 'the oldest ASCII trick in the book'
 int U = b1 & 0xDF;
 return (U == (b2 & 0xDF))
 && ((U >= 'A' && U <= 'Z') // In range A-Z
 || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
A-grave-Thorn, excl. multiplication

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread Eirik Bjorsnos
On Sat, 18 Feb 2023 19:04:24 GMT, David Schlosnagle  wrote:

>> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
>> 'the oldest ASCII trick in the book'.
>> 
>> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
>> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
>> updated to use `equalsIgnoreCase`
>> 
>> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
>> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
>> code point pairs have an `equalsIgnoreCase` consistent with 
>> Character.toUpperCase, Character.toLowerCase.
>> 
>> Performance is tested for matching and mismatching cases of code point pairs 
>> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
>> in the first comment below.
>
> src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
> 181:
> 
>> 179:  return ( U <= 'Z' // In range A-Z
>> 180:  || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
>> A-grave-Thorn, excl. multiplication
>> 181:  && U == (b2 & 0xDF); // b2 has same uppercase
> 
> I'm curious if the order of comparisons could alter performance to a small 
> degree. For example, it might be interesting to compare various permutations 
> like below to short circuit reject unequal uppercased b2
> 
> Suggestion:
> 
>  // uppercase b1 using 'the oldest ASCII trick in the book'
>  int U = b1 & 0xDF;
>  return (U == (b2 & 0xDF))
>  && ((U >= 'A' && U <= 'Z') // In range A-Z
>  || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
> A-grave-Thorn, excl. multiplication

Yeah, as you noticed this code is tricky and sensitive to the order of 
operations. I did some quite extensive exploration before ending on the current 
structure. This particular one seems to improve rejection somewhat at the cost 
of matches.

Since rejection is relatively speaking already very fast, I think we should 
favour fast matching here.

Results:


enchmark  (codePoints)  (size)  Mode  Cnt 
ScoreError  Units
RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
917.796 ± 20.285  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15 
4.367 ±  0.348  ns/op
RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
399.656 ± 10.703  ns/op
RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15 
4.361 ±  0.664  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
1384.443 ± 22.199  ns/op
RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15 
4.119 ±  0.451  ns/op

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI

2023-02-20 Thread David Schlosnagle
On Sat, 18 Feb 2023 19:45:34 GMT, Eirik Bjorsnos  wrote:

>> src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
>> 181:
>> 
>>> 179:  return ( U <= 'Z' // In range A-Z
>>> 180:  || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
>>> A-grave-Thorn, excl. multiplication
>>> 181:  && U == (b2 & 0xDF); // b2 has same uppercase
>> 
>> I'm curious if the order of comparisons could alter performance to a small 
>> degree. For example, it might be interesting to compare various permutations 
>> like below to short circuit reject unequal uppercased b2
>> 
>> Suggestion:
>> 
>>  // uppercase b1 using 'the oldest ASCII trick in the book'
>>  int U = b1 & 0xDF;
>>  return (U == (b2 & 0xDF))
>>  && ((U >= 'A' && U <= 'Z') // In range A-Z
>>  || (U >= 0xC0 && U <= 0XDE && U != 0xD7)) // ..or 
>> A-grave-Thorn, excl. multiplication
>
> Yeah, as you noticed this code is tricky and sensitive to the order of 
> operations. I did some quite extensive exploration before ending on the 
> current structure. This particular one seems to improve rejection somewhat at 
> the cost of matches.
> 
> Since rejection is relatively speaking already very fast, I think we should 
> favour fast matching here.
> 
> Results:
> 
> 
> enchmark  (codePoints)  (size)  Mode  Cnt 
> ScoreError  Units
> RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
> 917.796 ± 20.285  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15
>  4.367 ±  0.348  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
> 399.656 ± 10.703  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15
>  4.361 ±  0.664  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
> 1384.443 ± 22.199  ns/op
> RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15
>  4.119 ±  0.451  ns/op

Thanks for confirming

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread Tingjun Yuan
On Mon, 20 Feb 2023 13:15:20 GMT, liach  wrote:

> Bad change, doesn't handle e0.equals(e1), and this is getting beyond the 
> original issue

`Set.of(E, E)` handles duplication. If `coll` is 
`{Jumbo,Regular}EnumSetCompatible`, then we don't have to create the `HashSet`.

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread Tingjun Yuan
On Mon, 20 Feb 2023 13:16:09 GMT, liach  wrote:

>> Tingjun Yuan has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Optimize `EnumSet.copyOf` and `Set.copyOf`
>
> src/java.base/share/classes/java/util/EnumSet.java line 186:
> 
>> 184: } else {
>> 185: while (i.hasNext())
>> 186: result.add(i.next());
> 
> Can't we just use addAll in all cases?

I wonder why the original code didn't use `addAll`, so I didn't change it. 😂

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v2]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Eirik Bjorsnos has updated the pull request with a new target base due to a 
merge or a rebase. The incremental webrev excludes the unrelated changes 
brought in by the merge/rebase. The pull request contains five additional 
commits since the last revision:

 - Align whitespace to make example strings easier to read
 - Merge branch 'master' into regionmatches-latin1-speedup
 - Exhaustive verification needs to cover the case b1 == b2
 - Move multiplication exclusion to the lat1 range branch
 - Speed up StringLatin1.regionMatchesCI by applying the 'oldest ASCII trick in 
the book'

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12632/files
  - new: https://git.openjdk.org/jdk/pull/12632/files/a583bcd6..59c42298

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=00-01

  Stats: 57495 lines in 1316 files changed: 24512 ins; 15682 del; 17301 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread Tingjun Yuan
On Mon, 20 Feb 2023 13:29:52 GMT, Tingjun Yuan  wrote:

> I wonder why the original code didn't use `addAll`, so I didn't change it. 😂

It seems to be okay to use `addAll` in all cases. Do we need extra test cases 
for this?

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v3]

2023-02-20 Thread Tingjun Yuan
> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Tingjun Yuan has updated the pull request incrementally with one additional 
commit since the last revision:

  Use `addAll` in `EnumSet.copyOf`
  
  It seems to be okay to use `addAll` in all cases.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12498/files
  - new: https://git.openjdk.org/jdk/pull/12498/files/41f4a909..098b3fff

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=01-02

  Stats: 6 lines in 1 file changed: 0 ins; 5 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/12498.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12498/head:pull/12498

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v3]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Eirik Bjorsnos has updated the pull request incrementally with one additional 
commit since the last revision:

  Add clarifying comments and use more descriptive variable names in the latin1 
verification EqualsIgnoreCase test

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12632/files
  - new: https://git.openjdk.org/jdk/pull/12632/files/59c42298..84517102

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=01-02

  Stats: 16 lines in 1 file changed: 5 ins; 2 del; 9 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread liach
On Mon, 20 Feb 2023 13:24:27 GMT, Tingjun Yuan  wrote:

>> src/java.base/share/classes/java/util/Set.java line 742:
>> 
>>> 740: E e1 = it.next();
>>> 741: if (!it.hasNext()) {
>>> 742: return Set.of(e0, e1);
>> 
>> Bad change, doesn't handle e0.equals(e1), and this is getting beyond the 
>> original issue
>
>> Bad change, doesn't handle e0.equals(e1), and this is getting beyond the 
>> original issue
> 
> `Set.of(E, E)` handles duplication. If `coll` is 
> `{Jumbo,Regular}EnumSetCompatible`, then we don't have to create the 
> `HashSet`.

no, Set.copyOf explicitly allows duplications, unlike Set.of. You need to read 
the specs so you know what you are doing.

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: Speed up latin1 case folding

2023-02-20 Thread Claes Redestad
RFE filed: https://bugs.openjdk.org/browse/JDK-8302877

/Claes

17 feb. 2023 kl. 18:38 skrev Eirik Bjørsnøs 
mailto:eir...@gmail.com>>:

Hi,

The following PR suggests we can speed up Character.toUpperCase and 
Character.toLowerCase for latin1 code points by applying 'the oldest ASCII 
trick in the book':

https://github.com/openjdk/jdk/pull/12623

Thanks,
Eirik.



Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v2]

2023-02-20 Thread Tingjun Yuan
On Mon, 20 Feb 2023 13:58:01 GMT, liach  wrote:

> no, Set.copyOf explicitly allows duplications, unlike Set.of. You need to 
> read the specs so you know what you are doing.

Sorry, my bad. I've fixed that.

-

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v4]

2023-02-20 Thread Tingjun Yuan
> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Tingjun Yuan has updated the pull request incrementally with one additional 
commit since the last revision:

  Fix `Set.copyOf`

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12498/files
  - new: https://git.openjdk.org/jdk/pull/12498/files/098b3fff..b9699bfe

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=02-03

  Stats: 12 lines in 1 file changed: 0 ins; 9 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/12498.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12498/head:pull/12498

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Eirik Bjorsnos has updated the pull request incrementally with two additional 
commits since the last revision:

 - Add @bug tag to EqualsIgnoreCase test for correct issue JDK-8302871
 - Add @bug tag to EqualsIgnoreCase test for JDK-8302877

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12632/files
  - new: https://git.openjdk.org/jdk/pull/12632/files/84517102..03d3e2cb

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


RFR: 8302877: Speed up latin1 case conversions

2023-02-20 Thread Eirik Bjorsnos
This PR suggests we speed up Character.toUpperCase and Character.toLowerCase 
for latin1 code points by applying the 'oldest ASCII trick in the book'.

This takes advantage of the fact that latin1 uppercase code points are always 
0x20 lower than their lowercase (with the exception of two code points which 
uppercase out of latin1).

To verify the correctness of the new implementation, the test 
`Latin1CaseConversion` is added with an exhaustive verification of 
toUpperCase/toLowerCase for all latin1 code points.

The implementation needs to balance the performance of the various ranges in 
latin1. An effort has been made to favour operations on ASCII code points, 
without causing excessive regression for higher code points.

Performance is benchmarked for 7 chosen sample code points, each representing a 
range or a special-case.  Results in the first comment.

-

Commit messages:
 - Add @bug tag to test
 - Improved whitespace alignment for Param and switch values
 - Correct spelling for "exhaustive"
 - Prefer the term "case conversion" over ""case folding". Refer to 0xB5 as 
'Micro Sign' ("Mu" is the Unicode code point it uppercases to)
 - Improve comments for the two special-cased uppercase code points 'Micro 
Sign' and 'y with Diaeresis'
 - Adjust whitespace
 - Speed up Character.toUpperCase and Character.toLowerCase by applying the 
'oldest ASCII trick in the book'

Changes: https://git.openjdk.org/jdk/pull/12623/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12623&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8302877
  Stats: 164 lines in 3 files changed: 143 ins; 0 del; 21 mod
  Patch: https://git.openjdk.org/jdk/pull/12623.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12623/head:pull/12623

PR: https://git.openjdk.org/jdk/pull/12623


Re: RFR: 8302877: Speed up latin1 case conversions

2023-02-20 Thread Naoto Sato
On Fri, 17 Feb 2023 17:31:09 GMT, Eirik Bjorsnos  wrote:

> This PR suggests we speed up Character.toUpperCase and Character.toLowerCase 
> for latin1 code points by applying the 'oldest ASCII trick in the book'.
> 
> This takes advantage of the fact that latin1 uppercase code points are always 
> 0x20 lower than their lowercase (with the exception of two code points which 
> uppercase out of latin1).
> 
> To verify the correctness of the new implementation, the test 
> `Latin1CaseConversion` is added with an exhaustive verification of 
> toUpperCase/toLowerCase for all latin1 code points.
> 
> The implementation needs to balance the performance of the various ranges in 
> latin1. An effort has been made to favour operations on ASCII code points, 
> without causing excessive regression for higher code points.
> 
> Performance is benchmarked for 7 chosen sample code points, each representing 
> a range or a special-case.  Results in the first comment.

Looks good to me. I'd rather not use "case folding", as to me it implies 
"normalizing" but this is simply lowercasing/uppercasing.

test/jdk/java/lang/Character/Latin1CaseFolding.java line 31:

> 29: /**
> 30:  * @test
> 31:  * @summary Provides exchaustive verification of Character.toUpperCase 
> and Character.toLowerCase

typo: "exhaustive"?

-

PR: https://git.openjdk.org/jdk/pull/12623


Re: RFR: 8302877: Speed up latin1 case conversions

2023-02-20 Thread Eirik Bjorsnos
On Fri, 17 Feb 2023 17:31:09 GMT, Eirik Bjorsnos  wrote:

> This PR suggests we speed up Character.toUpperCase and Character.toLowerCase 
> for latin1 code points by applying the 'oldest ASCII trick in the book'.
> 
> This takes advantage of the fact that latin1 uppercase code points are always 
> 0x20 lower than their lowercase (with the exception of two code points which 
> uppercase out of latin1).
> 
> To verify the correctness of the new implementation, the test 
> `Latin1CaseConversion` is added with an exhaustive verification of 
> toUpperCase/toLowerCase for all latin1 code points.
> 
> The implementation needs to balance the performance of the various ranges in 
> latin1. An effort has been made to favour operations on ASCII code points, 
> without causing excessive regression for higher code points.
> 
> Performance is benchmarked for 7 chosen sample code points, each representing 
> a range or a special-case.  Results in the first comment.

Benchmark results:

Baseline:


Benchmark (codePoint)  Mode  Cnt  Score   Error 
 Units
Characters.Latin1CaseConversion.toLowerCase  low  avgt   15  1.267 ± 
0.013  ns/op
Characters.Latin1CaseConversion.toLowerCaseA  avgt   15  1.657 ± 
0.011  ns/op
Characters.Latin1CaseConversion.toLowerCasea  avgt   15  1.258 ± 
0.005  ns/op
Characters.Latin1CaseConversion.toLowerCase  A-grave  avgt   15  1.656 ± 
0.011  ns/op
Characters.Latin1CaseConversion.toLowerCase  a-grave  avgt   15  1.270 ± 
0.023  ns/op
Characters.Latin1CaseConversion.toLowerCase   mu  avgt   15  1.261 ± 
0.006  ns/op
Characters.Latin1CaseConversion.toLowerCase   yD  avgt   15  1.260 ± 
0.005  ns/op
Characters.Latin1CaseConversion.toUpperCase  low  avgt   15  1.284 ± 
0.043  ns/op
Characters.Latin1CaseConversion.toUpperCaseA  avgt   15  1.264 ± 
0.008  ns/op
Characters.Latin1CaseConversion.toUpperCasea  avgt   15  1.818 ± 
0.016  ns/op
Characters.Latin1CaseConversion.toUpperCase  A-grave  avgt   15  1.261 ± 
0.015  ns/op
Characters.Latin1CaseConversion.toUpperCase  a-grave  avgt   15  1.822 ± 
0.013  ns/op
Characters.Latin1CaseConversion.toUpperCase   mu  avgt   15  1.823 ± 
0.006  ns/op
Characters.Latin1CaseConversion.toUpperCase   yD  avgt   15  1.822 ± 
0.008  ns/op


PR:


Benchmark (codePoint)  Mode  Cnt  Score   Error 
 Units
Characters.Latin1CaseConversion.toLowerCase  low  avgt   15  0.878 ± 
0.005  ns/op
Characters.Latin1CaseConversion.toLowerCaseA  avgt   15  1.038 ± 
0.009  ns/op
Characters.Latin1CaseConversion.toLowerCasea  avgt   15  1.036 ± 
0.007  ns/op
Characters.Latin1CaseConversion.toLowerCase  A-grave  avgt   15  1.357 ± 
0.015  ns/op
Characters.Latin1CaseConversion.toLowerCase  a-grave  avgt   15  1.352 ± 
0.003  ns/op
Characters.Latin1CaseConversion.toLowerCase   mu  avgt   15  1.273 ± 
0.002  ns/op
Characters.Latin1CaseConversion.toLowerCase   yD  avgt   15  1.352 ± 
0.004  ns/op
Characters.Latin1CaseConversion.toUpperCase  low  avgt   15  0.880 ± 
0.013  ns/op
Characters.Latin1CaseConversion.toUpperCaseA  avgt   15  0.920 ± 
0.071  ns/op
Characters.Latin1CaseConversion.toUpperCasea  avgt   15  1.055 ± 
0.013  ns/op
Characters.Latin1CaseConversion.toUpperCase  A-grave  avgt   15  1.394 ± 
0.010  ns/op
Characters.Latin1CaseConversion.toUpperCase  a-grave  avgt   15  1.391 ± 
0.009  ns/op
Characters.Latin1CaseConversion.toUpperCase   mu  avgt   15  1.597 ± 
0.021  ns/op
Characters.Latin1CaseConversion.toUpperCase   yD  avgt   15  1.354 ± 
0.003  ns/op

-

PR: https://git.openjdk.org/jdk/pull/12623


Re: RFR: 8302877: Speed up latin1 case conversions

2023-02-20 Thread Eirik Bjorsnos
On Sat, 18 Feb 2023 00:57:14 GMT, Naoto Sato  wrote:

>  I'd rather not use "case folding", as to me it implies "normalizing" but 
> this is simply lowercasing/uppercasing.

I guess I was looking for a generic term for uppercase/lowercase. I picked 
"case conversion" instead.

> test/jdk/java/lang/Character/Latin1CaseFolding.java line 31:
> 
>> 29: /**
>> 30:  * @test
>> 31:  * @summary Provides exchaustive verification of Character.toUpperCase 
>> and Character.toLowerCase
> 
> typo: "exhaustive"?

I did an 'exchaustive' search for 'exchaustive' across the code base and found  
two comments in `LocaleData` and `LocaleData.cldr` in 
`jdk/test/jdk/sun/text/resources`.

Would you like me to update these as well while we're here, or should we avoid 
getting out scope for this PR?

-

PR: https://git.openjdk.org/jdk/pull/12623


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

2023-02-20 Thread Claes Redestad
On Mon, 20 Feb 2023 14:45:09 GMT, Eirik Bjorsnos  wrote:

>> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
>> 'the oldest ASCII trick in the book'.
>> 
>> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
>> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
>> updated to use `equalsIgnoreCase`
>> 
>> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
>> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
>> code point pairs have an `equalsIgnoreCase` consistent with 
>> Character.toUpperCase, Character.toLowerCase.
>> 
>> Performance is tested for matching and mismatching cases of code point pairs 
>> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
>> in the first comment below.
>
> Eirik Bjorsnos has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - Add @bug tag to EqualsIgnoreCase test for correct issue JDK-8302871
>  - Add @bug tag to EqualsIgnoreCase test for JDK-8302877

src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
170:

> 168:  * @return true if the two bytes are considered equals ignoring case 
> in latin1
> 169:  */
> 170:  static boolean equalsIgnoreCase(byte b1, byte b2) {

Perhaps put this in `CharacterDataLatin1`, keeping it close to 
toLowerCase/toUpperCase that you're changing to use similar logic with #12623 

If you apply #12623 first - how much difference does this make on the micro 
you're adding with this PR?

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

2023-02-20 Thread Eirik Bjorsnos
On Mon, 20 Feb 2023 15:40:09 GMT, Claes Redestad  wrote:

>> Eirik Bjorsnos has updated the pull request incrementally with two 
>> additional commits since the last revision:
>> 
>>  - Add @bug tag to EqualsIgnoreCase test for correct issue JDK-8302871
>>  - Add @bug tag to EqualsIgnoreCase test for JDK-8302877
>
> src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
> 170:
> 
>> 168:  * @return true if the two bytes are considered equals ignoring 
>> case in latin1
>> 169:  */
>> 170:  static boolean equalsIgnoreCase(byte b1, byte b2) {
> 
> Perhaps put this in `CharacterDataLatin1`, keeping it close to 
> toLowerCase/toUpperCase that you're changing to use similar logic with #12623 
> 
> If you apply #12623 first - how much difference does this make on the micro 
> you're adding with this PR?

Is it not already in CharacterDataLatin1?

Here is a comparison of relying on improvements in 
`CharacterDataLatin1.toUpperCase/toLowerCase` only vs. using 
`CharacterDataLatin1.equalsIgnoreCase`:

Character.toUpperCase/toLowerCase only:


Benchmark  (codePoints)  (size)  Mode  Cnt 
ScoreError  Units
RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15  
1310.582 ± 84.777  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15 
4.547 ±  0.545  ns/op
RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
686.947 ± 11.850  ns/op
RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15 
3.836 ±  0.634  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
2107.219 ± 17.662  ns/op
RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15 
4.924 ±  0.829  ns/op


CharacterDataLatin1.equalsIgnoreCase:


Benchmark  (codePoints)  (size)  Mode  Cnt 
ScoreError  Units
RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
742.467 ± 34.490  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15 
3.960 ±  0.046  ns/op
RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
361.158 ± 37.096  ns/op
RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15 
4.039 ±  0.521  ns/op
RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
1158.091 ± 41.617  ns/op
RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15 
4.358 ±  0.123  ns/op

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

2023-02-20 Thread Claes Redestad
On Mon, 20 Feb 2023 16:16:45 GMT, Eirik Bjorsnos  wrote:

>> src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 
>> 170:
>> 
>>> 168:  * @return true if the two bytes are considered equals ignoring 
>>> case in latin1
>>> 169:  */
>>> 170:  static boolean equalsIgnoreCase(byte b1, byte b2) {
>> 
>> Perhaps put this in `CharacterDataLatin1`, keeping it close to 
>> toLowerCase/toUpperCase that you're changing to use similar logic with 
>> #12623 
>> 
>> If you apply #12623 first - how much difference does this make on the micro 
>> you're adding with this PR?
>
> Is it not already in CharacterDataLatin1?
> 
> Here is a comparison of relying on improvements in 
> `CharacterDataLatin1.toUpperCase/toLowerCase` only vs. using 
> `CharacterDataLatin1.equalsIgnoreCase`:
> 
> Character.toUpperCase/toLowerCase only:
> 
> 
> Benchmark  (codePoints)  (size)  Mode  Cnt
>  ScoreError  Units
> RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15  
> 1310.582 ± 84.777  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15
>  4.547 ±  0.545  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
> 686.947 ± 11.850  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15
>  3.836 ±  0.634  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
> 2107.219 ± 17.662  ns/op
> RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15
>  4.924 ±  0.829  ns/op
> 
> 
> CharacterDataLatin1.equalsIgnoreCase:
> 
> 
> Benchmark  (codePoints)  (size)  Mode  Cnt
>  ScoreError  Units
> RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
> 742.467 ± 34.490  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15
>  3.960 ±  0.046  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
> 361.158 ± 37.096  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15
>  4.039 ±  0.521  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
> 1158.091 ± 41.617  ns/op
> RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15
>  4.358 ±  0.123  ns/op

Oops, I lost context and thought this was in `StringLatin1`.

Thanks for running the numbers with #12623. Looks like you're getting big 
enough of an improvement on top.

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

2023-02-20 Thread Eirik Bjorsnos
On Mon, 20 Feb 2023 16:23:32 GMT, Claes Redestad  wrote:

>> Is it not already in CharacterDataLatin1?
>> 
>> Here is a comparison of relying on improvements in 
>> `CharacterDataLatin1.toUpperCase/toLowerCase` only vs. using 
>> `CharacterDataLatin1.equalsIgnoreCase`:
>> 
>> Character.toUpperCase/toLowerCase only:
>> 
>> 
>> Benchmark  (codePoints)  (size)  Mode  Cnt   
>>   ScoreError  Units
>> RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15  
>> 1310.582 ± 84.777  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15   
>>   4.547 ±  0.545  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
>> 686.947 ± 11.850  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15   
>>   3.836 ±  0.634  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
>> 2107.219 ± 17.662  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15   
>>   4.924 ±  0.829  ns/op
>> 
>> 
>> CharacterDataLatin1.equalsIgnoreCase:
>> 
>> 
>> Benchmark  (codePoints)  (size)  Mode  Cnt   
>>   ScoreError  Units
>> RegionMatchesIC.Latin1.regionMatchesIC  ascii-match1024  avgt   15   
>> 742.467 ± 34.490  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch1024  avgt   15   
>>   3.960 ±  0.046  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC number-match1024  avgt   15   
>> 361.158 ± 37.096  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch1024  avgt   15   
>>   4.039 ±  0.521  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIC   lat1-match1024  avgt   15  
>> 1158.091 ± 41.617  ns/op
>> RegionMatchesIC.Latin1.regionMatchesIClat1-mismatch1024  avgt   15   
>>   4.358 ±  0.123  ns/op
>
> Oops, I lost context and thought this was in `StringLatin1`.
> 
> Thanks for running the numbers with #12623. Looks like you're getting big 
> enough of an improvement on top.

Yes, seems `equalsIgnoreCase` carries its weight.

-

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread Peter Levart
On Mon, 20 Feb 2023 13:09:50 GMT, liach  wrote:

>> 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not 
>> thread safe
>
> We don't fear calling the factory twice for benign races, as the distinct 
> constructor factory instances are behaviorally the same.
> 
> The true issue lies in the double getfield operations: Java memory model 
> doesn't require the second read to happen-after a write reflected in the 
> first read, so return this.genericInfo may return null while this.genericInfo 
> == null evaluates to false, in case genericInfo is initialized lazily by 
> another thread. See https://bugs.openjdk.org/browse/JDK-8261404

Hi @liach,

I think @dholmes-ora is worried about the fields in the object being returned 
by the getGenericInfo() method and similar. In above case this means fields in 
class ConstructorRepository.
I checked it and the entire hierarchy based on 
sun.reflect.generics.repository.AbstractRepository with subclasses including 
ConstructorRepository is modeled such that all fields are either:
- volatile and lazily initialized; or
- final and initialized in constructor

Such objects may be published via data race and still be seen consistent on the 
accepting side.

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread Alan Bateman
On Sat, 18 Feb 2023 23:26:08 GMT, Claes Redestad  wrote:

> When encoding Strings to US-ASCII we can speed up the happy path 
> significantly by using `StringCoding.countPositives` as a speculative check 
> for whether there are any chars that needs to be replaced by `'?'`. Once a 
> non-ASCII char is encountered we fall back to the slow loop and replace as 
> needed.
> 
> An alternative could be unrolling or using a byte array VarHandle, as 
> show-cased by Brett Okken here: 
> https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
> Having to replace chars with `?` is essentially an encoding error so it might 
> be safe to assume this case is exceptional in practice.

Marked as reviewed by alanb (Reviewer).

-

PR: https://git.openjdk.org/jdk/pull/12640


Re: RFR: 8299852: Modernize ConcurrentHashMap [v4]

2023-02-20 Thread Per Minborg
> `java.util.concurrent.ConcurrentHashMap` is relatively old and has not been 
> updated to reflect the current state of Java and can be modernized: 
> 
> * Add `@Serial` annotations 
> * Seal classes and restrict subclassing for internal classes 
> * Use pattern matching for instance 
> * Remove redundant modifiers (such as some final declarations) 
> * Use Objects.requireNonNull for invariant checking 
> * Use diamond operators instead of explicit types 
> * Simplify expressions using Math::max

Per Minborg has updated the pull request incrementally with one additional 
commit since the last revision:

  Remove redundant line

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/11924/files
  - new: https://git.openjdk.org/jdk/pull/11924/files/b2dbdd8f..cd4b1365

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=11924&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11924&range=02-03

  Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/11924.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11924/head:pull/11924

PR: https://git.openjdk.org/jdk/pull/11924


Re: RFR: 8302815 Use new Math.clamp method in core libraries [v2]

2023-02-20 Thread Alan Bateman
On Sat, 18 Feb 2023 21:40:08 GMT, Tagir F. Valeev  wrote:

>> For cleanup and dogfooding the new method, it would be nice to use 
>> Math.clamp where possible in java.base. See PR #12428.
>> 
>> As Math.clamp performs an additional check that min is not greater than max, 
>> I conservatively replaced only those occurrences where I can see that this 
>> invariant is always held. There are more occurrences, where clamp can be 
>> potentially used but it's unclear whether min <= max is always true.
>
> Tagir F. Valeev has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Revert changes in JrtPath, as it seems to be compiled with bootstrap JDK

I skimmed through the usages and they look okay. I didn't spot anywhere that it 
differs to the existing clamping. This is the first update in 2023 for some of 
these files so I assume you'll bump the copyright year before integrating.

-

Marked as reviewed by alanb (Reviewer).

PR: https://git.openjdk.org/jdk/pull/12633


Re: RFR: 8301627: System.exit and Runtime.exit debug logging [v7]

2023-02-20 Thread Roger Riggs
> It can be difficult to find the cause of calls to 
> `java.lang.System.exit(status)` and `Runtime.exit(status)` because the Java 
> runtime exits.
> The status value and stack trace are logged using the System Logger named 
> `java.lang.Runtime` with message level `System.Logger.Level.DEBUG`.

Roger Riggs has updated the pull request incrementally with one additional 
commit since the last revision:

  fix double the

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12517/files
  - new: https://git.openjdk.org/jdk/pull/12517/files/3dc13135..acce58f1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12517&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12517&range=05-06

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/12517.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12517/head:pull/12517

PR: https://git.openjdk.org/jdk/pull/12517


Re: RFR: JDK-8302800: Augment NaN handling tests of FDLIBM methods

2023-02-20 Thread Joe Darcy
On Mon, 20 Feb 2023 10:22:48 GMT, Raffaello Giulietti  
wrote:

>> Augment NaN-handling tests to include exotic NaN bit patterns. Assuming this 
>> changeset goes back first, I'll update 
>> https://github.com/openjdk/jdk/pull/12608 to follow the same convention 
>> afterward.
>
> test/jdk/java/lang/Math/Tests.java line 41:
> 
>> 39: private Tests(){}; // do not instantiate
>> 40: 
>> 41: static volatile double zero = 0.0;
> 
> Not directly related to this PR and just out of curiosity: is there a reason 
> for this to be `volatile` and non-`final`?

The comment on the PLATFORM_NAN field was mean to address that -- an attempt to 
make sure the expression to create the platform NaN gets evaluated at runtime 
rather than being constant-folded at compile time.

-

PR: https://git.openjdk.org/jdk/pull/12644


RFD: Consumer.empty()

2023-02-20 Thread Eirik Bjørsnøs
Hi,

So I found myself writing some code which sets up a j.u.s.Stream processing
pipeline where the stream can optionally be traced/logged via a peeking
Consumer.

The default should be not to trace, in which case I initially set the
Consumer to null. But then I need to check that trace != null when setting
up the stream. This felt awkward..

Since related code was using Function.identity() as a "no mapping
configured" default, I went looking for Consumer.empty(), only to learn
that no such thing exists.

I can of course just do () -> {}, but I think Consumer.empty() conveys the
purpose a bit more clearly..

Is there some thought behind this not existing? Would it be worth adding?


/**
 * Returns a consumer which performs no operation on its input argument
 *
 * @param  the type of the input argument
 * @return a Consumer which performs no operation on its input argument
 */
static  Consumer empty() {
return t -> {};
}


Re: RFD: Consumer.empty()

2023-02-20 Thread Eirik Bjørsnøs
Found this answer from smarks:

https://stackoverflow.com/questions/26549659/built-in-java-8-predicate-that-always-returns-true/26553481#26553481

Certainly there is some value in something that has a name vs. a
> written-out lambda, but this value is quite small. We expect that people
> will learn how to read and write x -> true and () -> { } and that their
> usage will become idiomatic. Even the value of Function.identity() over x
> -> x is questionable.


Makes sense. And where would we stop..?

EIrik.

On Mon, Feb 20, 2023 at 7:24 PM Eirik Bjørsnøs  wrote:

> Hi,
>
> So I found myself writing some code which sets up a j.u.s.Stream
> processing pipeline where the stream can optionally be traced/logged via a
> peeking Consumer.
>
> The default should be not to trace, in which case I initially set the
> Consumer to null. But then I need to check that trace != null when setting
> up the stream. This felt awkward..
>
> Since related code was using Function.identity() as a "no mapping
> configured" default, I went looking for Consumer.empty(), only to learn
> that no such thing exists.
>
> I can of course just do () -> {}, but I think Consumer.empty() conveys the
> purpose a bit more clearly..
>
> Is there some thought behind this not existing? Would it be worth adding?
>
>
> /**
>  * Returns a consumer which performs no operation on its input argument
>  *
>  * @param  the type of the input argument
>  * @return a Consumer which performs no operation on its input argument
>  */
> static  Consumer empty() {
> return t -> {};
> }
>


Re: RFR: 8302863: Speed up String::encodeASCII using countPositives

2023-02-20 Thread Brett Okken
On Sat, 18 Feb 2023 23:26:08 GMT, Claes Redestad  wrote:

> When encoding Strings to US-ASCII we can speed up the happy path 
> significantly by using `StringCoding.countPositives` as a speculative check 
> for whether there are any chars that needs to be replaced by `'?'`. Once a 
> non-ASCII char is encountered we fall back to the slow loop and replace as 
> needed.
> 
> An alternative could be unrolling or using a byte array VarHandle, as 
> show-cased by Brett Okken here: 
> https://mail.openjdk.org/pipermail/core-libs-dev/2023-February/100573.html 
> Having to replace chars with `?` is essentially an encoding error so it might 
> be safe to assume this case is exceptional in practice.

For the happy path of no encoding failures, this "trivial" change is a great 
improvement.

-

Marked as reviewed by bok...@github.com (no known OpenJDK username).

PR: https://git.openjdk.org/jdk/pull/12640


Re: RFR: 8302822: Method/Field/Constructor/RecordComponent::getGenericInfo() is not thread safe

2023-02-20 Thread David Holmes
On Mon, 20 Feb 2023 17:03:33 GMT, Peter Levart  wrote:

>> We don't fear calling the factory twice for benign races, as the distinct 
>> constructor factory instances are behaviorally the same.
>> 
>> The true issue lies in the double getfield operations: Java memory model 
>> doesn't require the second read to happen-after a write reflected in the 
>> first read, so return this.genericInfo may return null while 
>> this.genericInfo == null evaluates to false, in case genericInfo is 
>> initialized lazily by another thread. See 
>> https://bugs.openjdk.org/browse/JDK-8261404
>
> Hi @liach,
> 
> I think @dholmes-ora is worried about the fields in the object being returned 
> by the getGenericInfo() method and similar. In above case this means fields 
> in class ConstructorRepository.
> I checked it and the entire hierarchy based on 
> sun.reflect.generics.repository.AbstractRepository with subclasses including 
> ConstructorRepository is modeled such that all fields are either:
> - volatile and lazily initialized; or
> - final and initialized in constructor
> 
> Such objects may be published via data race and still be seen consistent on 
> the accepting side.

Thanks @plevart  that was exactly my concern but I didn't have time to check 
whether the returned object could be safely published regardless of any race 
condition. Is it specified that way, or just a fortuitous occurrence?

I would also be concerned about the guarantee of idempotency from the factory 
method - I hope its requirements in that area are clearly documented.

-

PR: https://git.openjdk.org/jdk/pull/12643


Re: RFR: 8026369: javac potentially ambiguous overload warning needs an improved scheme

2023-02-20 Thread SWinxy
On Sun, 19 Feb 2023 23:52:52 GMT, Archie L. Cobbs  wrote:

> This bug relates to the "potentially ambiguous overload" warning which is 
> enabled by `-Xlint:overloads`.
> 
> The warning detects certain ambiguities that can cause problems for lambdas. 
> For example, consider the interface `Spliterator.OfInt`, which declares these 
> two methods:
> 
> void forEachRemaining(Consumer action);
> void forEachRemaining(IntConsumer action);
> 
> Both methods have the same name, same number of parameters, and take a lambda 
> with the same "shape" in the same argument position. This causes an ambiguity 
> in any code that wants to do this:
> 
> spliterator.forEachRemaining(x -> { ... });
> 
> That code won't compile; instead, you'll get this error:
> 
> Ambiguity.java:4: error: reference to forEachRemaining is ambiguous
> spliterator.forEachRemaining(x -> { });
>^
>   both method forEachRemaining(IntConsumer) in OfInt and method 
> forEachRemaining(Consumer) in OfInt match
> 
> 
> The problem reported by the bug is that the warning fails to detect 
> ambiguities which are created purely by inheritance, for example:
> 
> interface ConsumerOfInteger {
> void foo(Consumer c);
> }
> 
> interface IntegerConsumer {
> void foo(IntConsumer c);
> }
> 
> // We should get a warning here...
> interface Test extends ConsumerOfInteger, IntegerConsumer {
> }
> 
> 
> The cause of the bug is that ambiguities are detected on a per-method basis, 
> by checking whether a method is part of an ambiguity pair when we visit that 
> method. So if the methods in an ambiguity pair are inherited from two 
> distinct supertypes, we'll miss the ambiguity.
> 
> To fix the problem, we need to look for ambiguities on a per-class level, 
> checking all pairs of methods. However, it's not that simple - we only want 
> to "blame" a class when that class itself, and not some supertype, is 
> responsible for creating the ambiguity. For example, any interface extending 
> `Spliterator.OfInt` will automatically inherit the two ambiguities mentioned 
> above, but these are not the interface's fault so to speak so no warning 
> should be generated. Making things more complicated is the fact that methods 
> can be overridden and declared in generic classes so they only conflict in 
> some subtypes, etc.
> 
> So we generate the warning when there are two methods m1 and m2 in a class C 
> such that:
> 
> * m1 and m2 consitiute a "potentially ambiguous overload" (using the same 
> definition as before)
> * There is no direct supertype T of C such that m1 and m2, or some methods 
> they override, both exist in T and constitute a "potentially ambiguous 
> overload" as members of T
> * We haven't already generated a warning for either m1 or m2 in class C
> 
> If either method is declared in C, we locate the warning there, but when both 
> methods are inherited, there's no method declaration to point at so the 
> warning is instead located at the class declaration.
> 
> I noticed a couple of other minor bugs; these are also being fixed here:
> 
> (1) For inherited methods, the method signatures were being reported as they 
> are declared, rather than in the context of the class being visited. As a 
> result, when a methods is inherited from a generic supertype, the ambiguity 
> is less clear. Here's an example:
> 
> interface Upper {
> void foo(T c);
> }
> 
> interface Lower extends Upper {
> void foo(Consumer c);
> }
> 
> Currently, the error is reported as:
> 
> warning: [overloads] foo(Consumer) in Lower is potentially ambiguous 
> with foo(T) in Upper
> 
> Reporting the method signatures in the context of the class being visited 
> makes the ambiguity clearer:
> 
> warning: [overloads] foo(Consumer) in Lower is potentially ambiguous 
> with foo(IntConsumer) in Upper
> 
> 
> (2) When a method is identified as part of an ambiguous pair, we were setting 
> a `POTENTIALLY_AMBIGUOUS` flag on it. This caused it to be forever excluded 
> from future warnings. For methods that are declared in the class we're 
> visiting, this makes sense, but it doesn't make sense for inherited methods, 
> because it disqualifies them from participating in the analysis of any other 
> class that also inherits them.
> 
> As a result, for a class like the one below, the compiler was only generating 
> one warning instead of three:
> 
> public interface SuperIface {
> void foo(Consumer c);
> }
> 
> public interface I1 extends SuperIface { 
> void foo(IntConsumer c);// warning was generated here
> }   
> 
> public interface I2 extends SuperIface { 
> void foo(IntConsumer c);// no warning was generated here
> }   
> 
> public interface I3 extends SuperIface { 
> void foo(IntConsumer c);// no warning was generated here
> }   
> 
> 
> With this patch the `POTENTIALLY_AMBIGUOUS` flag is no longer needed. I 
> wasn't sure whether to renumber all the subsequent flags, or just leave an 
> empty placeholder, 

Re: RFR: 8302877: Speed up latin1 case conversions

2023-02-20 Thread Naoto Sato
On Sat, 18 Feb 2023 06:43:27 GMT, Eirik Bjorsnos  wrote:

>> test/jdk/java/lang/Character/Latin1CaseFolding.java line 31:
>> 
>>> 29: /**
>>> 30:  * @test
>>> 31:  * @summary Provides exchaustive verification of Character.toUpperCase 
>>> and Character.toLowerCase
>> 
>> typo: "exhaustive"?
>
> I did an 'exchaustive' search for 'exchaustive' across the code base and 
> found  two comments in `LocaleData` and `LocaleData.cldr` in 
> `jdk/test/jdk/sun/text/resources`.
> 
> Would you like me to update these as well while we're here, or should we 
> avoid getting out scope for this PR?

I'd appreciate it. I don't mind fixing it with this PR.

-

PR: https://git.openjdk.org/jdk/pull/12623


Re: RFR: 8026369: javac potentially ambiguous overload warning needs an improved scheme

2023-02-20 Thread Archie L . Cobbs
On Mon, 20 Feb 2023 23:53:05 GMT, SWinxy  wrote:

> In the `AWTEventMulticaster` class(es), which interfaces are causing the 
> ambiguity?

Ah - my apologies. Those got added during development but were not needed once 
all the bugs were worked out.

I've removed them and that will also eliminate `client` and `i18n` from review. 
Thanks!

-

PR: https://git.openjdk.org/jdk/pull/12645


Re: RFR: 8026369: javac potentially ambiguous overload warning needs an improved scheme [v2]

2023-02-20 Thread Archie L . Cobbs
> This bug relates to the "potentially ambiguous overload" warning which is 
> enabled by `-Xlint:overloads`.
> 
> The warning detects certain ambiguities that can cause problems for lambdas. 
> For example, consider the interface `Spliterator.OfInt`, which declares these 
> two methods:
> 
> void forEachRemaining(Consumer action);
> void forEachRemaining(IntConsumer action);
> 
> Both methods have the same name, same number of parameters, and take a lambda 
> with the same "shape" in the same argument position. This causes an ambiguity 
> in any code that wants to do this:
> 
> spliterator.forEachRemaining(x -> { ... });
> 
> That code won't compile; instead, you'll get this error:
> 
> Ambiguity.java:4: error: reference to forEachRemaining is ambiguous
> spliterator.forEachRemaining(x -> { });
>^
>   both method forEachRemaining(IntConsumer) in OfInt and method 
> forEachRemaining(Consumer) in OfInt match
> 
> 
> The problem reported by the bug is that the warning fails to detect 
> ambiguities which are created purely by inheritance, for example:
> 
> interface ConsumerOfInteger {
> void foo(Consumer c);
> }
> 
> interface IntegerConsumer {
> void foo(IntConsumer c);
> }
> 
> // We should get a warning here...
> interface Test extends ConsumerOfInteger, IntegerConsumer {
> }
> 
> 
> The cause of the bug is that ambiguities are detected on a per-method basis, 
> by checking whether a method is part of an ambiguity pair when we visit that 
> method. So if the methods in an ambiguity pair are inherited from two 
> distinct supertypes, we'll miss the ambiguity.
> 
> To fix the problem, we need to look for ambiguities on a per-class level, 
> checking all pairs of methods. However, it's not that simple - we only want 
> to "blame" a class when that class itself, and not some supertype, is 
> responsible for creating the ambiguity. For example, any interface extending 
> `Spliterator.OfInt` will automatically inherit the two ambiguities mentioned 
> above, but these are not the interface's fault so to speak so no warning 
> should be generated. Making things more complicated is the fact that methods 
> can be overridden and declared in generic classes so they only conflict in 
> some subtypes, etc.
> 
> So we generate the warning when there are two methods m1 and m2 in a class C 
> such that:
> 
> * m1 and m2 consitiute a "potentially ambiguous overload" (using the same 
> definition as before)
> * There is no direct supertype T of C such that m1 and m2, or some methods 
> they override, both exist in T and constitute a "potentially ambiguous 
> overload" as members of T
> * We haven't already generated a warning for either m1 or m2 in class C
> 
> If either method is declared in C, we locate the warning there, but when both 
> methods are inherited, there's no method declaration to point at so the 
> warning is instead located at the class declaration.
> 
> I noticed a couple of other minor bugs; these are also being fixed here:
> 
> (1) For inherited methods, the method signatures were being reported as they 
> are declared, rather than in the context of the class being visited. As a 
> result, when a methods is inherited from a generic supertype, the ambiguity 
> is less clear. Here's an example:
> 
> interface Upper {
> void foo(T c);
> }
> 
> interface Lower extends Upper {
> void foo(Consumer c);
> }
> 
> Currently, the error is reported as:
> 
> warning: [overloads] foo(Consumer) in Lower is potentially ambiguous 
> with foo(T) in Upper
> 
> Reporting the method signatures in the context of the class being visited 
> makes the ambiguity clearer:
> 
> warning: [overloads] foo(Consumer) in Lower is potentially ambiguous 
> with foo(IntConsumer) in Upper
> 
> 
> (2) When a method is identified as part of an ambiguous pair, we were setting 
> a `POTENTIALLY_AMBIGUOUS` flag on it. This caused it to be forever excluded 
> from future warnings. For methods that are declared in the class we're 
> visiting, this makes sense, but it doesn't make sense for inherited methods, 
> because it disqualifies them from participating in the analysis of any other 
> class that also inherits them.
> 
> As a result, for a class like the one below, the compiler was only generating 
> one warning instead of three:
> 
> public interface SuperIface {
> void foo(Consumer c);
> }
> 
> public interface I1 extends SuperIface { 
> void foo(IntConsumer c);// warning was generated here
> }   
> 
> public interface I2 extends SuperIface { 
> void foo(IntConsumer c);// no warning was generated here
> }   
> 
> public interface I3 extends SuperIface { 
> void foo(IntConsumer c);// no warning was generated here
> }   
> 
> 
> With this patch the `POTENTIALLY_AMBIGUOUS` flag is no longer needed. I 
> wasn't sure whether to renumber all the subsequent flags, or just leave an 
> empty placeholder, so I chose the latter.
> 
> Finally, this fix uncovers new 

RFR: 8143900: OptimizeStringConcat has an opaque dependency on Integer.sizeTable field

2023-02-20 Thread Yi Yang
Hi, can I have a review for this patch? I noticed a strange field 
Integer.sizeTable, after digging into its history, I think it could be replaced 
by an in-place array allocation and initialization.

Thanks.

-

Commit messages:
 - 8143900 OptimizeStringConcat has an opaque dependency on Integer.sizeTable 
field

Changes: https://git.openjdk.org/jdk/pull/12680/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12680&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8143900
  Stats: 105 lines in 3 files changed: 43 ins; 53 del; 9 mod
  Patch: https://git.openjdk.org/jdk/pull/12680.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12680/head:pull/12680

PR: https://git.openjdk.org/jdk/pull/12680


Re: RFR: 8302818: Optimize wrapper sets and immutable sets of Enums [v5]

2023-02-20 Thread Tingjun Yuan
> Currently, the two subclasses of `java.util.EnumSet` optimize bulk operations 
> when the argument is also a `EnumSet`, but there is no such optimization for 
> wrapper sets (returned by `Collections.unmodifiableSet`, 
> `Collections.synchronizedSet`, etc.) and immutable sets (returned by `Set.of` 
> methods) of `Enum`s.
> 
> This PR introduces optimization classes for these situations.  No public APIs 
> are changed.

Tingjun Yuan has updated the pull request incrementally with one additional 
commit since the last revision:

  Set.copyOf: need defensive copy

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12498/files
  - new: https://git.openjdk.org/jdk/pull/12498/files/b9699bfe..0e0b2bf1

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12498&range=03-04

  Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/12498.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12498/head:pull/12498

PR: https://git.openjdk.org/jdk/pull/12498


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v5]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Eirik Bjorsnos has updated the pull request incrementally with one additional 
commit since the last revision:

  Spell fix for 'exhaustive' in comments in sun/text/resources

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12632/files
  - new: https://git.openjdk.org/jdk/pull/12632/files/03d3e2cb..5e9927a4

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=03-04

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v6]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we can speed up `StringLatin1.regionMatchesCI` by applying 
> 'the oldest ASCII trick in the book'.
> 
> The new static method `CharacterDataLatin1.equalsIgnoreCase` compares two 
> latin1 bytes for equality ignoring case. `StringLatin1.regionMatchesCI` is 
> updated to use `equalsIgnoreCase`
> 
> To verify the correctness of `equalsIgnoreCase`, a new test is added  to 
> `EqualsIgnoreCase` with an exhaustive verification that all 256x256 latin1 
> code point pairs have an `equalsIgnoreCase` consistent with 
> Character.toUpperCase, Character.toLowerCase.
> 
> Performance is tested for matching and mismatching cases of code point pairs 
> picked from the ASCII letter, ASCII number and latin1 letter ranges. Results 
> in the first comment below.

Eirik Bjorsnos has updated the pull request incrementally with one additional 
commit since the last revision:

  Revert "Spell fix for 'exhaustive' in comments in sun/text/resources"
  
  This reverts commit 5e9927a4b35e157fd3fa72fd2663c8bfbecf32bb.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12632/files
  - new: https://git.openjdk.org/jdk/pull/12632/files/5e9927a4..b8139961

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12632&range=04-05

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/12632.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12632/head:pull/12632

PR: https://git.openjdk.org/jdk/pull/12632


Re: RFR: 8302877: Speed up latin1 case conversions [v2]

2023-02-20 Thread Eirik Bjorsnos
> This PR suggests we speed up Character.toUpperCase and Character.toLowerCase 
> for latin1 code points by applying the 'oldest ASCII trick in the book'.
> 
> This takes advantage of the fact that latin1 uppercase code points are always 
> 0x20 lower than their lowercase (with the exception of two code points which 
> uppercase out of latin1).
> 
> To verify the correctness of the new implementation, the test 
> `Latin1CaseConversion` is added with an exhaustive verification of 
> toUpperCase/toLowerCase for all latin1 code points.
> 
> The implementation needs to balance the performance of the various ranges in 
> latin1. An effort has been made to favour operations on ASCII code points, 
> without causing excessive regression for higher code points.
> 
> Performance is benchmarked for 7 chosen sample code points, each representing 
> a range or a special-case.  Results in the first comment.

Eirik Bjorsnos has updated the pull request incrementally with one additional 
commit since the last revision:

  Spell fix for 'exhaustive' in comments in sun/text/resources

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/12623/files
  - new: https://git.openjdk.org/jdk/pull/12623/files/57a27d39..70c624d7

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12623&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12623&range=00-01

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/12623.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12623/head:pull/12623

PR: https://git.openjdk.org/jdk/pull/12623