Re: RFR: 8283726: x86 intrinsics for compare method in Integer and Long

2022-03-27 Thread Quan Anh Mai
On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa  wrote:

> Implements x86 intrinsics for compare() method in java.lang.Integer and 
> java.lang.Long.

This is both complicated and inefficient, I would suggest building the 
intrinsic in the IR graph so that the compiler can simplify 
`Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7975


RFR: 8283726: x86 intrinsics for compare method in Integer and Long

2022-03-27 Thread Vamsi Parasa
Implements x86 intrinsics for compare() method in java.lang.Integer and 
java.lang.Long.

-

Commit messages:
 - add JMH benchmarks
 - 8283726: x86 intrinsics for compare method in Integer and Long

Changes: https://git.openjdk.java.net/jdk/pull/7975/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7975=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8283726
  Stats: 430 lines in 13 files changed: 428 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7975.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7975/head:pull/7975

PR: https://git.openjdk.java.net/jdk/pull/7975


Re: RFR: 8282819: Deprecate Locale class constructors [v3]

2022-03-27 Thread ExE Boss
On Fri, 25 Mar 2022 22:51:23 GMT, Naoto Sato  wrote:

>> Proposing to deprecate the constructors in the `java.util.Locale` class. 
>> There is already a factory method and a builder to return singletons, so 
>> there is no need to have constructors anymore unless one purposefully wants 
>> to create `ill-formed` Locale objects, which is discouraged. We cannot 
>> terminally deprecate those constructors for the compatibility to serialized 
>> objects created with older JDKs. Please see the draft CSR for more detail.
>
> Naoto Sato has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Fixed a build failure

src/java.base/share/classes/java/util/Locale.java line 819:

> 817:  * @since 19
> 818:  */
> 819: public static Locale of(String... fields) {

Arguably, there should be `Locale.of` overloads taking 0 to 4 arguments, so 
that it’s not necessary to box the fields in a `String` array.

src/java.base/share/classes/java/util/Locale.java line 825:

> 823: case 2 -> getInstance(fields[0], "", fields[1], "", null);
> 824: case 3 -> getInstance(fields[0], "", fields[1], fields[2], 
> null);
> 825: default -> getInstance(fields[0], fields[3], fields[1], 
> fields[2], null);

This should probably throw `IllegalArgumentException` when more than 4 fields 
are passed:
Suggestion:

case 4 -> getInstance(fields[0], fields[3], fields[1], fields[2], 
null);
default -> throw new IllegalArgumentException(/* TODO: message */);

-

PR: https://git.openjdk.java.net/jdk/pull/7947


Re: RFR: 8186958: Need method to create pre-sized HashMap [v7]

2022-03-27 Thread XenoAmess
On Thu, 24 Mar 2022 17:43:31 GMT, XenoAmess  wrote:

>> 8186958: Need method to create pre-sized HashMap
>
> XenoAmess has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - update jmh
>  - refine javadoc; refine implement when expectedSize < 0

So I myself think, the functions looks good now.
The first step, to make such functions, seems completed.
Should we go into the next step now, means to change some usage to these 
functions in java.base?

-

PR: https://git.openjdk.java.net/jdk/pull/7928


Re: RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v2]

2022-03-27 Thread ExE Boss
On Fri, 4 Mar 2022 17:44:44 GMT, Ludovic Henry  wrote:

>> Despite the hash value being cached for Strings, computing the hash still 
>> represents a significant CPU usage for applications handling lots of text.
>> 
>> Even though it would be generally better to do it through an enhancement to 
>> the autovectorizer, the complexity of doing it by hand is trivial and the 
>> gain is sizable (2x speedup) even without the Vector API. The algorithm has 
>> been proposed by Richard Startin and Paul Sandoz [1].
>> 
>> Speedup are as follows on a `Intel(R) Xeon(R) E-2276G CPU @ 3.80GHz`
>> 
>> 
>> Benchmark(size)  Mode  Cnt Score 
>>Error  Units
>> StringHashCode.Algorithm.scalarLatin1 0  avgt   25 2.111 
>> ±  0.210  ns/op
>> StringHashCode.Algorithm.scalarLatin1 1  avgt   25 3.500 
>> ±  0.127  ns/op
>> StringHashCode.Algorithm.scalarLatin110  avgt   25 7.001 
>> ±  0.099  ns/op
>> StringHashCode.Algorithm.scalarLatin1   100  avgt   2561.285 
>> ±  0.444  ns/op
>> StringHashCode.Algorithm.scalarLatin1  1000  avgt   25   628.995 
>> ±  0.846  ns/op
>> StringHashCode.Algorithm.scalarLatin1 1  avgt   25  6307.990 
>> ±  4.071  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled16   0  avgt   25 2.358 
>> ±  0.092  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled16   1  avgt   25 3.631 
>> ±  0.159  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled16  10  avgt   25 7.049 
>> ±  0.019  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled16 100  avgt   2533.626 
>> ±  1.218  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled161000  avgt   25   317.811 
>> ±  1.225  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled16   1  avgt   25  3212.333 
>> ± 14.621  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled80  avgt   25 2.356 
>> ±  0.097  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled81  avgt   25 3.630 
>> ±  0.158  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled8   10  avgt   25 8.724 
>> ±  0.065  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled8  100  avgt   2532.402 
>> ±  0.019  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled8 1000  avgt   25   321.949 
>> ±  0.251  ns/op
>> StringHashCode.Algorithm.scalarLatin1Unrolled81  avgt   25  3202.083 
>> ±  1.667  ns/op
>> StringHashCode.Algorithm.scalarUTF16  0  avgt   25 2.135 
>> ±  0.191  ns/op
>> StringHashCode.Algorithm.scalarUTF16  1  avgt   25 5.202 
>> ±  0.362  ns/op
>> StringHashCode.Algorithm.scalarUTF16 10  avgt   2511.105 
>> ±  0.112  ns/op
>> StringHashCode.Algorithm.scalarUTF16100  avgt   2575.974 
>> ±  0.702  ns/op
>> StringHashCode.Algorithm.scalarUTF16   1000  avgt   25   716.429 
>> ±  3.290  ns/op
>> StringHashCode.Algorithm.scalarUTF16  1  avgt   25  7095.459 
>> ± 43.847  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled160  avgt   25 2.381 
>> ±  0.038  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled161  avgt   25 5.268 
>> ±  0.422  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled16   10  avgt   2511.248 
>> ±  0.178  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled16  100  avgt   2552.966 
>> ±  0.089  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled16 1000  avgt   25   450.912 
>> ±  1.834  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled161  avgt   25  4403.988 
>> ±  2.927  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled8 0  avgt   25 2.401 
>> ±  0.032  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled8 1  avgt   25 5.091 
>> ±  0.396  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled810  avgt   2512.801 
>> ±  0.189  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled8   100  avgt   2552.068 
>> ±  0.032  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled8  1000  avgt   25   453.270 
>> ±  0.340  ns/op
>> StringHashCode.Algorithm.scalarUTF16Unrolled8 1  avgt   25  4433.112 
>> ±  2.699  ns/op
>> 
>> 
>> At Datadog, we handle a great amount of text (through logs management for 
>> example), and hashing String represents a large part of our CPU usage. It's 
>> very unlikely that we are the only one as String.hashCode is such a core 
>> feature of the JVM-based languages with its use in HashMap for example. 
>> Having even only a 2x speedup would allow us to save thousands of CPU cores 
>> per month and improve correspondingly the energy/carbon impact.
>> 
>> [1] 
>> https://static.rainfocus.com/oracle/oow18/sess/1525822677955001tLqU/PF/codeone18-vector-API-DEV5081_1540354883936001Q3Sv.pdf
>
> Ludovic Henry has updated the pull request incrementally with one additional 
> commit since