RFR: 8281168: Micro-optimize VarForm.getMemberName for interpreter

2022-02-02 Thread Aleksey Shipilev
I was looking for easy things to do to improve `java.lang.invoke` cold 
performance. One of the things is inlining `VarForm.getMemberName` a bit, so 
that interpreter does not have to call through `getMemberNameOrNull`.

There is direct VarHandle benchmark in our corpus:


$ CONF=linux-x86_64-server-release make run-test 
TEST=micro:java.lang.invoke.VarHandleExact 
MICRO="TIME=200ms;WARMUP_TIME=200ms;VM_OPTIONS=-Xint"

Benchmark Mode  Cnt ScoreError  Units

# -Xint
# Baseline
VarHandleExact.exact_exactInvocation  avgt   30   714.041 ±  5.882  ns/op
VarHandleExact.generic_exactInvocationavgt   30   641.570 ± 11.681  ns/op
VarHandleExact.generic_genericInvocation  avgt   30  1336.571 ± 11.873  ns/op

# -Xint
# Patched
VarHandleExact.exact_exactInvocation  avgt   30   678.495 ± 10.752  ns/op ; 
+5%
VarHandleExact.generic_exactInvocationavgt   30   573.320 ±  5.100  ns/op ; 
+11%
VarHandleExact.generic_genericInvocation  avgt   30  1338.593 ± 14.275  ns/op 

# (server, default)
# Baseline
VarHandleExact.exact_exactInvocation  avgt   30   0.620 ± 0.079  ns/op
VarHandleExact.generic_exactInvocationavgt   30   0.602 ± 0.063  ns/op
VarHandleExact.generic_genericInvocation  avgt   30  10.521 ± 0.065  ns/op

# (server, default)
# Patched
VarHandleExact.exact_exactInvocation  avgt   30   0.621 ± 0.070  ns/op
VarHandleExact.generic_exactInvocationavgt   30   0.601 ± 0.061  ns/op
VarHandleExact.generic_genericInvocation  avgt   30  10.499 ± 0.070  ns/op


Additional testing:
 - [x] Linux x86_64 fastdebug `tier1`
 - [x] Linux x86_64 fastdebug `tier2`
 - [x] Linux x86_64 fastdebug `tier3`

-

Commit messages:
 - Fix

Changes: https://git.openjdk.java.net/jdk/pull/7333/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7333&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8281168
  Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7333.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7333/head:pull/7333

PR: https://git.openjdk.java.net/jdk/pull/7333


Re: RFR: JDK-8277175 : Add a parallel multiply method to BigInteger [v10]

2022-02-02 Thread Joe Darcy
On Thu, 3 Feb 2022 05:29:51 GMT, kabutz  wrote:

>> BigInteger currently uses three different algorithms for multiply. The 
>> simple quadratic algorithm, then the slightly better Karatsuba if we exceed 
>> a bit count and then Toom Cook 3 once we go into the several thousands of 
>> bits. Since Toom Cook 3 is a recursive algorithm, it is trivial to 
>> parallelize it. I have demonstrated this several times in conference talks. 
>> In order to be consistent with other classes such as Arrays and Collection, 
>> I have added a parallelMultiply() method. Internally we have added a 
>> parameter to the private multiply method to indicate whether the calculation 
>> should be done in parallel.
>> 
>> The performance improvements are as should be expected. Fibonacci of 100 
>> million (using a single-threaded Dijkstra's sum of squares version) 
>> completes in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with 
>> the sequential multiply() method. This is on my 1-8-2 laptop. The final 
>> multiplications are with very large numbers, which then benefit from the 
>> parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number.
>> 
>> We have also parallelized the private square() method. Internally, the 
>> square() method defaults to be sequential.
>> 
>> Some benchmark results, run on my 1-6-2 server:
>> 
>> 
>> Benchmark  (n)  Mode  Cnt  Score 
>>  Error  Units
>> BigIntegerParallelMultiply.multiply100ss4 51.707 
>> ±   11.194  ms/op
>> BigIntegerParallelMultiply.multiply   1000ss4988.302 
>> ±  235.977  ms/op
>> BigIntegerParallelMultiply.multiply  1ss4  24662.063 
>> ± 1123.329  ms/op
>> BigIntegerParallelMultiply.parallelMultiply100ss4 49.337 
>> ±   26.611  ms/op
>> BigIntegerParallelMultiply.parallelMultiply   1000ss4527.560 
>> ±  268.903  ms/op
>> BigIntegerParallelMultiply.parallelMultiply  1ss4   9076.551 
>> ± 1899.444  ms/op
>> 
>> 
>> We can see that for larger calculations (fib 100m), the execution is 2.7x 
>> faster in parallel. For medium size (fib 10m) it is 1.873x faster. And for 
>> small (fib 1m) it is roughly the same. Considering that the fibonacci 
>> algorithm that we used was in itself sequential, and that the last 3 
>> calculations would dominate, 2.7x faster should probably be considered quite 
>> good on a 1-6-2 machine.
>
> kabutz has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   Updated comment to include information about performance

src/java.base/share/classes/java/math/BigInteger.java line 1603:

> 1601:  * parallel multiplication algorithm will use more CPU resources
> 1602:  * to compute the result faster, with no increase in memory
> 1603:  * consumption.

The implNote should cover a space of possible parallel multiply implementations 
so it doesn't have to be updated as often as the implementation is tuned or 
adjusted. So I'd prefer to have a statement like "may use more memory" even if 
the current implementation doesn't actually use more memory. If there are any 
"contraindications" on when to use the method, they could be listed here too.

-

PR: https://git.openjdk.java.net/jdk/pull/6409


Re: RFR: JDK-8277175 : Add a parallel multiply method to BigInteger [v10]

2022-02-02 Thread kabutz
On Thu, 3 Feb 2022 05:29:51 GMT, kabutz  wrote:

>> BigInteger currently uses three different algorithms for multiply. The 
>> simple quadratic algorithm, then the slightly better Karatsuba if we exceed 
>> a bit count and then Toom Cook 3 once we go into the several thousands of 
>> bits. Since Toom Cook 3 is a recursive algorithm, it is trivial to 
>> parallelize it. I have demonstrated this several times in conference talks. 
>> In order to be consistent with other classes such as Arrays and Collection, 
>> I have added a parallelMultiply() method. Internally we have added a 
>> parameter to the private multiply method to indicate whether the calculation 
>> should be done in parallel.
>> 
>> The performance improvements are as should be expected. Fibonacci of 100 
>> million (using a single-threaded Dijkstra's sum of squares version) 
>> completes in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with 
>> the sequential multiply() method. This is on my 1-8-2 laptop. The final 
>> multiplications are with very large numbers, which then benefit from the 
>> parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number.
>> 
>> We have also parallelized the private square() method. Internally, the 
>> square() method defaults to be sequential.
>> 
>> Some benchmark results, run on my 1-6-2 server:
>> 
>> 
>> Benchmark  (n)  Mode  Cnt  Score 
>>  Error  Units
>> BigIntegerParallelMultiply.multiply100ss4 51.707 
>> ±   11.194  ms/op
>> BigIntegerParallelMultiply.multiply   1000ss4988.302 
>> ±  235.977  ms/op
>> BigIntegerParallelMultiply.multiply  1ss4  24662.063 
>> ± 1123.329  ms/op
>> BigIntegerParallelMultiply.parallelMultiply100ss4 49.337 
>> ±   26.611  ms/op
>> BigIntegerParallelMultiply.parallelMultiply   1000ss4527.560 
>> ±  268.903  ms/op
>> BigIntegerParallelMultiply.parallelMultiply  1ss4   9076.551 
>> ± 1899.444  ms/op
>> 
>> 
>> We can see that for larger calculations (fib 100m), the execution is 2.7x 
>> faster in parallel. For medium size (fib 10m) it is 1.873x faster. And for 
>> small (fib 1m) it is roughly the same. Considering that the fibonacci 
>> algorithm that we used was in itself sequential, and that the last 3 
>> calculations would dominate, 2.7x faster should probably be considered quite 
>> good on a 1-6-2 machine.
>
> kabutz has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   Updated comment to include information about performance

The multiply() and parallelMultiply() use the exact same amount of memory now. 
However, they both use a little bit more than the previous multiply() method 
when the numbers are very large. We tried various approaches to keep the memory 
usage the same for non-parallel multiply(), but the solutions were not elegant. 
Since the small memory increase is only when the object allocation is huge, the 
extra memory did not make a difference. For small numbers, multiply() and 
parallelMultiply() are exactly the same as the old multiply(). multiply() thus 
has the same latency and CPU consumption as before.

A question about wording of the @implNote. In multiply() they say: "An 
implementation may offer better algorithmic ...", but we changed this to "This 
implementation may offer better algorithmic ..." I've kept it as "This 
implementation may ...", but what is the better way of writing such 
implementation notes?

-

PR: https://git.openjdk.java.net/jdk/pull/6409


Re: RFR: JDK-8277175 : Add a parallel multiply method to BigInteger [v10]

2022-02-02 Thread kabutz
> BigInteger currently uses three different algorithms for multiply. The simple 
> quadratic algorithm, then the slightly better Karatsuba if we exceed a bit 
> count and then Toom Cook 3 once we go into the several thousands of bits. 
> Since Toom Cook 3 is a recursive algorithm, it is trivial to parallelize it. 
> I have demonstrated this several times in conference talks. In order to be 
> consistent with other classes such as Arrays and Collection, I have added a 
> parallelMultiply() method. Internally we have added a parameter to the 
> private multiply method to indicate whether the calculation should be done in 
> parallel.
> 
> The performance improvements are as should be expected. Fibonacci of 100 
> million (using a single-threaded Dijkstra's sum of squares version) completes 
> in 9.2 seconds with the parallelMultiply() vs 25.3 seconds with the 
> sequential multiply() method. This is on my 1-8-2 laptop. The final 
> multiplications are with very large numbers, which then benefit from the 
> parallelization of Toom-Cook 3. Fibonacci 100 million is a 347084 bit number.
> 
> We have also parallelized the private square() method. Internally, the 
> square() method defaults to be sequential.
> 
> Some benchmark results, run on my 1-6-2 server:
> 
> 
> Benchmark  (n)  Mode  Cnt  Score  
> Error  Units
> BigIntegerParallelMultiply.multiply100ss4 51.707 
> ±   11.194  ms/op
> BigIntegerParallelMultiply.multiply   1000ss4988.302 
> ±  235.977  ms/op
> BigIntegerParallelMultiply.multiply  1ss4  24662.063 
> ± 1123.329  ms/op
> BigIntegerParallelMultiply.parallelMultiply100ss4 49.337 
> ±   26.611  ms/op
> BigIntegerParallelMultiply.parallelMultiply   1000ss4527.560 
> ±  268.903  ms/op
> BigIntegerParallelMultiply.parallelMultiply  1ss4   9076.551 
> ± 1899.444  ms/op
> 
> 
> We can see that for larger calculations (fib 100m), the execution is 2.7x 
> faster in parallel. For medium size (fib 10m) it is 1.873x faster. And for 
> small (fib 1m) it is roughly the same. Considering that the fibonacci 
> algorithm that we used was in itself sequential, and that the last 3 
> calculations would dominate, 2.7x faster should probably be considered quite 
> good on a 1-6-2 machine.

kabutz has updated the pull request incrementally with one additional commit 
since the last revision:

  Updated comment to include information about performance

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/6409/files
  - new: https://git.openjdk.java.net/jdk/pull/6409/files/fc7b844a..ef74878e

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=6409&range=09
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=6409&range=08-09

  Stats: 9 lines in 1 file changed: 8 ins; 0 del; 1 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6409.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6409/head:pull/6409

PR: https://git.openjdk.java.net/jdk/pull/6409


Re: RFR: 8278753: Runtime crashes with access violation during JNI_CreateJavaVM call

2022-02-02 Thread Yumin Qi
On Tue, 25 Jan 2022 00:20:19 GMT, Yumin Qi  wrote:

> Please review,
>   When jlink with --compress=2, zip is used to compress the files while doing 
> copy. The user case failed to load zip.dll, since zip.dll is not set in PATH. 
> This failure is after we get NULL from GetModuleHandle("zip.dll"), then do 
> LoadLibrary("zip.dll") will have same result.
>   The fix is calling load_zip_library of ClassLoader first --- if zip library 
> already loaded just return the cached handle for following usage, if not, 
> load zip library and cached the handle.
> 
>   Tests: tier1,4,7 in test
>Manually tested user case, and checked output of jimage list  for 
> jlinked files using --compress=2.
> 
> Thanks
> Yumin

Since no further update, I will integrate tomorrow.

-

PR: https://git.openjdk.java.net/jdk/pull/7206


Integrated: 8266974: duplicate property key in java.sql.rowset resource bundle

2022-02-02 Thread Masanori Yano
On Tue, 25 Jan 2022 10:47:41 GMT, Masanori Yano  wrote:

> I have removed the duplicate property keys.
> Could you please review the fix?

This pull request has now been integrated.

Changeset: e3d5c9e7
Author:Masanori Yano 
Committer: Lance Andersen 
URL:   
https://git.openjdk.java.net/jdk/commit/e3d5c9e7c4ab210ae7a4417a47632603910744a1
Stats: 22 lines in 11 files changed: 0 ins; 11 del; 11 mod

8266974: duplicate property key in java.sql.rowset resource bundle

Reviewed-by: lancea

-

PR: https://git.openjdk.java.net/jdk/pull/7212


Re: Constant methods in Java

2022-02-02 Thread Aaron Scott-Boddendijk
There's this, and the fact that it's effectively unenforceable.

String t = ((Supplier) () -> {
final String s = myMethod();
return s;
}).get();

So you keep the compiler happy but the desire to force only final retention
of the reference (which, like Raffaello, I don't understand the use-case
for) can be trivially circumvented.

--
Aaron Scott-Boddendijk


On Thu, Feb 3, 2022 at 9:09 AM Raffaello Giulietti <
raffaello.giulie...@gmail.com> wrote:

> Hello,
>
> I don't get why the author of myMethod() would/should be interested in
> forcing the caller of the method to declare the variable b to be final.
> Stated otherwise, what is the problem addressed by this suggestion?
> Have you some specific case in mind?
>
>
> Greetings
> Raffaello
>
>
> On 2022-02-02 20:27, Alberto Otero Rodríguez wrote:
> > I have a suggestion. I think it would be interesting creating constant
> methods in Java.
> >
> > I mean methods declared like this:
> >
> > public const String myMethod() {
> >String a = "a";
> >a = a + "b";
> >return a;
> > }
> >
> > So that the response of the method is forced to be assigned to a final
> variable.
> >
> > This would be ok:
> > final String b = myMethod();
> >
> > But this would throw a compilation error:
> > String c = myMethod();
> >
> > What do you think? It's just an idea.
>


Re: Constant methods in Java

2022-02-02 Thread Raffaello Giulietti

Hello,

I don't get why the author of myMethod() would/should be interested in 
forcing the caller of the method to declare the variable b to be final.

Stated otherwise, what is the problem addressed by this suggestion?
Have you some specific case in mind?


Greetings
Raffaello


On 2022-02-02 20:27, Alberto Otero Rodríguez wrote:

I have a suggestion. I think it would be interesting creating constant methods 
in Java.

I mean methods declared like this:

public const String myMethod() {
   String a = "a";
   a = a + "b";
   return a;
}

So that the response of the method is forced to be assigned to a final variable.

This would be ok:
final String b = myMethod();

But this would throw a compilation error:
String c = myMethod();

What do you think? It's just an idea.


Constant methods in Java

2022-02-02 Thread Alberto Otero Rodríguez
I have a suggestion. I think it would be interesting creating constant methods 
in Java.

I mean methods declared like this:

public const String myMethod() {
  String a = "a";
  a = a + "b";
  return a;
}

So that the response of the method is forced to be assigned to a final variable.

This would be ok:
final String b = myMethod();

But this would throw a compilation error:
String c = myMethod();

What do you think? It's just an idea.


Re: RFR: JDK-8277175 : Add a parallel multiply method to BigInteger [v9]

2022-02-02 Thread Joe Darcy
On Fri, 28 Jan 2022 19:03:39 GMT, kabutz  wrote:

>> kabutz has updated the pull request incrementally with one additional commit 
>> since the last revision:
>> 
>>   Benchmark for testing the effectiveness of BigInteger.parallelMultiply()
>
> I have added a benchmark for checking performance difference between 
> sequential and parallel multiply of very large Mersenne primes using 
> BigInteger. We want to measure real time, user time, system time and the 
> amount of memory allocated. To calculate this, we create our own thread 
> factory for the common ForkJoinPool and then use that to measure user time, 
> cpu time and bytes allocated.
> 
> We use reflection to discover all methods that match "*ultiply", and use them 
> to multiply two very large Mersenne primes together.
> 
> ### Results on a 1-6-2 machine running Ubuntu linux
> 
> Memory allocation increased from 83.9GB to 84GB, for both the sequential and 
> parallel versions. This is an increase of just 0.1%. On this machine, the 
> parallel version was 3.8x faster in latency (real time), but it used 2.7x 
> more CPU resources.
> 
> Testing multiplying Mersenne primes of 2^57885161-1 and 2^82589933-1
> 
>  openjdk version "18-internal" 2022-03-15
> 
> BigInteger.parallelMultiply()
> real  0m6.288s
> user  1m3.010s
> sys   0m0.027s
> mem   84.0GB
> BigInteger.multiply()
> real  0m23.682s
> user  0m23.530s
> sys   0m0.004s
> mem   84.0GB
> 
> 
>  openjdk version "1.8.0_302"
> 
> BigInteger.multiply()
> real  0m25.657s
> user  0m25.390s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "9.0.7.1"
> 
> BigInteger.multiply()
> real  0m24.907s
> user  0m24.700s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "10.0.2" 2018-07-17
> 
> BigInteger.multiply()
> real  0m24.632s
> user  0m24.380s
> sys   0m0.004s
> mem   83.9GB
> 
> 
>  openjdk version "11.0.12" 2021-07-20 LTS
> 
> BigInteger.multiply()
> real  0m22.114s
> user  0m21.930s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "12.0.2" 2019-07-16
> 
> BigInteger.multiply()
> real  0m23.015s
> user  0m22.830s
> sys   0m0.000s
> mem   83.9GB
> 
> 
>  openjdk version "13.0.9" 2021-10-19
> 
> BigInteger.multiply()
> real  0m23.548s
> user  0m23.350s
> sys   0m0.005s
> mem   83.9GB
> 
> 
>  openjdk version "14.0.2" 2020-07-14
> 
> BigInteger.multiply()
> real  0m22.918s
> user  0m22.530s
> sys   0m0.131s
> mem   83.9GB
> 
> 
> 
>  openjdk version "15.0.5" 2021-10-19
> 
> BigInteger.multiply()
> real  0m22.038s
> user  0m21.750s
> sys   0m0.003s
> mem   83.9GB
> 
> 
>  openjdk version "16.0.2" 2021-07-20
> 
> BigInteger.multiply()
> real  0m23.049s
> user  0m22.760s
> sys   0m0.006s
> mem   83.9GB
> 
> 
>  openjdk version "17" 2021-09-14
> 
> BigInteger.multiply()
> real  0m22.580s
> user  0m22.310s
> sys   0m0.001s
> mem   83.9GB

> @kabutz thanks for the additional testing, kind of what we intuitively 
> expected.
> 
> Can you please update the specification in response to Joe's 
> [comment](https://bugs.openjdk.java.net/browse/JDK-8278886?focusedCommentId=14470153&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14470153)?
> 
> Generally for parallel constructs we try to say as little as possible with 
> regards to latency, CPU time, and memory. The first two are sort of obvious, 
> the later less so for the developer.
> 
> From your results I think can say a little more. Here a suggestive update 
> addressing Joe's comments:
> 
> ```java
> /**
>  * Returns a BigInteger whose value is {@code (this * val)}.
>  * When both {@code this} and {@code val} are large, typically
>  * in the thousands of bits, parallel multiply might be used. 
>  * This method returns the exact same mathematical result as {@link 
> #multiply}. 
>  *
>  * @implNote This implementation may offer better algorithmic
>  * performance when {@code val == this}.
>  * 
>  * @implNote Compared to {@link #multiply} this implementation's parallel 
> multiplication algorithm 
>  * will use more CPU resources to compute the result faster, with a 
> relatively small increase memory
>  * consumption.
>  *
>  * @param  val value to be multiplied by this BigInteger.
>  * @return {@code this * val}
>  * @see #multiply
>  */
> ```

Yes, my intention is that the parallelMultiply spec give some guidance to the 
user on when to use it and warning about the consequences of doing so (same 
answer, should be in less time, but more compute and possibly a bit more 
memory).

-

PR: https://git.openjdk.java.net/jdk/pull/6409


Re: RFR: JDK-8277175 : Add a parallel multiply method to BigInteger [v9]

2022-02-02 Thread Paul Sandoz
On Fri, 28 Jan 2022 19:03:39 GMT, kabutz  wrote:

>> kabutz has updated the pull request incrementally with one additional commit 
>> since the last revision:
>> 
>>   Benchmark for testing the effectiveness of BigInteger.parallelMultiply()
>
> I have added a benchmark for checking performance difference between 
> sequential and parallel multiply of very large Mersenne primes using 
> BigInteger. We want to measure real time, user time, system time and the 
> amount of memory allocated. To calculate this, we create our own thread 
> factory for the common ForkJoinPool and then use that to measure user time, 
> cpu time and bytes allocated.
> 
> We use reflection to discover all methods that match "*ultiply", and use them 
> to multiply two very large Mersenne primes together.
> 
> ### Results on a 1-6-2 machine running Ubuntu linux
> 
> Memory allocation increased from 83.9GB to 84GB, for both the sequential and 
> parallel versions. This is an increase of just 0.1%. On this machine, the 
> parallel version was 3.8x faster in latency (real time), but it used 2.7x 
> more CPU resources.
> 
> Testing multiplying Mersenne primes of 2^57885161-1 and 2^82589933-1
> 
>  openjdk version "18-internal" 2022-03-15
> 
> BigInteger.parallelMultiply()
> real  0m6.288s
> user  1m3.010s
> sys   0m0.027s
> mem   84.0GB
> BigInteger.multiply()
> real  0m23.682s
> user  0m23.530s
> sys   0m0.004s
> mem   84.0GB
> 
> 
>  openjdk version "1.8.0_302"
> 
> BigInteger.multiply()
> real  0m25.657s
> user  0m25.390s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "9.0.7.1"
> 
> BigInteger.multiply()
> real  0m24.907s
> user  0m24.700s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "10.0.2" 2018-07-17
> 
> BigInteger.multiply()
> real  0m24.632s
> user  0m24.380s
> sys   0m0.004s
> mem   83.9GB
> 
> 
>  openjdk version "11.0.12" 2021-07-20 LTS
> 
> BigInteger.multiply()
> real  0m22.114s
> user  0m21.930s
> sys   0m0.001s
> mem   83.9GB
> 
> 
>  openjdk version "12.0.2" 2019-07-16
> 
> BigInteger.multiply()
> real  0m23.015s
> user  0m22.830s
> sys   0m0.000s
> mem   83.9GB
> 
> 
>  openjdk version "13.0.9" 2021-10-19
> 
> BigInteger.multiply()
> real  0m23.548s
> user  0m23.350s
> sys   0m0.005s
> mem   83.9GB
> 
> 
>  openjdk version "14.0.2" 2020-07-14
> 
> BigInteger.multiply()
> real  0m22.918s
> user  0m22.530s
> sys   0m0.131s
> mem   83.9GB
> 
> 
> 
>  openjdk version "15.0.5" 2021-10-19
> 
> BigInteger.multiply()
> real  0m22.038s
> user  0m21.750s
> sys   0m0.003s
> mem   83.9GB
> 
> 
>  openjdk version "16.0.2" 2021-07-20
> 
> BigInteger.multiply()
> real  0m23.049s
> user  0m22.760s
> sys   0m0.006s
> mem   83.9GB
> 
> 
>  openjdk version "17" 2021-09-14
> 
> BigInteger.multiply()
> real  0m22.580s
> user  0m22.310s
> sys   0m0.001s
> mem   83.9GB

@kabutz thanks for the additional testing, kind of what we intuitively expected.

Can you please update the specification in response to Joe's 
[comment](https://bugs.openjdk.java.net/browse/JDK-8278886?focusedCommentId=14470153&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14470153)?

Generally for parallel constructs we try to say as little as possible with 
regards to latency, CPU time, and memory. The first two are sort of obvious, 
the later less so for the developer.

>From your results I think can say a little more. Here a suggestive update 
>addressing Joe's comments:


/**
 * Returns a BigInteger whose value is {@code (this * val)}.
 * When both {@code this} and {@code val} are large, typically
 * in the thousands of bits, parallel multiply might be used. 
 * This method returns the exact same mathematical result as {@link #multiply}. 
 *
 * @implNote This implementation may offer better algorithmic
 * performance when {@code val == this}.
 * 
 * @implNote Compared to {@link #multiply} this implementation's parallel 
multiplication algorithm 
 * will use more CPU resources to compute the result faster, with a relatively 
small increase memory
 * consumption.
 *
 * @param  val value to be multiplied by this BigInteger.
 * @return {@code this * val}
 * @see #multiply
 */

-

PR: https://git.openjdk.java.net/jdk/pull/6409


Re: RFR: 8279917: Refactor subclassAudits in Thread to use ClassValue [v2]

2022-02-02 Thread Roman Kennke
On Thu, 13 Jan 2022 12:19:03 GMT, Roman Kennke  wrote:

>> Thread.java would benefit from a refactoring similar to JDK-8278065 to use 
>> ClassValue instead of the somewhat problematic WeakClassKey mechanism.
>> 
>> Testing:
>>  - [x] tier1
>>  - [x] tier2
>>  - [x] tier3
>
> Roman Kennke has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Remove obsolete comment

Thank you all!

-

PR: https://git.openjdk.java.net/jdk/pull/7054


Integrated: 8279917: Refactor subclassAudits in Thread to use ClassValue

2022-02-02 Thread Roman Kennke
On Wed, 12 Jan 2022 19:39:29 GMT, Roman Kennke  wrote:

> Thread.java would benefit from a refactoring similar to JDK-8278065 to use 
> ClassValue instead of the somewhat problematic WeakClassKey mechanism.
> 
> Testing:
>  - [x] tier1
>  - [x] tier2
>  - [x] tier3

This pull request has now been integrated.

Changeset: ce71e8b2
Author:Roman Kennke 
URL:   
https://git.openjdk.java.net/jdk/commit/ce71e8b281176d39cc879ae4ecf95f3d643ebd29
Stats: 86 lines in 1 file changed: 1 ins; 78 del; 7 mod

8279917: Refactor subclassAudits in Thread to use ClassValue

Reviewed-by: alanb, rriggs

-

PR: https://git.openjdk.java.net/jdk/pull/7054


Re: [crac] RFR: Ensure empty Reference Handler and Cleaners queues

2022-02-02 Thread Alan Bateman

On 01/02/2022 09:11, Anton Kozlov wrote:
Cross-posting RFR from CRaC Project. The change touches Reference 
class, so I

would be glad to receive any feedback from core-libs-dev.

In CRaC project, java code participates in the preparation of the 
platform
state that can be safely stored to the image. The image can be 
attempted at any
time, so the image may capture unprocessed References. Recently I 
found cases
when objects became unreachable during preparation for the checkpoint, 
and
their associated clean-up actions to close external resources (which 
we don't
allow open when the image is stored). So it's become necessary to 
ensure as
many References as possible are processed before the image is created. 
As a
nice additional feature, restored java instances won't start with the 
same

Reference processing.

With the change, the image is not created until VM's queue of pending
j.l.References are drained, and then, as an example, each 
j.l.ref.Cleaner queue
is drained, only then the VM is called to prepare the image. More 
Reference
handling threads will be changed like Cleaner's ones. I'm looking for 
possible

problems or general comments about this approach.


At a high level it should be okay to provide a JDK-internal way to await 
quiescent. You've added it as a public API which might be okay for the 
current exploration but I don't think it would be exposed in its current 
form. Once the method returns then there is no guarantee that the number 
of waiters hasn't changed, but I think you know that.


-Alan.