Re: RFR: 8344223: Remove calls to SecurityManager and doPrivileged in java.net.URLClassLoader after JEP 486 integration [v4]

2024-11-20 Thread Yudi Zheng
On Wed, 20 Nov 2024 06:52:11 GMT, Jaikiran Pai  wrote:

>> Can I please get a review of this change which proposes to remove 
>> SecurityManager related API usages from `URLClassLoader` and its related 
>> `URLClassPath`? This addresses https://bugs.openjdk.org/browse/JDK-8344223.
>> 
>> The `URLClassLoader.getPermissions()` method will need additional changes 
>> but that will be done as part of https://bugs.openjdk.org/browse/JDK-8343150.
>> 
>> I'll be opening a CSR for this change because we are removing support for 
>> the `jdk.net.URLClassPath.disableRestrictedPermissions` system property.
>> 
>> No new tests have been added and existing tier1, tier2 and tier3 tests 
>> continue to pass.
>
> Jaikiran Pai has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains four additional 
> commits since the last revision:
> 
>  - merge latest from master branch
>  - Alan's review - fix code comment
>  - merge latest from master branch
>  - 8344223: Remove calls to SecurityManager and doPrivileged in 
> java.net.URLClassLoader after JEP 486 integration

Marked as reviewed by yzheng (Committer).

-

PR Review: https://git.openjdk.org/jdk/pull/22233#pullrequestreview-2448510144


Re: RFR: 8343982: Remove usage of security manager from ClassLoader and related classes [v2]

2024-11-14 Thread Yudi Zheng
On Thu, 14 Nov 2024 11:25:08 GMT, Alan Bateman  wrote:

>> Remove the code for the now defunct SecurityManager execution mode from 
>> ClassLoader and the built-in class loader implementations. The override of 
>> getPermissions is removed from the application class loader, and from the 
>> class loaders used to support module layers.
>> 
>> URLCassLoader (and its implementation URLClassPath) will be handled 
>> separately as there are spec changes required, and code in URLClassPath that 
>> requires special attention.
>
> Alan Bateman has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - Change 2-arg findResource to use findResourceOnClassPath when finding 
> resources on class path
>  - Merge branch 'master' into JDK-8343982
>  - Initial commit

LGTM

-

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/22097#pullrequestreview-2436292670


Re: RFR: 8344011: Remove usage of security manager from Class and reflective APIs [v2]

2024-11-13 Thread Yudi Zheng
On Wed, 13 Nov 2024 19:05:21 GMT, Alan Bateman  wrote:

>> Remove code required for the now defunct SecurityManager execution mode from 
>> java.lang.Class, friends, and reflection APIs. Careful review is required so 
>> I've set Reviewer to 2. I've tried to keep the changes as easy to review as 
>> possible and not go over board with cleanup.
>> 
>> sun.reflect.misc.ReflectUtil are been hollowed out. A future pass will 
>> remove empty methods and qualified exports once the changes in "far away" 
>> code and modules is done.
>> 
>> In Lookup's class description, the removal of the sentence "avoid package 
>> access checks for classes accessible to the lookup class"  and the link to 
>> the removed "Security manager interactions" section is in 
>> discussion/non-normative text, just missed in the JEP 486 update that remove 
>> the linked section.
>> 
>> runtime/cds/appcds/StaticArchiveWithLambda.java is updated as creating the 
>> archive no longer skips a generated class.
>> 
>> Testing: tier1-5
>
> Alan Bateman has updated the pull request with a new target base due to a 
> merge or a rebase. The incremental webrev excludes the unrelated changes 
> brought in by the merge/rebase. The pull request contains three additional 
> commits since the last revision:
> 
>  - StaticArchiveWithLambda no longer skips dynamically generated class when 
> creating archive
>  - Merge branch 'master' into JDK-8344011
>  - Initial commit

LGTM. Graal changes are ready

-

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/22063#pullrequestreview-2435251121


Re: RFR: 8343981: Remove usage of security manager from Thread and related classes [v4]

2024-11-13 Thread Yudi Zheng
On Tue, 12 Nov 2024 18:47:46 GMT, Alan Bateman  wrote:

>> Removes the SecurityManager usage from Thread + friends.
>> 
>> In Thread, the getContextClassLoader method is no longer caller-sensitive 
>> method.
>> 
>> JavaLangAccess.newThreadWithAcc is removed and jdk.internal.access is no 
>> longer exported to java.naming. The usage of newThreadWithAcc is removed 
>> from com.sun.jndi.ldap.VersionHelper. There will be further work on the 
>> java.naming module to remove usage of SM, the change here is specific to the 
>> usage of ewThreadWithAcc.
>
> Alan Bateman has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 227 commits:
> 
>  - Merge branch 'master' into JDK-8343981
>  - Merge commit 'db85090553ab14a84c3ed0a2604dd56c5b6e6982' into JDK-8343981
>  - Initial commit
>  - Merge remote-tracking branch 'jdk-sandbox/jep486' into JDK-8338411
>  - Merge branch 'master' into jep486
>  - Move remaining JEP 486 failing tests into correct groups.
>  - Move JEP 486 failing tests into hotspot_runtime group.
>  - test/jdk/java/rmi/server/RMIClassLoader/spi/DefaultProperty.java failing
>  - Merge branch 'master' into jep486
>  - Merge branch 'master' into jep486
>  - ... and 217 more: https://git.openjdk.org/jdk/compare/63eb4853...636ebd19

Marked as reviewed by yzheng (Committer).

-

PR Review: https://git.openjdk.org/jdk/pull/22035#pullrequestreview-2433814049


Re: RFR: 8343981: Remove usage of security manager from Thread and related classes [v4]

2024-11-13 Thread Yudi Zheng
On Tue, 12 Nov 2024 18:47:46 GMT, Alan Bateman  wrote:

>> Removes the SecurityManager usage from Thread + friends.
>> 
>> In Thread, the getContextClassLoader method is no longer caller-sensitive 
>> method.
>> 
>> JavaLangAccess.newThreadWithAcc is removed and jdk.internal.access is no 
>> longer exported to java.naming. The usage of newThreadWithAcc is removed 
>> from com.sun.jndi.ldap.VersionHelper. There will be further work on the 
>> java.naming module to remove usage of SM, the change here is specific to the 
>> usage of ewThreadWithAcc.
>
> Alan Bateman has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 227 commits:
> 
>  - Merge branch 'master' into JDK-8343981
>  - Merge commit 'db85090553ab14a84c3ed0a2604dd56c5b6e6982' into JDK-8343981
>  - Initial commit
>  - Merge remote-tracking branch 'jdk-sandbox/jep486' into JDK-8338411
>  - Merge branch 'master' into jep486
>  - Move remaining JEP 486 failing tests into correct groups.
>  - Move JEP 486 failing tests into hotspot_runtime group.
>  - test/jdk/java/rmi/server/RMIClassLoader/spi/DefaultProperty.java failing
>  - Merge branch 'master' into jep486
>  - Merge branch 'master' into jep486
>  - ... and 217 more: https://git.openjdk.org/jdk/compare/63eb4853...636ebd19

Please hold until I have the Graal changes ready

-

PR Comment: https://git.openjdk.org/jdk/pull/22035#issuecomment-2474106454


Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning

2024-11-06 Thread Yudi Zheng
On Thu, 17 Oct 2024 14:28:30 GMT, Patricio Chilano Mateo 
 wrote:

> This is the implementation of JEP 491: Synchronize Virtual Threads without 
> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for 
> further details.
> 
> In order to make the code review easier the changes have been split into the 
> following initial 4 commits:
> 
> - Changes to allow unmounting a virtual thread that is currently holding 
> monitors.
> - Changes to allow unmounting a virtual thread blocked on synchronized trying 
> to acquire the monitor.
> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and 
> its timed-wait variants.
> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
> 
> The changes fix pinning issues for all 4 ports that currently implement 
> continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added 
> recently and stand in its own commit after the initial ones.
> 
> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default 
> locking mode, (and `LM_MONITOR` which comes for free), but not when using 
> `LM_LEGACY` mode. Note that the `LockingMode` flag has already been 
> deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with 
> the intention to remove `LM_LEGACY` code in future releases.
> 
> 
> ## Summary of changes
> 
> ### Unmount virtual thread while holding monitors
> 
> As stated in the JEP, currently when a virtual thread enters a synchronized 
> method or block, the JVM records the virtual thread's carrier platform thread 
> as holding the monitor, not the virtual thread itself. This prevents the 
> virtual thread from being unmounted from its carrier, as ownership 
> information would otherwise go wrong. In order to fix this limitation we will 
> do two things:
> 
> - We copy the oops stored in the LockStack of the carrier to the stackChunk 
> when freezing (and clear the LockStack). We copy the oops back to the 
> LockStack of the next carrier when thawing for the first time (and clear them 
> from the stackChunk). Note that we currently assume carriers don't hold 
> monitors while mounting virtual threads.
> 
> - For inflated monitors we now record the `java.lang.Thread.tid` of the owner 
> in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows 
> us to tie the owner of the monitor to a `java.lang.Thread` instance, rather 
> than to a JavaThread which is only created per platform thread. The tid is 
> already a 64 bit field so we can ignore issues of the counter wrapping around.
> 
>  General notes about this part:
> 
> - Since virtual threads don't need to worry about holding monitors anymo...

src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 329:

> 327:   nonstatic_field(ObjArrayKlass,   _element_klass,   
>  Klass*)\
> 328:  
> \
> 329:   unchecked_nonstatic_field(ObjectMonitor, _owner,   
>  int64_t)   \

to make the type assert more precise:

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index 20b9609cdbf..f2b8a69c03f 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -326,7 +326,7 @@

  \
   nonstatic_field(ObjArrayKlass,   _element_klass, 
   Klass*)\

  \
-  unchecked_nonstatic_field(ObjectMonitor, _owner, 
   int64_t)   \
+  volatile_nonstatic_field(ObjectMonitor,  _owner, 
   int64_t)   \
   volatile_nonstatic_field(ObjectMonitor,  _recursions,
   intptr_t)  \
   volatile_nonstatic_field(ObjectMonitor,  _cxq,   
   ObjectWaiter*) \
   volatile_nonstatic_field(ObjectMonitor,  _EntryList, 
   ObjectWaiter*) \
diff --git a/src/hotspot/share/runtime/vmStructs.cpp 
b/src/hotspot/share/runtime/vmStructs.cpp
index 86d7277f88b..0492f28e15b 100644
--- a/src/hotspot/share/runtime/vmStructs.cpp
+++ b/src/hotspot/share/runtime/vmStructs.cpp
@@ -786,8 +786,8 @@

  \
   volatile_nonstatic_field

Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning [v12]

2024-10-28 Thread Yudi Zheng
On Fri, 25 Oct 2024 21:33:24 GMT, Patricio Chilano Mateo 
 wrote:

>> This is the implementation of JEP 491: Synchronize Virtual Threads without 
>> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for 
>> further details.
>> 
>> In order to make the code review easier the changes have been split into the 
>> following initial 4 commits:
>> 
>> - Changes to allow unmounting a virtual thread that is currently holding 
>> monitors.
>> - Changes to allow unmounting a virtual thread blocked on synchronized 
>> trying to acquire the monitor.
>> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` 
>> and its timed-wait variants.
>> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
>> 
>> The changes fix pinning issues for all 4 ports that currently implement 
>> continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added 
>> recently and stand in its own commit after the initial ones.
>> 
>> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default 
>> locking mode, (and `LM_MONITOR` which comes for free), but not when using 
>> `LM_LEGACY` mode. Note that the `LockingMode` flag has already been 
>> deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), 
>> with the intention to remove `LM_LEGACY` code in future releases.
>> 
>> 
>> ## Summary of changes
>> 
>> ### Unmount virtual thread while holding monitors
>> 
>> As stated in the JEP, currently when a virtual thread enters a synchronized 
>> method or block, the JVM records the virtual thread's carrier platform 
>> thread as holding the monitor, not the virtual thread itself. This prevents 
>> the virtual thread from being unmounted from its carrier, as ownership 
>> information would otherwise go wrong. In order to fix this limitation we 
>> will do two things:
>> 
>> - We copy the oops stored in the LockStack of the carrier to the stackChunk 
>> when freezing (and clear the LockStack). We copy the oops back to the 
>> LockStack of the next carrier when thawing for the first time (and clear 
>> them from the stackChunk). Note that we currently assume carriers don't hold 
>> monitors while mounting virtual threads.
>> 
>> - For inflated monitors we now record the `java.lang.Thread.tid` of the 
>> owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This 
>> allows us to tie the owner of the monitor to a `java.lang.Thread` instance, 
>> rather than to a JavaThread which is only created per platform thread. The 
>> tid is already a 64 bit field so we can ignore issues of the counter 
>> wrapping around.
>> 
>>  General notes about this part:
>> 
>> - Since virtual th...
>
> Patricio Chilano Mateo has updated the pull request incrementally with two 
> additional commits since the last revision:
> 
>  - Restore use of atPointA in test StopThreadTest.java
>  - remove interruptible check from conditional in Object::wait

src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 329:

> 327:   nonstatic_field(ObjArrayKlass,   _element_klass,   
>  Klass*)\
> 328:  
> \
> 329:   unchecked_nonstatic_field(ObjectMonitor, _owner,   
>  int64_t)   \

to make the type assert more precise:

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index 20b9609cdbf..f2b8a69c03f 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -326,7 +326,7 @@

  \
   nonstatic_field(ObjArrayKlass,   _element_klass, 
   Klass*)\

  \
-  unchecked_nonstatic_field(ObjectMonitor, _owner, 
   int64_t)   \
+  volatile_nonstatic_field(ObjectMonitor,  _owner, 
   int64_t)   \
   volatile_nonstatic_field(ObjectMonitor,  _recursions,
   intptr_t)  \
   volatile_nonstatic_field(ObjectMonitor,  _cxq,   
   ObjectWaiter*) \
   volatile_nonstatic_field(ObjectMonitor,  _EntryList, 
   ObjectWaiter*) \
diff --git a/src/hotspot/share/runtime/vmStructs.cpp 
b/src/hotspot/share/runtime/vmStructs.cpp
index 86d7277f88b..0492f28e15b 100644
--- a/src/hotspot/share

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v13]

2024-10-16 Thread Yudi Zheng
On Mon, 23 Sep 2024 19:24:51 GMT, Srinivas Vamsi Parasa  
wrote:

>> The goal of this PR is to implement an x86_64 intrinsic for 
>> java.lang.Math.tanh() using libm
>> 
>> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup
>> -- | -- | -- | --
>> MathBench.tanhDouble | 70900 | 95618 | 1.35x
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   change ifdef from x86 to AMD64

src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp line 74:

> 72: // Special cases:
> 73: //  tanh(NaN) = quiet NaN, and raise invalid exception
> 74: //  tanh(INF) = that INF

This should be

tanh(POSITIVE_INFINITY) = +1.0
tanh(NEGATIVE_INFINITY) = -1.0

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20657#discussion_r1803316120


Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]

2024-10-15 Thread Yudi Zheng
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev  wrote:

>> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native 
>> method for `Reference.clear`. The original patch skipped intrinsification of 
>> this method, because we thought `Reference.clear` is not on a performance 
>> sensitive path. However, it shows up prominently on simple benchmarks that 
>> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with 
>> `RRWL` benchmarks.
>> 
>> We need to know the actual oop strongness/weakness before we call into C2 
>> Access API, this work models this after existing code for `refersTo0` 
>> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. 
>> 
>> Additional testing:
>>  - [x] Linux x86_64 server fastdebug, `all`
>>  - [x] Linux AArch64 server fastdebug, `all`
>
> Aleksey Shipilev has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 18 commits:
> 
>  - Simplify: just do keep alive checks
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - More precise bit-unmasks
>  - Reconcile with late barrier expansion in G1
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Review comments
>  - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear
>  - Also dispatch to slow-path on other arches
>  - Fix other arches
>  - Tighten up comments in Reference javadoc
>  - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab

src/hotspot/share/gc/z/zBarrierSetRuntime.hpp line 43:

> 41:   static void store_barrier_on_oop_field_with_healing(oop* p);
> 42:   static void store_barrier_on_oop_field_without_healing(oop* p);
> 43:   static void 
> no_keepalive_store_barrier_on_oop_field_without_healing(oop* p);

Could you please export this to JVMCI? I.e.,

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index 5452cca96b8..46aeb996c56 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -847,6 +847,7 @@
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_store_good))   \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_load_barrier_on_weak_oop_field_preloaded))\
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_load_barrier_on_phantom_oop_field_preloaded)) \
+  ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing)   \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_native_oop_field_without_healing))\
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing))  \
   ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, 
ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing))   \

Thanks!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1800848130


Re: RFR: 8338694: x86_64 intrinsic for tanh using libm

2024-08-28 Thread Yudi Zheng
On Wed, 21 Aug 2024 00:25:03 GMT, Srinivas Vamsi Parasa  
wrote:

> The goal of this PR is to implement an x86_64 intrinsic for 
> java.lang.Math.tanh() using libm
> 
> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup
> -- | -- | -- | --
> MathBench.tanhDouble | 70900 | 95618 | 1.35x

src/hotspot/share/jvmci/jvmciCompilerToVM.hpp line 114:

> 112: static address dcos;
> 113: static address dtan;
> 114: static address dtanh;

Could you please add the following initializing code?

diff --git a/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp 
b/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp
index 9752d7edf99..1db9be70db0 100644
--- a/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp
+++ b/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp
@@ -259,6 +259,17 @@ void CompilerToVM::Data::initialize(JVMCI_TRAPS) {
   SET_TRIGFUNC(dpow);
 
 #undef SET_TRIGFUNC
+
+#define SET_TRIGFUNC_OR_NULL(name)  \
+  if (StubRoutines::name() != nullptr) {\
+name = StubRoutines::name();\
+  } else {  \
+name = nullptr; \
+  }
+
+  SET_TRIGFUNC_OR_NULL(dtanh);
+
+#undef SET_TRIGFUNC_OR_NULL
 }
 
 static jboolean is_c1_supported(vmIntrinsics::ID id){
diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index fea308503cf..189c1465589 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -126,6 +126,7 @@
   static_field(CompilerToVM::Data, dsin,   
address)  \
   static_field(CompilerToVM::Data, dcos,   
address)  \
   static_field(CompilerToVM::Data, dtan,   
address)  \
+  static_field(CompilerToVM::Data, dtanh,  
address)  \
   static_field(CompilerToVM::Data, dexp,   
address)  \
   static_field(CompilerToVM::Data, dlog,   
address)  \
   static_field(CompilerToVM::Data, dlog10, 
address)  \

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20657#discussion_r1734655447


Re: RFR: 8338745: Intrinsify Continuation.pin() and Continuation.unpin() [v2]

2024-08-22 Thread Yudi Zheng
On Thu, 22 Aug 2024 11:13:17 GMT, Markus Grönlund  wrote:

>> Greetings,
>> 
>> Please help review this change set that implements C2 intrinsics for 
>> jdk.internal.vm.Continuation.pin() and jdk.internal.vm.Continuation.unpin().
>> 
>> This work is a consequence of 
>> [JDK-8338417](https://bugs.openjdk.org/browse/JDK-8338417), which required 
>> us to introduce explicit pin constructs for Virtual threads in a relatively 
>> performance-sensitive path.
>> 
>> Testing: jdk_jfr, loom testing
>> 
>> Comment: I changed the type of the ContinuationEntry::_pin_count field from 
>> uint to uin32_t to make the size explicit and to access it uniformly from 
>> the intrinsic code using T_INT.
>> 
>> Thanks
>> Markus
>
> Markus Grönlund has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Deoptimization::Action_none for no deopt

src/hotspot/share/opto/library_call.cpp line 3733:

> 3731:   // TLS
> 3732:   Node* tls_ptr = _gvn.transform(new ThreadLocalNode());
> 3733:   Node* last_continuation_offset = basic_plus_adr(top(), tls_ptr, 
> in_bytes(JavaThread::cont_entry_offset()));

Could you please export `JavaThread::_cont_entry` and 
`ContinuationEntry::_pin_count` to JVMCI? Thanks!

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index 688691fb976..a25ecd2bab5 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -35,6 +35,7 @@
 #include "oops/methodCounters.hpp"
 #include "oops/objArrayKlass.hpp"
 #include "prims/jvmtiThreadState.hpp"
+#include "runtime/continuationEntry.hpp"
 #include "runtime/deoptimization.hpp"
 #include "runtime/flags/jvmFlag.hpp"
 #include "runtime/osThread.hpp"
@@ -244,10 +245,13 @@
   nonstatic_field(JavaThread,  _held_monitor_count,
   intx)  \
   nonstatic_field(JavaThread,  _lock_stack,
   LockStack) \
   nonstatic_field(JavaThread,  _om_cache,  
   OMCache)   \
+  nonstatic_field(JavaThread,  _cont_entry,
   ContinuationEntry*)\
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_in_VTMS_transition, 
   bool)) \
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_in_tmp_VTMS_transition, 
   bool)) \
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_disable_suspend,
   bool)) \

  \
+  nonstatic_field(ContinuationEntry,   _pin_count, 
   uint32_t)  \
+   
  \
   nonstatic_field(LockStack,   _top,   
   uint32_t)  \

  \
   JVMTI_ONLY(static_field(JvmtiVTMSTransitionDisabler, 
_VTMS_notify_jvmti_events, bool)) \
diff --git a/src/hotspot/share/runtime/continuationEntry.hpp 
b/src/hotspot/share/runtime/continuationEntry.hpp
index 459321f444c..ac76cd6f088 100644
--- a/src/hotspot/share/runtime/continuationEntry.hpp
+++ b/src/hotspot/share/runtime/continuationEntry.hpp
@@ -39,6 +39,7 @@ class RegisterMap;
 
 // Metadata stored in the continuation entry frame
 class ContinuationEntry {
+  friend class JVMCIVMStructs;
   ContinuationEntryPD _pd;
 #ifdef ASSERT
 private:

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20664#discussion_r1726876549


Re: RFR: 8315884: New Object to ObjectMonitor mapping [v20]

2024-08-15 Thread Yudi Zheng
On Thu, 15 Aug 2024 06:12:22 GMT, Axel Boldt-Christmas  
wrote:

>> When inflating a monitor the `ObjectMonitor*` is written directly over the 
>> `markWord` and any overwritten data is displaced into a displaced 
>> `markWord`. This is problematic for concurrent GCs which needs extra care or 
>> looser semantics to use this displaced data. In Lilliput this data also 
>> contains the klass forcing this to be something that the GC has to take into 
>> account everywhere.
>> 
>> This patch introduces an alternative solution where locking only uses the 
>> lock bits of the `markWord` and inflation does not override and displace the 
>> `markWord`. This is done by keeping associations between objects and 
>> `ObjectMonitor*` in an external hash table. Different caching techniques are 
>> used to speedup lookups from compiled code.
>> 
>> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is 
>> only supported in combination with the LM_LIGHTWEIGHT locking mode (the 
>> default). 
>> 
>> This patch has been evaluated to be performance neutral when 
>> `UseObjectMonitorTable` is turned off (the default). 
>> 
>> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` 
>> and `UseObjectMonitorTable` works.
>> 
>> # Cleanups
>> 
>> Cleaned up displaced header usage for:
>>   * BasicLock
>> * Contains some Zero changes
>> * Renames one exported JVMCI field
>>   * ObjectMonitor
>> * Updates comments and tests consistencies
>> 
>> # Refactoring
>> 
>> `ObjectMonitor::enter` has been refactored an a 
>> `ObjectMonitorContentionMark` witness object has been introduced to the 
>> signatures. Which signals that the contentions reference counter is being 
>> held. More details are given below in the section about deflation.
>> 
>> The initial purpose of this was to allow `UseObjectMonitorTable` to interact 
>> more seamlessly with the `ObjectMonitor::enter` code. 
>> 
>> _There is even more `ObjectMonitor` refactoring which can be done here to 
>> create a more understandable and enforceable API. There are a handful of 
>> invariants / assumptions which are not always explicitly asserted which 
>> could be trivially abstracted and verified by the type system by using 
>> similar witness objects._
>> 
>> # LightweightSynchronizer
>> 
>> Working on adapting and incorporating the following section as a comment in 
>> the source code
>> 
>> ## Fast Locking
>> 
>>   CAS on locking bits in markWord. 
>>   0b00 (Fast Locked) <--> 0b01 (Unlocked)
>> 
>>   When locking and 0b00 (Fast Locked) is observed, it may be beneficial to 
>> avoid inflating by spinning a bit.
>> 
>>   If 0b10 (Inflated) is observed or there is to...
>
> Axel Boldt-Christmas has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Remove newline

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 677:

> 675: 
> 676:   // Check for match.
> 677:   cmpptr(obj, Address(t));

`Address(t)` can be cached (in rax?) and reused in the subsequent `comptr`

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1718426158


Re: RFR: 8315884: New Object to ObjectMonitor mapping [v5]

2024-07-11 Thread Yudi Zheng
On Wed, 10 Jul 2024 20:10:07 GMT, Axel Boldt-Christmas  
wrote:

>> When inflating a monitor the `ObjectMonitor*` is written directly over the 
>> `markWord` and any overwritten data is displaced into a displaced 
>> `markWord`. This is problematic for concurrent GCs which needs extra care or 
>> looser semantics to use this displaced data. In Lilliput this data also 
>> contains the klass forcing this to be something that the GC has to take into 
>> account everywhere.
>> 
>> This patch introduces an alternative solution where locking only uses the 
>> lock bits of the `markWord` and inflation does not override and displace the 
>> `markWord`. This is done by keeping associations between objects and 
>> `ObjectMonitor*` in an external hash table. Different caching techniques are 
>> used to speedup lookups from compiled code.
>> 
>> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is 
>> only supported in combination with the LM_LIGHTWEIGHT locking mode (the 
>> default). 
>> 
>> This patch has been evaluated to be performance neutral when 
>> `UseObjectMonitorTable` is turned off (the default). 
>> 
>> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` 
>> and `UseObjectMonitorTable` works.
>> 
>> # Cleanups
>> 
>> Cleaned up displaced header usage for:
>>   * BasicLock
>> * Contains some Zero changes
>> * Renames one exported JVMCI field
>>   * ObjectMonitor
>> * Updates comments and tests consistencies
>> 
>> # Refactoring
>> 
>> `ObjectMonitor::enter` has been refactored an a 
>> `ObjectMonitorContentionMark` witness object has been introduced to the 
>> signatures. Which signals that the contentions reference counter is being 
>> held. More details are given below in the section about deflation.
>> 
>> The initial purpose of this was to allow `UseObjectMonitorTable` to interact 
>> more seamlessly with the `ObjectMonitor::enter` code. 
>> 
>> _There is even more `ObjectMonitor` refactoring which can be done here to 
>> create a more understandable and enforceable API. There are a handful of 
>> invariants / assumptions which are not always explicitly asserted which 
>> could be trivially abstracted and verified by the type system by using 
>> similar witness objects._
>> 
>> # LightweightSynchronizer
>> 
>> Working on adapting and incorporating the following section as a comment in 
>> the source code
>> 
>> ## Fast Locking
>> 
>>   CAS on locking bits in markWord. 
>>   0b00 (Fast Locked) <--> 0b01 (Unlocked)
>> 
>>   When locking and 0b00 (Fast Locked) is observed, it may be beneficial to 
>> avoid inflating by spinning a bit.
>> 
>>   If 0b10 (Inflated) is observed or there is to...
>
> Axel Boldt-Christmas has updated the pull request incrementally with four 
> additional commits since the last revision:
> 
>  - Add extra comments in LightweightSynchronizer::inflate_fast_locked_object
>  - Fix typos
>  - Remove unused variable
>  - Add missing inline qualifiers

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 843:

> 841:   movptr(monitor, Address(box, 
> BasicLock::object_monitor_cache_offset_in_bytes()));
> 842:   // null check with ZF == 0, no valid pointer below 
> alignof(ObjectMonitor*)
> 843:   cmpptr(monitor, alignof(ObjectMonitor*));

Is this only for keeping `ZF == 0` and can be replaced by `test; je` if we are 
not using `jne` to jump to the slow path? Or is there any performance concern? 
Btw, I think `ZF` is always rewritten before entering into the slow path 
https://github.com/openjdk/jdk/blob/b32e4a68bca588d908bd81a398eb3171a6876dc5/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp#L98-L102

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1673704482


Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]

2024-07-09 Thread Yudi Zheng
On Tue, 9 Jul 2024 14:42:42 GMT, Doug Simon  wrote:

>> test/jdk/jdk/internal/vm/TestTranslatedException.java line 167:
>> 
>>> 165: private static void assertThrowableEquals(Throwable originalIn, 
>>> Throwable decodedIn) {
>>> 166: Throwable original = originalIn;
>>> 167: Throwable decoded = decodedIn;
>> 
>> What is the purpose of this renaming?
>
> So that the printing down the bottom of this message shows the complete 
> throwable, not just the cause on which the comparison failed.

Thanks! I missed the reassign in the folded unchanged code.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670683400


Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]

2024-07-09 Thread Yudi Zheng
On Tue, 9 Jul 2024 13:46:46 GMT, Doug Simon  wrote:

>> This PR addresses intermittent failures in jtreg GC stress tests. The 
>> failures occur under these conditions:
>> 1. Using a libgraal build with assertions enabled as the top tier JIT 
>> compiler. Such a libgraal build will cause a VM exit if an assertion or 
>> GraalError occurs in a compiler thread (as this catches more errors in 
>> testing).
>> 2. A libgraal compiler thread makes a call into the VM (via `CompilerToVM`) 
>> to a routine that performs a HotSpot heap allocation that fails.
>> 3. The resulting OOME is wrapped in a GraalError, causing the VM to exit as 
>> described in 1.
>> 
>> An OOME thrown in these specific conditions should not exit the VM as it not 
>> related to an OOME in the app or test. Instead, the failure should be 
>> treated as a bailout and the libgraal compiler should continue.
>> 
>> To accomplish this, libgraal needs to be able to distinguish a GraalError 
>> caused by an OOME. This PR modifies the exception translation code to make 
>> this possible.
>
> Doug Simon has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   fixed TestTranslatedException

LGTM

-

Marked as reviewed by yzheng (Committer).

PR Review: https://git.openjdk.org/jdk/pull/20083#pullrequestreview-2166581323


Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]

2024-07-09 Thread Yudi Zheng
On Tue, 9 Jul 2024 13:46:46 GMT, Doug Simon  wrote:

>> This PR addresses intermittent failures in jtreg GC stress tests. The 
>> failures occur under these conditions:
>> 1. Using a libgraal build with assertions enabled as the top tier JIT 
>> compiler. Such a libgraal build will cause a VM exit if an assertion or 
>> GraalError occurs in a compiler thread (as this catches more errors in 
>> testing).
>> 2. A libgraal compiler thread makes a call into the VM (via `CompilerToVM`) 
>> to a routine that performs a HotSpot heap allocation that fails.
>> 3. The resulting OOME is wrapped in a GraalError, causing the VM to exit as 
>> described in 1.
>> 
>> An OOME thrown in these specific conditions should not exit the VM as it not 
>> related to an OOME in the app or test. Instead, the failure should be 
>> treated as a bailout and the libgraal compiler should continue.
>> 
>> To accomplish this, libgraal needs to be able to distinguish a GraalError 
>> caused by an OOME. This PR modifies the exception translation code to make 
>> this possible.
>
> Doug Simon has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   fixed TestTranslatedException

src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 782:

> 780:   while (true) {
> 781: // Trigger an OutOfMemoryError
> 782: objArrayOop next = oopFactory::new_objectArray(0x7FFF, 
> CHECK_NULL);

Shall we check for pending exception and break here?

test/jdk/jdk/internal/vm/TestTranslatedException.java line 167:

> 165: private static void assertThrowableEquals(Throwable originalIn, 
> Throwable decodedIn) {
> 166: Throwable original = originalIn;
> 167: Throwable decoded = decodedIn;

What is the purpose of this renaming?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670646934
PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670607742


Re: RFR: 8315884: New Object to ObjectMonitor mapping [v2]

2024-07-08 Thread Yudi Zheng
On Mon, 8 Jul 2024 12:13:07 GMT, Axel Boldt-Christmas  
wrote:

>> When inflating a monitor the `ObjectMonitor*` is written directly over the 
>> `markWord` and any overwritten data is displaced into a displaced 
>> `markWord`. This is problematic for concurrent GCs which needs extra care or 
>> looser semantics to use this displaced data. In Lilliput this data also 
>> contains the klass forcing this to be something that the GC has to take into 
>> account everywhere.
>> 
>> This patch introduces an alternative solution where locking only uses the 
>> lock bits of the `markWord` and inflation does not override and displace the 
>> `markWord`. This is done by keeping associations between objects and 
>> `ObjectMonitor*` in an external hash table. Different caching techniques are 
>> used to speedup lookups from compiled code.
>> 
>> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is 
>> only supported in combination with the LM_LIGHTWEIGHT locking mode (the 
>> default). 
>> 
>> This patch has been evaluated to be performance neutral when 
>> `UseObjectMonitorTable` is turned off (the default). 
>> 
>> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` 
>> and `UseObjectMonitorTable` works.
>> 
>> # Cleanups
>> 
>> Cleaned up displaced header usage for:
>>   * BasicLock
>> * Contains some Zero changes
>> * Renames one exported JVMCI field
>>   * ObjectMonitor
>> * Updates comments and tests consistencies
>> 
>> # Refactoring
>> 
>> `ObjectMonitor::enter` has been refactored an a 
>> `ObjectMonitorContentionMark` witness object has been introduced to the 
>> signatures. Which signals that the contentions reference counter is being 
>> held. More details are given below in the section about deflation.
>> 
>> The initial purpose of this was to allow `UseObjectMonitorTable` to interact 
>> more seamlessly with the `ObjectMonitor::enter` code. 
>> 
>> _There is even more `ObjectMonitor` refactoring which can be done here to 
>> create a more understandable and enforceable API. There are a handful of 
>> invariants / assumptions which are not always explicitly asserted which 
>> could be trivially abstracted and verified by the type system by using 
>> similar witness objects._
>> 
>> # LightweightSynchronizer
>> 
>> Working on adapting and incorporating the following section as a comment in 
>> the source code
>> 
>> ## Fast Locking
>> 
>>   CAS on locking bits in markWord. 
>>   0b00 (Fast Locked) <--> 0b01 (Unlocked)
>> 
>>   When locking and 0b00 (Fast Locked) is observed, it may be beneficial to 
>> avoid inflating by spinning a bit.
>> 
>>   If 0b10 (Inflated) is observed or there is to...
>
> Axel Boldt-Christmas has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   More graceful JVMCI VM option interaction

Could you please revert 2814350 and export the following symbols to JVMCI?

diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 
b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
index faf2cb24616..7be31aa0f5f 100644
--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
+++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp
@@ -241,6 +241,7 @@
   nonstatic_field(JavaThread,  
_stack_overflow_state._reserved_stack_activation, address)  
  \
   nonstatic_field(JavaThread,  _held_monitor_count,
   intx)  \
   nonstatic_field(JavaThread,  _lock_stack,
   LockStack) \
+  nonstatic_field(JavaThread,  _om_cache,  
   OMCache)   \
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_in_VTMS_transition, 
   bool)) \
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_in_tmp_VTMS_transition, 
   bool)) \
   JVMTI_ONLY(nonstatic_field(JavaThread,   _is_disable_suspend,
   bool)) \
@@ -531,6 +532,8 @@
   \
   declare_constant_with_value("CardTable::dirty_card", 
CardTable::dirty_card_val()) \
   declare_constant_with_value("LockStack::_end_offset", 
LockStack::end_offset()) \
+  declare_constant_with_value("OMCache::oop_to_oop_difference", 
OMCache::oop_to_oop_difference()) \
+  declare_constant_with_value("OMCache::oop_to_monitor_difference", 
OMCache::oop_to_monitor_difference()) \
   \
   declare_constant(CodeInstaller::VERIFIED_ENTRY) \
   declare_constant(CodeInstaller::UNVERIFIED_ENTRY)   \

-

PR Comment: https://git.openjdk.org/jdk/pull/20067#issuecomment-2214322632


Integrated: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic

2024-05-27 Thread Yudi Zheng
On Tue, 12 Mar 2024 10:44:54 GMT, Yudi Zheng  wrote:

> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

This pull request has now been integrated.

Changeset: ed81a478
Author:Yudi Zheng 
Committer: Martin Doerr 
URL:   
https://git.openjdk.org/jdk/commit/ed81a478e175631f1de69eb4b43f927629fefd74
Stats: 146 lines in 17 files changed: 11 ins; 82 del; 53 mod

8327964: Simplify BigInteger.implMultiplyToLen intrinsic

Reviewed-by: mdoerr, amitkumar, kvn, fyang

-

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v7]

2024-05-27 Thread Yudi Zheng
On Fri, 24 May 2024 15:12:28 GMT, Yudi Zheng  wrote:

>> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
>> candidate to its caller simplifies the intrinsic implementation in JIT 
>> compiler.
>
> Yudi Zheng has updated the pull request with a new target base due to a merge 
> or a rebase. The incremental webrev excludes the unrelated changes brought in 
> by the merge/rebase. The pull request contains eight additional commits since 
> the last revision:
> 
>  - Merge remote-tracking branch 'upstream/master' into JDK-8327964
>  - address comments.
>  - address comments.
>  - address comment.
>  - address comment.
>  - address comment.
>  - address comment.
>  - Simplify BigInteger.implMultiplyToLen intrinsic

Thanks for the reviews! Mach5 testing looks good except for a couple known 
timeouts unrelated to this PR. GHA test failure is due to 
[JDK-8332923](https://bugs.openjdk.org/browse/JDK-8332923).

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2133444527


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v7]

2024-05-24 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request with a new target base due to a merge 
or a rebase. The incremental webrev excludes the unrelated changes brought in 
by the merge/rebase. The pull request contains eight additional commits since 
the last revision:

 - Merge remote-tracking branch 'upstream/master' into JDK-8327964
 - address comments.
 - address comments.
 - address comment.
 - address comment.
 - address comment.
 - address comment.
 - Simplify BigInteger.implMultiplyToLen intrinsic

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/7c6023f8..c719e0a9

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=06
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=05-06

  Stats: 560567 lines in 6784 files changed: 132593 ins; 81763 del; 346211 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]

2024-05-24 Thread Yudi Zheng
On Thu, 23 May 2024 10:12:17 GMT, Bhavana Kilambi  wrote:

>> Yudi Zheng has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   address comments.
>
> src/hotspot/share/opto/library_call.cpp line 5925:
> 
>> 5923:   // Set the original stack and the reexecute bit for the interpreter 
>> to reexecute
>> 5924:   // the bytecode that invokes BigInteger.multiplyToLen() if 
>> deoptimization happens
>> 5925:   // on the return from z array allocation in runtime.
> 
> Since we are not allocating z array during runtime anymore, do we still need 
> these comments?

Thanks for pointing it out! Removed.

> src/java.base/share/classes/java/math/BigInteger.java line 1836:
> 
>> 1834: 
>> 1835: if (z == null || z.length < (xlen + ylen))
>> 1836:  z = new int[xlen + ylen];
> 
> Style: only 4 spaces indentation

Done

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1613608426
PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1613608653


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]

2024-05-22 Thread Yudi Zheng
On Wed, 22 May 2024 14:47:43 GMT, Yudi Zheng  wrote:

>> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
>> candidate to its caller simplifies the intrinsic implementation in JIT 
>> compiler.
>
> Yudi Zheng has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   address comments.

@theRealAph @TheRealMDoerr @RealFYang @offamitkumar could you please help 
review the aarch64/ppc/riscv/s390 changes?

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2125073511


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]

2024-05-22 Thread Yudi Zheng
On Wed, 17 Apr 2024 20:04:44 GMT, Dean Long  wrote:

>> Yudi Zheng has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   address comment.
>
> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6662:
> 
>> 6660:   push(tmp5);
>> 6661: 
>> 6662:   push(xlen);
> 
> There may be an opportunity here (separate RFE?) to get rid of the 
> save/restore for these.  I don't think it's necessary if this is called as 
> part of a C2 stub.

In the Graal port we did get rid of these save/restore.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610126799


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]

2024-05-22 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request incrementally with one additional 
commit since the last revision:

  address comments.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/72ba58ce..7c6023f8

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=05
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=04-05

  Stats: 24 lines in 2 files changed: 0 ins; 0 del; 24 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]

2024-05-22 Thread Yudi Zheng
On Mon, 20 May 2024 10:41:36 GMT, Bhavana Kilambi  wrote:

>> @dafedafe @dean-long please take a look and let me know if there are further 
>> issues, thanks!
>
> Hi @mur47x111, do you happen to have any performance results with this patch?

@Bhavana-Kilambi the performance result for x86 is at 
https://github.com/openjdk/jdk/pull/18226#issuecomment-2007922439 . It is 
expected to be negligible.

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2124984579


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]

2024-05-22 Thread Yudi Zheng
On Wed, 17 Apr 2024 19:33:01 GMT, Dean Long  wrote:

>> Yudi Zheng has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   address comment.
>
> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4670:
> 
>> 4668: const Register tmp5  = r15;
>> 4669: const Register tmp6  = r16;
>> 4670: const Register tmp7  = r17;
> 
> Why not minimize changes and continue to use r5 for tmp0?  I see no need for 
> r17 or to reassign all the other tmp registers.

Was attempting to align the suffixes. Will revert.

> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4693:
> 
>> 4691: const Register xlen  = r1;
>> 4692: const Register z = r2;
>> 4693: const Register zlen  = r3;
> 
> LibraryCallKit::inline_squareToLen() is still computing zlen and passing it 
> as the 4th arg, even though the value is unused.

ppc x86 are not using `multiply_to_len` for `generate_squareToLen`. I think we 
still need to pass zlen for these platforms.

-

PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610083476
PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610088021


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]

2024-04-17 Thread Yudi Zheng
On Tue, 26 Mar 2024 15:59:33 GMT, Damon Fenacci  wrote:

>> Yudi Zheng has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   address comment.
>
> `multiply_to_len` seems to be used by `generate_squareToLen` as well for 
> aarch64 and riscv but `zlen` is still passed in a register.
> 
> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710
> https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881
> 
> I think it might work anyway but it might be better to adapt them if only for 
> completeness.

@dafedafe @dean-long please take a look and let me know if there are further 
issues, thanks!

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2061162283


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]

2024-04-17 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request incrementally with one additional 
commit since the last revision:

  address comment.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/c5d521dc..72ba58ce

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=04
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=03-04

  Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v4]

2024-04-17 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request incrementally with one additional 
commit since the last revision:

  address comment.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/870a6127..c5d521dc

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=02-03

  Stats: 53 lines in 10 files changed: 3 ins; 6 del; 44 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]

2024-03-19 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request incrementally with one additional 
commit since the last revision:

  address comment.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/bfc323b7..870a6127

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=01-02

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v2]

2024-03-19 Thread Yudi Zheng
> Moving array construction within BigInteger.implMultiplyToLen intrinsic 
> candidate to its caller simplifies the intrinsic implementation in JIT 
> compiler.

Yudi Zheng has updated the pull request incrementally with one additional 
commit since the last revision:

  address comment.

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/18226/files
  - new: https://git.openjdk.org/jdk/pull/18226/files/37232a5f..bfc323b7

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=00-01

  Stats: 70 lines in 11 files changed: 8 ins; 28 del; 34 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v2]

2024-03-19 Thread Yudi Zheng
On Mon, 18 Mar 2024 16:55:28 GMT, Damon Fenacci  wrote:

> Quite a simplification! Have you checked if there are any performance 
> differences?

Ran 
https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.micro.benchmarks/src/micro/benchmarks/BigIntegerBenchmark.java
The results are

$ before change
Benchmark Mode  CntScore   Error  Units
BigIntegerBenchmark.bigIntMontgomeryMul  thrpt5  122.488 ± 0.130  ops/s
BigIntegerBenchmark.bigIntMontgomerySqr  thrpt5   76.023 ± 0.106  ops/s
BigIntegerBenchmark.bigIntMulthrpt5  330.130 ± 0.349  ops/s
BigIntegerBenchmark.bigIntMulAdd thrpt5  455.590 ± 0.663  ops/s

$ after change
Benchmark Mode  CntScore   Error  Units
BigIntegerBenchmark.bigIntMontgomeryMul  thrpt5  124.407 ± 0.045  ops/s
BigIntegerBenchmark.bigIntMontgomerySqr  thrpt5   76.036 ± 0.232  ops/s
BigIntegerBenchmark.bigIntMulthrpt5  329.836 ± 0.953  ops/s
BigIntegerBenchmark.bigIntMulAdd thrpt5  456.485 ± 0.766  ops/s

-

PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2007922439


RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic

2024-03-12 Thread Yudi Zheng
Moving array construction within BigInteger.implMultiplyToLen intrinsic 
candidate to its caller simplifies the intrinsic implementation in JIT compiler.

-

Commit messages:
 - Simplify BigInteger.implMultiplyToLen intrinsic

Changes: https://git.openjdk.org/jdk/pull/18226/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8327964
  Stats: 53 lines in 2 files changed: 4 ins; 49 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/18226.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226

PR: https://git.openjdk.org/jdk/pull/18226


Re: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11]

2023-08-02 Thread Yudi Zheng
On Wed, 2 Aug 2023 12:33:43 GMT, Ferenc Rakoczi  wrote:

>> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 3473:
>> 
>>> 3471: __ bcax(v24, __ T16B, v24, v8,  v31);
>>> 3472: 
>>> 3473: __ ld1r(v31, __ T2D, __ post(rscratch1, 8));
>> 
>> is it intentional to load 16 bytes and post-increment by 8?
>
> Actually, with the ld1r instruction the post increment should be the same as 
> the size of the memory accessed. So T2D requires 8 as it reads 8 bytes(and 
> duplicates it into both halves of the SIMD register).

Thanks for the clarification!

-

PR Review Comment: https://git.openjdk.org/jdk/pull/207#discussion_r1281961206


Re: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11]

2023-08-02 Thread Yudi Zheng
On Wed, 21 Oct 2020 23:42:33 GMT, Fei Yang  wrote:

>> Contributed-by: ard.biesheu...@linaro.org, dong...@huawei.com
>> 
>> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions.
>> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto 
>> Extensions:
>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52
>> 
>> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding 
>> the intrinsic.
>> For SHA3, we need to pass one extra parameter "digestLength" to the stub for 
>> the calculation of block size.
>> "digestLength" is also used in for the EOR loop before keccak to 
>> differentiate different SHA3 variants.
>> 
>> We added jtreg tests for SHA3 and used QEMU system emulator which supports 
>> SHA3 instructions to test the functionality.
>> Patch passed jtreg tier1-3 tests with QEMU system emulator.
>> Also verified with jtreg tier1-3 tests without SHA3 instructions on 
>> aarch64-linux-gnu and x86_64-linux-gnu, to make sure that there's no 
>> regression.
>> 
>> We used one existing JMH test for performance test: 
>> test/micro/org/openjdk/bench/java/security/MessageDigests.java
>> We measured the performance benefit with an aarch64 cycle-accurate simulator.
>> Patch delivers 20% - 40% performance improvement depending on specific SHA3 
>> digest length and size of the message.
>> 
>> For now, this feature will not be enabled automatically for aarch64. We can 
>> auto-enable this when it is fully tested on real hardware.  But for the 
>> above testing purposes, this is auto-enabled when the corresponding hardware 
>> feature is detected.
>
> Fei Yang has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Add if (isJDK16OrHigher()) check for SHA3 in CheckGraalIntrinsics.java

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 3473:

> 3471: __ bcax(v24, __ T16B, v24, v8,  v31);
> 3472: 
> 3473: __ ld1r(v31, __ T2D, __ post(rscratch1, 8));

is it intentional to load 16 bytes and post-increment by 8?

-

PR Review Comment: https://git.openjdk.org/jdk/pull/207#discussion_r1281749663