Re: RFR: 8344223: Remove calls to SecurityManager and doPrivileged in java.net.URLClassLoader after JEP 486 integration [v4]
On Wed, 20 Nov 2024 06:52:11 GMT, Jaikiran Pai wrote: >> Can I please get a review of this change which proposes to remove >> SecurityManager related API usages from `URLClassLoader` and its related >> `URLClassPath`? This addresses https://bugs.openjdk.org/browse/JDK-8344223. >> >> The `URLClassLoader.getPermissions()` method will need additional changes >> but that will be done as part of https://bugs.openjdk.org/browse/JDK-8343150. >> >> I'll be opening a CSR for this change because we are removing support for >> the `jdk.net.URLClassPath.disableRestrictedPermissions` system property. >> >> No new tests have been added and existing tier1, tier2 and tier3 tests >> continue to pass. > > Jaikiran Pai has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains four additional > commits since the last revision: > > - merge latest from master branch > - Alan's review - fix code comment > - merge latest from master branch > - 8344223: Remove calls to SecurityManager and doPrivileged in > java.net.URLClassLoader after JEP 486 integration Marked as reviewed by yzheng (Committer). - PR Review: https://git.openjdk.org/jdk/pull/22233#pullrequestreview-2448510144
Re: RFR: 8343982: Remove usage of security manager from ClassLoader and related classes [v2]
On Thu, 14 Nov 2024 11:25:08 GMT, Alan Bateman wrote: >> Remove the code for the now defunct SecurityManager execution mode from >> ClassLoader and the built-in class loader implementations. The override of >> getPermissions is removed from the application class loader, and from the >> class loaders used to support module layers. >> >> URLCassLoader (and its implementation URLClassPath) will be handled >> separately as there are spec changes required, and code in URLClassPath that >> requires special attention. > > Alan Bateman has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Change 2-arg findResource to use findResourceOnClassPath when finding > resources on class path > - Merge branch 'master' into JDK-8343982 > - Initial commit LGTM - Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/22097#pullrequestreview-2436292670
Re: RFR: 8344011: Remove usage of security manager from Class and reflective APIs [v2]
On Wed, 13 Nov 2024 19:05:21 GMT, Alan Bateman wrote: >> Remove code required for the now defunct SecurityManager execution mode from >> java.lang.Class, friends, and reflection APIs. Careful review is required so >> I've set Reviewer to 2. I've tried to keep the changes as easy to review as >> possible and not go over board with cleanup. >> >> sun.reflect.misc.ReflectUtil are been hollowed out. A future pass will >> remove empty methods and qualified exports once the changes in "far away" >> code and modules is done. >> >> In Lookup's class description, the removal of the sentence "avoid package >> access checks for classes accessible to the lookup class" and the link to >> the removed "Security manager interactions" section is in >> discussion/non-normative text, just missed in the JEP 486 update that remove >> the linked section. >> >> runtime/cds/appcds/StaticArchiveWithLambda.java is updated as creating the >> archive no longer skips a generated class. >> >> Testing: tier1-5 > > Alan Bateman has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - StaticArchiveWithLambda no longer skips dynamically generated class when > creating archive > - Merge branch 'master' into JDK-8344011 > - Initial commit LGTM. Graal changes are ready - Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/22063#pullrequestreview-2435251121
Re: RFR: 8343981: Remove usage of security manager from Thread and related classes [v4]
On Tue, 12 Nov 2024 18:47:46 GMT, Alan Bateman wrote: >> Removes the SecurityManager usage from Thread + friends. >> >> In Thread, the getContextClassLoader method is no longer caller-sensitive >> method. >> >> JavaLangAccess.newThreadWithAcc is removed and jdk.internal.access is no >> longer exported to java.naming. The usage of newThreadWithAcc is removed >> from com.sun.jndi.ldap.VersionHelper. There will be further work on the >> java.naming module to remove usage of SM, the change here is specific to the >> usage of ewThreadWithAcc. > > Alan Bateman has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 227 commits: > > - Merge branch 'master' into JDK-8343981 > - Merge commit 'db85090553ab14a84c3ed0a2604dd56c5b6e6982' into JDK-8343981 > - Initial commit > - Merge remote-tracking branch 'jdk-sandbox/jep486' into JDK-8338411 > - Merge branch 'master' into jep486 > - Move remaining JEP 486 failing tests into correct groups. > - Move JEP 486 failing tests into hotspot_runtime group. > - test/jdk/java/rmi/server/RMIClassLoader/spi/DefaultProperty.java failing > - Merge branch 'master' into jep486 > - Merge branch 'master' into jep486 > - ... and 217 more: https://git.openjdk.org/jdk/compare/63eb4853...636ebd19 Marked as reviewed by yzheng (Committer). - PR Review: https://git.openjdk.org/jdk/pull/22035#pullrequestreview-2433814049
Re: RFR: 8343981: Remove usage of security manager from Thread and related classes [v4]
On Tue, 12 Nov 2024 18:47:46 GMT, Alan Bateman wrote: >> Removes the SecurityManager usage from Thread + friends. >> >> In Thread, the getContextClassLoader method is no longer caller-sensitive >> method. >> >> JavaLangAccess.newThreadWithAcc is removed and jdk.internal.access is no >> longer exported to java.naming. The usage of newThreadWithAcc is removed >> from com.sun.jndi.ldap.VersionHelper. There will be further work on the >> java.naming module to remove usage of SM, the change here is specific to the >> usage of ewThreadWithAcc. > > Alan Bateman has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 227 commits: > > - Merge branch 'master' into JDK-8343981 > - Merge commit 'db85090553ab14a84c3ed0a2604dd56c5b6e6982' into JDK-8343981 > - Initial commit > - Merge remote-tracking branch 'jdk-sandbox/jep486' into JDK-8338411 > - Merge branch 'master' into jep486 > - Move remaining JEP 486 failing tests into correct groups. > - Move JEP 486 failing tests into hotspot_runtime group. > - test/jdk/java/rmi/server/RMIClassLoader/spi/DefaultProperty.java failing > - Merge branch 'master' into jep486 > - Merge branch 'master' into jep486 > - ... and 217 more: https://git.openjdk.org/jdk/compare/63eb4853...636ebd19 Please hold until I have the Graal changes ready - PR Comment: https://git.openjdk.org/jdk/pull/22035#issuecomment-2474106454
Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning
On Thu, 17 Oct 2024 14:28:30 GMT, Patricio Chilano Mateo wrote: > This is the implementation of JEP 491: Synchronize Virtual Threads without > Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for > further details. > > In order to make the code review easier the changes have been split into the > following initial 4 commits: > > - Changes to allow unmounting a virtual thread that is currently holding > monitors. > - Changes to allow unmounting a virtual thread blocked on synchronized trying > to acquire the monitor. > - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and > its timed-wait variants. > - Changes to tests, JFR pinned event, and other changes in the JDK libraries. > > The changes fix pinning issues for all 4 ports that currently implement > continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added > recently and stand in its own commit after the initial ones. > > The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default > locking mode, (and `LM_MONITOR` which comes for free), but not when using > `LM_LEGACY` mode. Note that the `LockingMode` flag has already been > deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with > the intention to remove `LM_LEGACY` code in future releases. > > > ## Summary of changes > > ### Unmount virtual thread while holding monitors > > As stated in the JEP, currently when a virtual thread enters a synchronized > method or block, the JVM records the virtual thread's carrier platform thread > as holding the monitor, not the virtual thread itself. This prevents the > virtual thread from being unmounted from its carrier, as ownership > information would otherwise go wrong. In order to fix this limitation we will > do two things: > > - We copy the oops stored in the LockStack of the carrier to the stackChunk > when freezing (and clear the LockStack). We copy the oops back to the > LockStack of the next carrier when thawing for the first time (and clear them > from the stackChunk). Note that we currently assume carriers don't hold > monitors while mounting virtual threads. > > - For inflated monitors we now record the `java.lang.Thread.tid` of the owner > in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows > us to tie the owner of the monitor to a `java.lang.Thread` instance, rather > than to a JavaThread which is only created per platform thread. The tid is > already a 64 bit field so we can ignore issues of the counter wrapping around. > > General notes about this part: > > - Since virtual threads don't need to worry about holding monitors anymo... src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 329: > 327: nonstatic_field(ObjArrayKlass, _element_klass, > Klass*)\ > 328: > \ > 329: unchecked_nonstatic_field(ObjectMonitor, _owner, > int64_t) \ to make the type assert more precise: diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index 20b9609cdbf..f2b8a69c03f 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -326,7 +326,7 @@ \ nonstatic_field(ObjArrayKlass, _element_klass, Klass*)\ \ - unchecked_nonstatic_field(ObjectMonitor, _owner, int64_t) \ + volatile_nonstatic_field(ObjectMonitor, _owner, int64_t) \ volatile_nonstatic_field(ObjectMonitor, _recursions, intptr_t) \ volatile_nonstatic_field(ObjectMonitor, _cxq, ObjectWaiter*) \ volatile_nonstatic_field(ObjectMonitor, _EntryList, ObjectWaiter*) \ diff --git a/src/hotspot/share/runtime/vmStructs.cpp b/src/hotspot/share/runtime/vmStructs.cpp index 86d7277f88b..0492f28e15b 100644 --- a/src/hotspot/share/runtime/vmStructs.cpp +++ b/src/hotspot/share/runtime/vmStructs.cpp @@ -786,8 +786,8 @@ \ volatile_nonstatic_field
Re: RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning [v12]
On Fri, 25 Oct 2024 21:33:24 GMT, Patricio Chilano Mateo wrote: >> This is the implementation of JEP 491: Synchronize Virtual Threads without >> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for >> further details. >> >> In order to make the code review easier the changes have been split into the >> following initial 4 commits: >> >> - Changes to allow unmounting a virtual thread that is currently holding >> monitors. >> - Changes to allow unmounting a virtual thread blocked on synchronized >> trying to acquire the monitor. >> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` >> and its timed-wait variants. >> - Changes to tests, JFR pinned event, and other changes in the JDK libraries. >> >> The changes fix pinning issues for all 4 ports that currently implement >> continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added >> recently and stand in its own commit after the initial ones. >> >> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default >> locking mode, (and `LM_MONITOR` which comes for free), but not when using >> `LM_LEGACY` mode. Note that the `LockingMode` flag has already been >> deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), >> with the intention to remove `LM_LEGACY` code in future releases. >> >> >> ## Summary of changes >> >> ### Unmount virtual thread while holding monitors >> >> As stated in the JEP, currently when a virtual thread enters a synchronized >> method or block, the JVM records the virtual thread's carrier platform >> thread as holding the monitor, not the virtual thread itself. This prevents >> the virtual thread from being unmounted from its carrier, as ownership >> information would otherwise go wrong. In order to fix this limitation we >> will do two things: >> >> - We copy the oops stored in the LockStack of the carrier to the stackChunk >> when freezing (and clear the LockStack). We copy the oops back to the >> LockStack of the next carrier when thawing for the first time (and clear >> them from the stackChunk). Note that we currently assume carriers don't hold >> monitors while mounting virtual threads. >> >> - For inflated monitors we now record the `java.lang.Thread.tid` of the >> owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This >> allows us to tie the owner of the monitor to a `java.lang.Thread` instance, >> rather than to a JavaThread which is only created per platform thread. The >> tid is already a 64 bit field so we can ignore issues of the counter >> wrapping around. >> >> General notes about this part: >> >> - Since virtual th... > > Patricio Chilano Mateo has updated the pull request incrementally with two > additional commits since the last revision: > > - Restore use of atPointA in test StopThreadTest.java > - remove interruptible check from conditional in Object::wait src/hotspot/share/jvmci/vmStructs_jvmci.cpp line 329: > 327: nonstatic_field(ObjArrayKlass, _element_klass, > Klass*)\ > 328: > \ > 329: unchecked_nonstatic_field(ObjectMonitor, _owner, > int64_t) \ to make the type assert more precise: diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index 20b9609cdbf..f2b8a69c03f 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -326,7 +326,7 @@ \ nonstatic_field(ObjArrayKlass, _element_klass, Klass*)\ \ - unchecked_nonstatic_field(ObjectMonitor, _owner, int64_t) \ + volatile_nonstatic_field(ObjectMonitor, _owner, int64_t) \ volatile_nonstatic_field(ObjectMonitor, _recursions, intptr_t) \ volatile_nonstatic_field(ObjectMonitor, _cxq, ObjectWaiter*) \ volatile_nonstatic_field(ObjectMonitor, _EntryList, ObjectWaiter*) \ diff --git a/src/hotspot/share/runtime/vmStructs.cpp b/src/hotspot/share/runtime/vmStructs.cpp index 86d7277f88b..0492f28e15b 100644 --- a/src/hotspot/share
Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v13]
On Mon, 23 Sep 2024 19:24:51 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.tanh() using libm >> >> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup >> -- | -- | -- | -- >> MathBench.tanhDouble | 70900 | 95618 | 1.35x > > Srinivas Vamsi Parasa has updated the pull request incrementally with one > additional commit since the last revision: > > change ifdef from x86 to AMD64 src/hotspot/cpu/x86/stubGenerator_x86_64_tanh.cpp line 74: > 72: // Special cases: > 73: // tanh(NaN) = quiet NaN, and raise invalid exception > 74: // tanh(INF) = that INF This should be tanh(POSITIVE_INFINITY) = +1.0 tanh(NEGATIVE_INFINITY) = -1.0 - PR Review Comment: https://git.openjdk.org/jdk/pull/20657#discussion_r1803316120
Re: RFR: 8329597: C2: Intrinsify Reference.clear [v9]
On Wed, 9 Oct 2024 08:44:34 GMT, Aleksey Shipilev wrote: >> [JDK-8240696](https://bugs.openjdk.org/browse/JDK-8240696) added the native >> method for `Reference.clear`. The original patch skipped intrinsification of >> this method, because we thought `Reference.clear` is not on a performance >> sensitive path. However, it shows up prominently on simple benchmarks that >> touch e.g. `ThreadLocal` cleanups. See the bug for an example profile with >> `RRWL` benchmarks. >> >> We need to know the actual oop strongness/weakness before we call into C2 >> Access API, this work models this after existing code for `refersTo0` >> intrinsics. C2 Access also need a support for `AS_NO_KEEPALIVE` for stores. >> >> Additional testing: >> - [x] Linux x86_64 server fastdebug, `all` >> - [x] Linux AArch64 server fastdebug, `all` > > Aleksey Shipilev has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 18 commits: > > - Simplify: just do keep alive checks > - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear > - More precise bit-unmasks > - Reconcile with late barrier expansion in G1 > - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear > - Review comments > - Merge branch 'master' into JDK-8329597-intrinsify-reference-clear > - Also dispatch to slow-path on other arches > - Fix other arches > - Tighten up comments in Reference javadoc > - ... and 8 more: https://git.openjdk.org/jdk/compare/580eb62d...9f7ad7ab src/hotspot/share/gc/z/zBarrierSetRuntime.hpp line 43: > 41: static void store_barrier_on_oop_field_with_healing(oop* p); > 42: static void store_barrier_on_oop_field_without_healing(oop* p); > 43: static void > no_keepalive_store_barrier_on_oop_field_without_healing(oop* p); Could you please export this to JVMCI? I.e., diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index 5452cca96b8..46aeb996c56 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -847,6 +847,7 @@ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_store_good)) \ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::no_keepalive_load_barrier_on_weak_oop_field_preloaded))\ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::no_keepalive_load_barrier_on_phantom_oop_field_preloaded)) \ + ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::no_keepalive_store_barrier_on_oop_field_without_healing) \ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::store_barrier_on_native_oop_field_without_healing))\ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing)) \ ZGC_ONLY(DECLARE_FUNCTION_FROM_ADDR(declare_function_with_value, ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing)) \ Thanks! - PR Review Comment: https://git.openjdk.org/jdk/pull/20139#discussion_r1800848130
Re: RFR: 8338694: x86_64 intrinsic for tanh using libm
On Wed, 21 Aug 2024 00:25:03 GMT, Srinivas Vamsi Parasa wrote: > The goal of this PR is to implement an x86_64 intrinsic for > java.lang.Math.tanh() using libm > > Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup > -- | -- | -- | -- > MathBench.tanhDouble | 70900 | 95618 | 1.35x src/hotspot/share/jvmci/jvmciCompilerToVM.hpp line 114: > 112: static address dcos; > 113: static address dtan; > 114: static address dtanh; Could you please add the following initializing code? diff --git a/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp b/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp index 9752d7edf99..1db9be70db0 100644 --- a/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp +++ b/src/hotspot/share/jvmci/jvmciCompilerToVMInit.cpp @@ -259,6 +259,17 @@ void CompilerToVM::Data::initialize(JVMCI_TRAPS) { SET_TRIGFUNC(dpow); #undef SET_TRIGFUNC + +#define SET_TRIGFUNC_OR_NULL(name) \ + if (StubRoutines::name() != nullptr) {\ +name = StubRoutines::name();\ + } else { \ +name = nullptr; \ + } + + SET_TRIGFUNC_OR_NULL(dtanh); + +#undef SET_TRIGFUNC_OR_NULL } static jboolean is_c1_supported(vmIntrinsics::ID id){ diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index fea308503cf..189c1465589 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -126,6 +126,7 @@ static_field(CompilerToVM::Data, dsin, address) \ static_field(CompilerToVM::Data, dcos, address) \ static_field(CompilerToVM::Data, dtan, address) \ + static_field(CompilerToVM::Data, dtanh, address) \ static_field(CompilerToVM::Data, dexp, address) \ static_field(CompilerToVM::Data, dlog, address) \ static_field(CompilerToVM::Data, dlog10, address) \ - PR Review Comment: https://git.openjdk.org/jdk/pull/20657#discussion_r1734655447
Re: RFR: 8338745: Intrinsify Continuation.pin() and Continuation.unpin() [v2]
On Thu, 22 Aug 2024 11:13:17 GMT, Markus Grönlund wrote: >> Greetings, >> >> Please help review this change set that implements C2 intrinsics for >> jdk.internal.vm.Continuation.pin() and jdk.internal.vm.Continuation.unpin(). >> >> This work is a consequence of >> [JDK-8338417](https://bugs.openjdk.org/browse/JDK-8338417), which required >> us to introduce explicit pin constructs for Virtual threads in a relatively >> performance-sensitive path. >> >> Testing: jdk_jfr, loom testing >> >> Comment: I changed the type of the ContinuationEntry::_pin_count field from >> uint to uin32_t to make the size explicit and to access it uniformly from >> the intrinsic code using T_INT. >> >> Thanks >> Markus > > Markus Grönlund has updated the pull request incrementally with one > additional commit since the last revision: > > Deoptimization::Action_none for no deopt src/hotspot/share/opto/library_call.cpp line 3733: > 3731: // TLS > 3732: Node* tls_ptr = _gvn.transform(new ThreadLocalNode()); > 3733: Node* last_continuation_offset = basic_plus_adr(top(), tls_ptr, > in_bytes(JavaThread::cont_entry_offset())); Could you please export `JavaThread::_cont_entry` and `ContinuationEntry::_pin_count` to JVMCI? Thanks! diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index 688691fb976..a25ecd2bab5 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -35,6 +35,7 @@ #include "oops/methodCounters.hpp" #include "oops/objArrayKlass.hpp" #include "prims/jvmtiThreadState.hpp" +#include "runtime/continuationEntry.hpp" #include "runtime/deoptimization.hpp" #include "runtime/flags/jvmFlag.hpp" #include "runtime/osThread.hpp" @@ -244,10 +245,13 @@ nonstatic_field(JavaThread, _held_monitor_count, intx) \ nonstatic_field(JavaThread, _lock_stack, LockStack) \ nonstatic_field(JavaThread, _om_cache, OMCache) \ + nonstatic_field(JavaThread, _cont_entry, ContinuationEntry*)\ JVMTI_ONLY(nonstatic_field(JavaThread, _is_in_VTMS_transition, bool)) \ JVMTI_ONLY(nonstatic_field(JavaThread, _is_in_tmp_VTMS_transition, bool)) \ JVMTI_ONLY(nonstatic_field(JavaThread, _is_disable_suspend, bool)) \ \ + nonstatic_field(ContinuationEntry, _pin_count, uint32_t) \ + \ nonstatic_field(LockStack, _top, uint32_t) \ \ JVMTI_ONLY(static_field(JvmtiVTMSTransitionDisabler, _VTMS_notify_jvmti_events, bool)) \ diff --git a/src/hotspot/share/runtime/continuationEntry.hpp b/src/hotspot/share/runtime/continuationEntry.hpp index 459321f444c..ac76cd6f088 100644 --- a/src/hotspot/share/runtime/continuationEntry.hpp +++ b/src/hotspot/share/runtime/continuationEntry.hpp @@ -39,6 +39,7 @@ class RegisterMap; // Metadata stored in the continuation entry frame class ContinuationEntry { + friend class JVMCIVMStructs; ContinuationEntryPD _pd; #ifdef ASSERT private: - PR Review Comment: https://git.openjdk.org/jdk/pull/20664#discussion_r1726876549
Re: RFR: 8315884: New Object to ObjectMonitor mapping [v20]
On Thu, 15 Aug 2024 06:12:22 GMT, Axel Boldt-Christmas wrote: >> When inflating a monitor the `ObjectMonitor*` is written directly over the >> `markWord` and any overwritten data is displaced into a displaced >> `markWord`. This is problematic for concurrent GCs which needs extra care or >> looser semantics to use this displaced data. In Lilliput this data also >> contains the klass forcing this to be something that the GC has to take into >> account everywhere. >> >> This patch introduces an alternative solution where locking only uses the >> lock bits of the `markWord` and inflation does not override and displace the >> `markWord`. This is done by keeping associations between objects and >> `ObjectMonitor*` in an external hash table. Different caching techniques are >> used to speedup lookups from compiled code. >> >> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is >> only supported in combination with the LM_LIGHTWEIGHT locking mode (the >> default). >> >> This patch has been evaluated to be performance neutral when >> `UseObjectMonitorTable` is turned off (the default). >> >> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` >> and `UseObjectMonitorTable` works. >> >> # Cleanups >> >> Cleaned up displaced header usage for: >> * BasicLock >> * Contains some Zero changes >> * Renames one exported JVMCI field >> * ObjectMonitor >> * Updates comments and tests consistencies >> >> # Refactoring >> >> `ObjectMonitor::enter` has been refactored an a >> `ObjectMonitorContentionMark` witness object has been introduced to the >> signatures. Which signals that the contentions reference counter is being >> held. More details are given below in the section about deflation. >> >> The initial purpose of this was to allow `UseObjectMonitorTable` to interact >> more seamlessly with the `ObjectMonitor::enter` code. >> >> _There is even more `ObjectMonitor` refactoring which can be done here to >> create a more understandable and enforceable API. There are a handful of >> invariants / assumptions which are not always explicitly asserted which >> could be trivially abstracted and verified by the type system by using >> similar witness objects._ >> >> # LightweightSynchronizer >> >> Working on adapting and incorporating the following section as a comment in >> the source code >> >> ## Fast Locking >> >> CAS on locking bits in markWord. >> 0b00 (Fast Locked) <--> 0b01 (Unlocked) >> >> When locking and 0b00 (Fast Locked) is observed, it may be beneficial to >> avoid inflating by spinning a bit. >> >> If 0b10 (Inflated) is observed or there is to... > > Axel Boldt-Christmas has updated the pull request incrementally with one > additional commit since the last revision: > > Remove newline src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 677: > 675: > 676: // Check for match. > 677: cmpptr(obj, Address(t)); `Address(t)` can be cached (in rax?) and reused in the subsequent `comptr` - PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1718426158
Re: RFR: 8315884: New Object to ObjectMonitor mapping [v5]
On Wed, 10 Jul 2024 20:10:07 GMT, Axel Boldt-Christmas wrote: >> When inflating a monitor the `ObjectMonitor*` is written directly over the >> `markWord` and any overwritten data is displaced into a displaced >> `markWord`. This is problematic for concurrent GCs which needs extra care or >> looser semantics to use this displaced data. In Lilliput this data also >> contains the klass forcing this to be something that the GC has to take into >> account everywhere. >> >> This patch introduces an alternative solution where locking only uses the >> lock bits of the `markWord` and inflation does not override and displace the >> `markWord`. This is done by keeping associations between objects and >> `ObjectMonitor*` in an external hash table. Different caching techniques are >> used to speedup lookups from compiled code. >> >> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is >> only supported in combination with the LM_LIGHTWEIGHT locking mode (the >> default). >> >> This patch has been evaluated to be performance neutral when >> `UseObjectMonitorTable` is turned off (the default). >> >> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` >> and `UseObjectMonitorTable` works. >> >> # Cleanups >> >> Cleaned up displaced header usage for: >> * BasicLock >> * Contains some Zero changes >> * Renames one exported JVMCI field >> * ObjectMonitor >> * Updates comments and tests consistencies >> >> # Refactoring >> >> `ObjectMonitor::enter` has been refactored an a >> `ObjectMonitorContentionMark` witness object has been introduced to the >> signatures. Which signals that the contentions reference counter is being >> held. More details are given below in the section about deflation. >> >> The initial purpose of this was to allow `UseObjectMonitorTable` to interact >> more seamlessly with the `ObjectMonitor::enter` code. >> >> _There is even more `ObjectMonitor` refactoring which can be done here to >> create a more understandable and enforceable API. There are a handful of >> invariants / assumptions which are not always explicitly asserted which >> could be trivially abstracted and verified by the type system by using >> similar witness objects._ >> >> # LightweightSynchronizer >> >> Working on adapting and incorporating the following section as a comment in >> the source code >> >> ## Fast Locking >> >> CAS on locking bits in markWord. >> 0b00 (Fast Locked) <--> 0b01 (Unlocked) >> >> When locking and 0b00 (Fast Locked) is observed, it may be beneficial to >> avoid inflating by spinning a bit. >> >> If 0b10 (Inflated) is observed or there is to... > > Axel Boldt-Christmas has updated the pull request incrementally with four > additional commits since the last revision: > > - Add extra comments in LightweightSynchronizer::inflate_fast_locked_object > - Fix typos > - Remove unused variable > - Add missing inline qualifiers src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 843: > 841: movptr(monitor, Address(box, > BasicLock::object_monitor_cache_offset_in_bytes())); > 842: // null check with ZF == 0, no valid pointer below > alignof(ObjectMonitor*) > 843: cmpptr(monitor, alignof(ObjectMonitor*)); Is this only for keeping `ZF == 0` and can be replaced by `test; je` if we are not using `jne` to jump to the slow path? Or is there any performance concern? Btw, I think `ZF` is always rewritten before entering into the slow path https://github.com/openjdk/jdk/blob/b32e4a68bca588d908bd81a398eb3171a6876dc5/src/hotspot/cpu/x86/c2_CodeStubs_x86.cpp#L98-L102 - PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1673704482
Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]
On Tue, 9 Jul 2024 14:42:42 GMT, Doug Simon wrote: >> test/jdk/jdk/internal/vm/TestTranslatedException.java line 167: >> >>> 165: private static void assertThrowableEquals(Throwable originalIn, >>> Throwable decodedIn) { >>> 166: Throwable original = originalIn; >>> 167: Throwable decoded = decodedIn; >> >> What is the purpose of this renaming? > > So that the printing down the bottom of this message shows the complete > throwable, not just the cause on which the comparison failed. Thanks! I missed the reassign in the folded unchanged code. - PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670683400
Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]
On Tue, 9 Jul 2024 13:46:46 GMT, Doug Simon wrote: >> This PR addresses intermittent failures in jtreg GC stress tests. The >> failures occur under these conditions: >> 1. Using a libgraal build with assertions enabled as the top tier JIT >> compiler. Such a libgraal build will cause a VM exit if an assertion or >> GraalError occurs in a compiler thread (as this catches more errors in >> testing). >> 2. A libgraal compiler thread makes a call into the VM (via `CompilerToVM`) >> to a routine that performs a HotSpot heap allocation that fails. >> 3. The resulting OOME is wrapped in a GraalError, causing the VM to exit as >> described in 1. >> >> An OOME thrown in these specific conditions should not exit the VM as it not >> related to an OOME in the app or test. Instead, the failure should be >> treated as a bailout and the libgraal compiler should continue. >> >> To accomplish this, libgraal needs to be able to distinguish a GraalError >> caused by an OOME. This PR modifies the exception translation code to make >> this possible. > > Doug Simon has updated the pull request incrementally with one additional > commit since the last revision: > > fixed TestTranslatedException LGTM - Marked as reviewed by yzheng (Committer). PR Review: https://git.openjdk.org/jdk/pull/20083#pullrequestreview-2166581323
Re: RFR: 8335553: [Graal] Compiler thread calls into jdk.internal.vm.VMSupport.decodeAndThrowThrowable and crashes in OOM situation [v2]
On Tue, 9 Jul 2024 13:46:46 GMT, Doug Simon wrote: >> This PR addresses intermittent failures in jtreg GC stress tests. The >> failures occur under these conditions: >> 1. Using a libgraal build with assertions enabled as the top tier JIT >> compiler. Such a libgraal build will cause a VM exit if an assertion or >> GraalError occurs in a compiler thread (as this catches more errors in >> testing). >> 2. A libgraal compiler thread makes a call into the VM (via `CompilerToVM`) >> to a routine that performs a HotSpot heap allocation that fails. >> 3. The resulting OOME is wrapped in a GraalError, causing the VM to exit as >> described in 1. >> >> An OOME thrown in these specific conditions should not exit the VM as it not >> related to an OOME in the app or test. Instead, the failure should be >> treated as a bailout and the libgraal compiler should continue. >> >> To accomplish this, libgraal needs to be able to distinguish a GraalError >> caused by an OOME. This PR modifies the exception translation code to make >> this possible. > > Doug Simon has updated the pull request incrementally with one additional > commit since the last revision: > > fixed TestTranslatedException src/hotspot/share/jvmci/jvmciCompilerToVM.cpp line 782: > 780: while (true) { > 781: // Trigger an OutOfMemoryError > 782: objArrayOop next = oopFactory::new_objectArray(0x7FFF, > CHECK_NULL); Shall we check for pending exception and break here? test/jdk/jdk/internal/vm/TestTranslatedException.java line 167: > 165: private static void assertThrowableEquals(Throwable originalIn, > Throwable decodedIn) { > 166: Throwable original = originalIn; > 167: Throwable decoded = decodedIn; What is the purpose of this renaming? - PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670646934 PR Review Comment: https://git.openjdk.org/jdk/pull/20083#discussion_r1670607742
Re: RFR: 8315884: New Object to ObjectMonitor mapping [v2]
On Mon, 8 Jul 2024 12:13:07 GMT, Axel Boldt-Christmas wrote: >> When inflating a monitor the `ObjectMonitor*` is written directly over the >> `markWord` and any overwritten data is displaced into a displaced >> `markWord`. This is problematic for concurrent GCs which needs extra care or >> looser semantics to use this displaced data. In Lilliput this data also >> contains the klass forcing this to be something that the GC has to take into >> account everywhere. >> >> This patch introduces an alternative solution where locking only uses the >> lock bits of the `markWord` and inflation does not override and displace the >> `markWord`. This is done by keeping associations between objects and >> `ObjectMonitor*` in an external hash table. Different caching techniques are >> used to speedup lookups from compiled code. >> >> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is >> only supported in combination with the LM_LIGHTWEIGHT locking mode (the >> default). >> >> This patch has been evaluated to be performance neutral when >> `UseObjectMonitorTable` is turned off (the default). >> >> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` >> and `UseObjectMonitorTable` works. >> >> # Cleanups >> >> Cleaned up displaced header usage for: >> * BasicLock >> * Contains some Zero changes >> * Renames one exported JVMCI field >> * ObjectMonitor >> * Updates comments and tests consistencies >> >> # Refactoring >> >> `ObjectMonitor::enter` has been refactored an a >> `ObjectMonitorContentionMark` witness object has been introduced to the >> signatures. Which signals that the contentions reference counter is being >> held. More details are given below in the section about deflation. >> >> The initial purpose of this was to allow `UseObjectMonitorTable` to interact >> more seamlessly with the `ObjectMonitor::enter` code. >> >> _There is even more `ObjectMonitor` refactoring which can be done here to >> create a more understandable and enforceable API. There are a handful of >> invariants / assumptions which are not always explicitly asserted which >> could be trivially abstracted and verified by the type system by using >> similar witness objects._ >> >> # LightweightSynchronizer >> >> Working on adapting and incorporating the following section as a comment in >> the source code >> >> ## Fast Locking >> >> CAS on locking bits in markWord. >> 0b00 (Fast Locked) <--> 0b01 (Unlocked) >> >> When locking and 0b00 (Fast Locked) is observed, it may be beneficial to >> avoid inflating by spinning a bit. >> >> If 0b10 (Inflated) is observed or there is to... > > Axel Boldt-Christmas has updated the pull request incrementally with one > additional commit since the last revision: > > More graceful JVMCI VM option interaction Could you please revert 2814350 and export the following symbols to JVMCI? diff --git a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp index faf2cb24616..7be31aa0f5f 100644 --- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp @@ -241,6 +241,7 @@ nonstatic_field(JavaThread, _stack_overflow_state._reserved_stack_activation, address) \ nonstatic_field(JavaThread, _held_monitor_count, intx) \ nonstatic_field(JavaThread, _lock_stack, LockStack) \ + nonstatic_field(JavaThread, _om_cache, OMCache) \ JVMTI_ONLY(nonstatic_field(JavaThread, _is_in_VTMS_transition, bool)) \ JVMTI_ONLY(nonstatic_field(JavaThread, _is_in_tmp_VTMS_transition, bool)) \ JVMTI_ONLY(nonstatic_field(JavaThread, _is_disable_suspend, bool)) \ @@ -531,6 +532,8 @@ \ declare_constant_with_value("CardTable::dirty_card", CardTable::dirty_card_val()) \ declare_constant_with_value("LockStack::_end_offset", LockStack::end_offset()) \ + declare_constant_with_value("OMCache::oop_to_oop_difference", OMCache::oop_to_oop_difference()) \ + declare_constant_with_value("OMCache::oop_to_monitor_difference", OMCache::oop_to_monitor_difference()) \ \ declare_constant(CodeInstaller::VERIFIED_ENTRY) \ declare_constant(CodeInstaller::UNVERIFIED_ENTRY) \ - PR Comment: https://git.openjdk.org/jdk/pull/20067#issuecomment-2214322632
Integrated: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic
On Tue, 12 Mar 2024 10:44:54 GMT, Yudi Zheng wrote: > Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. This pull request has now been integrated. Changeset: ed81a478 Author:Yudi Zheng Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/ed81a478e175631f1de69eb4b43f927629fefd74 Stats: 146 lines in 17 files changed: 11 ins; 82 del; 53 mod 8327964: Simplify BigInteger.implMultiplyToLen intrinsic Reviewed-by: mdoerr, amitkumar, kvn, fyang - PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v7]
On Fri, 24 May 2024 15:12:28 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request with a new target base due to a merge > or a rebase. The incremental webrev excludes the unrelated changes brought in > by the merge/rebase. The pull request contains eight additional commits since > the last revision: > > - Merge remote-tracking branch 'upstream/master' into JDK-8327964 > - address comments. > - address comments. > - address comment. > - address comment. > - address comment. > - address comment. > - Simplify BigInteger.implMultiplyToLen intrinsic Thanks for the reviews! Mach5 testing looks good except for a couple known timeouts unrelated to this PR. GHA test failure is due to [JDK-8332923](https://bugs.openjdk.org/browse/JDK-8332923). - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2133444527
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v7]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into JDK-8327964 - address comments. - address comments. - address comment. - address comment. - address comment. - address comment. - Simplify BigInteger.implMultiplyToLen intrinsic - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/7c6023f8..c719e0a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=05-06 Stats: 560567 lines in 6784 files changed: 132593 ins; 81763 del; 346211 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
On Thu, 23 May 2024 10:12:17 GMT, Bhavana Kilambi wrote: >> Yudi Zheng has updated the pull request incrementally with one additional >> commit since the last revision: >> >> address comments. > > src/hotspot/share/opto/library_call.cpp line 5925: > >> 5923: // Set the original stack and the reexecute bit for the interpreter >> to reexecute >> 5924: // the bytecode that invokes BigInteger.multiplyToLen() if >> deoptimization happens >> 5925: // on the return from z array allocation in runtime. > > Since we are not allocating z array during runtime anymore, do we still need > these comments? Thanks for pointing it out! Removed. > src/java.base/share/classes/java/math/BigInteger.java line 1836: > >> 1834: >> 1835: if (z == null || z.length < (xlen + ylen)) >> 1836: z = new int[xlen + ylen]; > > Style: only 4 spaces indentation Done - PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1613608426 PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1613608653
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
On Wed, 22 May 2024 14:47:43 GMT, Yudi Zheng wrote: >> Moving array construction within BigInteger.implMultiplyToLen intrinsic >> candidate to its caller simplifies the intrinsic implementation in JIT >> compiler. > > Yudi Zheng has updated the pull request incrementally with one additional > commit since the last revision: > > address comments. @theRealAph @TheRealMDoerr @RealFYang @offamitkumar could you please help review the aarch64/ppc/riscv/s390 changes? - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2125073511
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]
On Wed, 17 Apr 2024 20:04:44 GMT, Dean Long wrote: >> Yudi Zheng has updated the pull request incrementally with one additional >> commit since the last revision: >> >> address comment. > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 6662: > >> 6660: push(tmp5); >> 6661: >> 6662: push(xlen); > > There may be an opportunity here (separate RFE?) to get rid of the > save/restore for these. I don't think it's necessary if this is called as > part of a C2 stub. In the Graal port we did get rid of these save/restore. - PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610126799
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v6]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: address comments. - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/72ba58ce..7c6023f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=04-05 Stats: 24 lines in 2 files changed: 0 ins; 0 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
On Mon, 20 May 2024 10:41:36 GMT, Bhavana Kilambi wrote: >> @dafedafe @dean-long please take a look and let me know if there are further >> issues, thanks! > > Hi @mur47x111, do you happen to have any performance results with this patch? @Bhavana-Kilambi the performance result for x86 is at https://github.com/openjdk/jdk/pull/18226#issuecomment-2007922439 . It is expected to be negligible. - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2124984579
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]
On Wed, 17 Apr 2024 19:33:01 GMT, Dean Long wrote: >> Yudi Zheng has updated the pull request incrementally with one additional >> commit since the last revision: >> >> address comment. > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4670: > >> 4668: const Register tmp5 = r15; >> 4669: const Register tmp6 = r16; >> 4670: const Register tmp7 = r17; > > Why not minimize changes and continue to use r5 for tmp0? I see no need for > r17 or to reassign all the other tmp registers. Was attempting to align the suffixes. Will revert. > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4693: > >> 4691: const Register xlen = r1; >> 4692: const Register z = r2; >> 4693: const Register zlen = r3; > > LibraryCallKit::inline_squareToLen() is still computing zlen and passing it > as the 4th arg, even though the value is unused. ppc x86 are not using `multiply_to_len` for `generate_squareToLen`. I think we still need to pass zlen for these platforms. - PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610083476 PR Review Comment: https://git.openjdk.org/jdk/pull/18226#discussion_r1610088021
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
On Tue, 26 Mar 2024 15:59:33 GMT, Damon Fenacci wrote: >> Yudi Zheng has updated the pull request incrementally with one additional >> commit since the last revision: >> >> address comment. > > `multiply_to_len` seems to be used by `generate_squareToLen` as well for > aarch64 and riscv but `zlen` is still passed in a register. > > https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L4710 > https://github.com/openjdk/jdk/blob/870a6127cf54264c691f7322d775b202705c3bfa/src/hotspot/cpu/riscv/stubGenerator_riscv.cpp#L2881 > > I think it might work anyway but it might be better to adapt them if only for > completeness. @dafedafe @dean-long please take a look and let me know if there are further issues, thanks! - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2061162283
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v5]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: address comment. - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/c5d521dc..72ba58ce Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v4]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: address comment. - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/870a6127..c5d521dc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=02-03 Stats: 53 lines in 10 files changed: 3 ins; 6 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v3]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: address comment. - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/bfc323b7..870a6127 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=01-02 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v2]
> Moving array construction within BigInteger.implMultiplyToLen intrinsic > candidate to its caller simplifies the intrinsic implementation in JIT > compiler. Yudi Zheng has updated the pull request incrementally with one additional commit since the last revision: address comment. - Changes: - all: https://git.openjdk.org/jdk/pull/18226/files - new: https://git.openjdk.org/jdk/pull/18226/files/37232a5f..bfc323b7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=00-01 Stats: 70 lines in 11 files changed: 8 ins; 28 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic [v2]
On Mon, 18 Mar 2024 16:55:28 GMT, Damon Fenacci wrote: > Quite a simplification! Have you checked if there are any performance > differences? Ran https://github.com/oracle/graal/blob/master/compiler/src/org.graalvm.micro.benchmarks/src/micro/benchmarks/BigIntegerBenchmark.java The results are $ before change Benchmark Mode CntScore Error Units BigIntegerBenchmark.bigIntMontgomeryMul thrpt5 122.488 ± 0.130 ops/s BigIntegerBenchmark.bigIntMontgomerySqr thrpt5 76.023 ± 0.106 ops/s BigIntegerBenchmark.bigIntMulthrpt5 330.130 ± 0.349 ops/s BigIntegerBenchmark.bigIntMulAdd thrpt5 455.590 ± 0.663 ops/s $ after change Benchmark Mode CntScore Error Units BigIntegerBenchmark.bigIntMontgomeryMul thrpt5 124.407 ± 0.045 ops/s BigIntegerBenchmark.bigIntMontgomerySqr thrpt5 76.036 ± 0.232 ops/s BigIntegerBenchmark.bigIntMulthrpt5 329.836 ± 0.953 ops/s BigIntegerBenchmark.bigIntMulAdd thrpt5 456.485 ± 0.766 ops/s - PR Comment: https://git.openjdk.org/jdk/pull/18226#issuecomment-2007922439
RFR: 8327964: Simplify BigInteger.implMultiplyToLen intrinsic
Moving array construction within BigInteger.implMultiplyToLen intrinsic candidate to its caller simplifies the intrinsic implementation in JIT compiler. - Commit messages: - Simplify BigInteger.implMultiplyToLen intrinsic Changes: https://git.openjdk.org/jdk/pull/18226/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18226&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8327964 Stats: 53 lines in 2 files changed: 4 ins; 49 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18226.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18226/head:pull/18226 PR: https://git.openjdk.org/jdk/pull/18226
Re: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11]
On Wed, 2 Aug 2023 12:33:43 GMT, Ferenc Rakoczi wrote: >> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 3473: >> >>> 3471: __ bcax(v24, __ T16B, v24, v8, v31); >>> 3472: >>> 3473: __ ld1r(v31, __ T2D, __ post(rscratch1, 8)); >> >> is it intentional to load 16 bytes and post-increment by 8? > > Actually, with the ld1r instruction the post increment should be the same as > the size of the memory accessed. So T2D requires 8 as it reads 8 bytes(and > duplicates it into both halves of the SIMD register). Thanks for the clarification! - PR Review Comment: https://git.openjdk.org/jdk/pull/207#discussion_r1281961206
Re: RFR: 8252204: AArch64: Implement SHA3 accelerator/intrinsic [v11]
On Wed, 21 Oct 2020 23:42:33 GMT, Fei Yang wrote: >> Contributed-by: ard.biesheu...@linaro.org, dong...@huawei.com >> >> This added an intrinsic for SHA3 using aarch64 v8.2 SHA3 Crypto Extensions. >> Reference implementation for core SHA-3 transform using ARMv8.2 Crypto >> Extensions: >> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/arch/arm64/crypto/sha3-ce-core.S?h=v5.4.52 >> >> Trivial adaptation in SHA3. implCompress is needed for the purpose of adding >> the intrinsic. >> For SHA3, we need to pass one extra parameter "digestLength" to the stub for >> the calculation of block size. >> "digestLength" is also used in for the EOR loop before keccak to >> differentiate different SHA3 variants. >> >> We added jtreg tests for SHA3 and used QEMU system emulator which supports >> SHA3 instructions to test the functionality. >> Patch passed jtreg tier1-3 tests with QEMU system emulator. >> Also verified with jtreg tier1-3 tests without SHA3 instructions on >> aarch64-linux-gnu and x86_64-linux-gnu, to make sure that there's no >> regression. >> >> We used one existing JMH test for performance test: >> test/micro/org/openjdk/bench/java/security/MessageDigests.java >> We measured the performance benefit with an aarch64 cycle-accurate simulator. >> Patch delivers 20% - 40% performance improvement depending on specific SHA3 >> digest length and size of the message. >> >> For now, this feature will not be enabled automatically for aarch64. We can >> auto-enable this when it is fully tested on real hardware. But for the >> above testing purposes, this is auto-enabled when the corresponding hardware >> feature is detected. > > Fei Yang has updated the pull request incrementally with one additional > commit since the last revision: > > Add if (isJDK16OrHigher()) check for SHA3 in CheckGraalIntrinsics.java src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 3473: > 3471: __ bcax(v24, __ T16B, v24, v8, v31); > 3472: > 3473: __ ld1r(v31, __ T2D, __ post(rscratch1, 8)); is it intentional to load 16 bytes and post-increment by 8? - PR Review Comment: https://git.openjdk.org/jdk/pull/207#discussion_r1281749663