On Mon, 4 May 2026 19:20:45 GMT, Patricio Chilano Mateo <[email protected]> wrote:
>> David Holmes has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Fix typo > > src/hotspot/share/runtime/handshake.cpp line 718: > >> 716: >> 717: HandshakeOperation* op = get_op(); >> 718: assert(op != nullptr, "Must have an op"); > > There is a subtle race now where we can hit this assert. The sequence of > events go as follows: > 1 - `Thread1` adds handshake operation to `TargetThread's` queue. Sees > `TargetThread's` as handshake safe and proceeds to `claim_handshake()`. > 2 - Before `Thread1` calls `claim_handshake`, `TargetThread’s` processes the > operation and removes it from the queue. > 3 - `TargetThread’s` calls `GetPrimitiveArrayCritical`. Before incrementing > `_jni_deferred_suspension_count`, `Thread2` adds suspend handshake to > `TargetThread’s` queue. > 4 - `Thread1` calls `claim_handshake` and sees that there is a pending > operation that is enabled. > 5 - `TargetThread’s` increments `_jni_deferred_suspension_count` and > continues execution in critical native code. > 6 - `Thread1` calls `can_process_handshake()` and sees `TargetThread` in > native. Calls `get_op` but the only operation in the queue is not enabled. > > I created a test with a couple of artificial delays in the VM which triggers > this assert: > https://github.com/openjdk/jdk/compare/pr/30936...pchilano:jdk:8373839-jni-crit-suspend-stresstest > > One way of fixing this could be to not allow others threads besides the > requester/target to process the suspend handshake. That means > `HandshakeClosure::is_enabled` would need to also take `_requester`. > I’m also thinking if we could move `can_process_handshake` to > `claim_handshake` after grabbing the monitor and before checking the queue, > i.e. we first check if the thread is in a handshake safe state. Thanks for looking at this Patricio. I did not think what you described was possible, but it seems adding to the queue is effectively asynchronous and doesn't care about the target state etc. That breaks things with multiple parties involved. Restricting to the original handshaker does sound like it would address this, but I worry that we are contorting the handshake mechanism too far by trying to impose all these restrictions. I will take another look. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/30936#discussion_r3186732739
