On Thu, 20 Nov 2025 23:10:48 GMT, Patricio Chilano Mateo
<[email protected]> wrote:
>> When `ThreadSnapshotFactory::get_thread_snapshot()` captures a snapshot of a
>> virtual thread, it uses `JvmtiVTMSTransitionDisabler` class to disable
>> mount/unmount transitions. However, this only works if a JVMTI agent has
>> attached to the VM, otherwise virtual threads don’t honor the disable
>> request. Since this snapshot mechanism is used by `jcmd Thread.dump_to_file`
>> and `HotSpotDiagnosticMXBean` which don’t require a JVMTI agent to be
>> present, getting the snapshot of a virtual thread in such cases can lead to
>> crashes.
>>
>> This patch moves the transition-disabling mechanism out of JVMTI and into a
>> new class, `MountUnmountDisabler`. The code has been updated so that
>> transitions can be disabled independently of JVMTI, making JVMTI just one
>> user of the API rather than the owner of the mechanism. Here is a summary of
>> the key changes:
>>
>> - Currently when a virtual thread starts a mount/unmount transition we only
>> need to check if `_VTMS_notify_jvmti_events` is set to decide if we need to
>> go to the slow path. With these changes, JVMTI is now only one user of the
>> API, so we instead check the actual transition disabling counters, i.e the
>> per-vthread counter and the global counter. Since these can be set at any
>> time (unlike `_VTMS_notify_jvmti_events` which is only set at startup or
>> during a safepoint in case of late binding agents), we follow the classic
>> Dekker pattern for the required synchronization. That is, the virtual thread
>> sets the “in transition” bits for the carrier and vthread *before* reading
>> the “transition disabled” counters. The thread requesting to disable
>> transitions increments the “transition disabled” counter *before* reading
>> the “in transition” bits.
>> An alternative that avoids the extra fence in `startTransition` would be to
>> place extra overhead on the thread requesting to disable transitions (e.g.
>> using safepoint, handshake-all, or UseSystemMemoryBarrier). Performance
>> analysis show no difference with current mainline so for now I kept this
>> simpler version.
>>
>> - Ending the transition doesn’t need to check if transitions are disabled
>> (equivalent to not need to poll when transitioning from unsafe to safe
>> safepoint state). But we still require to go through the slow path if there
>> is a JVMTI agent present, since we need to check for event posting and JVMTI
>> state rebinding. As such, the end transition follows the same pattern that
>> we have today of only needing to check `_VTMS_notify_jvmti_events`.
>>
>> - The code was previously structured in t...
>
> Patricio Chilano Mateo has updated the pull request incrementally with one
> additional commit since the last revision:
>
> Rename VM methods for endFirstTransition/startFinalTransition
Hi Patricio, this is another significant piece of work. I have taken an initial
pass through trying to digest the main parts - can't comment on the C2 code or
the Java side. I have made a few minor comments/suggestions.
Thanks
src/hotspot/share/prims/jvm.cpp line 3668:
> 3666: if (!DoJVMTIVirtualThreadTransitions) {
> 3667: assert(!JvmtiExport::can_support_virtual_threads(), "sanity check");
> 3668: return;
Does this not still need checking somewhere?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 162:
> 160: // be executed once we go back to Java. If this is an unmount, the
> handshake that the
> 161: // disabler executed against this carrier thread already provided the
> needed synchronization.
> 162: // This matches the release fence in
> xx_enable_for_one()/xx_enable_for_all().
Subtle. Do we have comments where the fences are to ensure people realize the
fence is serving this purpose?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 277:
> 275:
> 276: // Start of the critical region. Prevent future memory
> 277: // operations to be ordered before we read the transition flag.
Does this refer to `java_lang_Thread::is_in_VTMS_transition(_vthread())`? If so
perhaps that should internally perform the `load_acquire`?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 278:
> 276: // Start of the critical region. Prevent future memory
> 277: // operations to be ordered before we read the transition flag.
> 278: // This matches the release fence in end_transition().
Suggestion:
// This pairs with the release fence in end_transition().
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 307:
> 305: // Block while some mount/unmount transitions are in progress.
> 306: // Debug version fails and prints diagnostic information.
> 307: for (JavaThreadIteratorWithHandle jtiwh; JavaThread *jt =
> jtiwh.next(); ) {
This looks very odd, having an assignment in the loop condition check and no
actual loop-update expression.
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 316:
> 314: // operations to be ordered before we read the transition flags.
> 315: // This matches the release fence in end_transition().
> 316: OrderAccess::acquire();
Surely the use of the iterator already provides the necessary ordering
guarantee here as well. ?
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 327:
> 325: // End of the critical section. Prevent previous memory operations to
> 326: // be ordered after we clear the clear the disable transition flag.
> 327: // This matches the equivalent acquire fence in start_transition().
Suggestion:
// This pairs with the acquire in start_transition().
I just realized you are using "fence" to describe release and acquire memory
barrier semantics. Given we have an operation `fence` I find this confusing for
the reader - especially when we also have a `release_store_fence` operation
which might be confused with "release fence".
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 370:
> 368: assert(VTMSTransition_lock->owned_by_self() ||
> SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 369: assert(_global_start_transition_disable_count >= 0, "");
> 370: AtomicAccess::store(&_global_start_transition_disable_count,
> _global_start_transition_disable_count + 1);
Suggestion:
AtomicAccess::inc(&_global_start_transition_disable_count);
src/hotspot/share/runtime/mountUnmountDisabler.cpp line 376:
> 374: assert(VTMSTransition_lock->owned_by_self() ||
> SafepointSynchronize::is_at_safepoint(), "Must be locked");
> 375: assert(_global_start_transition_disable_count > 0, "");
> 376: AtomicAccess::store(&_global_start_transition_disable_count,
> _global_start_transition_disable_count - 1);
Suggestion:
AtomicAccess::dec(&_global_start_transition_disable_count);
src/hotspot/share/runtime/mountUnmountDisabler.hpp line 52:
> 50: // parameter is_SR: suspender or resumer
> 51: MountUnmountDisabler(bool exlusive = false);
> 52: MountUnmountDisabler(oop thread_oop);
What does the comment mean here?
-------------
PR Review: https://git.openjdk.org/jdk/pull/28361#pullrequestreview-3490207826
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2547887801
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548145054
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548157390
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548150552
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548160373
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548161340
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548168223
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548169846
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548170787
PR Review Comment: https://git.openjdk.org/jdk/pull/28361#discussion_r2548174392