Re: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly [v4]
> Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method > > Testing: tier1-6 Alex Menkov has updated the pull request incrementally with three additional commits since the last revision: - update - Revert "renamed current_thread to current" This reverts commit d5d614bcf0861466acd695296e974d2253f84c9f. - Revert "renamed current_thread tp current" This reverts commit 4602632221044aa754a1bc8d11e7a3e9a0092590. - Changes: - all: https://git.openjdk.org/jdk/pull/18986/files - new: https://git.openjdk.org/jdk/pull/18986/files/46026322..9be24a4a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18986=03 - incr: https://webrevs.openjdk.org/?repo=jdk=18986=02-03 Stats: 122 lines in 2 files changed: 0 ins; 0 del; 122 mod Patch: https://git.openjdk.org/jdk/pull/18986.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18986/head:pull/18986 PR: https://git.openjdk.org/jdk/pull/18986
Re: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly [v3]
On Tue, 30 Apr 2024 23:48:02 GMT, Alex Menkov wrote: >> Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method >> >> Testing: tier1-6 > > Alex Menkov has updated the pull request incrementally with one additional > commit since the last revision: > > renamed current_thread tp current ok, I'll revert last update and rename current_thread to current only in few places where new variable is introduced - PR Comment: https://git.openjdk.org/jdk/pull/18986#issuecomment-2091927214
Re: RFR: 8330852: All callers of JvmtiEnvBase::get_threadOop_and_JavaThread should pass current thread explicitly [v3]
On Tue, 30 Apr 2024 23:48:02 GMT, Alex Menkov wrote: >> Some cleanup related to JvmtiEnvBase::get_threadOop_and_JavaThread method >> >> Testing: tier1-6 > > Alex Menkov has updated the pull request incrementally with one additional > commit since the last revision: > > renamed current_thread tp current Given the number of `current` renames (which distracts from the core change in this PR), and given that not all occurrences were renamed (only those that were touched), I think it would be best to leave the rename for a separate PR. - PR Comment: https://git.openjdk.org/jdk/pull/18986#issuecomment-2091868298
Re: RFR: 8330146: assert(!_thread->is_in_any_VTMS_transition()) failed
On Thu, 2 May 2024 10:07:35 GMT, Serguei Spitsyn wrote: > Any event posting code except CFLH, ClassPrepare and ClassLoad events has a > conditional return in case if the event is posted during a VTMS transition. > The CFLH, ClassPrepare and ClassLoad event posting code has just an assert > instead. The ClassPrepare and ClassLoad events also have a conditional return > in a case of temporary VTMS transition. > This update is to align the CFLH, ClassPrepare and ClassLoad events with all > other events in this area. > > Testing: > - TBD: submit mach5 tiers 1-6 I looked at other places where the following is already in place: `return; // no events should be posted if thread is in any VTMS transition` I can understand the rationale for not sending events in those cases (like breakpoint, singlestep, and methodentry). However, loss ClassPrepare and ClassLoad events seems a bit more significant for profilers that might be trying to accurately track all class loading. It seems maybe we should instead be trying to avoid these events by preloading the classes as was suggested as an option in the CR. - PR Review: https://git.openjdk.org/jdk/pull/19054#pullrequestreview-2037080382
Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
diff --git a/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java b/src/jdk.attach/linux/classes/sun/tools/attach/V irtualMachineImpl.java index 81d4fd259ed..74bd60c791d 100644 --- a/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java +++ b/src/jdk.attach/linux/classes/sun/tools/attach/VirtualMachineImpl.java @@ -34,6 +34,7 @@ import java.nio.file.Path; import java.nio.file.Paths; import java.nio.file.Files; +import java.util.Optional; import static java.nio.charset.StandardCharsets.UTF_8; @@ -47,7 +48,21 @@ public class VirtualMachineImpl extends HotSpotVirtualMachine { // will not be able to find all Hotspot processes. // Any changes to this needs to be synchronized with HotSpot. private static final String tmpdir = "/tmp"; + + private static final Optional MOUNT_NS; + + static { + Path mountns = null; + try { + mountns = Files.readSymbolicLink(Path.of("/proc/self/ns/mnt")); + } catch (IOException ioe) { + } finally { + MOUNT_NS = Optional.ofNullable(mountns); + } + } + String socket_path; + /** * Attaches to the target VM */ @@ -236,7 +251,18 @@ private File createAttachFile(int pid, int ns_pid) throws IOException { private String findTargetProcessTmpDirectory(int pid, int ns_pid) throws IOException { String root; - if (pid != ns_pid) { + + Optional tgtMountNS = Optional.empty(); + + try { + tgtMountNS = Optional.ofNullable(Files.readSymbolicLink(Path.of("/proc", Integer.toString(pid), "ns", "mnt"))); + } catch (IOException _) { + // do nothing... + } + + final boolean sameMountNS = MOUNT_NS.isPresent() && tgtMountNS.isPresent() && MOUNT_NS.equals(tgtMountNS); + + if (!sameMountNS || pid != ns_pid) { // A process may not exist in the same mount namespace as the caller, e.g. // if we are trying to attach to a JVM process inside a container. // Instead, attach relative to the target root filesystem as exposed by @@ -248,11 +274,11 @@ private String findTargetProcessTmpDirectory(int pid, int ns_pid) throws IOExcep "of target process %d", procRootDirectory, pid)); } - root = procRootDirectory + "/" + tmpdir; - } else { - root = tmpdir; - } - return root; + return procRootDirectory + "/" + tmpdir; + } else if (sameMountNS) { + return tmpdir; + } else + throw new IOException(String.format("target process:%d and this do not share common mount namespace for: %s attach faile d", pid, tmpdir)); } /*
Re: RFR: 8308033: The jcmd thread dump related tests should test virtual threads [v2]
On Thu, 2 May 2024 03:57:42 GMT, Jaikiran Pai wrote: >> test/hotspot/jtreg/serviceability/tmtools/jstack/DaemonThreadTest.java line >> 77: >> >>> 75: t.setDaemon(false); >>> 76: } >>> 77: testThread(t, ""); >> >> I think the following is a bit clearer: >> >> >> // Virtual threads are always daemon threads. Therefore if the current >> thread is a >> // virtual thread, then NormalThread will be a daemon thread and we need to >> explicitly >> // make it not a daemon thread. >> >> >> You don't actually need the isVirtual() check, especially with the comment >> explaining things. However, you could also forgo all this logic and the >> comment by just sticking setDaemon(false) in the NormalThread constructor. > > Done - I've updated the PR to just call `setDaemon(false)` from the > constructor of `NormalThread` instead of this conditional logic. > > When I initially added this logic, I was trying to avoid doing any changes > that could impact the testing when running under platform threads. But > thinking about this test's context, it merely wants to verify that the jstack > output correctly presents daemon/non-daemon status, so explicitly setting a > "normal" thread to non-daemon shouldn't impact what this test was originally > verifying. Yes, I also briefly wondered if somehow this would be subverting the testing of whether the thread was not daemon by default, but that's not what is being tested here. >> test/jdk/sun/tools/jstack/BasicJStackTest.java line 53: >> >>> 51: // print the stacktraces of virtual threads. >>> 52: throw new SkippedException("skipping test since current >>> thread is a virtual thread"); >>> 53: } >> >> I wonder if we can rely on the virtual thread always be mounted. If so, we >> could revisit this test after >> [JDK-8330846](https://bugs.openjdk.org/browse/JDK-8330846) is implemented. > > Hello Chris, if I understand that JBS issue correctly, then by default jstack > (and other tools) will start including stacktraces of virtual threads that > currently are mounted on a carrier thread (and thus haven't been parked) in > the output. > > If that's the case, then I think when that JBS issue is implemented we can > remove this conditional check and the skipping of the test since the virtual > thread would end up appearing in the thread dump because that's the thread > which would be currently launching the `jstack` process and waiting (through > a `ReentrantLock`'s `Condition` object) for the jstack process to complete. > > To revisit this test when JDK-8330846 is implemented, do you want me to file > an issue? Or did you mean we should wait doing the changes in this PR until > JDK-8330846 is implemented? I think just noting this in [JDK-8330846](https://bugs.openjdk.org/browse/JDK-8330846) should be good enough. - PR Review Comment: https://git.openjdk.org/jdk/pull/19016#discussion_r1588463768 PR Review Comment: https://git.openjdk.org/jdk/pull/19016#discussion_r1588462528
Re: RFR: 8308033: The jcmd thread dump related tests should test virtual threads [v3]
On Thu, 2 May 2024 03:57:25 GMT, Jaikiran Pai wrote: >> Can I please get a review of these test-only changes which proposes to >> remove the `test/jdk/sun/tools/jcmd/JcmdOutputEncodingTest.java` and >> `test/jdk/sun/tools/jstack/BasicJStackTest.java` tests from >> `ProblemList-Virtual.txt`? >> >> When jtreg was enhanced to allow running the tests from within a virtual >> (main) thread, some tests were problem listed since they either were failing >> at that time or the test code would require additional work to make it work >> when the current thread is a virtual thread. The >> `test/jdk/sun/tools/jcmd/JcmdOutputEncodingTest.java` and >> `test/jdk/sun/tools/jstack/BasicJStackTest.java` are 2 such threads which >> require special handling when the current thread is a virtual thread. >> >> `test/jdk/sun/tools/jcmd/JcmdOutputEncodingTest.java` has been updated to >> use a different command to dump stacktraces when the test runs in a virtual >> thread. I went back and looked at the original issue for which this test was >> introduced and based on that, using a different command to dump the >> stacktraces shouldn't impact what the test was originally intended to test. >> >> `test/jdk/sun/tools/jstack/BasicJStackTest.java` has been updated to be >> skipped when the current thread is the virtual thread. This is because the >> test exercises the `jstack` tool which doesn't print stacktraces of virtual >> threads and thus the test isn't applicable for virtual threads. >> >> I've run these tests with this change, both with platform threads and >> virtual threads in our CI and the tests complete without failures. > > Jaikiran Pai has updated the pull request incrementally with one additional > commit since the last revision: > > Chris' review - DaemonThreadTest.java, no need for checking if thread is > virtual Marked as reviewed by cjplummer (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/19016#pullrequestreview-2037020145
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Thu, 2 May 2024 21:36:27 GMT, Chris Plummer wrote: >> Thank you for the comment. In fact, I don't know how to fix it. >> Replacing of NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS in >> comments with `expEnteringCount/expWaitingCount` does not make sense to me. >> The comments are about the tested pattern, not about the real values. >> Please, let me know if you have any suggestion on fixing. > > expEnteringCount/expWaitingCount contain the tested patterns. I don't see why > they can't just replace NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS > in the comments also. In fact it is confusing if you don't because code right > below the comments references expEnteringCount/expWaitingCount, not > NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS. ...and there are also comments above with this issue. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1588456637
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Thu, 2 May 2024 07:33:09 GMT, Serguei Spitsyn wrote: >> The fix is to degrade virtual threads support in the JVM TI >> `GetObjectMonitorUsage` function so that it is specified to only return an >> owner when the owner is a platform thread. Also, virtual threads are not >> listed in the both `waiters` and `notify_waiters` lists returned in the >> `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI >> functions and events for virtual threads, we missed this one. >> >> The main motivation for degrading it now is that the object monitor >> implementation is being updated to allow virtual threads unmount while >> owning monitors. It would add overhead to record monitor usage when >> freezing/unmount, overhead that couldn't be tied to a JVMTI capability as >> the capability can be enabled at any time. >> >> `GetObjectMonitorUsage` was broken for 20+ years >> ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports >> so it seems unlikely that the function is widely used. Degrading it to only >> return an owner when the owner is a platform thread has no compatibility >> impact for tooling that uses it in conjunction with `HotSpot` thread dumps >> or `ThreadMXBean`. >> >> One other point about `GetObjectMonitorUsage` is that it pre-dates >> j.u.concurrent in Java 5 so it can't be used to get a full picture of the >> lock usage in a program. >> >> The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the >> JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` >> methods are updated to match the JVM TI spec. >> >> Also, please, review the related CSR and Release Note: >> - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade >> virtual thread support for GetObjectMonitorUsage >> - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: >> degrade virtual thread support for GetObjectMonitorUsage >> >> Testing: >> - tested impacted and updated tests locally >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one > additional commit since the last revision: > > review: Corrections in: 1) JVMTI/JDWP spec; 2) test vthread checks; 3) test > comments src/hotspot/share/prims/jvmti.xml line 8280: > 8278: > 8279: The number of platform threads waiting to own this monitor, > or 0 > 8280: if only virtual threads are waiting or no threads are > waiting This is now exactly the same as `waiter_count` above. I don't think this is what you intended. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1588453160
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Wed, 1 May 2024 20:42:16 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one >> additional commit since the last revision: >> >> review: Corrections in: 1) JVMTI/JDWP spec; 2) test vthread checks; 3) >> test comments > > src/java.se/share/data/jdwp/jdwp.spec line 1622: > >> 1620: (threadObject owner "The platform thread owning this >> monitor, or nullptr " >> 1621: "if owned` by a virtual thread or not >> owned.") >> 1622: (int entryCount "The number of times the owning platform >> thread has entered the monitor.") > > See the comment I left for the JVMTI spec. We should be more complete in the > explanation here, explaining how it is 0 for virtual threads. I don't think this has been resolved. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1588459462
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Thu, 2 May 2024 06:45:39 GMT, Serguei Spitsyn wrote: >> test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java >> line 257: >> >>> 255: // Correct the expected values for the virtual thread case. >>> 256: int expEnteringCount = isVirtual ? 0 : >>> NUMBER_OF_ENTERING_THREADS; >>> 257: int expWaitingCount = isVirtual ? 0 : >>> NUMBER_OF_WAITING_THREADS; >> >> There are comments below that still reference NUMBER_OF_ENTERING_THREADS >> and NUMBER_OF_WAITING_THREADS. > > Thank you for the comment. In fact, I don't know how to fix it. > Replacing of NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS in comments > with `expEnteringCount/expWaitingCount` does not make sense to me. The > comments are about the tested pattern, not about the real values. Please, let > me know if you have any suggestion on fixing. expEnteringCount/expWaitingCount contain the tested patterns. I don't see why they can't just replace NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS in the comments also. In fact it is confusing if you don't because code right below the comments references expEnteringCount/expWaitingCount, not NUMBER_OF_ENTERING_THREADS/NUMBER_OF_WAITING_THREADS. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1588440431
Re: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v6]
On Sat, 20 Apr 2024 03:04:23 GMT, Chris Plummer wrote: >> Lei Zaakjyu has updated the pull request incrementally with one additional >> commit since the last revision: >> >> review > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PointerLocation.java > line 131: > >> 129: } >> 130: >> 131: public G1HeapRegion getHeapRegion() { > > Do we want to rename to getG1HeapRegion? It seems you agreed with this suggestion but the change was never made. > test/hotspot/jtreg/serviceability/sa/TestG1HeapRegion.java line 62: > >> 60: agent.attach(Integer.parseInt(pid)); >> 61: G1CollectedHeap heap = >> (G1CollectedHeap)VM.getVM().getUniverse().heap(); >> 62: G1HeapRegion hr = heap.hrm().heapRegionIterator().next(); > > "g1HeapRegionIterator"? And here also it seems you agreed with this suggestion but the change was never made. - PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1588433200 PR Review Comment: https://git.openjdk.org/jdk/pull/18871#discussion_r1588434351
Re: RFR: 8328866: Add raw monitor rank support to the debug agent.
On Thu, 2 May 2024 02:40:36 GMT, Alex Menkov wrote: >> This PR adds ranked monitor support to the debug agent. The debug agent has >> a large number of monitors, and it's really hard to know which order to grab >> them in, and for that matter which monitors might already be held at any >> given moment. By imposing a rank on each monitor, we can check to make sure >> they are always grabbed in the order of their rank. Having this in place >> when I was working on >> [JDK-8324868](https://bugs.openjdk.org/browse/JDK-8324868) would have made >> it much easier to detect a deadlock that was occuring, and the reason for >> it. That's what motivated me to do this work >> >> There were 2 or 3 minor rank issues discovered as a result of these changes. >> I also learned a lot about some of the more ugly details of the locking >> support in the process. >> >> Tested with the following on all supported platforms and with virtual >> threads: >> >> com/sun/jdi >> vmTestbase/nsk/jdi >> vmTestbase/nsk/jdb >> vmTestbase/nsk/jdwp >> >> Still need to run tier2 and tier5. >> >> Details of the changes follow in the first comment. > > src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 656: > >> 654: commonRef_lock(); >> 655: if (gdata->rankedMonitors) { >> 656: dbgRawMonitor_lock(); > > What is this call for? I think dbgRawMonitor_lock/dbgRawMonitor_unlock should > not be used outside of util.c (I'd remove them from util.h) After calling getLocks(), there may still be attempts to enter locks. The locks should already be held. I filed [JDK-8330193](https://bugs.openjdk.org/browse/JDK-8330193) to assert that. However, even when entering an already entered lock, we still need to grab dbgRawMonitor. I found that out the hard way when there were deadlocks on dbgRawMonitor because it was not entered it here. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1204: > >> 1202: // Need to lock during initialization so verifyMonitorRank() can >> be guaranteed that >> 1203: // if the monitor field is set, then the monitor if fully >> initialized. More specifically, >> 1204: // that the rank field has been set. > > rank for all monitors can be set in `util_initialize()`: > > for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) { > dbg_monitors[i]->rank = i; > } Ok. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1231: > >> 1229: } >> 1230: >> 1231: dbg_monitor->monitor = NULL; > > I think it would be better to protect this with dbgRawMonitor I don't see how that helps. Access to the field is not protected. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1243: > >> 1241: error = JVMTI_FUNC_PTR(gdata->jvmti,GetThreadInfo) >> 1242: (gdata->jvmti, thread, ); >> 1243: return info.name; > > Need to delete JNI reference info.thread_group and info.jthreadGroup (if not > NULL) Did you mean info.context_class_loader, not info.jthreadGroup? It looks like elsewhere when we call GetThreadInfo it is bracketed with the WITH_LOCAL_REFS macro to automatically push/pop a local refs frame, although it is only setting the size to 1, which seems like a bug. I'll do the same here. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1262: > >> 1260: for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) { >> 1261: DebugRawMonitor* dbg_monitor = _monitors[i]; >> 1262: if (dbg_monitor == NULL) { > > Should be `dbg_monitor->monitor` ok > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1279: > >> 1277: return; // This assert is not reliable if the VM is exiting >> 1278: } >> 1279: jthread current_thread = threadControl_currentThread(); > > Looks like all callers of the `assertIsCurrentThread()` have reference to the > current thread. Maybe pass it as argument? Ok. There were a few changes w.r.t. when this API is called, as was when the callers have access to the current thread, so I that's why I ended up not passing it in. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1281: > >> 1279: jthread current_thread = threadControl_currentThread(); >> 1280: if (!isSameObject(env, thread, current_thread)) { >> 1281: tty_message("ERROR: Threads not the same: %p %p\n", thread, >> current_thread); > > Not sure the addresses provide useful information. Maybe print thread names? ok > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1289: > >> 1287: >> 1288: static void >> 1289: verifyMonitorRank(JNIEnv *env, DebugRawMonitorRank rank, jthread >> thread) > > please add a comment that the function should be called under dbgRawMonitor > lock (also for other functions which assume this) ok. I was also thinking of setting a static that contains the jthread of the thread that holds the lock, and we can assert it is held. > src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1294: > >> 1292: // has a higher rank than the monitor we are about to enter. >> 1293:
Re: RFR: 8322043: HeapDumper should use parallel dump by default [v6]
On Tue, 30 Apr 2024 18:54:09 GMT, Alex Menkov wrote: >> The fix makes VM heap dumping parallel by default. >> `jcmd GC.heap_dump` and `jmap -dump` had parallel dumping by default, the >> fix affects `HotSpotDiagnosticMXBean.dumpHeap()`, >> `-XX:+HeapDumpBeforeFullGC`, `-XX:+HeapDumpAfterFullGC` and >> `-XX:+HeapDumpOnOutOfMemoryError`. >> >> Testing: >> - manually tested different heap dump scenarios with `-Xlog:heapdump`; >> - tier1,tier2,hs-tier5-svc; >> - all reg.tests that use heap dump. > > Alex Menkov has updated the pull request incrementally with one additional > commit since the last revision: > > removed space-only change in diagnosticCommand.cpp Currently heap dump is always performed in 2 phases (even if single threaded heap dump is requested or SerialGC is used). This is required to correctly handle unmounted virtual threads. This was implemented in jdk22, so your testing shows regression comparing with jdk21u (which does not dump unmounted vthreads and references from them). Note also that you use `-XX:+HeapDumpAfterFullGC` in your testing and look at total heap dump time. Main advantage of the 2 phase dumping is decreasing STW time (merge phase is performed on the current thread outside of safepoint). I.e. the idea is not to decrease total heap dump time, but to minimize JVM freeze during dumping. But this does not work in case of -XX:+HeapDumpBeforeFullGC and -XX:+HeapDumpAfterFullGC because heap dumping is requested inside safepoint, so merge stage is also performed in safepoint too (I think it's possible to fix it so merge is performed on some other thread, but I'm not sure it worth it). - PR Comment: https://git.openjdk.org/jdk/pull/18748#issuecomment-2091683903
Re: RFR: 8328866: Add raw monitor rank support to the debug agent.
On Wed, 1 May 2024 21:32:46 GMT, Chris Plummer wrote: > This PR adds ranked monitor support to the debug agent. The debug agent has a > large number of monitors, and it's really hard to know which order to grab > them in, and for that matter which monitors might already be held at any > given moment. By imposing a rank on each monitor, we can check to make sure > they are always grabbed in the order of their rank. Having this in place when > I was working on [JDK-8324868](https://bugs.openjdk.org/browse/JDK-8324868) > would have made it much easier to detect a deadlock that was occuring, and > the reason for it. That's what motivated me to do this work > > There were 2 or 3 minor rank issues discovered as a result of these changes. > I also learned a lot about some of the more ugly details of the locking > support in the process. > > Tested with the following on all supported platforms and with virtual threads: > > com/sun/jdi > vmTestbase/nsk/jdi > vmTestbase/nsk/jdb > vmTestbase/nsk/jdwp > > Still need to run tier2 and tier5. > > Details of the changes follow in the first comment. src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 656: > 654: commonRef_lock(); > 655: if (gdata->rankedMonitors) { > 656: dbgRawMonitor_lock(); What is this call for? I think dbgRawMonitor_lock/dbgRawMonitor_unlock should not be used outside of util.c (I'd remove them from util.h) src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1204: > 1202: // Need to lock during initialization so verifyMonitorRank() can be > guaranteed that > 1203: // if the monitor field is set, then the monitor if fully > initialized. More specifically, > 1204: // that the rank field has been set. rank for all monitors can be set in `util_initialize()`: for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) { dbg_monitors[i]->rank = i; } src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1231: > 1229: } > 1230: > 1231: dbg_monitor->monitor = NULL; I think it would be better to protect this with dbgRawMonitor src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1243: > 1241: error = JVMTI_FUNC_PTR(gdata->jvmti,GetThreadInfo) > 1242: (gdata->jvmti, thread, ); > 1243: return info.name; Need to delete JNI reference info.thread_group and info.jthreadGroup (if not NULL) src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1262: > 1260: for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) { > 1261: DebugRawMonitor* dbg_monitor = _monitors[i]; > 1262: if (dbg_monitor == NULL) { Should be `dbg_monitor->monitor` src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1279: > 1277: return; // This assert is not reliable if the VM is exiting > 1278: } > 1279: jthread current_thread = threadControl_currentThread(); Looks like all callers of the `assertIsCurrentThread()` have reference to the current thread. Maybe pass it as argument? src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1281: > 1279: jthread current_thread = threadControl_currentThread(); > 1280: if (!isSameObject(env, thread, current_thread)) { > 1281: tty_message("ERROR: Threads not the same: %p %p\n", thread, > current_thread); Not sure the addresses provide useful information. Maybe print thread names? src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1289: > 1287: > 1288: static void > 1289: verifyMonitorRank(JNIEnv *env, DebugRawMonitorRank rank, jthread thread) please add a comment that the function should be called under dbgRawMonitor lock (also for other functions which assume this) src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1294: > 1292: // has a higher rank than the monitor we are about to enter. > 1293: DebugRawMonitorRank i; > 1294: for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) { No need to check monitors from the beginning. Should be enough to check starting from `min(rank, FIRST_LEAF_DEBUG_RAW_MONITOR)` src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1470: > 1468: > 1469: // Assert that the current thread owns this monitor. > 1470: assertIsCurrentThread(getEnv(), dbg_monitor->ownerThread); reading ownerThread (and other) fields without dbgRawMonitor is not MT-safe (it's not atomic) - PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r158613 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1588259933 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586968454 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586970362 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586973059 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586983415 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586984623 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1588206582 PR Review Comment:
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Thu, 2 May 2024 07:33:09 GMT, Serguei Spitsyn wrote: >> The fix is to degrade virtual threads support in the JVM TI >> `GetObjectMonitorUsage` function so that it is specified to only return an >> owner when the owner is a platform thread. Also, virtual threads are not >> listed in the both `waiters` and `notify_waiters` lists returned in the >> `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI >> functions and events for virtual threads, we missed this one. >> >> The main motivation for degrading it now is that the object monitor >> implementation is being updated to allow virtual threads unmount while >> owning monitors. It would add overhead to record monitor usage when >> freezing/unmount, overhead that couldn't be tied to a JVMTI capability as >> the capability can be enabled at any time. >> >> `GetObjectMonitorUsage` was broken for 20+ years >> ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports >> so it seems unlikely that the function is widely used. Degrading it to only >> return an owner when the owner is a platform thread has no compatibility >> impact for tooling that uses it in conjunction with `HotSpot` thread dumps >> or `ThreadMXBean`. >> >> One other point about `GetObjectMonitorUsage` is that it pre-dates >> j.u.concurrent in Java 5 so it can't be used to get a full picture of the >> lock usage in a program. >> >> The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the >> JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` >> methods are updated to match the JVM TI spec. >> >> Also, please, review the related CSR and Release Note: >> - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade >> virtual thread support for GetObjectMonitorUsage >> - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: >> degrade virtual thread support for GetObjectMonitorUsage >> >> Testing: >> - tested impacted and updated tests locally >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one > additional commit since the last revision: > > review: Corrections in: 1) JVMTI/JDWP spec; 2) test vthread checks; 3) test > comments Thank you, Alan. I've updated the CSR with the recent diffs. Also, added a statement about separate deprecation plans. - PR Comment: https://git.openjdk.org/jdk/pull/19030#issuecomment-2091156003
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 14:40:35 GMT, Aleksey Shipilev wrote: > `CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ GC > phase is running, while it actually only covers the STW GCs. It would be good > to rename it for clarity. The freed-up name, `is_gc_active` could then be > repurposed to track any (concurrent or STW) GC phase running. That would be > useful to resolve [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). > > Doing this rename separately guarantees we have caught and renamed all > current uses. > > Additional testing: > - [ ] Linux AArch64 server fastdebug, `all` LGTM - Marked as reviewed by zgu (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19064#pullrequestreview-2036411170
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 17:23:21 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1270: >> >>> 1268: >>> 1269: ParallelScavengeHeap* heap = ParallelScavengeHeap::heap(); >>> 1270: assert(!heap->is_stw_gc_active(), "not reentrant"); >> >> While reading this I see that all these "not reentrant" asserts seems >> redundant given that we already do these checks in `IsSTWGCActiveMark`. >> Brownies points if you get rid of them. ;) > > Ah, hm. Indeed! Separate PR? There is some light cleanup in G1 that can be > associated with it. This PR would keep with just a mechanical rename. Sounds like a good idea. >> src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1493: >> >>> 1491: PCAddThreadRootsMarkingTaskClosure(uint worker_id) : >>> _worker_id(worker_id) { } >>> 1492: void do_thread(Thread* thread) { >>> 1493: assert(ParallelScavengeHeap::heap()->is_stw_gc_active(), "called >>> outside gc"); >> >> Should this be updated to "called outside gc pause" as you did in >> `G1CollectedHeap::pin_object`? The same comment goes for the other >> occurrences below. > > I deliberately stopped myself from doing this for Parallel GC code, where > every GC is STW GC :) I can change to "GC pause" if you want. Ah, I see. I wouldn't mind if it were changed to include "pause", but I'm also OK with you leaving it as is. - PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1588019866 PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1588018382
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 17:04:44 GMT, Stefan Karlsson wrote: >> `CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ >> GC phase is running, while it actually only covers the STW GCs. It would be >> good to rename it for clarity. The freed-up name, `is_gc_active` could then >> be repurposed to track any (concurrent or STW) GC phase running. That would >> be useful to resolve >> [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). >> >> Doing this rename separately guarantees we have caught and renamed all >> current uses. >> >> Additional testing: >> - [ ] Linux AArch64 server fastdebug, `all` > > src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1270: > >> 1268: >> 1269: ParallelScavengeHeap* heap = ParallelScavengeHeap::heap(); >> 1270: assert(!heap->is_stw_gc_active(), "not reentrant"); > > While reading this I see that all these "not reentrant" asserts seems > redundant given that we already do these checks in `IsSTWGCActiveMark`. > Brownies points if you get rid of them. ;) Ah, hm. Indeed! Separate PR? There is some light cleanup in G1 that can be associated with it. This PR would keep with just a mechanical rename. - PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1588015870
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 14:40:35 GMT, Aleksey Shipilev wrote: > `CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ GC > phase is running, while it actually only covers the STW GCs. It would be good > to rename it for clarity. The freed-up name, `is_gc_active` could then be > repurposed to track any (concurrent or STW) GC phase running. That would be > useful to resolve [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). > > Doing this rename separately guarantees we have caught and renamed all > current uses. > > Additional testing: > - [ ] Linux AArch64 server fastdebug, `all` Marked as reviewed by stefank (Reviewer). - PR Review: https://git.openjdk.org/jdk/pull/19064#pullrequestreview-2036349526
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 16:56:11 GMT, Stefan Karlsson wrote: >> `CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ >> GC phase is running, while it actually only covers the STW GCs. It would be >> good to rename it for clarity. The freed-up name, `is_gc_active` could then >> be repurposed to track any (concurrent or STW) GC phase running. That would >> be useful to resolve >> [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). >> >> Doing this rename separately guarantees we have caught and renamed all >> current uses. >> >> Additional testing: >> - [ ] Linux AArch64 server fastdebug, `all` > > src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1493: > >> 1491: PCAddThreadRootsMarkingTaskClosure(uint worker_id) : >> _worker_id(worker_id) { } >> 1492: void do_thread(Thread* thread) { >> 1493: assert(ParallelScavengeHeap::heap()->is_stw_gc_active(), "called >> outside gc"); > > Should this be updated to "called outside gc pause" as you did in > `G1CollectedHeap::pin_object`? The same comment goes for the other > occurrences below. I deliberately stopped myself from doing this for Parallel GC code, where every GC is STW GC :) I can change to "GC pause" if you want. - PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1587999105
Re: RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
On Thu, 2 May 2024 14:40:35 GMT, Aleksey Shipilev wrote: > `CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ GC > phase is running, while it actually only covers the STW GCs. It would be good > to rename it for clarity. The freed-up name, `is_gc_active` could then be > repurposed to track any (concurrent or STW) GC phase running. That would be > useful to resolve [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). > > Doing this rename separately guarantees we have caught and renamed all > current uses. > > Additional testing: > - [ ] Linux AArch64 server fastdebug, `all` Seems like a good change. src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1270: > 1268: > 1269: ParallelScavengeHeap* heap = ParallelScavengeHeap::heap(); > 1270: assert(!heap->is_stw_gc_active(), "not reentrant"); While reading this I see that all these "not reentrant" asserts seems redundant given that we already do these checks in `IsSTWGCActiveMark`. Brownies points if you get rid of them. ;) src/hotspot/share/gc/parallel/psParallelCompact.cpp line 1493: > 1491: PCAddThreadRootsMarkingTaskClosure(uint worker_id) : > _worker_id(worker_id) { } > 1492: void do_thread(Thread* thread) { > 1493: assert(ParallelScavengeHeap::heap()->is_stw_gc_active(), "called > outside gc"); Should this be updated to "called outside gc pause" as you did in `G1CollectedHeap::pin_object`? The same comment goes for the other occurrences below. - Changes requested by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/19064#pullrequestreview-2036315901 PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1587988542 PR Review Comment: https://git.openjdk.org/jdk/pull/19064#discussion_r1587974562
RFR: 8331573: Rename CollectedHeap::is_gc_active to be explicitly about STW GCs
`CollectedHeap::is_gc_active()` is confusing, since its name implies _any_ GC phase is running, while it actually only covers the STW GCs. It would be good to rename it for clarity. The freed-up name, `is_gc_active` could then be repurposed to track any (concurrent or STW) GC phase running. That would be useful to resolve [JDK-8331572](https://bugs.openjdk.org/browse/JDK-8331572). Doing this rename separately guarantees we have caught and renamed all current uses. Additional testing: - [ ] Linux AArch64 server fastdebug, `all` - Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/19064/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19064=00 Issue: https://bugs.openjdk.org/browse/JDK-8331573 Stats: 64 lines in 27 files changed: 0 ins; 2 del; 62 mod Patch: https://git.openjdk.org/jdk/pull/19064.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19064/head:pull/19064 PR: https://git.openjdk.org/jdk/pull/19064
Re: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v6]
> follow up 8267941 Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: review - Changes: - all: https://git.openjdk.org/jdk/pull/18871/files - new: https://git.openjdk.org/jdk/pull/18871/files/f4e066c4..92d0df4d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=18871=05 - incr: https://webrevs.openjdk.org/?repo=jdk=18871=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871
Re: RFR: 8330694: Rename 'HeapRegion' to 'G1HeapRegion' [v5]
> follow up 8267941 Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge branch 'master' into JDK-8330694 - fix indentation - also tidy up - tidy up - rename - Changes: https://git.openjdk.org/jdk/pull/18871/files Webrev: https://webrevs.openjdk.org/?repo=jdk=18871=04 Stats: 995 lines in 124 files changed: 0 ins; 4 del; 991 mod Patch: https://git.openjdk.org/jdk/pull/18871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18871/head:pull/18871 PR: https://git.openjdk.org/jdk/pull/18871
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
On Thu, 2 May 2024 07:33:09 GMT, Serguei Spitsyn wrote: >> The fix is to degrade virtual threads support in the JVM TI >> `GetObjectMonitorUsage` function so that it is specified to only return an >> owner when the owner is a platform thread. Also, virtual threads are not >> listed in the both `waiters` and `notify_waiters` lists returned in the >> `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI >> functions and events for virtual threads, we missed this one. >> >> The main motivation for degrading it now is that the object monitor >> implementation is being updated to allow virtual threads unmount while >> owning monitors. It would add overhead to record monitor usage when >> freezing/unmount, overhead that couldn't be tied to a JVMTI capability as >> the capability can be enabled at any time. >> >> `GetObjectMonitorUsage` was broken for 20+ years >> ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports >> so it seems unlikely that the function is widely used. Degrading it to only >> return an owner when the owner is a platform thread has no compatibility >> impact for tooling that uses it in conjunction with `HotSpot` thread dumps >> or `ThreadMXBean`. >> >> One other point about `GetObjectMonitorUsage` is that it pre-dates >> j.u.concurrent in Java 5 so it can't be used to get a full picture of the >> lock usage in a program. >> >> The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the >> JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` >> methods are updated to match the JVM TI spec. >> >> Also, please, review the related CSR and Release Note: >> - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade >> virtual thread support for GetObjectMonitorUsage >> - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: >> degrade virtual thread support for GetObjectMonitorUsage >> >> Testing: >> - tested impacted and updated tests locally >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one > additional commit since the last revision: > > review: Corrections in: 1) JVMTI/JDWP spec; 2) test vthread checks; 3) test > comments The update to the API specs looks okay. I think it was just a cut & paste error that the wrong text was copied into the description of the waiter_count and notify_waiter_count fields. I assume you'll update the CSR so it has the updated text. - PR Comment: https://git.openjdk.org/jdk/pull/19030#issuecomment-2090636854
RE: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
Thank you, Larry - that /proc//ns/mnt test makes a lot of sense to me, and I learned something new today! Jim -Original Message- From: serviceability-dev On Behalf Of Laurence Cable Sent: Wednesday, May 1, 2024 8:44 PM To: serviceability-dev@openjdk.org Subject: Re: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) **Warning! This email originated from outside the organization. Do not open attachments unless you recognize the sender. If you suspect this email is malicious, use the "Report Email" button. Replies to this email will go to serviceability-dev-r...@openjdk.org .** just to demonstrate: $ docker run -it --name=js1 openjdk:17.0.1-jdk /bin/jshell ... $ docker run -it --name js2 --pid=container:js1 openjdk:17.0.1 /bin/jshell $ docker exec -it js1 bash bash-4.4# ls /tmp/hsperfdata_root 1 26 bash-4.4# readlink /proc/26/ns/pid pid:[4026532751] bash-4.4# readlink /proc/26/ns/mnt mnt:[4026532747] bash-4.4# exit [lpgc@arran ~]$ docker exec -it js2 bash bash-4.4# ls /tmp/hsperfdata_root 107 82 bash-4.4# readlink /proc/107/ns/pid pid:[4026532751] bash-4.4# readlink /proc/107/ns/mnt mnt:[4026532941] you will note that the JVM pid: 26 and 107 occupy the same pid namespace (4026532751) but occupy different mnt namespaces (4026532747, 4026532941) therefore attempting to attach via '/tmp' will fail, /proc//root/tmp must be used to rendezvous - Larry On 5/1/24 2:03 PM, Doyle, James, K wrote: > Hi Sebastian, > >> I think I can confirm that there is a regression. > Thanks for reproducing the regression, your test makes sense to me, and I > think it is similar to the scenario we have with Kubernetes debug containers > (separate filesystems, but same PID namespace). > > I noticed some of the other recent Pull Request comments on > https://github.com/openjdk/jdk/pull/17628: > >> should it not be comparing pid namespace ids and not pids? > and wanted to give a little feedback. I think more refined approaches to > figuring out whether the target JVM is in the same PID namespace make sense > and could be an improvement, but it's still different from figuring out > whether the target JVM has the same filesystem (specifically, I guess, the > filesystem containing /tmp or java.io.tmpdir). That seems like the crucial > thing for deciding what socket file path to read, and whether /tmp is > sufficient or /proc//root/tmp is needed. I can think of a couple > different approaches to the filesystem issue: > > 1. There is some Linux kernel information that can be obtained about > the jcmd process and the target JVM process to figure out unequivocally what > their root filesystems are from the host's point of view, and whether they're > the same. (I don't know what this might be, though!) 2. jcmd treats it as a > heuristic and attempts each way during the socket file read - first > /proc//root/tmp and then /tmp. > 3. jcmd has some option or environment variable where the user can tell it > the socket file path. > > Do you agree that these are the types of choices available? > > Thanks, > Jim >
RFR: 8331557: Serial: Refactor SerialHeap::do_collection
It's probably easier to read the new code directly. The two classes in `serialVMOperations` serve as entrance points to invoke young/full GCs. Some previously hidden decisions are made more obvious, e.g. if a young-gc fails (or will probablly fail), fallback to full-gc. Additionally, `StatRecord` is removed, because this kind of info-aggregation should be done outsite VM (by third-party tool). Test: tier1-6 - Commit messages: - s1-do-collect Changes: https://git.openjdk.org/jdk/pull/19056/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19056=00 Issue: https://bugs.openjdk.org/browse/JDK-8331557 Stats: 555 lines in 15 files changed: 123 ins; 347 del; 85 mod Patch: https://git.openjdk.org/jdk/pull/19056.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19056/head:pull/19056 PR: https://git.openjdk.org/jdk/pull/19056
Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
On Thu, 2 May 2024 10:13:51 GMT, Sebastian Lövdahl wrote: > 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid > (Kubernetes debug container) Ran the following tests locally: $ make test TEST="jtreg:test/hotspot/jtreg/containers" ... == Test summary == TEST TOTAL PASS FAIL ERROR jtreg:test/hotspot/jtreg/containers 1414 0 0 == TEST SUCCESS $ make test TEST="jtreg:test/hotspot/jtreg/serviceability" ... == Test summary == TEST TOTAL PASS FAIL ERROR jtreg:test/hotspot/jtreg/serviceability 374 374 0 0 == TEST SUCCESS And also manually tested it under various conditions. Basic environment information: slovdahl@ubuntu2204:~/reproducer$ systemd --version systemd 249 (249.11-0ubuntu3.12) +PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified slovdahl@ubuntu2204:~/reproducer$ sudo apt-get install openjdk-17-jdk-headless slovdahl@ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/java -version openjdk version "17.0.10" 2024-01-16 OpenJDK Runtime Environment (build 17.0.10+7-Ubuntu-122.04.1) OpenJDK 64-Bit Server VM (build 17.0.10+7-Ubuntu-122.04.1, mixed mode, sharing) slovdahl@ubuntu2204:~/reproducer$ /home/slovdahl/dev/external/jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version openjdk version "23-internal" 2024-09-17 OpenJDK Runtime Environment (build 23-internal-adhoc.slovdahl.jdk) OpenJDK 64-Bit Server VM (build 23-internal-adhoc.slovdahl.jdk, mixed mode, sharing) Reproducer.java used for the target process: import java.io.IOException; import java.net.InetSocketAddress; import java.net.ServerSocket; public class Reproducer { public static void main(String[] args) throws InterruptedException, IOException { int port; if (args.length > 0) { port = Integer.parseInt(args[0]); } else { port = 81; } System.out.println("Hello, World!"); try (var server = new ServerSocket()) { server.bind(new InetSocketAddress("localhost", port)); System.out.println("Bound to port " + port); while (true) { Thread.sleep(1_000L); } } } } systemd unit used for testing the fix for https://bugs.openjdk.org/browse/JDK-8226919: slovdahl@ubuntu2204:~/reproducer$ cat reproducer.service [Service] Type=simple ExecStart=/usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java User=slovdahl Group=slovdahl ReadWritePaths=/tmp AmbientCapabilities=CAP_NET_BIND_SERVICE slovdahl@ubuntu2204:~/reproducer$ sudo cp -a reproducer.service /etc/systemd/system/ slovdahl@ubuntu2204:~/reproducer$ sudo systemctl daemon-reload slovdahl@ubuntu2204:~/reproducer$ sudo systemctl start reproducer.service slovdahl@ubuntu2204:~/reproducer$ sudo systemctl status reproducer.service ● reproducer.service Loaded: loaded (/etc/systemd/system/reproducer.service; static) Active: active (running) since Wed 2024-05-01 20:22:01 EEST; 4s ago Main PID: 1576835 (java) Tasks: 26 (limit: 76968) Memory: 69.2M CPU: 832ms CGroup: /system.slice/reproducer.service └─1576835 /usr/lib/jvm/java-17-openjdk-amd64/bin/java /home/slovdahl/reproducer/Reproducer.java maj 01 20:22:01 ubuntu2204 systemd[1]: Started reproducer.service. maj 01 20:22:01 ubuntu2204 java[1576835]: Hello, World! maj 01 20:22:01 ubuntu2204 java[1576835]: Bound to port 81 slovdahl@ubuntu2204:~/reproducer$ ls -lh /proc/1576835/root ls: cannot read symbolic link '/proc/1576835/root': Permission denied lrwxrwxrwx 1 slovdahl slovdahl 0 maj 1 20:22 /proc/1576835/root slovdahl@ubuntu2204:~/reproducer$ sudo ls -lh /proc/1576835/root lrwxrwxrwx 1 slovdahl slovdahl 0 maj 1 20:22 /proc/1576835/root -> / Fails with vanilla OpenJDK 17: slovdahl@ubuntu2204:~/reproducer$ /usr/lib/jvm/java-17-openjdk-amd64/bin/jcmd 1576835 VM.version 1576835: com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file /proc/1576835/root/tmp/.java_pid1576835: target process 1576835 doesn't respond within 10500ms or HotSpot VM not loaded at jdk.attach/sun.tools.attach.VirtualMachineImpl.(VirtualMachineImpl.java:104) at jdk.attach/sun.tools.attach.AttachProviderImpl.attachVirtualMachine(AttachProviderImpl.java:58) at jdk.attach/com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:207) at jdk.jcmd/sun.tools.jcmd.JCmd.executeCommandForPid(JCmd.java:113) at
Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
using pid to namespace comparison is IMO inappropriate/misleading what is being tested is the sharing of a common mount namespace, therefore the test should be comparing the "mnt" namespace ids. Rgds - Larry On 5/2/24 3:22 AM, Sebastian Lövdahl wrote: On Thu, 2 May 2024 10:13:51 GMT, Sebastian Lövdahl wrote: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) This is a first stab at fixing the regression introduced in #17628. There has been a bit of discussion in https://mail.openjdk.org/pipermail/serviceability-dev/2024-April/055317.html and https://mail.openjdk.org/pipermail/serviceability-dev/2024-May/055364.html about exactly how to solve it, but this first attempt requires quite few changes at least. - PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2090105068
Re: RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
On Thu, 2 May 2024 10:13:51 GMT, Sebastian Lövdahl wrote: > 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid > (Kubernetes debug container) This is a first stab at fixing the regression introduced in #17628. There has been a bit of discussion in https://mail.openjdk.org/pipermail/serviceability-dev/2024-April/055317.html and https://mail.openjdk.org/pipermail/serviceability-dev/2024-May/055364.html about exactly how to solve it, but this first attempt requires quite few changes at least. - PR Comment: https://git.openjdk.org/jdk/pull/19055#issuecomment-2090105068
Re: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
On 5/2/24 3:09 AM, Sebastian Lövdahl wrote: Interesting, TIL about /proc//ns. I tried to look for something like that but couldn't find anything relevant in /proc//status. ok So, a pixel perfect solution could compare these IDs to know whether /tmp or /proc//root/tmp should be used. > 2. jcmd treats it as a heuristic and attempts each way during the socket file read - first /proc//root/tmp and then /tmp. This is what I had in mind as well, and what I actually implemented and tested already. I think I'll just go ahead and open a PR with those changes, and we can take it from there. yes, IMO the attachee should always use /proc/self/root/tmp and the attacher should compare /proc//ns/mnt with its /proc/self/ns/mnt with Path::toString comparison, if they match then it is "safe" to use /tmp, if they do not then it must resort to using /proc//ns/mnt if the attacher does not have sufficient privilege to access that due to attachee capabilities etc then the attach will fail Rgds - Larry /Sebastian On 2024-05-02 03:43, Laurence Cable wrote: just to demonstrate: $ docker run -it --name=js1 openjdk:17.0.1-jdk /bin/jshell ... $ docker run -it --name js2 --pid=container:js1 openjdk:17.0.1 /bin/jshell $ docker exec -it js1 bash bash-4.4# ls /tmp/hsperfdata_root 1 26 bash-4.4# readlink /proc/26/ns/pid pid:[4026532751] bash-4.4# readlink /proc/26/ns/mnt mnt:[4026532747] bash-4.4# exit [lpgc@arran ~]$ docker exec -it js2 bash bash-4.4# ls /tmp/hsperfdata_root 107 82 bash-4.4# readlink /proc/107/ns/pid pid:[4026532751] bash-4.4# readlink /proc/107/ns/mnt mnt:[4026532941] you will note that the JVM pid: 26 and 107 occupy the same pid namespace (4026532751) but occupy different mnt namespaces (4026532747, 4026532941) therefore attempting to attach via '/tmp' will fail, /proc//root/tmp must be used to rendezvous - Larry On 5/1/24 2:03 PM, Doyle, James, K wrote: Hi Sebastian, I think I can confirm that there is a regression. Thanks for reproducing the regression, your test makes sense to me, and I think it is similar to the scenario we have with Kubernetes debug containers (separate filesystems, but same PID namespace). I noticed some of the other recent Pull Request comments on https://github.com/openjdk/jdk/pull/17628: should it not be comparing pid namespace ids and not pids? and wanted to give a little feedback. I think more refined approaches to figuring out whether the target JVM is in the same PID namespace make sense and could be an improvement, but it's still different from figuring out whether the target JVM has the same filesystem (specifically, I guess, the filesystem containing /tmp or java.io.tmpdir). That seems like the crucial thing for deciding what socket file path to read, and whether /tmp is sufficient or /proc//root/tmp is needed. I can think of a couple different approaches to the filesystem issue: 1. There is some Linux kernel information that can be obtained about the jcmd process and the target JVM process to figure out unequivocally what their root filesystems are from the host's point of view, and whether they're the same. (I don't know what this might be, though!) 2. jcmd treats it as a heuristic and attempts each way during the socket file read - first /proc//root/tmp and then /tmp. 3. jcmd has some option or environment variable where the user can tell it the socket file path. Do you agree that these are the types of choices available? Thanks, Jim
RFR: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) - Commit messages: - 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container) Changes: https://git.openjdk.org/jdk/pull/19055/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19055=00 Issue: https://bugs.openjdk.org/browse/JDK-8327114 Stats: 69 lines in 2 files changed: 34 ins; 2 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/19055.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19055/head:pull/19055 PR: https://git.openjdk.org/jdk/pull/19055
RFR: 8330146: assert(!_thread->is_in_any_VTMS_transition()) failed
Any event posting code except CFLH, ClassPrepare and ClassLoad events has a conditional return in case if the event is posted during a VTMS transition. The CFLH, ClassPrepare and ClassLoad event posting code has just an assert instead. The ClassPrepare and ClassLoad events also have a conditional return in a case of temporary VTMS transition. This update is to align the CFLH, ClassPrepare and ClassLoad events with all other events in this area. Testing: - TBD: submit mach5 tiers 1-6 - Commit messages: - 8330146: assert(!_thread->is_in_any_VTMS_transition()) failed Changes: https://git.openjdk.org/jdk/pull/19054/files Webrev: https://webrevs.openjdk.org/?repo=jdk=19054=00 Issue: https://bugs.openjdk.org/browse/JDK-8330146 Stats: 9 lines in 1 file changed: 2 ins; 2 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/19054.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19054/head:pull/19054 PR: https://git.openjdk.org/jdk/pull/19054
Re: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
Interesting, TIL about /proc//ns. I tried to look for something like that but couldn't find anything relevant in /proc//status. So, a pixel perfect solution could compare these IDs to know whether /tmp or /proc//root/tmp should be used. > 2. jcmd treats it as a heuristic and attempts each way during the socket file read - first /proc//root/tmp and then /tmp. This is what I had in mind as well, and what I actually implemented and tested already. I think I'll just go ahead and open a PR with those changes, and we can take it from there. /Sebastian On 2024-05-02 03:43, Laurence Cable wrote: just to demonstrate: $ docker run -it --name=js1 openjdk:17.0.1-jdk /bin/jshell ... $ docker run -it --name js2 --pid=container:js1 openjdk:17.0.1 /bin/jshell $ docker exec -it js1 bash bash-4.4# ls /tmp/hsperfdata_root 1 26 bash-4.4# readlink /proc/26/ns/pid pid:[4026532751] bash-4.4# readlink /proc/26/ns/mnt mnt:[4026532747] bash-4.4# exit [lpgc@arran ~]$ docker exec -it js2 bash bash-4.4# ls /tmp/hsperfdata_root 107 82 bash-4.4# readlink /proc/107/ns/pid pid:[4026532751] bash-4.4# readlink /proc/107/ns/mnt mnt:[4026532941] you will note that the JVM pid: 26 and 107 occupy the same pid namespace (4026532751) but occupy different mnt namespaces (4026532747, 4026532941) therefore attempting to attach via '/tmp' will fail, /proc//root/tmp must be used to rendezvous - Larry On 5/1/24 2:03 PM, Doyle, James, K wrote: Hi Sebastian, I think I can confirm that there is a regression. Thanks for reproducing the regression, your test makes sense to me, and I think it is similar to the scenario we have with Kubernetes debug containers (separate filesystems, but same PID namespace). I noticed some of the other recent Pull Request comments on https://github.com/openjdk/jdk/pull/17628: should it not be comparing pid namespace ids and not pids? and wanted to give a little feedback. I think more refined approaches to figuring out whether the target JVM is in the same PID namespace make sense and could be an improvement, but it's still different from figuring out whether the target JVM has the same filesystem (specifically, I guess, the filesystem containing /tmp or java.io.tmpdir). That seems like the crucial thing for deciding what socket file path to read, and whether /tmp is sufficient or /proc//root/tmp is needed. I can think of a couple different approaches to the filesystem issue: 1. There is some Linux kernel information that can be obtained about the jcmd process and the target JVM process to figure out unequivocally what their root filesystems are from the host's point of view, and whether they're the same. (I don't know what this might be, though!) 2. jcmd treats it as a heuristic and attempts each way during the socket file read - first /proc//root/tmp and then /tmp. 3. jcmd has some option or environment variable where the user can tell it the socket file path. Do you agree that these are the types of choices available? Thanks, Jim -- Sebastian Lövdahl Software Architect, Hibox Systems - https://www.hibox.tv sebastian.lovd...@hibox.tv
Re: RFR: 8322043: HeapDumper should use parallel dump by default [v6]
On Tue, 30 Apr 2024 18:54:09 GMT, Alex Menkov wrote: >> The fix makes VM heap dumping parallel by default. >> `jcmd GC.heap_dump` and `jmap -dump` had parallel dumping by default, the >> fix affects `HotSpotDiagnosticMXBean.dumpHeap()`, >> `-XX:+HeapDumpBeforeFullGC`, `-XX:+HeapDumpAfterFullGC` and >> `-XX:+HeapDumpOnOutOfMemoryError`. >> >> Testing: >> - manually tested different heap dump scenarios with `-Xlog:heapdump`; >> - tier1,tier2,hs-tier5-svc; >> - all reg.tests that use heap dump. > > Alex Menkov has updated the pull request incrementally with one additional > commit since the last revision: > > removed space-only change in diagnosticCommand.cpp Late to the party here, sorry. Are there motivational performance improvements that we get from enabling parallel heap dumps by default? Asking because [JDK-8319650](https://bugs.openjdk.org/browse/JDK-8319650) and [JDK-8320924](https://bugs.openjdk.org/browse/JDK-8320924) improved single-threaded heap dump performance drastically, and the I/O looked to be a bottleneck going forward. Would parallel heap dump take _more_ I/O for writing chunks first, and then combining them into large file? Do we know that parallel heap dump still wins a lot? I don't think it does all that much. Here is a simple run using the workload from [JDK-8319650](https://bugs.openjdk.org/browse/JDK-8319650) on 10-core machine: $ for I in `seq 1 5`; do java -XX:+UseParallelGC -XX:+HeapDumpAfterFullGC -Xms8g -Xmx8g HeapDump.java 2>&1 | grep created; rm *.hprof; done # jdk mainline with this fix (parallel) Heap dump file created [1897668110 bytes in 1.401 secs] Heap dump file created [1897667841 bytes in 1.354 secs] Heap dump file created [1897668050 bytes in 1.440 secs] Heap dump file created [1897668101 bytes in 1.366 secs] Heap dump file created [1897668101 bytes in 1.345 secs] # jdk mainline without this fix (sequential) Heap dump file created [1897668092 bytes in 2.314 secs] Heap dump file created [1897668092 bytes in 2.384 secs] Heap dump file created [1897668092 bytes in 2.269 secs] Heap dump file created [1897668092 bytes in 2.274 secs] Heap dump file created [1897667816 bytes in 2.282 secs] This is less than 2x improvement, even though we took 3 threads to heap dump: [1.645s][info][heapdump] Requested dump threads 3, active dump threads 9, actual dump threads 3, parallelism true [1.649s][info][heapdump] Dump non-objects, 0.0046221 secs [1.651s][info][heapdump] Dump non-objects (part 2), 0.0019995 secs [2.230s][info][heapdump] Dump heap objects in parallel, 0.5850964 secs [2.230s][info][heapdump] Dump heap objects in parallel, 0.5852571 secs [2.230s][info][heapdump] Dump heap objects in parallel, 0.5790543 secs [2.558s][info][heapdump] Merge segmented heap file, 0.3268282 secs [2.863s][info][heapdump] Merge segmented heap file, 0.3047630 secs [3.307s][info][heapdump] Merge segmented heap file, 0.4436261 secs [3.308s][info][heapdump] Merge heap files complete, 1.0766620 secs Heap dump file created [1897667959 bytes in 1.664 secs] And this is on a fast SSD, where I/O is abundant, and there is plenty of space. The sequential heap dump also seems to be regressing against jdk21u-dev, which does: Heap dump file created [1897840374 bytes in 1.071 secs] Heap dump file created [1897840481 bytes in 1.070 secs] Heap dump file created [1897840490 bytes in 1.069 secs] Heap dump file created [1897840481 bytes in 1.073 secs] Heap dump file created [1897840481 bytes in 1.134 secs] I believe that is because the 2-phase heap dump makes excess work for a single-threaded heap dump. Note that the _parallel_ heap dump in current mainline is not even able to catch up with *what we already had* with _sequential_ heap dump. I propose we revert this switch to parallel, fix the sequential heap dump performance, and then reconsider -- with benchmarks -- if we want to switch to parallel. - PR Comment: https://git.openjdk.org/jdk/pull/18748#issuecomment-2089957067
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage [v2]
> The fix is to degrade virtual threads support in the JVM TI > `GetObjectMonitorUsage` function so that it is specified to only return an > owner when the owner is a platform thread. Also, virtual threads are not > listed in the both `waiters` and `notify_waiters` lists returned in the > `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI > functions and events for virtual threads, we missed this one. > > The main motivation for degrading it now is that the object monitor > implementation is being updated to allow virtual threads unmount while owning > monitors. It would add overhead to record monitor usage when > freezing/unmount, overhead that couldn't be tied to a JVMTI capability as the > capability can be enabled at any time. > > `GetObjectMonitorUsage` was broken for 20+ years > ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports > so it seems unlikely that the function is widely used. Degrading it to only > return an owner when the owner is a platform thread has no compatibility > impact for tooling that uses it in conjunction with `HotSpot` thread dumps or > `ThreadMXBean`. > > One other point about `GetObjectMonitorUsage` is that it pre-dates > j.u.concurrent in Java 5 so it can't be used to get a full picture of the > lock usage in a program. > > The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the > JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` > methods are updated to match the JVM TI spec. > > Also, please, review the related CSR and Release Note: > - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade > virtual thread support for GetObjectMonitorUsage > - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: > degrade virtual thread support for GetObjectMonitorUsage > > Testing: > - tested impacted and updated tests locally > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: Corrections in: 1) JVMTI/JDWP spec; 2) test vthread checks; 3) test comments - Changes: - all: https://git.openjdk.org/jdk/pull/19030/files - new: https://git.openjdk.org/jdk/pull/19030/files/9a3b8192..7465f064 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk=19030=01 - incr: https://webrevs.openjdk.org/?repo=jdk=19030=00-01 Stats: 23 lines in 6 files changed: 6 ins; 2 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/19030.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/19030/head:pull/19030 PR: https://git.openjdk.org/jdk/pull/19030
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage
On Wed, 1 May 2024 21:01:16 GMT, Chris Plummer wrote: >> The fix is to degrade virtual threads support in the JVM TI >> `GetObjectMonitorUsage` function so that it is specified to only return an >> owner when the owner is a platform thread. Also, virtual threads are not >> listed in the both `waiters` and `notify_waiters` lists returned in the >> `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI >> functions and events for virtual threads, we missed this one. >> >> The main motivation for degrading it now is that the object monitor >> implementation is being updated to allow virtual threads unmount while >> owning monitors. It would add overhead to record monitor usage when >> freezing/unmount, overhead that couldn't be tied to a JVMTI capability as >> the capability can be enabled at any time. >> >> `GetObjectMonitorUsage` was broken for 20+ years >> ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports >> so it seems unlikely that the function is widely used. Degrading it to only >> return an owner when the owner is a platform thread has no compatibility >> impact for tooling that uses it in conjunction with `HotSpot` thread dumps >> or `ThreadMXBean`. >> >> One other point about `GetObjectMonitorUsage` is that it pre-dates >> j.u.concurrent in Java 5 so it can't be used to get a full picture of the >> lock usage in a program. >> >> The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the >> JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` >> methods are updated to match the JVM TI spec. >> >> Also, please, review the related CSR and Release Note: >> - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade >> virtual thread support for GetObjectMonitorUsage >> - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: >> degrade virtual thread support for GetObjectMonitorUsage >> >> Testing: >> - tested impacted and updated tests locally >> - tested with mach5 tiers 1-6 > > test/hotspot/jtreg/vmTestbase/nsk/jdi/ObjectReference/waitingThreads/waitingthreads002.java > line 167: > >> 165: try { >> 166: List waitingThreads = objRef.waitingThreads(); >> 167: if (waitingThreads.size() != expWaitingCount) { > > I don't see the need for the expWaitingCount bookkeeping. Can't we just > verify that size() is zero if we are using virtual threads? I guess maybe the > reason you took this approach is because you don't know if the threads are > going to be virtual or not until you check them. There is a way to find out, > but it's not that pretty either: > > static final boolean vthreadMode = > "Virtual".equals(System.getProperty("test.thread.factory")); Thank you for the suggestion. Updated with it. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1587177852
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage
On Wed, 1 May 2024 21:11:36 GMT, Chris Plummer wrote: >> The fix is to degrade virtual threads support in the JVM TI >> `GetObjectMonitorUsage` function so that it is specified to only return an >> owner when the owner is a platform thread. Also, virtual threads are not >> listed in the both `waiters` and `notify_waiters` lists returned in the >> `jvmtiMonitorUsage` structure. Java 19 re-specified a number of JVMTI >> functions and events for virtual threads, we missed this one. >> >> The main motivation for degrading it now is that the object monitor >> implementation is being updated to allow virtual threads unmount while >> owning monitors. It would add overhead to record monitor usage when >> freezing/unmount, overhead that couldn't be tied to a JVMTI capability as >> the capability can be enabled at any time. >> >> `GetObjectMonitorUsage` was broken for 20+ years >> ([8247972](https://bugs.openjdk.org/browse/JDK-8247972)) without bug reports >> so it seems unlikely that the function is widely used. Degrading it to only >> return an owner when the owner is a platform thread has no compatibility >> impact for tooling that uses it in conjunction with `HotSpot` thread dumps >> or `ThreadMXBean`. >> >> One other point about `GetObjectMonitorUsage` is that it pre-dates >> j.u.concurrent in Java 5 so it can't be used to get a full picture of the >> lock usage in a program. >> >> The specs of the impacted `JDWP ObjectReference.MonitorInfo` command and the >> JDI `ObjectReference` `ownerThread()`, `waitingThreads()` and `entryCount()` >> methods are updated to match the JVM TI spec. >> >> Also, please, review the related CSR and Release Note: >> - CSR: [8331422](https://bugs.openjdk.org/browse/JDK-8331422): degrade >> virtual thread support for GetObjectMonitorUsage >> - RN: [8331465](https://bugs.openjdk.org/browse/JDK-8331465): Release Note: >> degrade virtual thread support for GetObjectMonitorUsage >> >> Testing: >> - tested impacted and updated tests locally >> - tested with mach5 tiers 1-6 > > test/hotspot/jtreg/serviceability/jvmti/ObjectMonitorUsage/ObjectMonitorUsage.java > line 257: > >> 255: // Correct the expected values for the virtual thread case. >> 256: int expEnteringCount = isVirtual ? 0 : >> NUMBER_OF_ENTERING_THREADS; >> 257: int expWaitingCount = isVirtual ? 0 : >> NUMBER_OF_WAITING_THREADS; > > There are comments below that still reference NUMBER_OF_ENTERING_THREADS and > NUMBER_OF_WAITING_THREADS. Thank you for the comment. In fact, I don't know how to fix it. Replacing with `expEnteringCount/expWaitingCount` does not make sense to me. The comments are about the tested pattern, not about the real values. Please, let me know if you have any suggestion on fixing. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1587142283
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage
On Thu, 2 May 2024 02:49:35 GMT, David Holmes wrote: >> You can drop "the" from "with the JTREG_TEST_THREAD_FACTORY=Virtual" > > And drop "the" from "the GetObjectMonitorUsage". Thank you. Updated. - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1587137513
Re: RFR: 8328083: degrade virtual thread support for GetObjectMonitorUsage
On Thu, 2 May 2024 02:47:19 GMT, David Holmes wrote: >> Okay, thanks. > > Second suggestion is better. "waited by" is not grammatically correct in this > context. Thank you, David. So, the latest version is: The number of platform threads waiting to own this monitor, or 0 if only virtual threads are waiting or no threads are waiting - PR Review Comment: https://git.openjdk.org/jdk/pull/19030#discussion_r1587131816