On Fri, 21 Nov 2025 11:38:49 GMT, Serguei Spitsyn <[email protected]> wrote:
>> Thanks @sspitsyn for diving into the issue. >> >> With that definition of the deadlock and suspension logic I agree that it >> might not be a problem at all. With this being said, is the existing test >> `SuspendWithObjectMonitorWait` demonstrating a real-world scenario? >> @dcubed-ojdk, what do you think? > >> is the existing test SuspendWithObjectMonitorWait demonstrating a real-world >> scenario? > > It does not look as such. There could be some motivation to write it however, > e.g. checking some invariants. At least, it seems this test does not enforce > any strict rules on the OM implementation and JVMTI events + suspend/resume. > :) > New tests do not allow for OM implementation to put `MonitorWaited` event > notification at a right spot. Otherwise, I would not object against them. > @sspitsyn so your position is that it is okay for suspension to cause > something to break as long as resuming the suspended thread then fixes > things? Does it matter how much time passes? Suspend/Resume is a debugging feature and normally used in a debugging session and expected to cause a slow down. Also, it is known to be somewhat risky to use. So, my answer is yes. Slow down time does not matter much as it depends on a suspension time. > We have had a lot of discussion about this outside the PR and some of us at > least feel there is a distinction between suspending a thread that clearly > holds an application level resource (like a monitor) which then prevents > other threads from continuing, versus suspending a thread holding a VM > internal "resource" that prevents other threads from continuing. Agreed > The design of JVM TI thread suspension actively tries to minimise the ability > to hold any internal VM resource whilst suspended, and the current problem > seems a variant of that. If we suspend a thread that has not yet acquired a > monitor, and inspection of the monitor would show it is not owned, then it > seems a bug if other threads trying to acquire that monitor can not make > progress. You, most likely right that the current problem is a variant of that. But I disagree to qualify this issue as a deadlock. The thread was picked as a successor and then suspended. It feels like it has to be qualified same as a thread owns the monitor and suspended. The issue is that the OM real state and the JVMTI state bits do to reflect this. I feel that changing the order of JVMTI OM events and giving up the symmetry between `timed-wait` and `notified-wait` is risky and may cause more problems than this bug is trying to solve. I'm thinking if there is a way the tread could give up being the OM successor. > Agreed the tests are completely artificial but there is no way to test this > other than to do that. Agree with this. But the deadlock is in the new tests, not in the OM implementation. And yes, constructing a deadlock was needed to demonstrate the problem. It is confusing. :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3569971399
