On Fri, 21 Nov 2025 11:38:49 GMT, Serguei Spitsyn <[email protected]> wrote:

>> Thanks @sspitsyn for diving into the issue. 
>> 
>> With that definition of the deadlock and suspension logic I agree that it 
>> might not be a problem at all. With this being said, is the existing test 
>> `SuspendWithObjectMonitorWait` demonstrating a real-world scenario? 
>> @dcubed-ojdk, what do you think?
>
>> is the existing test SuspendWithObjectMonitorWait demonstrating a real-world 
>> scenario?
> 
> It does not look as such. There could be some motivation to write it however, 
> e.g. checking some invariants. At least, it seems this test does not enforce 
> any strict rules on the OM implementation and JVMTI events + suspend/resume. 
> :)
> New tests do not allow for OM implementation to put `MonitorWaited` event 
> notification at a right spot. Otherwise, I would not object against them.

> @sspitsyn so your position is that it is okay for suspension to cause 
> something to break as long as resuming the suspended thread then fixes 
> things? Does it matter how much time passes?

Suspend/Resume is a debugging feature and normally used in a debugging session 
and expected to cause a slow down. 
Also, it is known to be somewhat risky to use. So, my answer is yes. Slow down 
time does not matter much as it depends on a suspension time.

> We have had a lot of discussion about this outside the PR and some of us at 
> least feel there is a distinction between suspending a thread that clearly 
> holds an application level resource (like a monitor) which then prevents 
> other threads from continuing, versus suspending a thread holding a VM 
> internal "resource" that prevents other threads from continuing.

Agreed

> The design of JVM TI thread suspension actively tries to minimise the ability 
> to hold any internal VM resource whilst suspended, and the current problem 
> seems a variant of that. If we suspend a thread that has not yet acquired a 
> monitor, and inspection of the monitor would show it is not owned, then it 
> seems a bug if other threads trying to acquire that monitor can not make 
> progress.

You, most likely right that the current problem is a variant of that. But I 
disagree to qualify this issue as a deadlock.
The thread was picked as a successor and then suspended. It feels like it has 
to be qualified same as a thread owns the monitor and suspended. The issue is 
that the OM real state and the JVMTI state bits do to reflect this.
I feel that changing the order of JVMTI OM events and giving up the symmetry 
between `timed-wait` and `notified-wait` is risky and may cause more problems 
than this bug is trying to solve. I'm thinking if there is a way the tread 
could give up
being the OM successor.

> Agreed the tests are completely artificial but there is no way to test this 
> other than to do that.

Agree with this. But the deadlock is in the new tests, not in the OM 
implementation. And yes, constructing a deadlock was needed to demonstrate the 
problem. It is confusing. :)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27040#issuecomment-3569971399

Reply via email to