On 10/3/19 9:55 PM, David Holmes wrote:
<trimming>
On 4/10/2019 10:01 am, Daniel D. Daugherty wrote:
On 10/3/19 7:35 PM, [email protected] wrote:
If I remember correctly it is a scenario where this issue also
comes to the picture:
https://bugs.openjdk.java.net/browse/JDK-8033399
I do not really understand how shared ParkEvent can be
used/consumed by both ObjectMonitor and RawMonitor on the same
thread.
But we observed and investigated this problem several years ago
and Dan finally filed this enhancement.
I still don't see how this is possible as you are not actually
enqueued on the ObjectMonitor when the call out to the event
callback is made. but that is a topic for another email thread. :)
Correct that you cannot be enqueued on the ObjectMonitor when you
make the callback. However, I don't think that was the point of
8033399 when we filed so very long ago...
Quoting a comment from David:
David Holmes added a comment - 2014-01-27 18:34
Could there be multiple places in event handling code that could in
theory consume unparks and so require the re-issue of an unpark()
from different locations in the code?
Seems to me that perhaps raw monitors - given they can be entered
from within the normal monitor code - should have their own _event
object per thread, so that this accidental consumption of unparks
can not occur.
And since then I've decided this isn't actually a problem.
Sounds good to me.
The scenario that comes to mind:
- T1 is contending on an ObjectMonitor and has set waitingToLockMonitor.
- T1 calls the jvmtiEventMonitorContendedEnter event handler that
contends on a JvmtiRawMonitor and has set waitingToLockRawMonitor.
- T1 blocks on the JvmtiRawMonitor and parks.
- T2 is exiting the ObjectMonitor and has picked T1 as the successor
so it unparks T1.
Nope - T1 is not yet enqueued on the ObjectMonitor so it can't be
picked as successor.
Also sounds good.
Only T1 is parked for the JvmtiRawMonitor and
not for the ObjectMonitor. T2 hasn't quite finished exiting the
ObjectMonitor yet... (not sure if this lag is possible)
- T1 has unparked early for the JvmtiRawMonitor and at the same
time T3 is exiting the JvmtiRawMonitor.
- T1 quickly enters the JvmtiRawMonitor and T3 doesn't have to
pick a successor so it doesn't do an unpark.
- T1 finishes the work of the jvmtiEventMonitorContendedEnter and
returns to the caller which is the code that's about to block on
the ObjectMonitor...
- T1 blocks on the ObjectMonitor and parks.
For this to actually be a problem requires that the call out to the
raw monitor code happens inbetween the check for whether T1 needs to
park and the park call. That also is not the case.
Also sounds good.
- T2 finishes exiting the ObjectMonitor... Does T1 get unparked?
I can't remember when T2 does the unpark() relative to dropping
ownership of the ObjectMonitor. If the unpark() is first or if
the _owner field setting to NULL lingers... it's possible for T1
to block and park... with nothing to unpark T1...
Pure, crazy theory here...
However, with David's work on this bug (8231289), this theoretical
problem goes away... That's the only reason for trying to close
this 8033399 sub-thread here...
My work here makes no difference to 8033399 perceived problem. The
same ParkEvent continues to be used by both JvmtiRawMonitor and
ObjectMonitor.
I confused myself into thinking that you either created a
new ParkEvent or didn't use the existing ParkEvent anymore.
My mistake.
Dan
Cheers,
David
Dan
Agreed.
Just wanted to point out it can be related.
Dan filed this RFE and can have more knowledge.
Meanwhile what to do about broken deadlock detection ... :(
It is a good catch from Dan.
Thanks,
Serguei
Thanks,
David
Thanks,
Serguei
This also probably means that you can have a pending raw monitor
at the same time as you have a "Blocker" as I'm pretty sure there
are various JVM TI event handlers that may execute between the
Blocker being set and the actual park. So that would be an
additional breakage in the existing code.
Back to my code and I have two problems. The second, which is
easy to address, is the deadlock printing code. I'll hoist the
waitingToLockRawMonitor chunk to the top so it is executed
independent of the waitingToLockMonitor value (which remains in
an if/else relationship with the waitingToLockBlocker). But now
that we might print two "records" at a time I have to make
additional changes to get meaningful output for the current
thread (which is handled as a common code after the if/else block
to finish whichever record was being printed). Also I can no
longer use "continue" as the 3 outcomes are not mutually
exclusive - so this could get a bit messy. :(
So definitely a v2 webrev on the way.
But before that I need to solve my first problem - and I don't
know how. Now that it is apparent that a thread can be blocked on
both a raw monitor and an ObjectMonitor at the same time, I have
no idea how to actually account for this in the deadlock
detection code. That code has a while loop that expects to at
most find either a locked ObjectMonitor or j.u.c Blocker, and it
adds the owner thread to the cycle detection, then moves on. But
now I can have two different owner threads in the same loop
iteration. I don't know how to account for that.
Given that it seems to me that the current code is already broken
if we encounter these conditions, then perhaps all I can do is
handle the other cases, where the blocking reasons are mutually
exclusive, and not try to fix things? i.e. leave lines #434 to
#440 as they are in webrev v1 - which implies no change to line
#398 except the comment ... ??
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
No comments.
Thumbs up! The only non-nit I have is the setting of
waitingToLockRawMonitor
on L400 and the corresponding comment on L399. Everything else
is a nit.
I don't need to see a new webrev.
If only that were true :(
Thanks,
David
Thanks for tackling this disentangle issue!
Dan
The earlier attempt to rewrite JvmtiRawMonitor as a simple
wrapper around PlatformMonitor proved not so simple and
ultimately had too many issues due to the need to support
Thread.interrupt.
I'd previously stated in the bug report:
"In the worst-case I suppose we could just copy ObjectMonitor
to a new class and have JvmtiRawMonitor continue to extend that
(with some additional minor adjustments) - or even just inline
it all as needed."
but hadn't looked at it in detail. Richard Reingruber did look
at it and pointed out that it is actually quite simple - we
barely use any actual code from ObjectMonitor, mainly just the
state. So thanks Richard! :)
So this change basically copies or moves anything needed by
JvmtiRawMonitor from ObjectMonitor, breaking the connection
between the two. We also copy and simplify ObjectWaiter,
turning it into a QNode internal class. There is then a lot of
cleanup that was applied (and a lot more that could still be
done):
- Removed the never implemented/used PROPER_TRANSITIONS ifdefs
- Fixed the disconnect between the types of non-JavaThreads
expected by the upper layer code and lower layer code
- cleaned up and simplified return codes
- consolidated code that is identical for JavaThreads and
non-JavaThreads (e.g. notify/notifyAll).
- removed used of TRAPS/THREAD where not appropriate and
replaced with "Thread * Self" in the style of the rest of the code
- changed recursions to be int rather than intptr_t (a "fixme"
in the ObjectMonitor code)
I have not changed the many style flaws with this code:
- Capitalized names
- extra spaces before ;
- ...
but could do so if needed. I wanted to try and keep it more
obvious that the fundamental functional code is actually
unmodified.
There is one aspect that requires further explanation: the
notion of current pending monitor. The "current pending
monitor" is stored in the Thread and used by a number of
introspection APIs for things like finding monitors, doing
deadlock detection, etc. The JvmtiRawMonitor code would also
set/clear itself as "current pending monitor". Most uses of the
current pending monitor actually, explicitly or implicitly,
ignore the case when the monitor is a JvmtiRawMonitor (observed
by the fact the mon->object() query returns NULL). The
exception to that is deadlock detection where raw monitors are
at least partially accounted for. To preserve that I added the
notion of "current pending raw monitor" and updated the
deadlock detection code to use that.
The test:
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
was updated because I'd noticed previously that it was the only
test that used interrupt with raw monitors, but was in fact
broken: the test thread is a daemon thread so the main thread
could terminate the VM immediately after the interrupt() call,
thus you would never know if the interruption actually worked
as expected.
Testing:
- tiers 1 - 3
- vmTestbase/nsk/monitoring/ (for deadlock detection**)
- vmTestbase/nsk/jdwp
- vmTestbase/nsk/jdb/
- vmTestbase/nsk/jdi/
- vmTestbase/nsk/jvmti/
- serviceability/jvmti/
- serviceability/jdwp
- JDK: java/lang/management
** There are no existing deadlock related tests involving
JvmtiRawMonitor. It would be interesting/useful to add them to
the existing nsk/monitoring tests that cover synchronized and
JNI locking. But it's a non-trivial enhancement that I don't
really have time to do.
Thanks,
David
-----