Re: RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

David Holmes Thu, 03 Oct 2019 06:31:12 -0700



On 3/10/2019 11:11 pm, Daniel D. Daugherty wrote:

On 10/3/19 6:13 AM, David Holmes wrote:
Hi Dan,

On 3/10/2019 3:20 am, Daniel D. Daugherty wrote:
Sorry for the delay in reviewing this one... I've been playingwhack-a-mole
with Robbin's MoCrazy test and my AsyncMonitorDeflation bits...
No problem - your contribution made the wait worthwhile :)
On 9/24/19 1:09 AM, David Holmes wrote:
Bug: https://bugs.openjdk.java.net/browse/JDK-8231289
webrev: http://cr.openjdk.java.net/~dholmes/8231289/webrev/
src/hotspot/share/prims/jvmtiEnv.cpp
     Thanks for removing the PROPER_TRANSITIONS stuff. That was old
     and crufty stuff.

src/hotspot/share/prims/jvmtiEnvBase.cpp
     No comments.

src/hotspot/share/prims/jvmtiRawMonitor.cpp
L39: new (ResourceObj::C_HEAP, mtInternal)GrowableArray<JvmtiRawMonitor*>(1,true);
         nit - need a space between ',' and 'true'.

         Update: leave for your follow-up bug.
Fixed now so I don't forget later. :)
src/hotspot/share/prims/jvmtiRawMonitor.hpp
     No comments.

src/hotspot/share/runtime/objectMonitor.hpp
     Glad I added those 'protected for JvmtiRawMonitor' in one
     of my recent cleanup bugs. Obviously I'll have to merge
     with Async Monitor Deflation. :-)

src/hotspot/share/runtime/thread.cpp
     No comments.

src/hotspot/share/runtime/thread.hpp
     No comments.

src/hotspot/share/services/threadService.cpp
     L397:     waitingToLockMonitor = jt->current_pending_monitor();
     L398:     if (waitingToLockMonitor == NULL) {
L399: // we can only be blocked on a raw monitor if notblocked on an ObjectMonitor L400: waitingToLockRawMonitor =jt->current_pending_raw_monitor();
     L401:     }

         JVM/TI has this event handler:

           typedef void (JNICALL *jvmtiEventMonitorContendedEnter)
               (jvmtiEnv *jvmti_env,
                JNIEnv* jni_env,
                jthread thread,
                jobject object);
This event handler is called afterset_current_pending_monitor() and if the event handler uses a RawMonitor, then it possiblefor
         for the thread to show up as blocked on both a Java monitor and
         a JVM/TI RawMonitor.
Oh that is interesting - good catch! So that means the current code isbroken because the raw monitor will replace the ObjectMonitor as thepending monitor and then set it back to NULL, thus losing the fact thethread is actually pending on the ObjectMonitor. And of course whilethe pending monitor is the raw monitor that totally messes up thedeadlock detection as the ObjectMonitor is missing from consideration. :(
This also probably means that you can have a pending raw monitor atthe same time as you have a "Blocker" as I'm pretty sure there arevarious JVM TI event handlers that may execute between the Blockerbeing set and the actual park. So that would be an additional breakagein the existing code.
Back to my code and I have two problems. The second, which is easy toaddress, is the deadlock printing code. I'll hoist thewaitingToLockRawMonitor chunk to the top so it is executed independentof the waitingToLockMonitor value (which remains in an if/elserelationship with the waitingToLockBlocker). But now that we mightprint two "records" at a time I have to make additional changes to getmeaningful output for the current thread (which is handled as a commoncode after the if/else block to finish whichever record was beingprinted). Also I can no longer use "continue" as the 3 outcomes arenot mutually exclusive - so this could get a bit messy. :(
So definitely a v2 webrev on the way.
But before that I need to solve my first problem - and I don't knowhow. Now that it is apparent that a thread can be blocked on both araw monitor and an ObjectMonitor at the same time, I have no idea howto actually account for this in the deadlock detection code. That codehas a while loop that expects to at most find either a lockedObjectMonitor or j.u.c Blocker, and it adds the owner thread to thecycle detection, then moves on. But now I can have two different ownerthreads in the same loop iteration. I don't know how to account for that.
Given that it seems to me that the current code is already broken ifwe encounter these conditions, then perhaps all I can do is handle theother cases, where the blocking reasons are mutually exclusive, andnot try to fix things? i.e. leave lines #434 to #440 as they are inwebrev v1 - which implies no change to line #398 except the comment... ??
Ouch. Sorry I didn't mean to throw such a large monkey wrench into the
mix... I skimmed the JVM/TI spec again looking for anything that might
help ease the situation, but had no luck.

Perhaps approach it from a slightly different perspective...

     If both waitingToLockMonitor and waitingToLockRawMonitor are set
     on the same thread, then waitingToLockRawMonitor should take
     precedence since we are not yet truly blocked on the
     waitingToLockMonitor condition. After we did continue to execute
     and that got us into code that got waitingToLockRawMonitor set.

I don't see that. The raw monitor usage is incidental to the eventcallback. There could be a real deadlock with the ObjectMonitor, butwe're just transiently blocked on the raw monitor.

     Will that help from a deadlock detection point of view? Maybe.
     If our target thread is holding some other lock before it logically
     blocked on waitingToLockMonitor, it is still useful to report that
     it is now blocked on waitingToLockRawMonitor. After all the block
     on waitingToLockRawMonitor is also contributing to the deadlock.

It isn't the reporting that is the issue it is the actual deadlockdetection logic. That code as written can't accommodate being blocked ontwo different "locks" with potentially two different owners, at the sametime. To me it just breaks the whole approach that has been taken todetect cycles in the locking.


Thanks,
David
-----

     Once the developer solves the RawMonitor deadlock cause, there may
     still be another deadlock related to waitingToLockMonitor, but we've
     reported one layer of the onion.

Food for thought...

Dan
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
     No comments.
Thumbs up! The only non-nit I have is the setting ofwaitingToLockRawMonitor
on L400 and the corresponding comment on L399. Everything else is a nit.

I don't need to see a new webrev.
If only that were true :(

Thanks,
David
Thanks for tackling this disentangle issue!

Dan
The earlier attempt to rewrite JvmtiRawMonitor as a simple wrapperaround PlatformMonitor proved not so simple and ultimately had toomany issues due to the need to support Thread.interrupt.
I'd previously stated in the bug report:
"In the worst-case I suppose we could just copy ObjectMonitor to anew class and have JvmtiRawMonitor continue to extend that (withsome additional minor adjustments) - or even just inline it all asneeded."
but hadn't looked at it in detail. Richard Reingruber did look at itand pointed out that it is actually quite simple - we barely use anyactual code from ObjectMonitor, mainly just the state. So thanksRichard! :)
So this change basically copies or moves anything needed byJvmtiRawMonitor from ObjectMonitor, breaking the connection betweenthe two. We also copy and simplify ObjectWaiter, turning it into aQNode internal class. There is then a lot of cleanup that wasapplied (and a lot more that could still be done):
- Removed the never implemented/used PROPER_TRANSITIONS ifdefs
- Fixed the disconnect between the types of non-JavaThreads expectedby the upper layer code and lower layer code
- cleaned up and simplified return codes
- consolidated code that is identical for JavaThreads andnon-JavaThreads (e.g. notify/notifyAll).- removed used of TRAPS/THREAD where not appropriate and replacedwith "Thread * Self" in the style of the rest of the code- changed recursions to be int rather than intptr_t (a "fixme" inthe ObjectMonitor code)
I have not changed the many style flaws with this code:
- Capitalized names
- extra spaces before ;
- ...
but could do so if needed. I wanted to try and keep it more obviousthat the fundamental functional code is actually unmodified.
There is one aspect that requires further explanation: the notion ofcurrent pending monitor. The "current pending monitor" is stored inthe Thread and used by a number of introspection APIs for thingslike finding monitors, doing deadlock detection, etc. TheJvmtiRawMonitor code would also set/clear itself as "current pendingmonitor". Most uses of the current pending monitor actually,explicitly or implicitly, ignore the case when the monitor is aJvmtiRawMonitor (observed by the fact the mon->object() queryreturns NULL). The exception to that is deadlock detection where rawmonitors are at least partially accounted for. To preserve that Iadded the notion of "current pending raw monitor" and updated thedeadlock detection code to use that.
The test:
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
was updated because I'd noticed previously that it was the only testthat used interrupt with raw monitors, but was in fact broken: thetest thread is a daemon thread so the main thread could terminatethe VM immediately after the interrupt() call, thus you would neverknow if the interruption actually worked as expected.
Testing:
 - tiers 1 - 3
 - vmTestbase/nsk/monitoring/  (for deadlock detection**)
 - vmTestbase/nsk/jdwp
 - vmTestbase/nsk/jdb/
 - vmTestbase/nsk/jdi/
 - vmTestbase/nsk/jvmti/
 - serviceability/jvmti/
 - serviceability/jdwp
 - JDK: java/lang/management
** There are no existing deadlock related tests involvingJvmtiRawMonitor. It would be interesting/useful to add them to theexisting nsk/monitoring tests that cover synchronized and JNIlocking. But it's a non-trivial enhancement that I don't really havetime to do.
Thanks,
David
-----

Re: RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

Reply via email to