Re: RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

Daniel D. Daugherty Fri, 04 Oct 2019 06:46:54 -0700

On 10/3/19 9:55 PM, David Holmes wrote:

<trimming>
On 4/10/2019 10:01 am, Daniel D. Daugherty wrote:
On 10/3/19 7:35 PM, [email protected] wrote:
If I remember correctly it is a scenario where this issue alsocomes to the picture:
https://bugs.openjdk.java.net/browse/JDK-8033399
I do not really understand how shared ParkEvent can beused/consumed by both ObjectMonitor and RawMonitor on the samethread.But we observed and investigated this problem several years agoand Dan finally filed this enhancement.
I still don't see how this is possible as you are not actuallyenqueued on the ObjectMonitor when the call out to the eventcallback is made. but that is a topic for another email thread. :)
Correct that you cannot be enqueued on the ObjectMonitor when you
make the callback. However, I don't think that was the point of
8033399 when we filed so very long ago...

Quoting a comment from David:
David Holmes added a comment - 2014-01-27 18:34
Could there be multiple places in event handling code that could intheory consume unparks and so require the re-issue of an unpark()from different locations in the code?
Seems to me that perhaps raw monitors - given they can be enteredfrom within the normal monitor code - should have their own _eventobject per thread, so that this accidental consumption of unparkscan not occur.
And since then I've decided this isn't actually a problem.


Sounds good to me.


The scenario that comes to mind:

- T1 is contending on an ObjectMonitor and has set waitingToLockMonitor.
- T1 calls the jvmtiEventMonitorContendedEnter event handler that
   contends on a JvmtiRawMonitor and has set waitingToLockRawMonitor.
- T1 blocks on the JvmtiRawMonitor and parks.
- T2 is exiting the ObjectMonitor and has picked T1 as the successor

so it unparks T1.

Nope - T1 is not yet enqueued on the ObjectMonitor so it can't bepicked as successor.


Also sounds good.

Only T1 is parked for the JvmtiRawMonitor and
   not for the ObjectMonitor. T2 hasn't quite finished exiting the
   ObjectMonitor yet... (not sure if this lag is possible)
- T1 has unparked early for the JvmtiRawMonitor and at the same
   time T3 is exiting the JvmtiRawMonitor.
- T1 quickly enters the JvmtiRawMonitor and T3 doesn't have to
   pick a successor so it doesn't do an unpark.
- T1 finishes the work of the jvmtiEventMonitorContendedEnter and
   returns to the caller which is the code that's about to block on
   the ObjectMonitor...
- T1 blocks on the ObjectMonitor and parks.

For this to actually be a problem requires that the call out to theraw monitor code happens inbetween the check for whether T1 needs topark and the park call. That also is not the case.


Also sounds good.

- T2 finishes exiting the ObjectMonitor... Does T1 get unparked?

I can't remember when T2 does the unpark() relative to dropping
ownership of the ObjectMonitor. If the unpark() is first or if
the _owner field setting to NULL lingers... it's possible for T1
to block and park... with nothing to unpark T1...

Pure, crazy theory here...

However, with David's work on this bug (8231289), this theoretical
problem goes away... That's the only reason for trying to close
this 8033399 sub-thread here...

My work here makes no difference to 8033399 perceived problem. Thesame ParkEvent continues to be used by both JvmtiRawMonitor andObjectMonitor.


I confused myself into thinking that you either created a
new ParkEvent or didn't use the existing ParkEvent anymore.
My mistake.

Dan

Cheers,
David
Dan
Agreed.
Just wanted to point out it can be related.
Dan filed this RFE and can have more knowledge.
Meanwhile what to do about broken deadlock detection ... :(
It is a good catch from Dan.

Thanks,
Serguei
Thanks,
David
Thanks,
Serguei
This also probably means that you can have a pending raw monitorat the same time as you have a "Blocker" as I'm pretty sure thereare various JVM TI event handlers that may execute between theBlocker being set and the actual park. So that would be anadditional breakage in the existing code.
Back to my code and I have two problems. The second, which iseasy to address, is the deadlock printing code. I'll hoist thewaitingToLockRawMonitor chunk to the top so it is executedindependent of the waitingToLockMonitor value (which remains inan if/else relationship with the waitingToLockBlocker). But nowthat we might print two "records" at a time I have to makeadditional changes to get meaningful output for the currentthread (which is handled as a common code after the if/else blockto finish whichever record was being printed). Also I can nolonger use "continue" as the 3 outcomes are not mutuallyexclusive - so this could get a bit messy. :(
So definitely a v2 webrev on the way.
But before that I need to solve my first problem - and I don'tknow how. Now that it is apparent that a thread can be blocked onboth a raw monitor and an ObjectMonitor at the same time, I haveno idea how to actually account for this in the deadlockdetection code. That code has a while loop that expects to atmost find either a locked ObjectMonitor or j.u.c Blocker, and itadds the owner thread to the cycle detection, then moves on. Butnow I can have two different owner threads in the same loopiteration. I don't know how to account for that.
Given that it seems to me that the current code is already brokenif we encounter these conditions, then perhaps all I can do ishandle the other cases, where the blocking reasons are mutuallyexclusive, and not try to fix things? i.e. leave lines #434 to#440 as they are in webrev v1 - which implies no change to line#398 except the comment ... ??
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
     No comments.
Thumbs up! The only non-nit I have is the setting ofwaitingToLockRawMonitoron L400 and the corresponding comment on L399. Everything elseis a nit.
I don't need to see a new webrev.
If only that were true :(

Thanks,
David
Thanks for tackling this disentangle issue!

Dan
The earlier attempt to rewrite JvmtiRawMonitor as a simplewrapper around PlatformMonitor proved not so simple andultimately had too many issues due to the need to supportThread.interrupt.
I'd previously stated in the bug report:
"In the worst-case I suppose we could just copy ObjectMonitorto a new class and have JvmtiRawMonitor continue to extend that(with some additional minor adjustments) - or even just inlineit all as needed."
but hadn't looked at it in detail. Richard Reingruber did lookat it and pointed out that it is actually quite simple - webarely use any actual code from ObjectMonitor, mainly just thestate. So thanks Richard! :)
So this change basically copies or moves anything needed byJvmtiRawMonitor from ObjectMonitor, breaking the connectionbetween the two. We also copy and simplify ObjectWaiter,turning it into a QNode internal class. There is then a lot ofcleanup that was applied (and a lot more that could still bedone):
- Removed the never implemented/used PROPER_TRANSITIONS ifdefs
- Fixed the disconnect between the types of non-JavaThreadsexpected by the upper layer code and lower layer code
- cleaned up and simplified return codes
- consolidated code that is identical for JavaThreads andnon-JavaThreads (e.g. notify/notifyAll).- removed used of TRAPS/THREAD where not appropriate andreplaced with "Thread * Self" in the style of the rest of the code- changed recursions to be int rather than intptr_t (a "fixme"in the ObjectMonitor code)
I have not changed the many style flaws with this code:
- Capitalized names
- extra spaces before ;
- ...
but could do so if needed. I wanted to try and keep it moreobvious that the fundamental functional code is actuallyunmodified.
There is one aspect that requires further explanation: thenotion of current pending monitor. The "current pendingmonitor" is stored in the Thread and used by a number ofintrospection APIs for things like finding monitors, doingdeadlock detection, etc. The JvmtiRawMonitor code would alsoset/clear itself as "current pending monitor". Most uses of thecurrent pending monitor actually, explicitly or implicitly,ignore the case when the monitor is a JvmtiRawMonitor (observedby the fact the mon->object() query returns NULL). Theexception to that is deadlock detection where raw monitors areat least partially accounted for. To preserve that I added thenotion of "current pending raw monitor" and updated thedeadlock detection code to use that.
The test:
test/hotspot/jtreg/vmTestbase/nsk/jvmti/RawMonitorWait/rawmnwait005/rawmnwait005.cpp
was updated because I'd noticed previously that it was the onlytest that used interrupt with raw monitors, but was in factbroken: the test thread is a daemon thread so the main threadcould terminate the VM immediately after the interrupt() call,thus you would never know if the interruption actually workedas expected.
Testing:
 - tiers 1 - 3
 - vmTestbase/nsk/monitoring/  (for deadlock detection**)
 - vmTestbase/nsk/jdwp
 - vmTestbase/nsk/jdb/
 - vmTestbase/nsk/jdi/
 - vmTestbase/nsk/jvmti/
 - serviceability/jvmti/
 - serviceability/jdwp
 - JDK: java/lang/management
** There are no existing deadlock related tests involvingJvmtiRawMonitor. It would be interesting/useful to add them tothe existing nsk/monitoring tests that cover synchronized andJNI locking. But it's a non-trivial enhancement that I don'treally have time to do.
Thanks,
David
-----

Re: RFR: 8231289: Disentangle JvmtiRawMonitor from ObjectMonitor and clean it up

Reply via email to