Hi Serguei,
Thanks for taking a look.
On 15/11/2019 4:04 am, [email protected] wrote:
Hi David,
It looks good to me.
A couple of nits below.
http://cr.openjdk.java.net/~dholmes/8233549/webrev/src/hotspot/share/prims/jvmtiRawMonitor.cpp.frames.html
236 if (self->is_Java_thread()) {
237 JavaThread* jt = (JavaThread*) self;
238 // Transition to VM so we can check interrupt state
239 ThreadInVMfromNative tivm(jt);
240 if (jt->is_interrupted(true)) {
241 ret = M_INTERRUPTED;
242 } else {
243 ThreadBlockInVM tbivm(jt);
244 jt->set_suspend_equivalent();
245 if (millis <= 0) {
246 self->_ParkEvent->park();
247 } else {
248 self->_ParkEvent->park(millis);
249 }
250 }
251 // Return to VM before post-check of interrupt state
252 if (jt->is_interrupted(true)) {
253 ret = M_INTERRUPTED;
254 }
255 } else {
It seems, the fragment at lines 251-254 needs to bebefore the line 250.
It will add more clarity to this code.
No, it has to be after line 250 as that is when we will hit the TBIVM
destructor and so return to _thread_in_vm which is the state needed to
read the interrupted field. Dan commented on the above and I changed it
slightly by moving the comment:
> 250 // Return to VM before post-check of interrupt state
> 251 }
> 252 if (jt->is_interrupted(true)) {
> 253 ret = M_INTERRUPTED;
> 254 }
412 if (self->is_Java_thread()) {
413 JavaThread* jt = (JavaThread*)self;
414 jt->set_suspend_equivalent();
415 for (;;) {
416 if (!jt->handle_special_suspend_equivalent_condition()) {
417 break;
418 } else {
419 // We've been suspended whilst waiting and so we have to
420 // relinquish the raw monitor until we are resumed. Of course
421 // after reacquiring we have to re-check for suspension again.
422 // Suspension requires we are _thread_blocked, and we also have to
423 // recheck for being interrupted.
424 simple_exit(jt);
425 {
426 ThreadInVMfromNative tivm(jt);
427 {
428 ThreadBlockInVM tbivm(jt);
429 jt->java_suspend_self();
430 }
431 if (jt->is_interrupted(true)) {
432 ret = M_INTERRUPTED;
433 }
434 }
435 simple_enter(jt);
436 jt->set_suspend_equivalent();
437 }
...
This code can be simplified a little bit.
The line:
414 jt->set_suspend_equivalent();
can be placed before line 416.
Then this line can be removed:
436 jt->set_suspend_equivalent();
Yes you're right. I was trying to preserve the original loop structure,
but then had to add the additional set_suspend_equivalent for the first
iteration. But I can instead just move the existing one to the top of
the loop.
Webrev updated in place.
Thanks,
David
-----
Thanks,
Serguei
On 11/11/19 20:52, David Holmes wrote:
webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
bug: https://bugs.openjdk.java.net/browse/JDK-8233549
In JDK-8229516 I moved the interrupted state of a thread from the
osThread in the VM to the java.lang.Thread instance. In doing that I
overlooked a critical aspect, which is that to access the field of a
Java object the JavaThread must not be in a safepoint-safe state** -
otherwise the oop, and anything referenced there from could be
relocated by the GC whilst the JavaThread is accessing it. This
manifested in a number of tests using JVM TI Agent threads and JVM TI
RawMonitors because the JavaThread's were marked _thread_blocked and
hence safepoint-safe, and we read a non-zero value for the interrupted
field even though we had never been interrupted.
This problem existed in all the code that checks for interruption when
"waiting":
- Parker::park (the code underpinning
java.util.concurrent.LockSupport.park())
To fix this code I simply deleted a late check of the interrupted
field. The check was not needed because if an interrupt has occurred
then we will find the ParkEvent in a signalled state.
- ObjectMonitor::wait
Here the late check of the interrupted state is essential as we reset
the ParkEvent after an earlier check of the interrupted state. But the
fix was simply achieved by moving the check slightly earlier before we
use ThreadBlockInVm to become _thread_blocked.
- RawMonitor::wait
This fix was much more involved. The RawMonitor code directly
transitions the JavaThread from _thread_in_Native to _thread_blocked.
This is safe from a safepoint perspective because they are equivalent
safepoint-safe states. To allow access to the interrupted field I have
to transition from native to _thread_in_vm, and that has to be done by
proper thread-state transitions to ensure correct access to the oop
and its fields. Having done that I can then use ThreadBlockInVM for
the transitions to blocked. However, as the old code noted it can't
use proper thread-state transitions as this will lead to deadlocks
with the VMThread that can also use RawMonitors when executing various
event callbacks. To deal with that we have to note that the real
constraint is that the JavaThread cannot block at a safepoint whilst
it holds the RawMonitor. Hence the fix was push all the interrupt
checking code and the thread-state transitions to the lowest level of
RawMonitorWait, around the final park() call, after we have enqueued
the waiter and released the monitor. That avoids any deadlock
possibility.
I also added checks to is_interrupted/interrupted to ensure they are
only called by a thread in a suitable state. This should only be the
VMThread (as a consequence of the Thread.stop implementation occurring
at a safepoint and issuing a JavaThread::interrupt() call to unblock
the target); or a JavaThread that is not _thread_in_native or
_thread_blocked.
Testing: (still finalizing)
- tiers 1 - 6 (Oracle platforms)
- Local Linux testing
- vmTestbase/nsk/monitoring/
- vmTestbase/nsk/jdwp
- vmTestbase/nsk/jdb/
- vmTestbase/nsk/jdi/
- vmTestbase/nsk/jvmti/
- serviceability/jvmti/
- serviceability/jdwp
- JDK: java/lang/management
com/sun/management
** Note that this applies to all accesses we make via code in
javaClasses.*. For this particular code I thought about adding a guard
in JavaThread::threadObj() but it turns out when we generate a crash
report we access the Thread's name() field and that can happen when in
any state, so we'd always trigger a secondary assertion failure during
error reporting if we did that. Note that accessing name() can still
easily lead to secondary assertions failures as I discovered when
trying to debug this and print the thread name out - I would see an
is_instance assertion fail checking that the Thread name() is an
instance of java.lang.String!
Thanks,
David
-----