Re: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state

Daniel D. Daugherty Wed, 13 Nov 2019 06:21:38 -0800

On 11/12/19 5:50 PM, David Holmes wrote:

Hi Dan,


Thanks for taking a look so quickly!


Your welcome! I figured you would prefer to get this one out of
the way quickly.


On 13/11/2019 3:18 am, Daniel D. Daugherty wrote:

On 11/11/19 11:52 PM, David Holmes wrote:
webrev: http://cr.openjdk.java.net/~dholmes/8233549/webrev/
src/hotspot/os/posix/os_posix.cpp
L2078: // Can't access interrupt state now we are_thread_blocked. If we've been L2079: // interrupted since we checked above then _counterwill be > 0.
         nit - grammar. Please consider:
// Can't access interrupt state now that we are_thread_blocked. If we've // been interrupted since we checked above then_counter will be > 0.
src/hotspot/os/solaris/os_solaris.cpp
L4924: // Can't access interrupt state now we are_thread_blocked. If we've been L4925: // interrupted since we checked above then _counterwill be > 0.
         nit - grammar. Please consider:
// Can't access interrupt state now that we are_thread_blocked. If we've // been interrupted since we checked above then_counter will be > 0.


Will fix grammatical nits.

src/hotspot/share/classfile/javaClasses.cpp
     No comments.

src/hotspot/share/prims/jvmtiEnv.cpp

Hmmm... did the "non-JavaThread can't be interrupted" check alsoget

     pushed down?
     Update: Similar check is now in JvmtiRawMonitor::raw_wait().

src/hotspot/share/prims/jvmtiRawMonitor.cpp
     L239:     ThreadInVMfromNative tivm(jt);
     L240:     if (jt->is_interrupted(true)) {
     L241:         ret = M_INTERRUPTED;
     L242:     } else {
     L243:       ThreadBlockInVM tbivm(jt);
     L244:       jt->set_suspend_equivalent();
     L245:       if (millis <= 0) {
     L246:         self->_ParkEvent->park();
     L247:       } else {
     L248:         self->_ParkEvent->park(millis);
     L249:       }
     L250:     }
     L251:     // Return to VM before post-check of interrupt state
     L252:     if (jt->is_interrupted(true)) {
         The comment on L251 is better between L249 and L250 since that
         is where 'tbivm' gets destroyed and you transition back.

         You could have this comment before L252:

                // Must be in VM to safely access interrupt state:

         if you think you really need a comment there.


Will move comment up as suggested.

src/hotspot/share/prims/jvmtiRawMonitor.hpp
     No comments.

src/hotspot/share/runtime/objectMonitor.cpp
     You've moved the is_interrupted() check from after ThreadBlockInVM

to before it. ThreadBlockInVM can block for a safepoint whichwidens

     the window for an interrupt to come in after the check on L1272 and
     and before the thread parks on L1286 or L1288.

     Can this result in an unexpected park() where before we would have
     taken the "Intentionally empty" code path on L1283?

     What I'm worried about is whether we've opened a window where we
     do Object.wait(0) and that wait() is supposed to be interrupted.
     However, we lose that interrupt because it arrives in the now wider

window between L1272 and L1286 and we never return from thewait(0).

It is possible that I'm not remembering something about howinterrupt()

     interacts with park().

The interrupt() not only sets the field but also issues an unpark() tothe ParkEvent. So if we are interrupted whilst processing through theTBIVM, the call to park() will return immediately as the ParkEventwill be in the signalled state.


That was the piece I wasn't remembering. Thanks for filling in the detail.

test/hotspot/jtreg/ProblemList.txt
     Thanks for remembering to update the ProblemList.

The only part I'm worried about is ObjectMonitor::wait(). If my worry is
baseless, then thumbs up.


Worry is baseless :)


Agreed!

I have a couple of nits above. If you choose to fix those, then I don't
need to see another webrev.


Thanks again!


You're welcome.

Dan

David
-----
Dan
bug: https://bugs.openjdk.java.net/browse/JDK-8233549
In JDK-8229516 I moved the interrupted state of a thread from theosThread in the VM to the java.lang.Thread instance. In doing that Ioverlooked a critical aspect, which is that to access the field of aJava object the JavaThread must not be in a safepoint-safe state** -otherwise the oop, and anything referenced there from could berelocated by the GC whilst the JavaThread is accessing it. Thismanifested in a number of tests using JVM TI Agent threads and JVMTI RawMonitors because the JavaThread's were marked _thread_blockedand hence safepoint-safe, and we read a non-zero value for theinterrupted field even though we had never been interrupted.
This problem existed in all the code that checks for interruptionwhen "waiting":
- Parker::park (the code underpinningjava.util.concurrent.LockSupport.park())
To fix this code I simply deleted a late check of the interruptedfield. The check was not needed because if an interrupt has occurredthen we will find the ParkEvent in a signalled state.
- ObjectMonitor::wait
Here the late check of the interrupted state is essential as wereset the ParkEvent after an earlier check of the interrupted state.But the fix was simply achieved by moving the check slightly earlierbefore we use ThreadBlockInVm to become _thread_blocked.
- RawMonitor::wait
This fix was much more involved. The RawMonitor code directlytransitions the JavaThread from _thread_in_Native to_thread_blocked. This is safe from a safepoint perspective becausethey are equivalent safepoint-safe states. To allow access to theinterrupted field I have to transition from native to _thread_in_vm,and that has to be done by proper thread-state transitions to ensurecorrect access to the oop and its fields. Having done that I canthen use ThreadBlockInVM for the transitions to blocked. However, asthe old code noted it can't use proper thread-state transitions asthis will lead to deadlocks with the VMThread that can also useRawMonitors when executing various event callbacks. To deal withthat we have to note that the real constraint is that the JavaThreadcannot block at a safepoint whilst it holds the RawMonitor. Hencethe fix was push all the interrupt checking code and thethread-state transitions to the lowest level of RawMonitorWait,around the final park() call, after we have enqueued the waiter andreleased the monitor. That avoids any deadlock possibility.
I also added checks to is_interrupted/interrupted to ensure they areonly called by a thread in a suitable state. This should only be theVMThread (as a consequence of the Thread.stop implementationoccurring at a safepoint and issuing a JavaThread::interrupt() callto unblock the target); or a JavaThread that is not_thread_in_native or _thread_blocked.
Testing: (still finalizing)
 - tiers 1 - 6 (Oracle platforms)
 - Local Linux testing
  - vmTestbase/nsk/monitoring/
  - vmTestbase/nsk/jdwp
  - vmTestbase/nsk/jdb/
  - vmTestbase/nsk/jdi/
  - vmTestbase/nsk/jvmti/
  - serviceability/jvmti/
  - serviceability/jdwp
  - JDK: java/lang/management
         com/sun/management
** Note that this applies to all accesses we make via code injavaClasses.*. For this particular code I thought about adding aguard in JavaThread::threadObj() but it turns out when we generate acrash report we access the Thread's name() field and that can happenwhen in any state, so we'd always trigger a secondary assertionfailure during error reporting if we did that. Note that accessingname() can still easily lead to secondary assertions failures as Idiscovered when trying to debug this and print the thread name out -I would see an is_instance assertion fail checking that the Threadname() is an instance of java.lang.String!
Thanks,
David
-----

Re: RFR: 8233549: Thread interrupted state must only be accessed when not in a safepoint-safe state

Reply via email to