webrev: http://cr.openjdk.java.net/~dholmes/8159461/webrev/

bug: https://bugs.openjdk.java.net/browse/JDK-8159461

The suspend/resume signal (SR_signum) is never sent to a thread once it has started to terminate. On one platform (SuSE 12) we have seen what appears to be a "stuck" signal, which is only delivered when the terminating thread restores its original signal mask (as if pthread_sigmask makes the system realize there is a pending signal - we already check the signal was not blocked). At this point in the thread termination we have freed the osthread, so the the SR_handler would access deallocated memory. In debug builds we first hit an assertion that the current thread is a JavaThread or the VMThread - that assertion fails, even though it is a JavaThread, because we have already executed the ~JavaThread destructor and inside the ~Thread destructor we are a plain Thread not a JavaThread.

The fix was to make a small adjustment to the thread termination process so that we delete the SR_lock before calling os::free_thread(). In the SR_handler() we can then use a NULL check of SR_lock() to indicate the thread has terminated and we return.

While only seen on Linux I took the opportunity to apply the fix on all platforms and also cleaned up the code where we were using Thread::current() unsafely in a signal-handling context.

Testing: regular tier 1 (JPRT)
         Kitchensink (in progress)

As we can't readily reproduce the problem I tested this by having a terminating thread raise SR_signum directly from within the ~Thread destructor.

Thanks,
David

Reply via email to