On 10/15/18 1:12 AM, David Holmes wrote:
Bug: https://bugs.openjdk.java.net/browse/JDK-8211909
webrev: http://cr.openjdk.java.net/~dholmes/8211909/webrev/
src/hotspot/share/prims/jvmtiEnv.cpp
No comments.
Thumbs up on the code changes.
The crash occurs when trying to access a thread that was returned as
part of JVM TI GetThreadGroupChildren. The problem is that the JVM TI
code tries to use the Threads_lock to ensure atomic access to the
ThreadGroup's threads and child groups:
{ // Cannot allow thread or group counts to change.
MutexLocker mu(Threads_lock);
but the Threads_lock does not control concurrency in the Java code
that can cause threads to be added and removed, so we do not get a
stable snapshot of the thread array and its length, and contents. To
get a stable snapshot we have to use the same synchronization
mechanism as used by the Java code: lock the monitor of the
ThreadGroup instance.
Agreed that the wrong locking mechanism was used.
Side note: The '// Cannot allow thread or group counts to change.'
comment was added sometime after we switched from TeamWare to
Mercurial.
Looks like all three uses of Threads_lock that you removed
date back to the original JVM/TI delta:
$ sgv src/share/vm/prims/jvmtiEnv.cpp | grep '{ MutexLocker
mu(Threads_lock);'
1.168
1.1 { MutexLocker mu(Threads_lock);
1.1 { MutexLocker mu(Threads_lock);
1.1 { MutexLocker mu(Threads_lock);
3358 lines
No id keywords (cm7)
$ sp -r1.1 src/share/vm/prims/jvmtiEnv.cpp
src/share/vm/prims/SCCS/s.jvmtiEnv.cpp:
D 1.1 03/01/28 20:52:06 rfield 1 0 03130/00000/00000
MRs:
COMMENTS:
date and time created 03/01/28 20:52:06 by rfield
Two other pointless acquisitions of the Threads_lock, in GetThreadInfo
and GetThreadGroupInfo, were also removed. These functions could
report an inconsistent snapshot of the Thread or ThreadGroup state, if
the mutable state is mutated concurrently. Again we could acquire the
object's monitor to prevent this but it is unclear that there is any
real benefit in doing so - after all the thread/group information
could change immediately after it has been read anyway. So here I just
delete the Threads_lock usage.
Also agreed. I ran into all three of these in the Thread-SMR project
and I couldn't convince myself to remove the use of Threads_lock for
these two. I missed that the wrong lock was used for JVM/TI
GetThreadGroupChildren().
Very nice catch!
Dan
Testing: mach5 tier 1-3
JVM TI tests (serviceability/jvmti vmTestbase/nsk/jvmti/)
A regression test may be possible if we call GetThreadGroupChildren in
a loop, whilst threads add and remove themselves from the group
concurrently. But it is awkward to write.
Thanks,
David
Thanks,
David