On 10/15/18 1:12 AM, David Holmes wrote:
Bug: https://bugs.openjdk.java.net/browse/JDK-8211909
webrev: http://cr.openjdk.java.net/~dholmes/8211909/webrev/

src/hotspot/share/prims/jvmtiEnv.cpp
    No comments.

Thumbs up on the code changes.


The crash occurs when trying to access a thread that was returned as part of JVM TI GetThreadGroupChildren. The problem is that the JVM TI code tries to use the Threads_lock to ensure atomic access to the ThreadGroup's threads and child groups:

    { // Cannot allow thread or group counts to change.
      MutexLocker mu(Threads_lock);

but the Threads_lock does not control concurrency in the Java code that can cause threads to be added and removed, so we do not get a stable snapshot of the thread array and its length, and contents. To get a stable snapshot we have to use the same synchronization mechanism as used by the Java code: lock the monitor of the ThreadGroup instance.

Agreed that the wrong locking mechanism was used.

Side note: The '// Cannot allow thread or group counts to change.'
comment was added sometime after we switched from TeamWare to
Mercurial.

Looks like all three uses of Threads_lock that you removed
date back to the original JVM/TI delta:

$ sgv src/share/vm/prims/jvmtiEnv.cpp | grep '{ MutexLocker mu(Threads_lock);'
1.168
1.1 { MutexLocker mu(Threads_lock);
1.1 { MutexLocker mu(Threads_lock);
1.1 { MutexLocker mu(Threads_lock);
3358 lines
No id keywords (cm7)

$ sp -r1.1 src/share/vm/prims/jvmtiEnv.cpp
src/share/vm/prims/SCCS/s.jvmtiEnv.cpp:

D 1.1 03/01/28 20:52:06 rfield 1 0 03130/00000/00000
MRs:
COMMENTS:
date and time created 03/01/28 20:52:06 by rfield


Two other pointless acquisitions of the Threads_lock, in GetThreadInfo and GetThreadGroupInfo, were also removed. These functions could report an inconsistent snapshot of the Thread or ThreadGroup state, if the mutable state is mutated concurrently. Again we could acquire the object's monitor to prevent this but it is unclear that there is any real benefit in doing so - after all the thread/group information could change immediately after it has been read anyway. So here I just delete the Threads_lock usage.

Also agreed. I ran into all three of these in the Thread-SMR project
and I couldn't convince myself to remove the use of Threads_lock for
these two. I missed that the wrong lock was used for JVM/TI
GetThreadGroupChildren().

Very nice catch!

Dan



Testing: mach5 tier 1-3
         JVM TI tests (serviceability/jvmti vmTestbase/nsk/jvmti/)

A regression test may be possible if we call GetThreadGroupChildren in a loop, whilst threads add and remove themselves from the group concurrently. But it is awkward to write.

Thanks,
David

Thanks,
David

Reply via email to