On Tue, 15 Feb 2022 17:48:43 GMT, Zhengyu Gu <[email protected]> wrote:
>> There are scenarios that JDWP agent can deadlock on `classTrackLock`
>> monitor. Following is the scenario in bug report.
>>
>> **Java Thread**
>> - loads a class and post `JVMTI_EVENT_CLASS_PREPARE` event
>> - JDWP event callback handler calls `classTrack_processUnloads()` to handle
>> the event.
>> - `classTrack_processUnloads()` takes `classTrackLock` lock, then tries to
>> allocate a new bag under the lock.
>> - bag allocation code calls` jvmtiAllocate()`, which may be blocked by
>> ongoing safepoint due to state transition.
>>
>> If the safepoint is GC safepoint (prior to JDK16) or
>> `VM_JvmtiPostObjectFree` safepoint (JDK16 or later)
>>
>> **VM Thread**
>> - post `JVMTI_EVENT_OBJECT_FREE`
>> - JDWP event callback handler calls `cbTrackingObjectFree()` to handle the
>> event
>> - `cbTrackingObjectFree()` tries to acquire `classTrackLock` lock, leads to
>> deadlock
>>
>> From my research, there are three events that may be posted at safepoints,
>> `JVMTI_EVENT_GARBAGE_COLLECTION_START`,
>> `JVMTI_EVENT_GARBAGE_COLLECTION_FINISH` and `JVMTI_EVENT_OBJECT_FREE`, but
>> only `JVMTI_EVENT_OBJECT_FREE` is relevant to JDWP agent.
>>
>> The solution I purpose here, is simply move allocation/deallocation code
>> outside of `classTrackLock` lock.
>>
>>
>> Test:
>> - [x] tier1
>> - [x] vmTestbase_nsk_jdi
>> - [x] vmTestbase_nsk_jdwp
>> - [x] vmTestbase_nsk_jvmti
>
> Zhengyu Gu has updated the pull request incrementally with one additional
> commit since the last revision:
>
> David and Chris' comments
src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c line 88:
> 86: jboolean retry = JNI_FALSE;
> 87: do {
> 88: // Avoid unnecessary allocations when class track has yet been
> activated.
I don't think this comment is accurate. It's not that we want to avoid
unnecessary allocations. We want to avoid unnecessary tracking of loaded
classes.
src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c line 89:
> 87: do {
> 88: // Avoid unnecessary allocations when class track has yet been
> activated.
> 89: if (deletedSignatures != NULL) {
It's now kind of hard to read what used to be the `deletedSignatures == NULL`
case. I think you can first just check for NULL outside the lock and then
return. It will make it clear that it is very low overhead in this case.
src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c line 111:
> 109: }
> 110: debugMonitorExit(classTrackLock);
> 111: } while (retry == JNI_TRUE);
I don't think the retry is necessary. Nor is the check for `deletedSignatures
!= NULL`. I assume you are trying to cover the case where initially
deletedSignatures was NULL (so no bag was allocated), but is not NULL by the
time you grab the lock. If you return when NULL like I suggested above, you
won't have this issue. Note that if not NULL, there is no way another thread
can get into this same code in clear it. This is because all callers first grab
the handlerLock. Grabbing of the classTrackLock here is only done to
synchronize with cbTrackingObjectFree(), not with other callers of
classTrack_processUnloads().
src/jdk.jdwp.agent/share/native/libjdwp/classTrack.c line 112:
> 110: debugMonitorExit(classTrackLock);
> 111: } while (retry == JNI_TRUE);
> 112: bagDestroyBag(new_bag);
One other comment about `classTrack_processUnloads()` in general that also
applies to the original version. It seems pretty inefficient. Every time the
debug agent gets an event it is called. On almost ever call it is returning and
empty back and allocating a new one. It seems a check for
`bagSize(deletedSignatures) == 0` and returning NULL if true would help
performance. I also believe this can be done outside of the lock (would like
David's opinion on this).
-------------
PR: https://git.openjdk.java.net/jdk/pull/7461