On Thu, 28 Sep 2023 09:11:12 GMT, Roman Kennke wrote:
>> The SA can run concurrently with Java threads, SA code that inspects locking
>> state should be able to deal with that. On the other hand, the particular
>> code is only used in printing routine and is not expected to be precise.
>>
On Thu, 28 Sep 2023 09:11:12 GMT, Roman Kennke wrote:
>> The SA can run concurrently with Java threads, SA code that inspects locking
>> state should be able to deal with that. On the other hand, the particular
>> code is only used in printing routine and is not expected to be precise.
>>
> The SA can run concurrently with Java threads, SA code that inspects locking
> state should be able to deal with that. On the other hand, the particular
> code is only used in printing routine and is not expected to be precise. When
> resolving an anonymous owner, we may not find one, because
On Wed, 27 Sep 2023 17:54:51 GMT, Roman Kennke wrote:
>> The SA can run concurrently with Java threads, SA code that inspects locking
>> state should be able to deal with that. On the other hand, the particular
>> code is only used in printing routine and is not expected to be precise.
>>
On Wed, 27 Sep 2023 17:54:51 GMT, Roman Kennke wrote:
>> The SA can run concurrently with Java threads, SA code that inspects locking
>> state should be able to deal with that. On the other hand, the particular
>> code is only used in printing routine and is not expected to be precise.
>>
> The SA can run concurrently with Java threads, SA code that inspects locking
> state should be able to deal with that. On the other hand, the particular
> code is only used in printing routine and is not expected to be precise. When
> resolving an anonymous owner, we may not find one, because
On Wed, 27 Sep 2023 16:59:17 GMT, Roman Kennke wrote:
>> I think given this, issuing a warning is the appropriate thing to do. The
>> warning might want to mention that the failure is likely because we are in
>> the middle of a GC. That way the user has some context as to why SA info is
>>
On Wed, 27 Sep 2023 15:42:51 GMT, Chris Plummer wrote:
>> I think you are right @plummercj - I'm adding details to the JBS issue.
>
> I think given this, issuing a warning is the appropriate thing to do. The
> warning might want to mention that the failure is likely because we are in
> the
On Wed, 27 Sep 2023 04:17:57 GMT, David Holmes wrote:
>> I wonder if we are in the middle of a GC, the object has been moved, and the
>> lock stack object reference has been updated but the monitor reference has
>> not (or vice versa).
>
> I think you are right @plummercj - I'm adding details
On Wed, 27 Sep 2023 00:44:30 GMT, Chris Plummer wrote:
>> I made a pass over the deadlock detection code involved. Nothing stuck out.
>> There's not much in the way of SA-isms in this code. I think anyone with an
>> understanding of how hotspot locking works should be able to navigate it
>>
On Wed, 27 Sep 2023 00:44:30 GMT, Chris Plummer wrote:
>> I made a pass over the deadlock detection code involved. Nothing stuck out.
>> There's not much in the way of SA-isms in this code. I think anyone with an
>> understanding of how hotspot locking works should be able to navigate it
>>
On Tue, 26 Sep 2023 22:55:27 GMT, Chris Plummer wrote:
>> Yeah, I also haven't seen a good explanation for why this is happening yet.
>> I'll also look into SA, but I don't understand the hotspot code it is trying
>> to emulate that well so not sure how useful I'll be.
>
> I made a pass over
On Tue, 26 Sep 2023 22:10:15 GMT, Chris Plummer wrote:
>> Switching the order did not fix the problem.
>>
>> Even if we end up resolving that this is just something that can happen when
>> SA inspects things, I'd prefer to understand how this can arise. I guess I
>> need to look into how the
On Tue, 26 Sep 2023 20:41:33 GMT, David Holmes wrote:
>> I'm not seeing that this code has much impact on the failure rate. I've
>> tried changing the order and also have tried adding a nanosleep(). Always
>> fails at about the same rate.
>
> Switching the order did not fix the problem.
>
>
On Tue, 26 Sep 2023 20:20:29 GMT, Chris Plummer wrote:
>> I managed to get it to fail 4/100 times in our CI so armed with that I will
>> try swapping the order. Though I have to admit I tend to agree with Chris
>> that the code in question seems unlikely to be executed such that this
>>
On Tue, 26 Sep 2023 12:02:52 GMT, David Holmes wrote:
>> Ok!
>> So we would have to swap the popping and state-update at least. Unless we
>> insert members between them, the updates would still be observable in any
>> order, I suppose, but inserting membars just to make SA happy seems a bit
On Tue, 26 Sep 2023 05:04:57 GMT, Roman Kennke wrote:
>> Atomicity means guaranteeing that a 2 or more operations are executed as if
>> they are single operation from the viewpoint of other threads. There's no
>> such concept with SA. The state is frozen, so SA can look at anything
>> without
On Tue, 26 Sep 2023 05:03:13 GMT, Chris Plummer wrote:
>> @rkennke the "snapshot" is atomic - the target VM is suspended.
>
> Atomicity means guaranteeing that a 2 or more operations are executed as if
> they are single operation from the viewpoint of other threads. There's no
> such concept
On Tue, 26 Sep 2023 04:52:00 GMT, David Holmes wrote:
>> Ok, so we are not (usually) at a safepoint, but no threads are moving. But
>> the snapshot can not be taken atomically. Which means that the
>> anonymous-state in the ObjectMonitor and the state of the lock-stack are not
>> necessarily
On Tue, 26 Sep 2023 04:25:40 GMT, Roman Kennke wrote:
>> ...although that is a pretty small window and we are seeing this bug quite a
>> lot. Seems if the code in question was the issue, it would take 1000s of
>> iterations to reproduce.
>
> Ok, so we are not (usually) at a safepoint, but no
On Tue, 26 Sep 2023 03:07:14 GMT, Chris Plummer wrote:
>> And I shouldn't have said "cache". I was confusing this PR with another
>> dealing with the Monitor Cache.
>
> ...although that is a pretty small window and we are seeing this bug quite a
> lot. Seems if the code in question was the
On Tue, 26 Sep 2023 02:48:17 GMT, Chris Plummer wrote:
>> Looks like you answered my question while I was asking it.
>
> And I shouldn't have said "cache". I was confusing this PR with another
> dealing with the Monitor Cache.
...although that is a pretty small window and we are seeing this
On Tue, 26 Sep 2023 02:44:59 GMT, Chris Plummer wrote:
>> Maybe SA is seeing a monitor in the monitor cache that it is only seeing
>> because the monitor cache is currently inconsistent (due to SA not safe
>> pointing). So the question is whether the monitor cache can be in a state
>> where
On Tue, 26 Sep 2023 02:42:56 GMT, Chris Plummer wrote:
>> Correction, there is one code path where we pop the lockstack first and then
>> update the owner:
>>
>> ObjectMonitor* monitor = inflate(current, object, inflate_cause_vm_internal);
>> if (LockingMode == LM_LIGHTWEIGHT &&
On Tue, 26 Sep 2023 02:34:08 GMT, David Holmes wrote:
>> AFAICS we update the owner to the real thread before we remove the object
>> from the lock stack. So if we see the object monitor is anonymously owned
>> then we should find the monitor object in a thread's lockstack.
>
> Correction,
On Tue, 26 Sep 2023 02:30:25 GMT, David Holmes wrote:
>>> Ah! I guess we get used to talking about "at a safepoint" when we really
>>> mean "at a fixed point in time". So the VM is not necessarily at a
>>> safepoint, but everything is fixed. So invariants may not hold, but the
>>> state
On Tue, 26 Sep 2023 02:27:39 GMT, Chris Plummer wrote:
>> Ah! I guess we get used to talking about "at a safepoint" when we really
>> mean "at a fixed point in time". So the VM is not necessarily at a
>> safepoint, but everything is fixed. So invariants may not hold, but the
>> state cannot
On Tue, 26 Sep 2023 02:14:40 GMT, David Holmes wrote:
> Ah! I guess we get used to talking about "at a safepoint" when we really mean
> "at a fixed point in time". So the VM is not necessarily at a safepoint, but
> everything is fixed. So invariants may not hold, but the state cannot change.
On Tue, 26 Sep 2023 01:41:38 GMT, Chris Plummer wrote:
>> Correction to above:
>>
>> threads = VM.getVM().getThreads();
>> heap = VM.getVM().getObjectHeap();
>> createThreadTable(); // calls getThreads() again
>>
>> The VM caches the set of threads ie the snapshot, so three sets are not
>>
On Tue, 26 Sep 2023 01:32:48 GMT, David Holmes wrote:
>> If the SA is working from a snapshot then it has to create that snapshot
>> atomically. It can't snapshot the threads, then snapshot the heap.
>
> Correction to above:
>
> threads = VM.getVM().getThreads();
> heap =
On Tue, 26 Sep 2023 01:25:59 GMT, David Holmes wrote:
>> To expand if deadlock detection does not run at a safepoint then this logic
>> is non-atomic and completely broken:
>>
>> threads = VM.getVM().getThreads();
>> heap = VM.getVM().getObjectHeap();
>> createThreadTable(); // calls
On Tue, 26 Sep 2023 01:16:11 GMT, David Holmes wrote:
>> Edit: I missed the intermediate comments above where Roman explained.
>>
>> @plummercj the SA code sees T2 is pending on the monitor for object O, which
>> is locked anonymously but actually by T1. The SA code then goes hunting for
>>
On Tue, 26 Sep 2023 01:09:47 GMT, David Holmes wrote:
>> Surely jstack thread dump and deadlock check _has_ to run at a safepoint?
>
> Also isn't "anonymous locking" an intermediate step in monitor inflation? The
> inflated monitor becomes anonymously owned until the real owner sees it has
>
On Tue, 26 Sep 2023 01:07:32 GMT, David Holmes wrote:
>>> The specific race here is that SA sees an anonymously locked ObjectMonitor
>>> and tries to find the owning thread, and fails, presumably because that
>>> thread has moved on and unlocked the object in the meantime.
>>
>> But you said
On Mon, 25 Sep 2023 20:52:28 GMT, Chris Plummer wrote:
>> With the new lightweight locking, when a thread T1 holds a lightweight lock
>> on object O, and another thread T2 comes in and also wants to acquire that
>> lock, it inflates the lock to a monitor. And since it cannot determine which
On Mon, 25 Sep 2023 19:59:32 GMT, Roman Kennke wrote:
> The specific race here is that SA sees an anonymously locked ObjectMonitor
> and tries to find the owning thread, and fails, presumably because that
> thread has moved on and unlocked the object in the meantime.
But you said that when T1
On Mon, 25 Sep 2023 19:18:38 GMT, Chris Plummer wrote:
>> Yes. But unless we run this at a safepoint, there is no consistent state
>> when we race with locking. I don't think we want to spam users with warnings
>> whenever we run into anonymous owners (which is not too infrequent).
>>
On Mon, 25 Sep 2023 18:32:19 GMT, Roman Kennke wrote:
>> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java
>> line 247:
>>
>>> 245: // Java code and locking state can change at any time.
>>> This code is not
>>> 246: // expected to be
On Mon, 25 Sep 2023 17:44:56 GMT, Chris Plummer wrote:
>> The SA can run concurrently with Java threads, SA code that inspects locking
>> state should be able to deal with that. On the other hand, the particular
>> code is only used in printing routine and is not expected to be precise.
>>
On Mon, 25 Sep 2023 16:16:51 GMT, Roman Kennke wrote:
> The SA can run concurrently with Java threads, SA code that inspects locking
> state should be able to deal with that. On the other hand, the particular
> code is only used in printing routine and is not expected to be precise. When
>
The SA can run concurrently with Java threads, SA code that inspects locking
state should be able to deal with that. On the other hand, the particular code
is only used in printing routine and is not expected to be precise. When
resolving an anonymous owner, we may not find one, because Java
41 matches
Mail list logo