Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v3]
On Thu, 28 Sep 2023 09:11:12 GMT, Roman Kennke wrote: >> The SA can run concurrently with Java threads, SA code that inspects locking >> state should be able to deal with that. On the other hand, the particular >> code is only used in printing routine and is not expected to be precise. >> When resolving an anonymous owner, we may not find one, because Java threads >> may already have moved on. Instead of crashing with a stacktrace, we should >> gracefully return null here. >> >> Testing: >> - [x] sun/tools/jhsdb/JStackStressTest.java >> - [x] sun/tools/jhsdb >> - [x] serviceability/sa > > Roman Kennke has updated the pull request incrementally with one additional > commit since the last revision: > > Fix comment Thanks! - PR Comment: https://git.openjdk.org/jdk/pull/15907#issuecomment-1740721744
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v3]
On Thu, 28 Sep 2023 09:11:12 GMT, Roman Kennke wrote: >> The SA can run concurrently with Java threads, SA code that inspects locking >> state should be able to deal with that. On the other hand, the particular >> code is only used in printing routine and is not expected to be precise. >> When resolving an anonymous owner, we may not find one, because Java threads >> may already have moved on. Instead of crashing with a stacktrace, we should >> gracefully return null here. >> >> Testing: >> - [x] sun/tools/jhsdb/JStackStressTest.java >> - [x] sun/tools/jhsdb >> - [x] serviceability/sa > > Roman Kennke has updated the pull request incrementally with one additional > commit since the last revision: > > Fix comment Looks good. Thanks for you patience while we worked through this one. - Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15907#pullrequestreview-1650058692
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v3]
> The SA can run concurrently with Java threads, SA code that inspects locking > state should be able to deal with that. On the other hand, the particular > code is only used in printing routine and is not expected to be precise. When > resolving an anonymous owner, we may not find one, because Java threads may > already have moved on. Instead of crashing with a stacktrace, we should > gracefully return null here. > > Testing: > - [x] sun/tools/jhsdb/JStackStressTest.java > - [x] sun/tools/jhsdb > - [x] serviceability/sa Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix comment - Changes: - all: https://git.openjdk.org/jdk/pull/15907/files - new: https://git.openjdk.org/jdk/pull/15907/files/4ddaf577..bff3b3a3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15907&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15907&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15907/head:pull/15907 PR: https://git.openjdk.org/jdk/pull/15907
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v2]
On Wed, 27 Sep 2023 17:54:51 GMT, Roman Kennke wrote: >> The SA can run concurrently with Java threads, SA code that inspects locking >> state should be able to deal with that. On the other hand, the particular >> code is only used in printing routine and is not expected to be precise. >> When resolving an anonymous owner, we may not find one, because Java threads >> may already have moved on. Instead of crashing with a stacktrace, we should >> gracefully return null here. >> >> Testing: >> - [x] sun/tools/jhsdb/JStackStressTest.java >> - [x] sun/tools/jhsdb >> - [x] serviceability/sa > > Roman Kennke has updated the pull request incrementally with one additional > commit since the last revision: > > Print warning when anonymous lock cannot be found Changes requested by dholmes (Reviewer). src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 245: > 243: } > 244: // We should have found the owner. However, the code can > run concurrently with > 245: // Java code and locking state can change at any time. > This code is not The second sentence is not accurate as things are not running concurrently and cannot change state at any time. Suggestion: // We should have found the owner, however, as the VM could be in any state, including the middle // of performing GC, it is not always possible to do so. Just return null if we can't locate it. - PR Review: https://git.openjdk.org/jdk/pull/15907#pullrequestreview-1647736396 PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1339374587
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v2]
On Wed, 27 Sep 2023 17:54:51 GMT, Roman Kennke wrote: >> The SA can run concurrently with Java threads, SA code that inspects locking >> state should be able to deal with that. On the other hand, the particular >> code is only used in printing routine and is not expected to be precise. >> When resolving an anonymous owner, we may not find one, because Java threads >> may already have moved on. Instead of crashing with a stacktrace, we should >> gracefully return null here. >> >> Testing: >> - [x] sun/tools/jhsdb/JStackStressTest.java >> - [x] sun/tools/jhsdb >> - [x] serviceability/sa > > Roman Kennke has updated the pull request incrementally with one additional > commit since the last revision: > > Print warning when anonymous lock cannot be found Looks good. Thanks! - Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15907#pullrequestreview-1647405613
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" [v2]
> The SA can run concurrently with Java threads, SA code that inspects locking > state should be able to deal with that. On the other hand, the particular > code is only used in printing routine and is not expected to be precise. When > resolving an anonymous owner, we may not find one, because Java threads may > already have moved on. Instead of crashing with a stacktrace, we should > gracefully return null here. > > Testing: > - [x] sun/tools/jhsdb/JStackStressTest.java > - [x] sun/tools/jhsdb > - [x] serviceability/sa Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Print warning when anonymous lock cannot be found - Changes: - all: https://git.openjdk.org/jdk/pull/15907/files - new: https://git.openjdk.org/jdk/pull/15907/files/18e46db6..4ddaf577 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15907&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15907&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15907/head:pull/15907 PR: https://git.openjdk.org/jdk/pull/15907
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Wed, 27 Sep 2023 16:59:17 GMT, Roman Kennke wrote: >> I think given this, issuing a warning is the appropriate thing to do. The >> warning might want to mention that the failure is likely because we are in >> the middle of a GC. That way the user has some context as to why SA info is >> not reliable. > > Could you advice how such a warning could look like? Does SA use a special > stream for that? Most warnings go to System.err, but I think System.out is better in most cases so the warning appears in line with the rest of the related output. You could do something like: Warning: We failed to find a thread that owns an anonymous lock. This is likely due to the JVM currently running a GC. Locking information may not be accurate. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1338959696
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Wed, 27 Sep 2023 15:42:51 GMT, Chris Plummer wrote: >> I think you are right @plummercj - I'm adding details to the JBS issue. > > I think given this, issuing a warning is the appropriate thing to do. The > warning might want to mention that the failure is likely because we are in > the middle of a GC. That way the user has some context as to why SA info is > not reliable. Could you advice how such a warning could look like? Does SA use a special stream for that? - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1338928770
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Wed, 27 Sep 2023 04:17:57 GMT, David Holmes wrote: >> I wonder if we are in the middle of a GC, the object has been moved, and the >> lock stack object reference has been updated but the monitor reference has >> not (or vice versa). > > I think you are right @plummercj - I'm adding details to the JBS issue. I think given this, issuing a warning is the appropriate thing to do. The warning might want to mention that the failure is likely because we are in the middle of a GC. That way the user has some context as to why SA info is not reliable. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1338812930
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Wed, 27 Sep 2023 00:44:30 GMT, Chris Plummer wrote: >> I made a pass over the deadlock detection code involved. Nothing stuck out. >> There's not much in the way of SA-isms in this code. I think anyone with an >> understanding of how hotspot locking works should be able to navigate it >> pretty easily and maybe conjure a hypothesis as to why this happens as >> frequently as it does. > > I wonder if we are in the middle of a GC, the object has been moved, and the > lock stack object reference has been updated but the monitor reference has > not (or vice versa). I think you are right @plummercj - I'm adding details to the JBS issue. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1338022386
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Wed, 27 Sep 2023 00:44:30 GMT, Chris Plummer wrote: >> I made a pass over the deadlock detection code involved. Nothing stuck out. >> There's not much in the way of SA-isms in this code. I think anyone with an >> understanding of how hotspot locking works should be able to navigate it >> pretty easily and maybe conjure a hypothesis as to why this happens as >> frequently as it does. > > I wonder if we are in the middle of a GC, the object has been moved, and the > lock stack object reference has been updated but the monitor reference has > not (or vice versa). Possible I guess. I added some instrumentation to try and see what threads we have and what their lockstacks show when the failure happens. Of course the test didn't fail after I did that, so having to re-run. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337987283
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 22:55:27 GMT, Chris Plummer wrote: >> Yeah, I also haven't seen a good explanation for why this is happening yet. >> I'll also look into SA, but I don't understand the hotspot code it is trying >> to emulate that well so not sure how useful I'll be. > > I made a pass over the deadlock detection code involved. Nothing stuck out. > There's not much in the way of SA-isms in this code. I think anyone with an > understanding of how hotspot locking works should be able to navigate it > pretty easily and maybe conjure a hypothesis as to why this happens as > frequently as it does. I wonder if we are in the middle of a GC, the object has been moved, and the lock stack object reference has been updated but the monitor reference has not (or vice versa). - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337908791
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 22:10:15 GMT, Chris Plummer wrote: >> Switching the order did not fix the problem. >> >> Even if we end up resolving that this is just something that can happen when >> SA inspects things, I'd prefer to understand how this can arise. I guess I >> need to look into how the SA tries to find the owner. > > Yeah, I also haven't seen a good explanation for why this is happening yet. > I'll also look into SA, but I don't understand the hotspot code it is trying > to emulate that well so not sure how useful I'll be. I made a pass over the deadlock detection code involved. Nothing stuck out. There's not much in the way of SA-isms in this code. I think anyone with an understanding of how hotspot locking works should be able to navigate it pretty easily and maybe conjure a hypothesis as to why this happens as frequently as it does. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337857814
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 20:41:33 GMT, David Holmes wrote: >> I'm not seeing that this code has much impact on the failure rate. I've >> tried changing the order and also have tried adding a nanosleep(). Always >> fails at about the same rate. > > Switching the order did not fix the problem. > > Even if we end up resolving that this is just something that can happen when > SA inspects things, I'd prefer to understand how this can arise. I guess I > need to look into how the SA tries to find the owner. Yeah, I also haven't seen a good explanation for why this is happening yet. I'll also look into SA, but I don't understand the hotspot code it is trying to emulate that well so not sure how useful I'll be. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337832556
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 20:20:29 GMT, Chris Plummer wrote: >> I managed to get it to fail 4/100 times in our CI so armed with that I will >> try swapping the order. Though I have to admit I tend to agree with Chris >> that the code in question seems unlikely to be executed such that this >> failure mode is so repeatable. > > I'm not seeing that this code has much impact on the failure rate. I've tried > changing the order and also have tried adding a nanosleep(). Always fails at > about the same rate. Switching the order did not fix the problem. Even if we end up resolving that this is just something that can happen when SA inspects things, I'd prefer to understand how this can arise. I guess I need to look into how the SA tries to find the owner. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337758338
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 12:02:52 GMT, David Holmes wrote: >> Ok! >> So we would have to swap the popping and state-update at least. Unless we >> insert members between them, the updates would still be observable in any >> order, I suppose, but inserting membars just to make SA happy seems a bit >> much. >> Can one of you try swapping and see if that makes the problem appear less >> often? I could not yet reproduce the bug at all. > > I managed to get it to fail 4/100 times in our CI so armed with that I will > try swapping the order. Though I have to admit I tend to agree with Chris > that the code in question seems unlikely to be executed such that this > failure mode is so repeatable. I'm not seeing that this code has much impact on the failure rate. I've tried changing the order and also have tried adding a nanosleep(). Always fails at about the same rate. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337736512
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 05:04:57 GMT, Roman Kennke wrote: >> Atomicity means guaranteeing that a 2 or more operations are executed as if >> they are single operation from the viewpoint of other threads. There's no >> such concept with SA. The state is frozen, so SA can look at anything >> without worrying about the state changing as it looks at it. When I say the >> JVM is not in a consistent state, one example of that is when the JVM is in >> the middle of an atomic operation when SA looks at it. Another example is >> when SA looks at data that is normally only safe to look at when the VM is >> safe pointing. The code that David pointed out is another example. It's >> usually protected from observation by other threads by safe pointing, but SA >> can see the data when it's in the middle of executing this code. Order may >> or may not matter. Clearly the current order is problematic for SA. >> Switching the order may or may not be a problem. It depends on whether SA >> cares that a monitor on the lock_stack of one thread is shown to be owned by >> another thread. If that does br eak SA, I think SA might be able to recognize the situation and work around it, or just print a warning. >> >>> However, even that problem would be pre-existing (monitors not agreeing >>> with each other or with threads), and do we currently print a warning about >>> it? >> >> Likely not because it is likely not detected and doesn't result in some sort >> of failure, and probably whoever wrote the original code hadn't considered >> the possibility. Most of this type of SA error handling code (that result in >> a warning) is the results of eventually someone noticing a failure of some >> sort. If no one ever notices a failure, then the error handling is never >> worked on. > > Ok! > So we would have to swap the popping and state-update at least. Unless we > insert members between them, the updates would still be observable in any > order, I suppose, but inserting membars just to make SA happy seems a bit > much. > Can one of you try swapping and see if that makes the problem appear less > often? I could not yet reproduce the bug at all. I managed to get it to fail 4/100 times in our CI so armed with that I will try swapping the order. Though I have to admit I tend to agree with Chris that the code in question seems unlikely to be executed such that this failure mode is so repeatable. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337100891
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 05:03:13 GMT, Chris Plummer wrote: >> @rkennke the "snapshot" is atomic - the target VM is suspended. > > Atomicity means guaranteeing that a 2 or more operations are executed as if > they are single operation from the viewpoint of other threads. There's no > such concept with SA. The state is frozen, so SA can look at anything without > worrying about the state changing as it looks at it. When I say the JVM is > not in a consistent state, one example of that is when the JVM is in the > middle of an atomic operation when SA looks at it. Another example is when SA > looks at data that is normally only safe to look at when the VM is safe > pointing. The code that David pointed out is another example. It's usually > protected from observation by other threads by safe pointing, but SA can see > the data when it's in the middle of executing this code. Order may or may not > matter. Clearly the current order is problematic for SA. Switching the order > may or may not be a problem. It depends on whether SA cares that a monitor on > the lock_stack of one thread is shown to be owned by another thread. If that > does bre ak SA, I think SA might be able to recognize the situation and work around it, or just print a warning. > >> However, even that problem would be pre-existing (monitors not agreeing with >> each other or with threads), and do we currently print a warning about it? > > Likely not because it is likely not detected and doesn't result in some sort > of failure, and probably whoever wrote the original code hadn't considered > the possibility. Most of this type of SA error handling code (that result in > a warning) is the results of eventually someone noticing a failure of some > sort. If no one ever notices a failure, then the error handling is never > worked on. Ok! So we would have to swap the popping and state-update at least. Unless we insert members between them, the updates would still be observable in any order, I suppose, but inserting membars just to make SA happy seems a bit much. Can one of you try swapping and see if that makes the problem appear less often? I could not yet reproduce the bug at all. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336628312
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 04:52:00 GMT, David Holmes wrote: >> Ok, so we are not (usually) at a safepoint, but no threads are moving. But >> the snapshot can not be taken atomically. Which means that the >> anonymous-state in the ObjectMonitor and the state of the lock-stack are not >> necessarily consistent, not even if we swap the popping and owner-update, >> right? This is what you mean by inconsistent state, right? Or is there a way >> to get an atomic snapshot of threads *and* monitors? >> >> So yeah, we could print a warning when that happens. >> >> However, even that problem would be pre-existing (monitors not agreeing with >> each other or with threads), and do we currently print a warning about it? > > @rkennke the "snapshot" is atomic - the target VM is suspended. Atomicity means guaranteeing that a 2 or more operations are executed as if they are single operation from the viewpoint of other threads. There's no such concept with SA. The state is frozen, so SA can look at anything without worrying about the state changing as it looks at it. When I say the JVM is not in a consistent state, one example of that is when the JVM is in the middle of an atomic operation when SA looks at it. Another example is when SA looks at data that is normally only safe to look at when the VM is safe pointing. The code that David pointed out is another example. It's usually protected from observation by other threads by safe pointing, but SA can see the data when it's in the middle of executing this code. Order may or may not matter. Clearly the current order is problematic for SA. Switching the order may or may not be a problem. It depends on whether SA cares that a monitor on the lock_stack of one thread is shown to be owned by another thread. If that does break SA, I think SA might be able to recognize the situation and work around it, or just print a warning. > However, even that problem would be pre-existing (monitors not agreeing with > each other or with threads), and do we currently print a warning about it? Likely not because it is likely not detected and doesn't result in some sort of failure, and probably whoever wrote the original code hadn't considered the possibility. Most of this type of SA error handling code (that result in a warning) is the results of eventually someone noticing a failure of some sort. If no one ever notices a failure, then the error handling is never worked on. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336627387
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 04:25:40 GMT, Roman Kennke wrote: >> ...although that is a pretty small window and we are seeing this bug quite a >> lot. Seems if the code in question was the issue, it would take 1000s of >> iterations to reproduce. > > Ok, so we are not (usually) at a safepoint, but no threads are moving. But > the snapshot can not be taken atomically. Which means that the > anonymous-state in the ObjectMonitor and the state of the lock-stack are not > necessarily consistent, not even if we swap the popping and owner-update, > right? This is what you mean by inconsistent state, right? Or is there a way > to get an atomic snapshot of threads *and* monitors? > > So yeah, we could print a warning when that happens. > > However, even that problem would be pre-existing (monitors not agreeing with > each other or with threads), and do we currently print a warning about it? @rkennke the "snapshot" is atomic - the target VM is suspended. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336621849
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 03:07:14 GMT, Chris Plummer wrote: >> And I shouldn't have said "cache". I was confusing this PR with another >> dealing with the Monitor Cache. > > ...although that is a pretty small window and we are seeing this bug quite a > lot. Seems if the code in question was the issue, it would take 1000s of > iterations to reproduce. Ok, so we are not (usually) at a safepoint, but no threads are moving. But the snapshot can not be taken atomically. Which means that the anonymous-state in the ObjectMonitor and the state of the lock-stack are not necessarily consistent, not even if we swap the popping and owner-update, right? This is what you mean by inconsistent state, right? Or is there a way to get an atomic snapshot of threads *and* monitors? So yeah, we could print a warning when that happens. However, even that problem would be pre-existing (monitors not agreeing with each other or with threads), and do we currently print a warning about it? - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336609109
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:48:17 GMT, Chris Plummer wrote: >> Looks like you answered my question while I was asking it. > > And I shouldn't have said "cache". I was confusing this PR with another > dealing with the Monitor Cache. ...although that is a pretty small window and we are seeing this bug quite a lot. Seems if the code in question was the issue, it would take 1000s of iterations to reproduce. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336576620
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:44:59 GMT, Chris Plummer wrote: >> Maybe SA is seeing a monitor in the monitor cache that it is only seeing >> because the monitor cache is currently inconsistent (due to SA not safe >> pointing). So the question is whether the monitor cache can be in a state >> where it still contains a reference to an anonymous monitor, but no thread >> actually owns the monitor (and the anonymous monitor is about to be purged >> from the cache). > > Looks like you answered my question while I was asking it. And I shouldn't have said "cache". I was confusing this PR with another dealing with the Monitor Cache. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336568286
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:42:56 GMT, Chris Plummer wrote: >> Correction, there is one code path where we pop the lockstack first and then >> update the owner: >> >> ObjectMonitor* monitor = inflate(current, object, inflate_cause_vm_internal); >> if (LockingMode == LM_LIGHTWEIGHT && monitor->is_owner_anonymous()) { >> // It must be owned by us. Pop lock object from lock stack. >> LockStack& lock_stack = current->lock_stack(); >> oop popped = lock_stack.pop(); >> assert(popped == object, "must be owned by this thread"); >> monitor->set_owner_from_anonymous(current); >> } >> monitor->exit(current); >> >> we should probably do this the other way around. > > Maybe SA is seeing a monitor in the monitor cache that it is only seeing > because the monitor cache is currently inconsistent (due to SA not safe > pointing). So the question is whether the monitor cache can be in a state > where it still contains a reference to an anonymous monitor, but no thread > actually owns the monitor (and the anonymous monitor is about to be purged > from the cache). Looks like you answered my question while I was asking it. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336566708
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:34:08 GMT, David Holmes wrote: >> AFAICS we update the owner to the real thread before we remove the object >> from the lock stack. So if we see the object monitor is anonymously owned >> then we should find the monitor object in a thread's lockstack. > > Correction, there is one code path where we pop the lockstack first and then > update the owner: > > ObjectMonitor* monitor = inflate(current, object, inflate_cause_vm_internal); > if (LockingMode == LM_LIGHTWEIGHT && monitor->is_owner_anonymous()) { > // It must be owned by us. Pop lock object from lock stack. > LockStack& lock_stack = current->lock_stack(); > oop popped = lock_stack.pop(); > assert(popped == object, "must be owned by this thread"); > monitor->set_owner_from_anonymous(current); > } > monitor->exit(current); > > we should probably do this the other way around. Maybe SA is seeing a monitor in the monitor cache that it is only seeing because the monitor cache is currently inconsistent (due to SA not safe pointing). So the question is whether the monitor cache can be in a state where it still contains a reference to an anonymous monitor, but no thread actually owns the monitor (and the anonymous monitor is about to be purged from the cache). - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336565875
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:30:25 GMT, David Holmes wrote: >>> Ah! I guess we get used to talking about "at a safepoint" when we really >>> mean "at a fixed point in time". So the VM is not necessarily at a >>> safepoint, but everything is fixed. So invariants may not hold, but the >>> state cannot change. >> >> Correct! >> >>> So in the current context the anonymous owner should be found ... >> >> That's what I concluded based on Roman's description of how things work. >> >>> I guess the question to be answered is how the code tries to find an >>> anonymous owner? I'm not sure how you can find it. >> >> I'm not sure yet. I'll need to have a look. > > AFAICS we update the owner to the real thread before we remove the object > from the lock stack. So if we see the object monitor is anonymously owned > then we should find the monitor object in a thread's lockstack. Correction, there is one code path where we pop the lockstack first and then update the owner: ObjectMonitor* monitor = inflate(current, object, inflate_cause_vm_internal); if (LockingMode == LM_LIGHTWEIGHT && monitor->is_owner_anonymous()) { // It must be owned by us. Pop lock object from lock stack. LockStack& lock_stack = current->lock_stack(); oop popped = lock_stack.pop(); assert(popped == object, "must be owned by this thread"); monitor->set_owner_from_anonymous(current); } monitor->exit(current); we should probably do this the other way around. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336558790
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:27:39 GMT, Chris Plummer wrote: >> Ah! I guess we get used to talking about "at a safepoint" when we really >> mean "at a fixed point in time". So the VM is not necessarily at a >> safepoint, but everything is fixed. So invariants may not hold, but the >> state cannot change. So in the current context the anonymous owner should be >> found ... I guess the question to be answered is how the code tries to find >> an anonymous owner? I'm not sure how you can find it. > >> Ah! I guess we get used to talking about "at a safepoint" when we really >> mean "at a fixed point in time". So the VM is not necessarily at a >> safepoint, but everything is fixed. So invariants may not hold, but the >> state cannot change. > > Correct! > >> So in the current context the anonymous owner should be found ... > > That's what I concluded based on Roman's description of how things work. > >> I guess the question to be answered is how the code tries to find an >> anonymous owner? I'm not sure how you can find it. > > I'm not sure yet. I'll need to have a look. AFAICS we update the owner to the real thread before we remove the object from the lock stack. So if we see the object monitor is anonymously owned then we should find the monitor object in a thread's lockstack. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336557193
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 02:14:40 GMT, David Holmes wrote: > Ah! I guess we get used to talking about "at a safepoint" when we really mean > "at a fixed point in time". So the VM is not necessarily at a safepoint, but > everything is fixed. So invariants may not hold, but the state cannot change. Correct! > So in the current context the anonymous owner should be found ... That's what I concluded based on Roman's description of how things work. > I guess the question to be answered is how the code tries to find an > anonymous owner? I'm not sure how you can find it. I'm not sure yet. I'll need to have a look. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336555978
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:41:38 GMT, Chris Plummer wrote: >> Correction to above: >> >> threads = VM.getVM().getThreads(); >> heap = VM.getVM().getObjectHeap(); >> createThreadTable(); // calls getThreads() again >> >> The VM caches the set of threads ie the snapshot, so three sets are not >> possible. But AFAICS the thread snapshot and heap snapshot are not atomic, >> so the set of threads could have changed, and the state of threads also. > >> Surely jstack thread dump and deadlock check _has_ to run at a safepoint? > > The reality is that JVM is rarely at a safepoint (unless perhaps when all > threads are blocked), and therefore jstack rarely is done at a safepoint. > This is the world SA lives in. The understanding is that SA debugging > features may give inaccurate info, or possibly not work at all. > >> If the SA is working from a snapshot then it has to create that snapshot >> atomically. It can't snapshot the threads, then snapshot the heap. > > It doesn't snapshot. It does a debugger attach, which suspends the process. > It then starts to read in pages from the process to do things like jstack. > The process state does not change while it does this. It's not really any > different than gdb in this regard. gdb does not let the process state change > unless you use a command that allows execution such "step" or "continue". As > long as you avoid the commands that allow process execution, you can debug > without worrying about the process state changing. However, even GDB > debugging has the same issues with JVM safe pointing (or lack thereof). If > the JVM crashes and you start looking at certain data, it might be > inconsistent. There's no way to force a safepoint once there is a crash. > >> @plummercj the SA code sees T2 is pending on the monitor for object O, which >> is locked anonymously but actually by T1. The SA code then goes hunting for >> the owner. But the VM is not standing still... > > The VM is standing still. There is no process execution while all of this > happens. jstack and the deadlock detection are fully executed while the JVM > process is halted. There is no JVM state change during any of this. Ah! I guess we get used to talking about "at a safepoint" when we really mean "at a fixed point in time". So the VM is not necessarily at a safepoint, but everything is fixed. So invariants may not hold, but the state cannot change. So in the current context the anonymous owner should be found ... I guess the question to be answered is how the code tries to find an anonymous owner? I'm not sure how you can find it. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336549922
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:32:48 GMT, David Holmes wrote: >> If the SA is working from a snapshot then it has to create that snapshot >> atomically. It can't snapshot the threads, then snapshot the heap. > > Correction to above: > > threads = VM.getVM().getThreads(); > heap = VM.getVM().getObjectHeap(); > createThreadTable(); // calls getThreads() again > > The VM caches the set of threads ie the snapshot, so three sets are not > possible. But AFAICS the thread snapshot and heap snapshot are not atomic, so > the set of threads could have changed, and the state of threads also. > Surely jstack thread dump and deadlock check _has_ to run at a safepoint? The reality is that JVM is rarely at a safepoint (unless perhaps when all threads are blocked), and therefore jstack rarely is done at a safepoint. This is the world SA lives in. The understanding is that SA debugging features may give inaccurate info, or possibly not work at all. > If the SA is working from a snapshot then it has to create that snapshot > atomically. It can't snapshot the threads, then snapshot the heap. It doesn't snapshot. It does a debugger attach, which suspends the process. It then starts to read in pages from the process to do things like jstack. The process state does not change while it does this. It's not really any different than gdb in this regard. gdb does not let the process state change unless you use a command that allows execution such "step" or "continue". As long as you avoid the commands that allow process execution, you can debug without worrying about the process state changing. However, even GDB debugging has the same issues with JVM safe pointing (or lack thereof). If the JVM crashes and you start looking at certain data, it might be inconsistent. There's no way to force a safepoint once there is a crash. > @plummercj the SA code sees T2 is pending on the monitor for object O, which > is locked anonymously but actually by T1. The SA code then goes hunting for > the owner. But the VM is not standing still... The VM is standing still. There is no process execution while all of this happens. jstack and the deadlock detection are fully executed while the JVM process is halted. There is no JVM state change during any of this. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336534342
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:25:59 GMT, David Holmes wrote: >> To expand if deadlock detection does not run at a safepoint then this logic >> is non-atomic and completely broken: >> >> threads = VM.getVM().getThreads(); >> heap = VM.getVM().getObjectHeap(); >> createThreadTable(); // calls getThreads() again >> >> Without a stable VM you could have three almost completely different set of >> threads when each of these statements run! > > If the SA is working from a snapshot then it has to create that snapshot > atomically. It can't snapshot the threads, then snapshot the heap. Correction to above: threads = VM.getVM().getThreads(); heap = VM.getVM().getObjectHeap(); createThreadTable(); // calls getThreads() again The VM caches the set of threads ie the snapshot, so three sets are not possible. But AFAICS the thread snapshot and heap snapshot are not atomic, so the set of threads could have changed, and the state of threads also. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336530626
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:16:11 GMT, David Holmes wrote: >> Edit: I missed the intermediate comments above where Roman explained. >> >> @plummercj the SA code sees T2 is pending on the monitor for object O, which >> is locked anonymously but actually by T1. The SA code then goes hunting for >> the owner. But the VM is not standing still. T1 continued execution and went >> to release the lock, it sees it has been inflated and sets itself as the >> true owner, then releases the lock thus clearing the owner and potentially >> unblocking T2. T2 could now proceed to acquire the lock. But the SA sees >> none of this because it has a stale view of things. > > To expand if deadlock detection does not run at a safepoint then this logic > is non-atomic and completely broken: > > threads = VM.getVM().getThreads(); > heap = VM.getVM().getObjectHeap(); > createThreadTable(); // calls getThreads() again > > Without a stable VM you could have three almost completely different set of > threads when each of these statements run! If the SA is working from a snapshot then it has to create that snapshot atomically. It can't snapshot the threads, then snapshot the heap. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336527632
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:09:47 GMT, David Holmes wrote: >> Surely jstack thread dump and deadlock check _has_ to run at a safepoint? > > Also isn't "anonymous locking" an intermediate step in monitor inflation? The > inflated monitor becomes anonymously owned until the real owner sees it has > been inflated and updates the ownership accordingly. To expand if deadlock detection does not run at a safepoint then this logic is non-atomic and completely broken: threads = VM.getVM().getThreads(); heap = VM.getVM().getObjectHeap(); createThreadTable(); // calls getThreads() again Without a stable VM you could have three almost completely different set of threads when each of these statements run! - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336523076
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Tue, 26 Sep 2023 01:07:32 GMT, David Holmes wrote: >>> The specific race here is that SA sees an anonymously locked ObjectMonitor >>> and tries to find the owning thread, and fails, presumably because that >>> thread has moved on and unlocked the object in the meantime. >> >> But you said that when T1 starts the process of unlocking O, it sees the >> anonymous owner and it sets T1 as the owner before unlocking and handing off >> to T2. So I don't see how T1 could have "moved on". If T2 is blocked on O >> and O an anonymous owner, then T1 must still own it. >> >> SA always deals with a snapshot of the JVM state. Once SA attaches, no >> threads are running. So if O has an anonymous owner, you don't have to worry >> about the owner releasing the monitor while you are looking for the owner. >> >> The question then becomes is there a short window while releasing the >> monitor that O still shows up as having an anonymous owner, but T1 has >> already released it. From your description of the order of things, this >> doesn't seem possible. > > Surely jstack thread dump and deadlock check _has_ to run at a safepoint? Also isn't "anonymous locking" an intermediate step in monitor inflation? The inflated monitor becomes anonymously owned until the real owner sees it has been inflated and updates the ownership accordingly. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336520272
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 20:52:28 GMT, Chris Plummer wrote: >> With the new lightweight locking, when a thread T1 holds a lightweight lock >> on object O, and another thread T2 comes in and also wants to acquire that >> lock, it inflates the lock to a monitor. And since it cannot determine which >> threads currently locks O, it installs a special owner - which we call >> anonymous owner - in the ObjectMonitor, it basically means 'we don't know'. >> As soon as T1 unlocks O, it observes that the lock is now inflated and set >> to anonymous owner, installs itself as the proper owner of the ObjectMonitor >> and proceeds to unlock the ObjectMonitor - thereby handing over ownership to >> T2. >> >> The specific race here is that SA sees an anonymously locked ObjectMonitor >> and tries to find the owning thread, and fails, presumably because that >> thread has moved on and unlocked the object in the meantime. >> >> However, the race is more general, and pre-(lightweight-locking-)existing: >> if SA observes a locked object, or a thread which holds a particular lock, >> by the time it is reporting, this information may (likely) be wrong because >> the thread has, in the meantime, moved on. This in itself may not be a big >> problem. What may be worse is that, if SA prints stacks of all threads and >> also prints locking information, the overall information may be >> *inconsistent*. For example, it may report that two threads hold the same >> object. >> Also, I am not sure how deep SA goes. Telling from the stack-trace in the >> original report, it looks like there is a DeadlockDetector involved. If >> threads don't hold still while we try to detect deadlocks, we may report >> false positives. (We would perhaps not report false negatives, because >> deadlocked thread by definition don't move ;-) ) >> >> I don't know enough about SA and how it deals with locking to judge how much >> of a problem all that is, but I think the pre-existing code (before LW >> locking) did not warn about locking raciness, and I don't really see that we >> should do this for the anonymous case. > >> The specific race here is that SA sees an anonymously locked ObjectMonitor >> and tries to find the owning thread, and fails, presumably because that >> thread has moved on and unlocked the object in the meantime. > > But you said that when T1 starts the process of unlocking O, it sees the > anonymous owner and it sets T1 as the owner before unlocking and handing off > to T2. So I don't see how T1 could have "moved on". If T2 is blocked on O and > O an anonymous owner, then T1 must still own it. > > SA always deals with a snapshot of the JVM state. Once SA attaches, no > threads are running. So if O has an anonymous owner, you don't have to worry > about the owner releasing the monitor while you are looking for the owner. > > The question then becomes is there a short window while releasing the monitor > that O still shows up as having an anonymous owner, but T1 has already > released it. From your description of the order of things, this doesn't seem > possible. Surely jstack thread dump and deadlock check _has_ to run at a safepoint? - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336519288
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 19:59:32 GMT, Roman Kennke wrote: > The specific race here is that SA sees an anonymously locked ObjectMonitor > and tries to find the owning thread, and fails, presumably because that > thread has moved on and unlocked the object in the meantime. But you said that when T1 starts the process of unlocking O, it sees the anonymous owner and it sets T1 as the owner before unlocking and handing off to T2. So I don't see how T1 could have "moved on". If T2 is blocked on O and O an anonymous owner, then T1 must still own it. SA always deals with a snapshot of the JVM state. Once SA attaches, no threads are running. So if O has an anonymous owner, you don't have to worry about the owner releasing the monitor while you are looking for the owner. The question then becomes is there a short window while releasing the monitor that O still shows up as having an anonymous owner, but T1 has already released it. From your description of the order of things, this doesn't seem possible. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336377708
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 19:18:38 GMT, Chris Plummer wrote: >> Yes. But unless we run this at a safepoint, there is no consistent state >> when we race with locking. I don't think we want to spam users with warnings >> whenever we run into anonymous owners (which is not too infrequent). >> Existing (pre-lightweight-) locking doesn't print warnings either, when a >> thread that's expected to lock on an object has moved on or otherwise >> encounters racy state. > > I need a better understanding of anonymous owners and the race you think is > going on here. 99+% of the time if we are not at a safepoint, pretty much > everything still works fine with SA. Safe points guarantee safe access, but > not being at a safepoint does not guarantee there will be an issue for SA. It > usually still works, and we try to report when it doesn't. With the new lightweight locking, when a thread T1 holds a lightweight lock on object O, and another thread T2 comes in and also wants to acquire that lock, it inflates the lock to a monitor. And since it cannot determine which threads currently locks O, it installs a special owner - which we call anonymous owner - in the ObjectMonitor, it basically means 'we don't know'. As soon as T1 unlocks O, it observes that the lock is now inflated and set to anonymous owner, installs itself as the proper owner of the ObjectMonitor and proceeds to unlock the ObjectMonitor - thereby handing over ownership to T2. The specific race here is that SA sees an anonymously locked ObjectMonitor and tries to find the owning thread, and fails, presumably because that thread has moved on and unlocked the object in the meantime. However, the race is more general, and pre-(lightweight-locking-)existing: if SA observes a locked object, or a thread which holds a particular lock, by the time it is reporting, this information may (likely) be wrong because the thread has, in the meantime, moved on. This in itself may not be a big problem. What may be worse is that, if SA prints stacks of all threads and also prints locking information, the overall information may be *inconsistent*. For example, it may report that two threads hold the same object. Also, I am not sure how deep SA goes. Telling from the stack-trace in the original report, it looks like there is a DeadlockDetector involved. If threads don't hold still while we try to detect deadlocks, we may report false positives. (We would perhaps not report false negatives, because deadlocked thread by definition don't move ;-) ) I don't know enough about SA and how it deals with locking to judge how much of a problem all that is, but I think the pre-existing code (before LW locking) did not warn about locking raciness, and I don't really see that we should do this for the anonymous case. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336332299
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 18:32:19 GMT, Roman Kennke wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java >> line 247: >> >>> 245: // Java code and locking state can change at any time. >>> This code is not >>> 246: // expected to be precise, so we return null here. >>> 247: return null; >> >> If the JVM is in a consistent state, do we expect that this should not >> happen? If so, then a warning message would be the usual SA approach. >> Basically let the user know we detected an issue rather than just quietly >> succeeding. > > Yes. But unless we run this at a safepoint, there is no consistent state when > we race with locking. I don't think we want to spam users with warnings > whenever we run into anonymous owners (which is not too infrequent). Existing > (pre-lightweight-) locking doesn't print warnings either, when a thread > that's expected to lock on an object has moved on or otherwise encounters > racy state. I need a better understanding of anonymous owners and the race you think is going on here. 99+% of the time if we are not at a safepoint, pretty much everything still works fine with SA. Safe points guarantee safe access, but not being at a safepoint does not guarantee there will be an issue for SA. It usually still works, and we try to report when it doesn't. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336297000
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 17:44:56 GMT, Chris Plummer wrote: >> The SA can run concurrently with Java threads, SA code that inspects locking >> state should be able to deal with that. On the other hand, the particular >> code is only used in printing routine and is not expected to be precise. >> When resolving an anonymous owner, we may not find one, because Java threads >> may already have moved on. Instead of crashing with a stacktrace, we should >> gracefully return null here. >> >> Testing: >> - [x] sun/tools/jhsdb/JStackStressTest.java >> - [x] sun/tools/jhsdb >> - [x] serviceability/sa > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line > 247: > >> 245: // Java code and locking state can change at any time. >> This code is not >> 246: // expected to be precise, so we return null here. >> 247: return null; > > If the JVM is in a consistent state, do we expect that this should not > happen? If so, then a warning message would be the usual SA approach. > Basically let the user know we detected an issue rather than just quietly > succeeding. Yes. But unless we run this at a safepoint, there is no consistent state when we race with locking. I don't think we want to spam users with warnings whenever we run into anonymous owners (which is not too infrequent). Existing (pre-lightweight-) locking doesn't print warnings either, when a thread that's expected to lock on an object has moved on or otherwise encounters racy state. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336253796
Re: RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
On Mon, 25 Sep 2023 16:16:51 GMT, Roman Kennke wrote: > The SA can run concurrently with Java threads, SA code that inspects locking > state should be able to deal with that. On the other hand, the particular > code is only used in printing routine and is not expected to be precise. When > resolving an anonymous owner, we may not find one, because Java threads may > already have moved on. Instead of crashing with a stacktrace, we should > gracefully return null here. > > Testing: > - [x] sun/tools/jhsdb/JStackStressTest.java > - [x] sun/tools/jhsdb > - [x] serviceability/sa src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java line 247: > 245: // Java code and locking state can change at any time. > This code is not > 246: // expected to be precise, so we return null here. > 247: return null; If the JVM is in a consistent state, do we expect that this should not happen? If so, then a warning message would be the usual SA approach. Basically let the user know we detected an issue rather than just quietly succeeding. - PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1336208858
RFR: 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock"
The SA can run concurrently with Java threads, SA code that inspects locking state should be able to deal with that. On the other hand, the particular code is only used in printing routine and is not expected to be precise. When resolving an anonymous owner, we may not find one, because Java threads may already have moved on. Instead of crashing with a stacktrace, we should gracefully return null here. Testing: - [x] sun/tools/jhsdb/JStackStressTest.java - [x] sun/tools/jhsdb - [x] serviceability/sa - Commit messages: - 8316401: sun/tools/jhsdb/JStackStressTest.java failed with "InternalError: We should have found a thread that owns the anonymous lock" Changes: https://git.openjdk.org/jdk/pull/15907/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15907&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8316401 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15907/head:pull/15907 PR: https://git.openjdk.org/jdk/pull/15907