On Tue, 26 Sep 2023 05:04:57 GMT, Roman Kennke <rken...@openjdk.org> wrote:
>> Atomicity means guaranteeing that a 2 or more operations are executed as if >> they are single operation from the viewpoint of other threads. There's no >> such concept with SA. The state is frozen, so SA can look at anything >> without worrying about the state changing as it looks at it. When I say the >> JVM is not in a consistent state, one example of that is when the JVM is in >> the middle of an atomic operation when SA looks at it. Another example is >> when SA looks at data that is normally only safe to look at when the VM is >> safe pointing. The code that David pointed out is another example. It's >> usually protected from observation by other threads by safe pointing, but SA >> can see the data when it's in the middle of executing this code. Order may >> or may not matter. Clearly the current order is problematic for SA. >> Switching the order may or may not be a problem. It depends on whether SA >> cares that a monitor on the lock_stack of one thread is shown to be owned by >> another thread. If that does br eak SA, I think SA might be able to recognize the situation and work around it, or just print a warning. >> >>> However, even that problem would be pre-existing (monitors not agreeing >>> with each other or with threads), and do we currently print a warning about >>> it? >> >> Likely not because it is likely not detected and doesn't result in some sort >> of failure, and probably whoever wrote the original code hadn't considered >> the possibility. Most of this type of SA error handling code (that result in >> a warning) is the results of eventually someone noticing a failure of some >> sort. If no one ever notices a failure, then the error handling is never >> worked on. > > Ok! > So we would have to swap the popping and state-update at least. Unless we > insert members between them, the updates would still be observable in any > order, I suppose, but inserting membars just to make SA happy seems a bit > much. > Can one of you try swapping and see if that makes the problem appear less > often? I could not yet reproduce the bug at all. I managed to get it to fail 4/100 times in our CI so armed with that I will try swapping the order. Though I have to admit I tend to agree with Chris that the code in question seems unlikely to be executed such that this failure mode is so repeatable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15907#discussion_r1337100891