On Tue, 11 Apr 2023 19:58:19 GMT, Roman Kennke <rken...@openjdk.org> wrote:

>> The `_base` array is only initialized to nullptr in debug builds.  I don't 
>> see a release barrier in LockStack::push between the update to _base[] and 
>> the update to _top, nor a corresponding acquire barrier when reading.  
>> Doesn't this mean it is possible to racily read an uninitialized junk oop 
>> value from _base[], especially on weak memory models?
>
> Yes. The whole LockStack is not meant to be accessed cross-thread, pretty 
> much like any thread's stack is not meant to be accessed like that (including 
> current stack-locking). So what can go wrong?
> With the new locking, we could read junk and compare it to the oop that we're 
> testing against and get a wrong result. We're not going to crash though.
> With the current stack-locking, we would fetch the stack-pointer and check if 
> that address is within the foreign thread's stack. Again, because the other 
> thread is not holding still, we might get a wrong result, but we would not 
> crash.
> So I guess we need to answer the question whether or not jmm_GetThreadInfo() 
> is ok with returning wrong result and what could be the consequences of this. 
> For example, how important is it that the info about the thread(s) is correct 
> and consistent (e.g. what happens if we report two threads both holding the 
> same lock?), etc. But I don't consider this to be part of this PR.
> 
> So my proposal is: leave that code as it is, for now (being racy when 
> inspecting foreign threads, but don't crash). Open a new issue to investigate 
> and possibly fix the problem (maybe by safepointing, maybe by handshaking if 
> that is enough, or maybe we find out we don't need to do anything). Add 
> comments in relevant places to point out the problem like you and David 
> suggested earlier. Would that be ok?

That seems fine to me, as long as we don't crash.  But my understanding is that 
Generational ZGC will crash if it sees a stale oop.  Isn't it possible that the 
racing read sees junk that looks to Generational ZGC like a stale oop?  To 
avoid this, unused slots may need to be set to nullptr even in product builds.  
But I'm not a GC expert so maybe there's no problem.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163306288

Reply via email to