On Fri, 17 Mar 2023 06:33:43 GMT, Roman Kennke <rken...@openjdk.org> wrote:

>>> In my last changes I made a stupid mistake and don't set the condition 
>>> flags correctly to force the slow-path, on aarch64. This is only relevant 
>>> when we exceed the lock-stack capacity, that is why it's failing so rarely. 
>>> I don't see a similar problem on x86_64 - have we observed any failures on 
>>> x86_64? I pushed a fix for aarch64.
>> 
>> I noticed this too for arm; I used cmp to clear EQ but using tst seems 
>> better. I also do it inside fast_lock, to give it a defined exit state wrt 
>> EQ|NE, since it saves me from having to think about this on every call site. 
>> But at least the fail case may be fiddly without conditional execution.
>
>> > In my last changes I made a stupid mistake and don't set the condition 
>> > flags correctly to force the slow-path, on aarch64. This is only relevant 
>> > when we exceed the lock-stack capacity, that is why it's failing so 
>> > rarely. I don't see a similar problem on x86_64 - have we observed any 
>> > failures on x86_64? I pushed a fix for aarch64.
>> 
>> 
>> 
>> I noticed this too for arm; I used cmp to clear EQ but using tst seems 
>> better. I also do it inside fast_lock, to give it a defined exit state wrt 
>> EQ|NE, since it saves me from having to think about this on every call site. 
>> But at least the fail case may be fiddly without conditional execution.
> 
> Cmp(r,r) would not clear EQ, but set it. Unless you do cmp(r,0) on a non-null 
> register. Tst is better at least on x86 because it encodes smaller. *shrugs*
> 
> You can do it in the shared fast_lock() but it's really only needed in C2, 
> that's why I'm doing it there. Maybe I'm too perfectionist when it comes to 
> assembly code?

@rkennke I was not able to directly use 
'JavaThread::lock_stack_offset_offset()' in cmp since it was not encodable as 
immediate. You did not hit the same problem on aarch64, right? IIUC that was 
more out of accident, since you should have similar or the same (not sure) 
restrictions for encoding immediates. But your Thread layout is probably 
different and the offset may just happened to be encodable. If so, that would 
make you vulnerable against changes in Thread that change the offset of the 
LockStack.

Anyway, for now I solved this by using the second scratch register as 
intermediate. One more instruction though. 

I am now experimenting with my original idea of placing the Lockstack slots at 
a known aligned offset and then testing the alignment of the current 
index/pointer. This should be possible with a simple TST. Lets see how this 
goes.

-------------

PR: https://git.openjdk.org/jdk/pull/10907

Reply via email to