On Tue, 24 Sep 2024 17:15:21 GMT, Aleksey Shipilev <sh...@openjdk.org> wrote:

>> When investigating an intermittent NPE with an Oracle internal test on 
>> AArch64, I noticed that the NPE is actually a SIGSEGV in the code emitted by 
>> `MethodHandles::jump_to_lambda_form` when trying to load the 
>> `MemberName::method` field because the `MemberName` we retrieved from 
>> `LambdaForm::vmentry` is null:
>> 
>> https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp#L141-L143
>> 
>> The JVM translates SIGSEGVs happening in method handle intrinsics to NPEs, 
>> assuming that they are implicit NPEs due to a null receiver:
>> https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/share/runtime/sharedRuntime.cpp#L979-L982
>> 
>> After digging around in the MethodHandle implementation that I'm not too 
>> familiar with, I found this suspicious code in `MethodHandle::updateForm`:
>> https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/MethodHandle.java#L1881-L1883
>> 
>> The `Unsafe.fullFence` was added by 
>> [JDK-8059877](https://bugs.openjdk.org/browse/JDK-8059877) and replaced a 
>> `// ISSUE: Should we have a memory fence here?` 
>> [comment](https://github.com/openjdk/jdk/commit/224c42ee4d4c3027d1f8f0d8b7ecf6646e9418c3#diff-5a4b2f817a4273eacf07f3ee24782c58c8ff474c6d790f72298e906837c5543aL1442).
>>  If the intention was to [prevent publishing a partially initialized 
>> object](https://wiki.sei.cmu.edu/confluence/display/java/TSM03-J.+Do+not+publish+partially+initialized+objects),
>>  I think it was added to the wrong place (paging @iwanowww).
>> 
>> In the failing scenario, we concurrently trigger LambdaForm customization 
>> for a VarHandle invoker:
>> 
>>      at 
>> java.base/java.lang.invoke.MethodHandle.updateForm(MethodHandle.java:1883)
>>      at 
>> java.base/java.lang.invoke.MethodHandle.customize(MethodHandle.java:1856)
>>      at 
>> java.base/java.lang.invoke.MethodHandle.maybeCustomize(MethodHandle.java:1846)
>>      at java.base/java.lang.invoke.Invokers.maybeCustomize(Invokers.java:632)
>>      at 
>> java.base/java.lang.invoke.Invokers.checkCustomized(Invokers.java:626)
>> 
>> The problem is that another thread can observe `newForm` which via the 
>> `MethodHandle::form` field before the store to `vmentry` in `prepare()` (see 
>> [here](https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/LambdaForm.java#L821))
>>  is visible and therefore observe null. We need a StoreStore barrier to 
>> prevent store...
>
> src/java.base/share/classes/java/lang/invoke/MethodHandle.java line 1882:
> 
>> 1880:                     assert (newForm.customized == null || 
>> newForm.customized == this);
>> 1881:                     newForm.prepare(); // as in MethodHandle.<init>
>> 1882:                     UNSAFE.storeStoreFence(); // properly publish 
>> newForm
> 
> I agree `fullFence` was misplaced. Not sure if `storeStore` is sufficient 
> here, if we exposed the old instance somehow. It is safer to use release. And 
> I think an even cleaner way would be to drop the fence and call 
> `Unsafe.putReferenceRelease` below, which solves this as well: it publishes 
> new instance with release semantics.

Thanks Aleksey, makes sense. Let's use `putReferenceRelease` then. There are a 
few other, existing uses of `UNSAFE.storeStoreFence` in the libraries code, 
especially in similar places in `java/lang/invoke/`. Maybe these should be 
double checked.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21160#discussion_r1774575296

Reply via email to