When investigating an intermittent NPE with an Oracle internal test on AArch64, 
I noticed that the NPE is actually a SIGSEGV in the code emitted by 
`MethodHandles::jump_to_lambda_form` when trying to load the 
`MemberName::method` field because the `MemberName` we retrieved from 
`LambdaForm::vmentry` is null:

https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp#L141-L143

The JVM translates SIGSEGVs happening in method handle intrinsics to NPEs, 
assuming that they are implicit NPEs due to a null receiver:
https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/share/runtime/sharedRuntime.cpp#L979-L982

After digging around in the MethodHandle implementation that I'm not too 
familiar with, I found this suspicious code in `MethodHandle::updateForm`:
https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/MethodHandle.java#L1881-L1883

The `Unsafe.fullFence` was added by 
[JDK-8059877](https://bugs.openjdk.org/browse/JDK-8059877) and replaced a `// 
ISSUE: Should we have a memory fence here?` 
[comment](https://github.com/openjdk/jdk/commit/224c42ee4d4c3027d1f8f0d8b7ecf6646e9418c3#diff-5a4b2f817a4273eacf07f3ee24782c58c8ff474c6d790f72298e906837c5543aL1442).
 If the intention was to [prevent publishing a partially initialized 
object](https://wiki.sei.cmu.edu/confluence/display/java/TSM03-J.+Do+not+publish+partially+initialized+objects),
 I think it was added to the wrong place (paging @iwanowww).

In the failing scenario, we concurrently trigger LambdaForm customization for a 
VarHandle invoker:

        at 
java.base/java.lang.invoke.MethodHandle.updateForm(MethodHandle.java:1883)
        at 
java.base/java.lang.invoke.MethodHandle.customize(MethodHandle.java:1856)
        at 
java.base/java.lang.invoke.MethodHandle.maybeCustomize(MethodHandle.java:1846)
        at java.base/java.lang.invoke.Invokers.maybeCustomize(Invokers.java:632)
        at 
java.base/java.lang.invoke.Invokers.checkCustomized(Invokers.java:626)

The problem is that another thread can observe `newForm` which via the 
`MethodHandle::form` field before the store to `vmentry` in `prepare()` (see 
[here](https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/LambdaForm.java#L821))
 is visible and therefore observe null. We need a StoreStore barrier to prevent 
stores that initialize the newForm from being reordered with the subsequent 
store to the `MethodHandle::form` field that would make it accessible by other 
threads.

I removed the `fullFence` which should not be needed. 

Unfortunately, my new tests seems to trigger another issue (even if I leave the 
`fullFence` in):

java.lang.AssertionError
        at 
java.base/java.lang.invoke.LambdaForm.arityCheck(LambdaForm.java:1006)
        at 
TestLambdaFormCustomization.lambda$test$0(TestLambdaFormCustomization.java:46)
        at java.base/java.lang.Thread.run(Thread.java:1576)


It's this assert:
https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/LambdaForm.java#L1010

UPDATE: After discussing with @JornVernee, it seems that my test using 
`-Djava.lang.invoke.MethodHandle.COMPILE_THRESHOLD=1` was the issue since that 
flag triggers some untested code paths. I removed that flag from the test, 
verified that it still triggers the original issue and leave it to the method 
handle maintainers to address this issue separately.

Thanks to @eme64 for taking a first step at simplifying the original test.

Best regards,
Tobias

-------------

Commit messages:
 - Use putReferenceRelease
 - Change from driver to main
 - Updated copyright year
 - Don't use COMPILE_THRESHOLD=1 in the test. Verified that it still reproduces
 - Removed fullFence
 - 8340812: LambdaForm customization via MethodHandle::updateForm is not thread 
safe

Changes: https://git.openjdk.org/jdk/pull/21160/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21160&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8340812
  Stats: 73 lines in 2 files changed: 70 ins; 1 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/21160.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21160/head:pull/21160

PR: https://git.openjdk.org/jdk/pull/21160

Reply via email to