I'm helping a colleague debug a weird problem that occurs in
ClassValue on jdk11u (and presumably on upstream as well, though it's
presently impossible to verify).  As a disclaimer, the problem
manifests itself when building native images via GraalVM so it's
possible that something is simply broken there, but it seems at least
feasible that it could be a plain old Java bug so I thought I'd send
up a flare here to see if this makes sense to anyone else.

The bug itself manifests (on jdk11u) as an NPE with the following
exception trace:

java.lang.NullPointerException
        at 
java.base/java.lang.ClassValue$ClassValueMap.loadFromCache(ClassValue.java:535)
        at 
java.base/java.lang.ClassValue$ClassValueMap.probeHomeLocation(ClassValue.java:541)
        at java.base/java.lang.ClassValue.get(ClassValue.java:101)
        ...

Some basic analysis shows that this should be impossible under
normal/naive circumstances: the initializer of
java.lang.ClassValue.ClassValueMap sets the corresponding field to
non-null during construction.

However, I'm wondering if this isn't a side effect of what appears to
be an incorrect double-checked lock at lines 374-378 of
ClassValue.java [1].  In order for the write to the non-volatile
`cache` field of ClassValueMap, it is my understanding that there must
be some acquire/release edge from where the variable is published to
guarantee that all of the writes are visible to the read site, and I'm
not really sure that the exit-the-constructor edge is going to match
up with with the synchronized block which protects a different
non-volatile field.  And even if my feeling that this is dodgy is
valid, I'm not sure whether this NPE is a possible outcome of that
problem!

Thoughts?

[1] 
https://github.com/openjdk/jdk11u-dev/blob/3789983e89c9de252ef546a1b98a732a7d066650/src/java.base/share/classes/java/lang/ClassValue.java#L374-L378
-- 
- DML • he/him

Reply via email to