For DCL to be correct Class.classValueMap should be declared volatile.

Final field guarantees are there to support unsafe publication in special cases. In this case we should be using safe publication.

Cheers,
David

On 8/08/2020 9:22 am, Paul Sandoz wrote:
Here’s some pertinent information from Doug’s excellent document of memory 
order modes [1]:

"As a delicate (but commonly exploited) special case of the above 
considerations, acquire-style reads of immutable fields of newly constructed objects 
do not require an explicit fence or moded access -- Plain mode reads suffice: If the 
consumer has not seen the new object before, it cannot have stale values that it 
must otherwise ignore or discard, and it cannot perform the read until it knows the 
address. On subsequent encounters, reusing old values is OK, because they cannot 
have changed. This reasoning rests on the only known defensible exception to the 
rule of never making assumptions about local precedence order: The reference 
(address) of a new object is assumed never to be known and impossible to speculate 
until creation. This assumption is also relied on by other Java security 
requirements.

The resulting techniques are used in Java final field implementations, and are the 
reason why specified guarantees for final fields are conditional upon constructors 
not leaking references to objects before constructor return. Classes with final 
fields are normally implemented by issuing a form of Release fence upon constructor 
return. Further, because nothing need be guaranteed about interactions with reads by 
the constructor, a StoreStoreFence suffices. Similar techniques may apply in other 
contexts, but can be unacceptably fragile. For example, code that works when the 
associated objects are always newly constructed may, without further safeguards, 
fail upon later changes to instead recycle the objects from pools."


I thought that by virtue of ClassValueMap extending WeakHashMap, which contains 
final fields, it would all work out. But, I was uncertain so I ran a jcstress 
test modifying the sample JMMSample_06_Finals [1].

See gist here:

https://gist.github.com/PaulSandoz/3ae6d3f3e4027653d185f237eedfab72 
<https://gist.github.com/PaulSandoz/3ae6d3f3e4027653d185f237eedfab72>

I ran in tough mode and observed this failure on my MacBook:

*** FAILED tests
   Strong asserts were violated. Correct implementations should have no assert 
failures here.

   1 matching test results.
   [FAILED] o.o.j.t.singletons.JMMSample_06_Finals.FinalInitExtendedNoFinal
     (JVM args: [-XX:-TieredCompilation, -XX:+UnlockDiagnosticVMOptions, 
-XX:+StressLCM, -XX:+StressGCM])
   Observed state     Occurrences   Expectation  Interpretation
               -1   2,420,284,404    ACCEPTABLE  Object is not seen yet.
                3             320     FORBIDDEN  Seeing partially constructed 
object.
                4              53     FORBIDDEN  Seeing partially constructed 
object.
                5           3,486     FORBIDDEN  Seeing partially constructed 
object.
                6           8,409     FORBIDDEN  Seeing partially constructed 
object.
                7          11,282     FORBIDDEN  Seeing partially constructed 
object.
                8     181,774,476    ACCEPTABLE  Seen the complete object.

So for HotSpot it seems to occur only for C2 under certain stress options, 
-XX:+StressLCM, -XX:+StressGCM, which might explain why an issue is not 
generally being observed. I don’t understand enough about these options to know 
why this is so.

However, this makes me nervous enough that it might be best to add a dummy 
final field to ClassMapValue, and to review other similar code.

David, perhaps you could add a dummy final field to ClassValueMap and see if 
that fixes the problem with Graal?

Paul.

[1] http://gee.cs.oswego.edu/dl/html/j9mm.html 
<http://gee.cs.oswego.edu/dl/html/j9mm.html>

[2] 
http://hg.openjdk.java.net/code-tools/jcstress/file/1dd2cca36fa9/jcstress-samples/src/main/java/org/openjdk/jcstress/samples/JMMSample_06_Finals.java
 
<http://hg.openjdk.java.net/code-tools/jcstress/file/1dd2cca36fa9/jcstress-samples/src/main/java/org/openjdk/jcstress/samples/JMMSample_06_Finals.java>

On Aug 7, 2020, at 2:35 PM, John Rose <john.r.r...@oracle.com> wrote:

On Aug 7, 2020, at 9:52 AM, Andrew Haley <a...@redhat.com> wrote:

On 8/7/20 4:39 PM, David Lloyd wrote:

However, I'm wondering if this isn't a side effect of what appears
to be an incorrect double-checked lock at lines 374-378 of
ClassValue.java [1].  In order for the write to the non-volatile
`cache` field of ClassValueMap, it is my understanding that there
must be some acquire/release edge from where the variable is
published to guarantee that all of the writes are visible to the
read site, and I'm not really sure that the exit-the-constructor
edge is going to match up with with the synchronized block which
protects a different non-volatile field.  And even if my feeling
that this is dodgy is valid, I'm not sure whether this NPE is a
possible outcome of that problem!

It certainly doesn't look right to me. AFAICS this is classic broken
double-checked locking. It'd need some sort of fence after
constructing the ClassValueMap instance and before publishing it.

Y’all are calling my baby ugly but he’s really cute if you look
at him from the right angle.

(First, a quick question:  This bug shows up on x86, right?
If so, we probably are not looking at a hardware reordering
problem but something stemming from Graal’s reordering
of operations compared to C1 and C2, perhaps after a different
set of inlining decisions uncharacteristic of C2.)

The bug indeed looks like a dropped fence at the end of the
constructor for ClassValueMap.  (I don’t see an easier way to
get a null at the NPE point.)  The fence might not actually be
dropped, but a write which the fence is supposed to fence
might have been reordered after the fence, and after an
unlock operation, allowing someone to see the initial
null after the unlock.

The double-check idiom breaks when the outer check
(not synchronized) gets half-baked data and acts on it.
This much-maligned idiom would not be broken in the
present case under the assumption that a data dependency
on type.classValueMap (set in initializeMap) will see
a fully initialized value of ClassValueMap.cacheArray.

Now we get into the fine print, and I agree there is a bug
here.  The easy fix is to repair the logic under which that
ugly everybody’s-hating-on-it double check idiom would
be in fact correct.

The fine print requires a couple different things that are not
quite fulfilled by the present code, and Graal has either
reordered the memory accesses to expose the problem, or
it (itself) has a complementary bug.  (Or it has an unrelated
bug, and you people are *staring* at my baby!)

First, as Paul noted, you need a final variable in your class
to get the benefit of a fence at the end of the constructor.
Oops #1:  ClassValueMap doesn’t have *any* finals.  That’s
awkward.  Two fixes for that:  (a) add a dummy final, and
(b) add a real fence in the constructor.  For better luck,
maybe put the fence at the end of sizeCache.

On machines (not x86) which need fences *before* final
reads, another explicit fence should be placed before the
unsynchronized read (or else inside the getCache method).

By the letter of the JMM, the helpful effects of the fence
at the end of the constructor are only guaranteed for
references which depend on final variables set by that
constructor, because only those final variables get a
“freeze” operation which stabilizes them (and values
dependently loaded via them).  Throwing in a dummy
final is therefore not guaranteed to work.  But throwing
in a fence will work.  One downside of the fence is that
the JMM is silent about fences, but the JMM is widely
recognized as a partial story, not the full or final story.

(Here’s a tidbit of JMM politics:  I once heard Doug Lea
considering whether maybe all fields should be treated
more like final fields.  I don’t know if this is still a live
idea, but it would make this bug go way, since nearly all
constructors would then get fences of some sort.)

Here’s a bit of explanatory (non-normative) text from the draft
POPL paper for JMM, where the parenthetic comment indicates
(I think) the sort of thing that is happening here:

Let us now assume that the acquire and release do happen. As long as
these actions take place after object has been constructed (and there
is no code motion around the end of the constructor), the diffs that
the second processor acquires are guaranteed to reflect the correctly
constructed object.

Basically, the synchronization statement is expected to do a
release *after* the non-null value is stored.  It looks like this
is failing due to a lost and/or reordered fence.

Second, David says:

I’m not really sure that the exit-the-constructor edge is going to match
up with with the synchronized block which protects a different
non-volatile field.

By the letter of the JMM (AFAIUI), this code is coloring way outside
the lines.  (Yes, I admit it.)  The problem is pretty fundamental.
I went and swapped in some JMM spec just now (don’t have it in
short term memory; go figure), and here’s what I re-learned.

In JMM terms, a “release” is the earlier end of any edge in relation
called “synchronizes-with”.  An “acquire” is the latter end of such
an edge.  In the JMM this relation applies only to lock, unlock,
and volatile reads and writes.  Your guess is as good as mine whether
other operations (CAS, fences, etc.) participate; the JMM is silent.
Crucially, acquires and release do not touch plain fields.  This is
counter-intuitive for those of us who like to reason with fences.

The JMM prevents regular writes from floating past what we like
to think of as “release points” (unlocks) by appealing to processor
order inside lock/unlock pairs, and also by appealing to global
ordering of multiple lock/unlock pairs (or stuff with volatiles
and other corner cases which we will neglect here).  In order
to work right, a normal write has to come before a release *and*
a matching normal read has to come after an acquire, and such
an acquire is nothing other than the lock of a later lock/unlock
pair, typically in another thread.

So the rules which reliably connect normal writes to normal
reads in other threads work differently from acquire fences and
release fences as we (or just I?) usually think of them.  Yes, there
are “acquires” and “releases” in the JMM.  No, they are not
primitives but rather descriptions of the way the happens-before
relation is constrained by (among other things) lock and unlock
actions.

On x86, you don’t have to worry about acquires; you just set up
the right release fences.  In the JMM, there’s no way to get either
an acquire or a release without a synchronized statement (or
other fancy stuff like volatiles).  That’s why double-check is
broken in the pure JMM, and why it can be repaired if you
add some (non-JMM) fences.

So why is this showing up suddenly with Graal?  Possibly it’s
got a really aggressive interpretation of the JMM, relative to
the standard HotSpot JITs.  Possibly it’s got a bug.  Graal might
be allowing the initialization of ClassValueMap.cacheArray
to float down past publication of the ClassValueMap on
type.classValueMap (in initializeMap).  Less likely, it is allowing
the initialization to totally escape the lock/unlock pair, something
the JMM might allow but the Cookbook advises against (in its
“Can Reorder” table).

Perhaps Graal is modeling the “freeze” operations on finals more
accurately (and the Cookbook gives instructions for this).  That
would allow the ClassValueMap to correctly initialize its frozen
finals, but not (unfortunately) the not-frozen cacheArray field.
It appears that the new JIT on the block is exposing my neglect
for my poor baby, in not putting the right fences around its playpen.

Bottom line:  Add a release fence before type.cVM is set, and add
a (no-op on x86) acquire fence in getCache.  Quick test:  Add a dummy
final to CVM, to see if that makes the bug let go.

— John

Reply via email to