Re: Making sense of Memory Barriers

Gil Tene Wed, 06 Jul 2016 09:49:23 -0700

It's actually fairly hard to precisely map the JMM to barriers.

The job for (some of us) runtime implementors it fairly easy, as there are 
clear rules we can follow that while overly conservative, are **sufficient** 
for meeting the memory model requirements. You can find those in Doug Lea's 
excellent JMM cookbook (http://gee.cs.oswego.edu/dl/jmm/cookbook.html), and 
you'll see a matrix there that explain the sort of LoadLoad | LoadStore | 
StoreStore | StoreLoad combos would be sufficient to place between various 
operations for the memory model to be adhered to. However, it is critical 
to understand that these are *not rules that *you* can rely* on as a 
programmer. The cookbook is written for runtime and compiler implementors 
and provide an "if you follow this you are safe in your implementation" 
set, but runtimes are allowed to be more aggressive than the rules 
specified and still meet the JMM requirements. And they do. So programs 
that run on those runtimes are not guaranteed anything in the cookbook. 
E.g. lock coarsening is an optimization that is more aggressive than the 
cookbook rules, but is allowed by the JMM. And so is lock biasing. And so 
is the ability to eliminate monitor and/or volatile operations on 
provably-thread-local objects and fields, e.g. when escape analysis can 
show that no other thread will ever observe a given object.

A specific quality that the JMM has that is "looser" than the notion of 
fences or barriers is that the rules apply to specific variables and not 
globally to all memory operations. While fences and barriers are usually 
described in a global sense (a rule that applies to all pervious/subsequent 
stores or loads), the rules of the JMM only apply to operations in other 
threads that interact with the same volatile variable or monitor in 
question. E.g. with regards to other threads operations on the same 
volatile variable, a volatile read will appear to have the equivalent of a 
LoadLoad | LoadSore between the volatile read operation any any subsequent 
loads or stores (seen in program order in the thread). But this promise 
does not apply against other threads that do not interact with the same 
volatile. So e.g. if the volatile can be proven to be thread local, a 
volatile read has no ordering implications. The same is true for volatile 
stores, so a volatile store will appear to have a LoadStore|StoreStore 
fence between any preceding loads and store of the volatile store operation 
when considered from the point of view of another thread operating on the 
same volatile field. But it cannot be relied on to create a global 
StoreStore ro LoadStore fence, or the equivalent of an acquire or release, 
since it can be optimized away under certain conditions (like e.g. if the 
field was a member of an an object that was proven to be thread-local via 
escape analysis and therefore is now to not have any other threads 
interacts with it). The same caveats apply to monitor enter and monitor 
exist.

On Tuesday, July 5, 2016 at 2:53:43 PM UTC-7, Dain Ironfoot wrote:
>
> Gil thanks for the wonderfully simple yet clear explanation on the 
> barriers.
>
> May I ask you to please help me understand the related "memory visibility" 
> guarantees that these barriers provide in JMM.
> AFAIK, in java, we have the following language level instructions which 
> deals with memory visibility & ordering.
>
> volatile read (Atomicxxx get) 
> volatile write (Atomicxxx set)
>

> Atomicxxx lazySet (Unsafe.putOrdered*)
>

lazySet is not described or related to in any way in the current (Java SE 8 
or prior) JMM. In implementation, it is usually equivalent to a StoreStore 
fence preceding the set operation. This quality seems to be relied on by a 
lot of concurrent Java code (including code within the JDK), as it is a 
relatively cheap barrier. 

>
> Unsafe.loadFence()  
>
Unsafe.storeFence() 
> Unsafe.fullFence() 
>

Similarly not defined in the JMM. And (like everything else in Unsafe) not 
defined by the Java SE spec. But described in JEP 171 
<http://openjdk.java.net/jeps/171> in a way that would make them equivalent 
to:

Unsafe.loadFence() : LoadLoad | LoadStore
Unsafe.storeFence(): StoreLoad | StoreStore
Unsafe.fullFence(): LoadLoad | LoadStore | StoreLoad | StoreStore

>
> Lock acquire 
> Lock release
>

I assume you are referring to j.u.c Locks here. According to the 
j.u.c.locks.Lock 
<https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Lock.html#lock()>

JavaDoc, those have the same semantics as monitor enter and monitor exist.

> final fields in the constructor 
>

This is defined in the JMM. With regards to other threads that may observe 
the contents of the final field, it can be thought of loosely as having a 
StoreStore fence between the final field initializer and any subsequent 
store operation that would make the object that contains the final field 
visible to other threads (e.g. storing it's reference into a heap object). 
But keep in mind the same JMM loosening possibility: if the field can be 
proven to not be visible to other threads (e.g. via escape analysis) then 
there is no ordering required. So depending on the runtime, it may not 
exist.

>
> Given your model of "LoadLoad", "LoadStore", "StoreStore", "StoreLoad"; is 
> it helpful to map them to these instructions above?
> Also, how do you think about the memory visibility that are provided by 
> these instructions?
>
> Many thanks
>
>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Making sense of Memory Barriers

Reply via email to