On 6/12/2014 7:29 AM, Martin Buchholz wrote:
On Thu, Dec 4, 2014 at 5:36 PM, David Holmes <david.hol...@oracle.com> wrote:
Martin,

On 2/12/2014 6:46 AM, Martin Buchholz wrote:

Is this finalized then? You can only make one commit per CR.

Right.  I'd like to commit and then perhaps do another round of clarifications.

I still find this entire comment block to be misguided and misplaced:

!     // Fences, also known as memory barriers, or membars.
!     // See hotspot sources for more details:
!     // orderAccess.hpp memnode.hpp unsafe.cpp
!     //
!     // One way of implementing Java language-level volatile variables
using
!     // fences (but there is often a better way without) is by:
!     // translating a volatile store into the sequence:
!     // - storeFence()
!     // - relaxed store
!     // - fullFence()
!     // and translating a volatile load into the sequence:
!     // - if (CPU_NOT_MULTIPLE_COPY_ATOMIC) fullFence()
!     // - relaxed load
!     // - loadFence()
!     // The full fence on volatile stores ensures the memory model
guarantee of
!     // sequential consistency on most platforms.  On some platforms (ppc)
we
!     // need an additional full fence between volatile loads as well (see
!     // hotspot's CPU_NOT_MULTIPLE_COPY_ATOMIC).

Even I think this comment is marginal - I will delete it.  But
consider this a plea for better documentation of the hotspot
internals.

Okay, but Unsafe.java is not the place to document anything about hotspot.

why do want this description here - it has no relevance to the API itself,
nor to how volatiles are implemented in the VM. And as I said in the bug
report CPU_NOT_MULTIPLE_COPY_ATOMIC exists only for platforms that want to
implement IRIW (none of our platforms are multiple-copy-atomic, but only PPC
sets this so that it employs IRIW).

I believe the comment _does_ reflect hotspot's current implementation
(entirely from exploring the sources).
I believe it's correct to say "all of the platforms are
multiple-copy-atomic except PPC".

Here is the definition of multi-copy atomicity from the ARM architecture manual:

"In a multiprocessing system, writes to a memory location are multi-copy atomic if the following conditions are both true: • All writes to the same location are serialized, meaning they are observed in the same order by all observers, although some observers might not observe all of the writes. • A read of a location does not return the value of a write until all observers observe that write."

The first condition is met by Total-Store-Order (TSO) systems like x86 and sparc; and not by relaxed-memory-order (RMO) systems like ARM and PPC. However the second condition is not met simply by having TSO. If the local processor can see a write from the local store buffer prior to it being visible to other processors, then we do not have multi-copy atomicity and I believe that is true for x86 and sparc. Hence none of our supported platforms are multi-copy-atomic as far as I can see.

I believe hotspot must implement IRIW correctly to fulfil the promise
of sequential consistency for standard Java, so on ppc volatile reads
get a full fence, which leads us back to the ppc pointer chasing
performance problem that started all of this.

Note that nothing in the JSR-133 cookbook allows for IRIW, even on x86 and sparc. The key feature needed for IRIW is a load barrier that forces global memory synchronization to ensure that all processors see writes at the same time. I'm not even sure we can force that on x86 and sparc! Such a load barrier negates the need for some store barriers as defined in the cookbook.

My understanding, which could be wrong, is that the JMM implies linearizability of volatile accesses, which in turn provides the IRIW property. It is also my understanding that linearizability is a necessary property for current proof systems to be applicable. However absence of proof is not proof of absence, and it doesn't follow that code that doesn't rely on IRIW is incorrect if IRIW is not ensured on a system. As has been stated many times now, in the literature no practical lock-free algorithm seems to rely on IRIW. So I still hope that IRIW can somehow be removed because implementing it will impact everything related to the JMM in hotspot.

David
-----

Reply via email to