On Sun, Dec 7, 2014 at 2:58 PM, David Holmes <david.hol...@oracle.com> wrote:
>> I believe the comment _does_ reflect hotspot's current implementation >> (entirely from exploring the sources). >> I believe it's correct to say "all of the platforms are >> multiple-copy-atomic except PPC". ... current hotspot sources don't contain ARM support. > Here is the definition of multi-copy atomicity from the ARM architecture > manual: > > "In a multiprocessing system, writes to a memory location are multi-copy > atomic if the following conditions are both true: > • All writes to the same location are serialized, meaning they are observed > in the same order by all observers, although some observers might not > observe all of the writes. > • A read of a location does not return the value of a write until all > observers observe that write." The hotspot sources give """ // To assure the IRIW property on processors that are not multiple copy // atomic, sync instructions must be issued between volatile reads to // assure their ordering, instead of after volatile stores. // (See "A Tutorial Introduction to the ARM and POWER Relaxed Memory Models" // by Luc Maranget, Susmit Sarkar and Peter Sewell, INRIA/Cambridge) #ifdef CPU_NOT_MULTIPLE_COPY_ATOMIC const bool support_IRIW_for_not_multiple_copy_atomic_cpu = true; """ and the referenced paper gives """ on POWER and ARM, two threads can observe writes to different locations in different orders, even in the absence of any thread-local reordering. In other words, the architectures are not multiple-copy atomic [Col92]. """ which strongly suggests that x86 and sparc are OK. > The first condition is met by Total-Store-Order (TSO) systems like x86 and > sparc; and not by relaxed-memory-order (RMO) systems like ARM and PPC. > However the second condition is not met simply by having TSO. If the local > processor can see a write from the local store buffer prior to it being > visible to other processors, then we do not have multi-copy atomicity and I > believe that is true for x86 and sparc. Hence none of our supported > platforms are multi-copy-atomic as far as I can see. > >> I believe hotspot must implement IRIW correctly to fulfil the promise >> of sequential consistency for standard Java, so on ppc volatile reads >> get a full fence, which leads us back to the ppc pointer chasing >> performance problem that started all of this. > > > Note that nothing in the JSR-133 cookbook allows for IRIW, even on x86 and > sparc. The key feature needed for IRIW is a load barrier that forces global > memory synchronization to ensure that all processors see writes at the same > time. I'm not even sure we can force that on x86 and sparc! Such a load > barrier negates the need for some store barriers as defined in the cookbook. > > My understanding, which could be wrong, is that the JMM implies > linearizability of volatile accesses, which in turn provides the IRIW > property. It is also my understanding that linearizability is a necessary > property for current proof systems to be applicable. However absence of > proof is not proof of absence, and it doesn't follow that code that doesn't > rely on IRIW is incorrect if IRIW is not ensured on a system. As has been > stated many times now, in the literature no practical lock-free algorithm > seems to rely on IRIW. So I still hope that IRIW can somehow be removed > because implementing it will impact everything related to the JMM in > hotspot.