The memory_barrier pattern expands to UNSPEC_MEMBAR on the SPARC and the latter 
is implemented differently for V8 and V9:

(define_insn "*stbar"
  [(set (match_operand:BLK 0 "" "")
        (unspec:BLK [(match_dup 0)] UNSPEC_MEMBAR))]
  "TARGET_V8"
  "stbar"
  [(set_attr "type" "multi")])

;; membar #StoreStore | #LoadStore | #StoreLoad | #LoadLoad
(define_insn "*membar"
  [(set (match_operand:BLK 0 "" "")
        (unspec:BLK [(match_dup 0)] UNSPEC_MEMBAR))]
  "TARGET_V9"
  "membar\t15"
  [(set_attr "type" "multi")])


This is surprising because, while membar 0x0F is a full memory barrier for V9, 
stbar isn't one for V8.  stbar is only for PSO and a nop in TSO; now TSO isn't 
Strong Consistency so there is something missing.  Geert has devised a nice 
testcase (in Ada) based on Peterson's algorithm with 4 tasks (threads) and it 
fails on a 4-CPU Solaris machine with -mcpu=v8 (Solaris is TSO).  Something 
like the attached patch is needed to make it pass.

Now the GCC implementation seems to derive from that of the kernel, which has:

/* XXX Change this if we ever use a PSO mode kernel. */
#define mb()    __asm__ __volatile__ ("" : : : "memory")

in include/asm-sparc/system.h and

#define mb()    \
        membar_safe("#LoadLoad | #LoadStore | #StoreStore | #StoreLoad")

in include/asm-sparc64/system.h.

So mb() isn't a full memory barrier for V8 either.


        * config/sparc/sync.md (*stbar): Delete.
        (*membar_v8): New insn to implement UNSPEC_MEMBAR in SPARC-V8.


-- 
Eric Botcazou
Index: config/sparc/sync.md
===================================================================
--- config/sparc/sync.md	(revision 175408)
+++ config/sparc/sync.md	(working copy)
@@ -30,15 +30,20 @@ (define_expand "memory_barrier"
 {
   operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
   MEM_VOLATILE_P (operands[0]) = 1;
-
 })
 
-(define_insn "*stbar"
+;; In V8, loads are blocking and ordered wrt earlier loads, i.e. every load
+;; is virtually followed by a load barrier (membar #LoadStore | #LoadLoad).
+;; In PSO, stbar orders the stores (membar #StoreStore).
+;; In TSO, ldstub orders the stores wrt subsequent loads (membar #StoreLoad).
+;; The combination of the three yields a full memory barrier in all cases.
+(define_insn "*membar_v8"
   [(set (match_operand:BLK 0 "" "")
 	(unspec:BLK [(match_dup 0)] UNSPEC_MEMBAR))]
   "TARGET_V8"
-  "stbar"
-  [(set_attr "type" "multi")])
+  "stbar\n\tldstub\t[%%sp-1], %%g0"
+  [(set_attr "type" "multi")
+   (set_attr "length" "2")])
 
 ;; membar #StoreStore | #LoadStore | #StoreLoad | #LoadLoad
 (define_insn "*membar"

Reply via email to