On 08/06/2015 01:41 AM, Nathan Sidwell wrote: > I've committed this to fix the spinlock problem Cesar fell over. While > there I added more checking on the worker dimension.
I hit a couple of more bugs with the spinlocks. First, the address space argument to membar wasn't being handled properly. Second, nvptx_spinunlock should probably be using atom.exch instead of atom.cas. Finally, ptxas complains about the period prefix to the atom instructions. This patch addresses these problems. Is there a better way to allocate a scratch register for nvptx_spinunlock, or is my solution ok as-is for gomp-4_0-branch? Thanks, Cesar
2015-08-06 Cesar Philippidis <ce...@codesourcery.com> gcc/ * config/nvptx/nvptx.c (nvptx_expand_lock_unlock): Pass an additional scratch register to gen_nvptx_spinlock. * config/nvptx/nvptx.md (nvptx_membar): Use %B for the address space operand. (nvptx_spinlock): Remove period prefix from atom. (nvptx_spinunlock): Take additional scratch register argument. Use atom.exch to update the lock. diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c index 2013219..881aea4 100644 --- a/gcc/config/nvptx/nvptx.c +++ b/gcc/config/nvptx/nvptx.c @@ -3327,7 +3327,7 @@ nvptx_expand_lock_unlock (tree exp, bool lock) label); } else - pat = gen_nvptx_spinunlock (mem, space); + pat = gen_nvptx_spinunlock (mem, space, gen_reg_rtx (SImode)); emit_insn (pat); if (lock) emit_insn (barrier); diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md index 8cd8300..fb88c72 100644 --- a/gcc/config/nvptx/nvptx.md +++ b/gcc/config/nvptx/nvptx.md @@ -1569,7 +1569,7 @@ [(unspec_volatile [(match_operand:SI 0 "const_int_operand" "")] UNSPECV_MEMBAR)] "" - "membar%M0;") + "membar%B0;") ;; spinlock and unlock (define_insn "nvptx_spinlock" @@ -1581,11 +1581,12 @@ (match_operand:BI 3 "register_operand" "=R") (label_ref (match_operand 4 "" ""))])] "" - "%4:\\t.atom%R1.cas.b32 %2,%0,0,1;setp.ne.u32 %3,%2,0;@%3 bra.uni %4;") + "%4:\\tatom%R1.cas.b32 %2,%0,0,1;setp.ne.u32 %3,%2,0;@%3 bra.uni %4;") (define_insn "nvptx_spinunlock" [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m") (match_operand:SI 1 "const_int_operand" "i")] - UNSPECV_UNLOCK)] + UNSPECV_UNLOCK) + (match_operand:SI 2 "register_operand" "=R")] "" - ".atom%R1.cas.b32 %0,1,0;") + "atom%R1.exch.b32 %2,%0,0;")