Module: Mesa
Branch: master
Commit: 6dbb8d12a8b78769b9803884fad5f0d9923023bc
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6dbb8d12a8b78769b9803884fad5f0d9923023bc

Author: Samuel Pitoiset <samuel.pitoi...@gmail.com>
Date:   Mon Oct 24 21:41:11 2016 +0200

nv50/ir: do not perform global membar for shared memory

Shared memory is local to CTA, thus we should only wait for
prior memory writes which are visible to other threads in
the same CTA, and not at global level. This should speedup
compute shaders which use shared memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoi...@gmail.com>
Reviewed-by: Ilia Mirkin <imir...@alum.mit.edu>

---

 src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
index b47fc49..91cef81 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp
@@ -3561,12 +3561,15 @@ Converter::handleInstruction(const struct 
tgsi_full_instruction *insn)
       geni->subOp = tgsi::opcodeToSubOp(tgsi.getOpcode());
       break;
    case TGSI_OPCODE_MEMBAR:
+   {
+      uint32_t level = tgsi.getSrc(0).getValueU32(0, info);
       geni = mkOp(OP_MEMBAR, TYPE_NONE, NULL);
       geni->fixed = 1;
-      if (tgsi.getSrc(0).getValueU32(0, info) & TGSI_MEMBAR_THREAD_GROUP)
+      if (!(level & ~(TGSI_MEMBAR_THREAD_GROUP | TGSI_MEMBAR_SHARED)))
          geni->subOp = NV50_IR_SUBOP_MEMBAR(M, CTA);
       else
          geni->subOp = NV50_IR_SUBOP_MEMBAR(M, GL);
+   }
       break;
    case TGSI_OPCODE_ATOMUADD:
    case TGSI_OPCODE_ATOMXCHG:

_______________________________________________
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Reply via email to