NoDDClr for split SHUFFLEs

GitLab Mirror Mon, 05 Oct 2020 11:30:18 -0700

Module: Mesa
Branch: staging/20.2
Commit: 6ea01cd07aa5914ad3b332bfb1fea8f7648d1f0d
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6ea01cd07aa5914ad3b332bfb1fea8f7648d1f0d


Author: Jason Ekstrand <[email protected]>
Date:   Fri Oct  2 13:37:05 2020 -0500

intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs

When I copied and pasted the code from MOV_INDIRECT for handling the
dependency controls, I missed a subtle difference between MOV_INDIRECT
and SHUFFLE.  Specifically, MOV_INDIRECT gets lowered to a narrow
instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to
split it in the generator.  Therefore, the check safety check for
whether or not we can use dependency control has to be based on the
lowered width rather than the width of the original instruction.

Fixes: a8ac61b0ee2fd "intel/fs: NoMask initialize the address..."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989>
(cherry picked from commit 8427e5606721019b0885af5b986a875e7d562643)

---

 .pick_status.json                       |  2 +-
 src/intel/compiler/brw_fs_generator.cpp | 18 +++++++++++++++---
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/.pick_status.json b/.pick_status.json
index 6cc5966ef96..7a99e9f46e2 100644
--- a/.pick_status.json
+++ b/.pick_status.json
@@ -391,7 +391,7 @@
         "description": "intel/fs: Don't use NoDDClk/NoDDClr for split 
SHUFFLEs",
         "nominated": true,
         "nomination_type": 1,
-        "resolution": 0,
+        "resolution": 1,
         "master_sha": null,
         "because_sha": "a8ac61b0ee2fdf4e8bc7b47aee9c24f96c40435c"
     },
diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index 1b62643027d..83409459563 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -652,11 +652,23 @@ fs_generator::generate_shuffle(fs_inst *inst,
 
          uint32_t src_start_offset = src.nr * REG_SIZE + src.subnr;
 
-         /* Whether we can use destination dependency control without running
-          * the risk of a hang if an instruction gets shot down.
+         /* From the Haswell PRM:
+          *
+          *    "When a sequence of NoDDChk and NoDDClr are used, the last
+          *    instruction that completes the scoreboard clear must have a
+          *    non-zero execution mask. This means, if any kind of predication
+          *    can change the execution mask or channel enable of the last
+          *    instruction, the optimization must be avoided.  This is to
+          *    avoid instructions being shot down the pipeline when no writes
+          *    are required."
+          *
+          * Whenever predication is enabled or the instructions being emitted
+          * aren't the full width, it's possible that it will be run with zero
+          * channels enabled so we can't use dependency control without
+          * running the risk of a hang if an instruction gets shot down.
           */
          const bool use_dep_ctrl = !inst->predicate &&
-                                   inst->exec_size == dispatch_width;
+                                   lower_width == dispatch_width;
          brw_inst *insn;
 
          /* Due to a hardware bug some platforms (particularly Gen11+) seem

_______________________________________________
mesa-commit mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Mesa (staging/20.2): intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs

Reply via email to