compiler: fix ddy for half-float in gen8

Iago Toral Quiroga Tue, 15 Jan 2019 05:54:47 -0800

We use ALign16 mode for this, since it is more convenient, but the PRM
for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register region
restrictions', Section '1. Special Restrictions':


   "In Align16 mode, the channel selects and channel enables apply to a
    pair of half-floats, because these parameters are defined for DWord
    elements ONLY. This is applicable when both source and destination
    are half-floats."

This means that we cannot select individual HF elements using swizzles
like we do with 32-bit floats so we can't implement the required
regioning for this.

Use the gen11 path for this instead, which uses Align1 mode.

The restriction is not present in gen9 or gen10, where the Align16
implementation seems to work just fine.

Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>
---
 src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/intel/compiler/brw_fs_generator.cpp 
b/src/intel/compiler/brw_fs_generator.cpp
index d0cc4a6d231..4310f0b7fdc 100644
--- a/src/intel/compiler/brw_fs_generator.cpp
+++ b/src/intel/compiler/brw_fs_generator.cpp
@@ -1339,8 +1339,14 @@ fs_generator::generate_ddy(const fs_inst *inst,
    const uint32_t type_size = type_sz(src.type);
 
    if (inst->opcode == FS_OPCODE_DDY_FINE) {
-      /* produce accurate derivatives */
-      if (devinfo->gen >= 11) {
+      /* produce accurate derivatives. We can do this easily in Align16
+       * but this is not supported in gen11+ and gen8 Align16 swizzles
+       * for Half-Float operands work in units of 32-bit and always
+       * select pairs of consecutive half-float elements, so we can't use
+       * use it for this.
+       */
+      if (devinfo->gen >= 11 ||
+          (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) {
          src = stride(src, 0, 2, 1);
          struct brw_reg src_0  = byte_offset(src,  0 * type_size);
          struct brw_reg src_2  = byte_offset(src,  2 * type_size);
-- 
2.17.1

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v3 24/42] intel/compiler: fix ddy for half-float in gen8

Reply via email to