On Tue, Dec 4, 2018 at 1:19 AM Iago Toral Quiroga <ito...@igalia.com> wrote:

> We use ALign16 mode for this, since it is more convenient, but the PRM
> for Broadwell states in Volume 3D Media GPGPU, Chapter 'Register region
> restrictions', Section '1. Special Restrictions':
>
>    "In Align16 mode, the channel selects and channel enables apply to a
>     pair of half-floats, because these parameters are defined for DWord
>     elements ONLY. This is applicable when both source and destination
>     are half-floats."
>
> This means that we cannot select individual HF elements using swizzles
> like we do with 32-bit floats so we can't implement the required
> regioning for this.
>
> Use the gen11 path for this instead, which uses Align1 mode.
>
> The restriction is not present in gen9 of gen10, where the Align16
>

"or gen10"?

Reviewed-by: Jason Ekstrand <ja...@jlekstrand.net>


> implementation seems to work just fine.
> ---
>  src/intel/compiler/brw_fs_generator.cpp | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/src/intel/compiler/brw_fs_generator.cpp
> b/src/intel/compiler/brw_fs_generator.cpp
> index d8e4bae17e0..ba7ed07e692 100644
> --- a/src/intel/compiler/brw_fs_generator.cpp
> +++ b/src/intel/compiler/brw_fs_generator.cpp
> @@ -1281,8 +1281,14 @@ fs_generator::generate_ddy(const fs_inst *inst,
>     const uint32_t type_size = type_sz(src.type);
>
>     if (inst->opcode == FS_OPCODE_DDY_FINE) {
> -      /* produce accurate derivatives */
> -      if (devinfo->gen >= 11) {
> +      /* produce accurate derivatives. We can do this easily in Align16
> +       * but this is not supported in gen11+ and gen8 Align16 swizzles
> +       * for Half-Float operands work in units of 32-bit and always
> +       * select pairs of consecutive half-float elements, so we can't use
> +       * use it for this.
> +       */
> +      if (devinfo->gen >= 11 ||
> +          (devinfo->gen == 8 && src.type == BRW_REGISTER_TYPE_HF)) {
>           src = stride(src, 0, 2, 1);
>           struct brw_reg src_0  = byte_offset(src,  0 * type_size);
>           struct brw_reg src_2  = byte_offset(src,  2 * type_size);
> --
> 2.17.1
>
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to