Re: [Mesa-dev] [PATCH 04/10] i965: Have NIR lower flrp on pre-GEN6 vec4 backend

2016-03-10 Thread Matt Turner
On Thu, Mar 10, 2016 at 10:25 AM, Ian Romanick  wrote:
> From: Ian Romanick 
>
> Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
> By doing it in NIR, we have the opportunity for NIR to do additional
> optimization of the expanded code.
>
> This also enables optimizations added by the next commit.
>
> shader-db results:
>
> G4X / Ironlake
> total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
> instructions in affected programs: 447686 -> 439823 (-1.76%)
> helped: 2623
> HURT: 0
>
> total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
> cycles in affected programs: 16964960 -> 16917410 (-0.28%)
> helped: 2556
> HURT: 41
>
> Unsurprisingly, no changes on later platforms.
>
> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/drivers/dri/i965/brw_compiler.c | 27 +--
>  1 file changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
> b/src/mesa/drivers/dri/i965/brw_compiler.c
> index 2f05a26..6f67b5c 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> @@ -107,6 +107,26 @@ static const struct nir_shader_compiler_options 
> vector_nir_options = {
>  */
> .fdot_replicates = true,
>
> +   /* Prior to GEN6, there are no three source operations for SIMD4x2. */

Gen's not an acronym, so we don't write it in all-caps.

> +   .lower_flrp = true,
> +
> +   .lower_pack_snorm_2x16 = true,
> +   .lower_pack_unorm_2x16 = true,
> +   .lower_unpack_snorm_2x16 = true,
> +   .lower_unpack_unorm_2x16 = true,
> +   .lower_extract_byte = true,
> +   .lower_extract_word = true,
> +};
> +
> +static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
> +   COMMON_OPTIONS,
> +
> +   /* In the vec4 backend, our dpN instruction replicates its result to all 
> the
> +* components of a vec4.  We would like NIR to give us replicated fdot
> +* instructions because it can optimize better for us.
> +*/
> +   .fdot_replicates = true,
> +
> .lower_pack_snorm_2x16 = true,
> .lower_pack_unorm_2x16 = true,
> .lower_unpack_snorm_2x16 = true,
> @@ -159,8 +179,11 @@ brw_compiler_create(void *mem_ctx, const struct 
> brw_device_info *devinfo)
>if (devinfo->gen < 7)
>   compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
>
> -  compiler->glsl_compiler_options[i].NirOptions =
> - is_scalar ? &scalar_nir_options : &vector_nir_options;
> +  if (is_scalar)
> + compiler->glsl_compiler_options[i].NirOptions = &scalar_nir_options;
> +  else
> + compiler->glsl_compiler_options[i].NirOptions =
> +devinfo->gen < 6 ? &vector_nir_options : 
> &vector_nir_options_gen6;

Braces since this statement is multiline (and braces around the if
since the else will have them).

Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] i965: Have NIR lower flrp on pre-GEN6 vec4 backend

2016-03-10 Thread Ian Romanick
From: Ian Romanick 

Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.

This also enables optimizations added by the next commit.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
instructions in affected programs: 447686 -> 439823 (-1.76%)
helped: 2623
HURT: 0

total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
cycles in affected programs: 16964960 -> 16917410 (-0.28%)
helped: 2556
HURT: 41

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick 
---
 src/mesa/drivers/dri/i965/brw_compiler.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
b/src/mesa/drivers/dri/i965/brw_compiler.c
index 2f05a26..6f67b5c 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.c
+++ b/src/mesa/drivers/dri/i965/brw_compiler.c
@@ -107,6 +107,26 @@ static const struct nir_shader_compiler_options 
vector_nir_options = {
 */
.fdot_replicates = true,
 
+   /* Prior to GEN6, there are no three source operations for SIMD4x2. */
+   .lower_flrp = true,
+
+   .lower_pack_snorm_2x16 = true,
+   .lower_pack_unorm_2x16 = true,
+   .lower_unpack_snorm_2x16 = true,
+   .lower_unpack_unorm_2x16 = true,
+   .lower_extract_byte = true,
+   .lower_extract_word = true,
+};
+
+static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
+   COMMON_OPTIONS,
+
+   /* In the vec4 backend, our dpN instruction replicates its result to all the
+* components of a vec4.  We would like NIR to give us replicated fdot
+* instructions because it can optimize better for us.
+*/
+   .fdot_replicates = true,
+
.lower_pack_snorm_2x16 = true,
.lower_pack_unorm_2x16 = true,
.lower_unpack_snorm_2x16 = true,
@@ -159,8 +179,11 @@ brw_compiler_create(void *mem_ctx, const struct 
brw_device_info *devinfo)
   if (devinfo->gen < 7)
  compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
 
-  compiler->glsl_compiler_options[i].NirOptions =
- is_scalar ? &scalar_nir_options : &vector_nir_options;
+  if (is_scalar)
+ compiler->glsl_compiler_options[i].NirOptions = &scalar_nir_options;
+  else
+ compiler->glsl_compiler_options[i].NirOptions =
+devinfo->gen < 6 ? &vector_nir_options : &vector_nir_options_gen6;
 
   compiler->glsl_compiler_options[i].LowerBufferInterfaceBlocks = true;
}
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev