[Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.

2016-04-07 Thread Kenneth Graunke
Many shaders contain expression trees of the form:

const_1 * (value * const_2)

Reorganizing these to

(const_1 * const_2) * value

will allow constant folding to combine the constants.  Sometimes, these
constants are 2 and 0.5, so we can remove a multiply altogether.  Other
times, it can create more immediate constants, which can actually hurt.

Finding a good balance here is tricky.  While much more could be done,
this simple patch seems to have a lot of positive benefit while having
a low downside.

shader-db results on Broadwell:

total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
instructions in affected programs: 438318 -> 435919 (-0.55%)
helped: 1502
HURT: 245

total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
cycles in affected programs: 11541788 -> 11435950 (-0.92%)
helped: 3445
HURT: 1224

Signed-off-by: Kenneth Graunke 
---
 src/compiler/nir/nir_opt_algebraic.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index e72b4a7..420d9d9 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -274,6 +274,14 @@ optimizations = [
(('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))),
(('imul', ('ineg', a), b), ('ineg', ('imul', a, b))),
 
+   # Reassociate constants in add/mul chains so they can be folded together.
+   # For now, we only handle cases where the constants are separated by
+   # a single non-constant.  We could do better eventually.
+   (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)),
+   (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)),
+   (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)),
+   (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)),
+
# Misc. lowering
(('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a, b, 
'options->lower_fmod'),
(('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)), 
'options->lower_uadd_carry'),
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.

2016-04-08 Thread Eduardo Lima Mitev

On 04/08/2016 01:35 AM, Kenneth Graunke wrote:

Many shaders contain expression trees of the form:

 const_1 * (value * const_2)

Reorganizing these to

 (const_1 * const_2) * value

will allow constant folding to combine the constants.  Sometimes, these
constants are 2 and 0.5, so we can remove a multiply altogether.  Other
times, it can create more immediate constants, which can actually hurt.

Finding a good balance here is tricky.  While much more could be done,
this simple patch seems to have a lot of positive benefit while having
a low downside.



Makes sense.

Reviewed-by: Eduardo Lima Mitev 


shader-db results on Broadwell:

total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
instructions in affected programs: 438318 -> 435919 (-0.55%)
helped: 1502
HURT: 245

total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
cycles in affected programs: 11541788 -> 11435950 (-0.92%)
helped: 3445
HURT: 1224



FWIW, these are my results on HSW:

total instructions in shared programs: 6223995 -> 6222470 (-0.02%)
instructions in affected programs: 300945 -> 299420 (-0.51%)
helped: 1231
HURT: 240

total cycles in shared programs: 56809432 -> 56615552 (-0.34%)
cycles in affected programs: 7980160 -> 7786280 (-2.43%)
helped: 2087
HURT: 634

Eduardo


Signed-off-by: Kenneth Graunke 
---
  src/compiler/nir/nir_opt_algebraic.py | 8 
  1 file changed, 8 insertions(+)

diff --git a/src/compiler/nir/nir_opt_algebraic.py 
b/src/compiler/nir/nir_opt_algebraic.py
index e72b4a7..420d9d9 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -274,6 +274,14 @@ optimizations = [
 (('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))),
 (('imul', ('ineg', a), b), ('ineg', ('imul', a, b))),

+   # Reassociate constants in add/mul chains so they can be folded together.
+   # For now, we only handle cases where the constants are separated by
+   # a single non-constant.  We could do better eventually.
+   (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)),
+   (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)),
+   (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)),
+   (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)),
+
 # Misc. lowering
 (('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a, b, 
'options->lower_fmod'),
 (('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)), 
'options->lower_uadd_carry'),



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.

2016-04-08 Thread Jason Ekstrand
On Fri, Apr 8, 2016 at 8:21 AM, Eduardo Lima Mitev  wrote:

> On 04/08/2016 01:35 AM, Kenneth Graunke wrote:
>
>> Many shaders contain expression trees of the form:
>>
>>  const_1 * (value * const_2)
>>
>> Reorganizing these to
>>
>>  (const_1 * const_2) * value
>>
>> will allow constant folding to combine the constants.  Sometimes, these
>> constants are 2 and 0.5, so we can remove a multiply altogether.  Other
>> times, it can create more immediate constants, which can actually hurt.
>>
>
Further justification would be that the original was 4 NIR instructions: 2
load_const and 2 multiplies.  Once constant folding takes over, we are
again left with 4 NIR instructions: 2 load_const and 2 multiplies.  The
difference is that we now have (value * const_2) and (value * (const_1 *
const_2)).  So we have no more instructions in the end and, hopefully, the
(value * const_2) goes away.  Obviously, the emperical evidence says this
actually hurts sometimes, but that's not surprising.


> Finding a good balance here is tricky.  While much more could be done,
>> this simple patch seems to have a lot of positive benefit while having
>> a low downside.
>>
>>
> Makes sense.
>
> Reviewed-by: Eduardo Lima Mitev 
>
> shader-db results on Broadwell:
>>
>> total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
>> instructions in affected programs: 438318 -> 435919 (-0.55%)
>> helped: 1502
>> HURT: 245
>>
>> total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
>> cycles in affected programs: 11541788 -> 11435950 (-0.92%)
>> helped: 3445
>> HURT: 1224
>>
>>
> FWIW, these are my results on HSW:
>
> total instructions in shared programs: 6223995 -> 6222470 (-0.02%)
> instructions in affected programs: 300945 -> 299420 (-0.51%)
> helped: 1231
> HURT: 240
>
> total cycles in shared programs: 56809432 -> 56615552 (-0.34%)
> cycles in affected programs: 7980160 -> 7786280 (-2.43%)
> helped: 2087
> HURT: 634
>
> Eduardo
>
> Signed-off-by: Kenneth Graunke 
>> ---
>>   src/compiler/nir/nir_opt_algebraic.py | 8 
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/src/compiler/nir/nir_opt_algebraic.py
>> b/src/compiler/nir/nir_opt_algebraic.py
>> index e72b4a7..420d9d9 100644
>> --- a/src/compiler/nir/nir_opt_algebraic.py
>> +++ b/src/compiler/nir/nir_opt_algebraic.py
>> @@ -274,6 +274,14 @@ optimizations = [
>>  (('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))),
>>  (('imul', ('ineg', a), b), ('ineg', ('imul', a, b))),
>>
>> +   # Reassociate constants in add/mul chains so they can be folded
>> together.
>> +   # For now, we only handle cases where the constants are separated by
>> +   # a single non-constant.  We could do better eventually.
>> +   (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)),
>> +   (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)),
>> +   (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)),
>> +   (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)),
>> +
>>  # Misc. lowering
>>  (('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a,
>> b, 'options->lower_fmod'),
>>  (('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)),
>> 'options->lower_uadd_carry'),
>>
>>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.

2016-04-11 Thread Matt Turner
On Thu, Apr 7, 2016 at 4:35 PM, Kenneth Graunke  wrote:
> Many shaders contain expression trees of the form:
>
> const_1 * (value * const_2)
>
> Reorganizing these to
>
> (const_1 * const_2) * value
>
> will allow constant folding to combine the constants.  Sometimes, these
> constants are 2 and 0.5, so we can remove a multiply altogether.  Other
> times, it can create more immediate constants, which can actually hurt.
>
> Finding a good balance here is tricky.  While much more could be done,
> this simple patch seems to have a lot of positive benefit while having
> a low downside.
>
> shader-db results on Broadwell:
>
> total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
> instructions in affected programs: 438318 -> 435919 (-0.55%)
> helped: 1502
> HURT: 245
>
> total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
> cycles in affected programs: 11541788 -> 11435950 (-0.92%)
> helped: 3445
> HURT: 1224
>

The series is

Reviewed-by: Matt Turner 

The shaders most hurt from this patch are... funny. They do

s_texcoord_0 = texcoord + offset * vec4(-1.5,-1.5,-0.5,-1.5);
s_texcoord_1 = texcoord + offset * vec4( 0.5,-1.5, 1.5,-1.5);
s_texcoord_2 = texcoord + offset * vec4(-1.5,-0.5,-0.5,-0.5);
s_texcoord_3 = texcoord + offset * vec4( 0.5,-0.5, 1.5,-0.5);
s_texcoord_4 = texcoord + offset * vec4(-1.5, 0.5,-0.5, 0.5);
s_texcoord_5 = texcoord + offset * vec4( 0.5, 0.5, 1.5, 0.5);
s_texcoord_6 = texcoord + offset * vec4(-1.5, 1.5,-0.5, 1.5);
s_texcoord_7 = texcoord + offset * vec4( 0.5, 1.5, 1.5, 1.5);

Today, we generate 8 MOV instructions with VF immediates. We could
have just loaded 0.5, -0.5, 1.5, and -1.5 with a single VF immediate
and then swizzled that as needed, but how would we recognize that all
of these can be combined? NIR CSE? Part of the difficulty is that each
of the vec4s in the source language include only 3 of the immediate
floats -- none contain all four.

Anyway, the programs go from 28->38 instructions because when things
are multiplied in a different order the constants become 0.125,
-0.125, 0.375, and -0.375, and since ±0.125 isn't representable as a
VF we generate even more silly instructions!

If we could recognize all of these as swizzles of vec4(0.5, -0.5, 1.5,
-1.5) at the NIR level we could at cut those hurt programs down by a
lot.

Everything else just looks like noise to me.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev