[Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.
Many shaders contain expression trees of the form: const_1 * (value * const_2) Reorganizing these to (const_1 * const_2) * value will allow constant folding to combine the constants. Sometimes, these constants are 2 and 0.5, so we can remove a multiply altogether. Other times, it can create more immediate constants, which can actually hurt. Finding a good balance here is tricky. While much more could be done, this simple patch seems to have a lot of positive benefit while having a low downside. shader-db results on Broadwell: total instructions in shared programs: 8963768 -> 8961369 (-0.03%) instructions in affected programs: 438318 -> 435919 (-0.55%) helped: 1502 HURT: 245 total cycles in shared programs: 71527354 -> 71421516 (-0.15%) cycles in affected programs: 11541788 -> 11435950 (-0.92%) helped: 3445 HURT: 1224 Signed-off-by: Kenneth Graunke --- src/compiler/nir/nir_opt_algebraic.py | 8 1 file changed, 8 insertions(+) diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index e72b4a7..420d9d9 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -274,6 +274,14 @@ optimizations = [ (('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))), (('imul', ('ineg', a), b), ('ineg', ('imul', a, b))), + # Reassociate constants in add/mul chains so they can be folded together. + # For now, we only handle cases where the constants are separated by + # a single non-constant. We could do better eventually. + (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)), + (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)), + (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)), + (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)), + # Misc. lowering (('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a, b, 'options->lower_fmod'), (('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)), 'options->lower_uadd_carry'), -- 2.8.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.
On 04/08/2016 01:35 AM, Kenneth Graunke wrote: Many shaders contain expression trees of the form: const_1 * (value * const_2) Reorganizing these to (const_1 * const_2) * value will allow constant folding to combine the constants. Sometimes, these constants are 2 and 0.5, so we can remove a multiply altogether. Other times, it can create more immediate constants, which can actually hurt. Finding a good balance here is tricky. While much more could be done, this simple patch seems to have a lot of positive benefit while having a low downside. Makes sense. Reviewed-by: Eduardo Lima Mitev shader-db results on Broadwell: total instructions in shared programs: 8963768 -> 8961369 (-0.03%) instructions in affected programs: 438318 -> 435919 (-0.55%) helped: 1502 HURT: 245 total cycles in shared programs: 71527354 -> 71421516 (-0.15%) cycles in affected programs: 11541788 -> 11435950 (-0.92%) helped: 3445 HURT: 1224 FWIW, these are my results on HSW: total instructions in shared programs: 6223995 -> 6222470 (-0.02%) instructions in affected programs: 300945 -> 299420 (-0.51%) helped: 1231 HURT: 240 total cycles in shared programs: 56809432 -> 56615552 (-0.34%) cycles in affected programs: 7980160 -> 7786280 (-2.43%) helped: 2087 HURT: 634 Eduardo Signed-off-by: Kenneth Graunke --- src/compiler/nir/nir_opt_algebraic.py | 8 1 file changed, 8 insertions(+) diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index e72b4a7..420d9d9 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -274,6 +274,14 @@ optimizations = [ (('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))), (('imul', ('ineg', a), b), ('ineg', ('imul', a, b))), + # Reassociate constants in add/mul chains so they can be folded together. + # For now, we only handle cases where the constants are separated by + # a single non-constant. We could do better eventually. + (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)), + (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)), + (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)), + (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)), + # Misc. lowering (('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a, b, 'options->lower_fmod'), (('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)), 'options->lower_uadd_carry'), ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.
On Fri, Apr 8, 2016 at 8:21 AM, Eduardo Lima Mitev wrote: > On 04/08/2016 01:35 AM, Kenneth Graunke wrote: > >> Many shaders contain expression trees of the form: >> >> const_1 * (value * const_2) >> >> Reorganizing these to >> >> (const_1 * const_2) * value >> >> will allow constant folding to combine the constants. Sometimes, these >> constants are 2 and 0.5, so we can remove a multiply altogether. Other >> times, it can create more immediate constants, which can actually hurt. >> > Further justification would be that the original was 4 NIR instructions: 2 load_const and 2 multiplies. Once constant folding takes over, we are again left with 4 NIR instructions: 2 load_const and 2 multiplies. The difference is that we now have (value * const_2) and (value * (const_1 * const_2)). So we have no more instructions in the end and, hopefully, the (value * const_2) goes away. Obviously, the emperical evidence says this actually hurts sometimes, but that's not surprising. > Finding a good balance here is tricky. While much more could be done, >> this simple patch seems to have a lot of positive benefit while having >> a low downside. >> >> > Makes sense. > > Reviewed-by: Eduardo Lima Mitev > > shader-db results on Broadwell: >> >> total instructions in shared programs: 8963768 -> 8961369 (-0.03%) >> instructions in affected programs: 438318 -> 435919 (-0.55%) >> helped: 1502 >> HURT: 245 >> >> total cycles in shared programs: 71527354 -> 71421516 (-0.15%) >> cycles in affected programs: 11541788 -> 11435950 (-0.92%) >> helped: 3445 >> HURT: 1224 >> >> > FWIW, these are my results on HSW: > > total instructions in shared programs: 6223995 -> 6222470 (-0.02%) > instructions in affected programs: 300945 -> 299420 (-0.51%) > helped: 1231 > HURT: 240 > > total cycles in shared programs: 56809432 -> 56615552 (-0.34%) > cycles in affected programs: 7980160 -> 7786280 (-2.43%) > helped: 2087 > HURT: 634 > > Eduardo > > Signed-off-by: Kenneth Graunke >> --- >> src/compiler/nir/nir_opt_algebraic.py | 8 >> 1 file changed, 8 insertions(+) >> >> diff --git a/src/compiler/nir/nir_opt_algebraic.py >> b/src/compiler/nir/nir_opt_algebraic.py >> index e72b4a7..420d9d9 100644 >> --- a/src/compiler/nir/nir_opt_algebraic.py >> +++ b/src/compiler/nir/nir_opt_algebraic.py >> @@ -274,6 +274,14 @@ optimizations = [ >> (('fmul', ('fneg', a), b), ('fneg', ('fmul', a, b))), >> (('imul', ('ineg', a), b), ('ineg', ('imul', a, b))), >> >> + # Reassociate constants in add/mul chains so they can be folded >> together. >> + # For now, we only handle cases where the constants are separated by >> + # a single non-constant. We could do better eventually. >> + (('~fmul', '#a', ('fmul', b, '#c')), ('fmul', ('fmul', a, c), b)), >> + (('imul', '#a', ('imul', b, '#c')), ('imul', ('imul', a, c), b)), >> + (('~fadd', '#a', ('fadd', b, '#c')), ('fadd', ('fadd', a, c), b)), >> + (('iadd', '#a', ('iadd', b, '#c')), ('iadd', ('iadd', a, c), b)), >> + >> # Misc. lowering >> (('fmod', a, b), ('fsub', a, ('fmul', b, ('ffloor', ('fdiv', a, >> b, 'options->lower_fmod'), >> (('uadd_carry', a, b), ('b2i', ('ult', ('iadd', a, b), a)), >> 'options->lower_uadd_carry'), >> >> > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] nir: Do basic constant reassociation.
On Thu, Apr 7, 2016 at 4:35 PM, Kenneth Graunke wrote: > Many shaders contain expression trees of the form: > > const_1 * (value * const_2) > > Reorganizing these to > > (const_1 * const_2) * value > > will allow constant folding to combine the constants. Sometimes, these > constants are 2 and 0.5, so we can remove a multiply altogether. Other > times, it can create more immediate constants, which can actually hurt. > > Finding a good balance here is tricky. While much more could be done, > this simple patch seems to have a lot of positive benefit while having > a low downside. > > shader-db results on Broadwell: > > total instructions in shared programs: 8963768 -> 8961369 (-0.03%) > instructions in affected programs: 438318 -> 435919 (-0.55%) > helped: 1502 > HURT: 245 > > total cycles in shared programs: 71527354 -> 71421516 (-0.15%) > cycles in affected programs: 11541788 -> 11435950 (-0.92%) > helped: 3445 > HURT: 1224 > The series is Reviewed-by: Matt Turner The shaders most hurt from this patch are... funny. They do s_texcoord_0 = texcoord + offset * vec4(-1.5,-1.5,-0.5,-1.5); s_texcoord_1 = texcoord + offset * vec4( 0.5,-1.5, 1.5,-1.5); s_texcoord_2 = texcoord + offset * vec4(-1.5,-0.5,-0.5,-0.5); s_texcoord_3 = texcoord + offset * vec4( 0.5,-0.5, 1.5,-0.5); s_texcoord_4 = texcoord + offset * vec4(-1.5, 0.5,-0.5, 0.5); s_texcoord_5 = texcoord + offset * vec4( 0.5, 0.5, 1.5, 0.5); s_texcoord_6 = texcoord + offset * vec4(-1.5, 1.5,-0.5, 1.5); s_texcoord_7 = texcoord + offset * vec4( 0.5, 1.5, 1.5, 1.5); Today, we generate 8 MOV instructions with VF immediates. We could have just loaded 0.5, -0.5, 1.5, and -1.5 with a single VF immediate and then swizzled that as needed, but how would we recognize that all of these can be combined? NIR CSE? Part of the difficulty is that each of the vec4s in the source language include only 3 of the immediate floats -- none contain all four. Anyway, the programs go from 28->38 instructions because when things are multiplied in a different order the constants become 0.125, -0.125, 0.375, and -0.375, and since ±0.125 isn't representable as a VF we generate even more silly instructions! If we could recognize all of these as swizzles of vec4(0.5, -0.5, 1.5, -1.5) at the NIR level we could at cut those hurt programs down by a lot. Everything else just looks like noise to me. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev