Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); We don't use STL. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + switch (instr-op) { case nir_op_i2f: case nir_op_u2f: -- 2.4.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 11:13 AM, Jason Ekstrand ja...@jlekstrand.net wrote: On Fri, May 8, 2015 at 11:05 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Friday, May 08, 2015 10:18:44 AM Matt Turner wrote: On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; Yeah. I also wrote a patch to do that, adding (brw-gen 8 || brw-is_cherryview) In that case, shouldn't Cherry View take the same path as hsw when emitting multiplies and look for 15-bit constants? Almost... that path needs to set one of the MUL's source types to W/UW and the stride to 2, like in commit 0c06d019. I'm planning to rip out all of the multiplication code from brw_fs_visitor.cpp/brw_fs_nir.cpp and move it to a common lowering pass, so I'll fix that at the same time. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Friday, May 08, 2015 10:18:44 AM Matt Turner wrote: On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; Yeah. I also wrote a patch to do that, adding (brw-gen 8 || brw-is_cherryview) which also solves the problem. But it won't help on Cherryview, which I believe still has the asymmetrical multiply, while switching to shifts will. I suppose another alternative to NIR late optimizations is to have brw_fs_nir's handling of imul check for power of two sizes and emit a SHL. That would be easy. I do think we really need to make logical IMUL opcodes and lower them to MUL/MACH as needed later, so we don't need to worry about breaking MACH in cases like this. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 11:05 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Friday, May 08, 2015 10:18:44 AM Matt Turner wrote: On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; Yeah. I also wrote a patch to do that, adding (brw-gen 8 || brw-is_cherryview) In that case, shouldn't Cherry View take the same path as hsw when emitting multiplies and look for 15-bit constants? which also solves the problem. But it won't help on Cherryview, which I believe still has the asymmetrical multiply, while switching to shifts will. I suppose another alternative to NIR late optimizations is to have brw_fs_nir's handling of imul check for power of two sizes and emit a SHL. That would be easy. I really don't think SHL is the issue here. It's that we're being stupid about multiplies. SHL is a nice hack but unless it is actually faster, I think it's hacking around the problem. I do think we really need to make logical IMUL opcodes and lower them to MUL/MACH as needed later, so we don't need to worry about breaking MACH in cases like this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 11:05 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Friday, May 08, 2015 10:18:44 AM Matt Turner wrote: On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; Yeah. I also wrote a patch to do that, adding (brw-gen 8 || brw-is_cherryview) which also solves the problem. But it won't help on Cherryview, which I believe still has the asymmetrical multiply, while switching to shifts will. I suppose another alternative to NIR late optimizations is to have brw_fs_nir's handling of imul check for power of two sizes and emit a SHL. That would be easy. I do think we really need to make logical IMUL opcodes and lower them to MUL/MACH as needed later, so we don't need to worry about breaking MACH in cases like this. Indeed. I mentioned yesterday I've been planning to do it for a while, so I'll go ahead and take care of it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs_nir: Put the immediate in src1 for commutative ALU ops
On Fri, May 8, 2015 at 11:15 AM, Matt Turner matts...@gmail.com wrote: On Fri, May 8, 2015 at 11:13 AM, Jason Ekstrand ja...@jlekstrand.net wrote: On Fri, May 8, 2015 at 11:05 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Friday, May 08, 2015 10:18:44 AM Matt Turner wrote: On Fri, May 8, 2015 at 10:04 AM, Jason Ekstrand ja...@jlekstrand.net wrote: Shader-db results for fragment shaders on Broadwell: total instructions in shared programs: 4310987 - 4310663 (-0.01%) instructions in affected programs: 43242 - 42918 (-0.75%) helped:142 HURT: 0 Shader-db results for vertex shaders on Broadwell: total instructions in shared programs: 2889581 - 2844309 (-1.57%) instructions in affected programs: 1418720 - 1373448 (-3.19%) helped:6139 HURT: 0 --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 12 1 file changed, 12 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 555987d..161a262 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -21,6 +21,8 @@ * IN THE SOFTWARE. */ +#include algorithm + #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h @@ -662,6 +664,16 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr) op[i] = offset(op[i], instr-src[i].swizzle[channel]); } + /* Immediates can only be used as the second source of two-source +* instructions. We have code in opt_algebraic to flip them as needed +* for most instructions. However, it doesn't hurt anything to just do +* the right thing if we can detect it at the NIR level. +*/ + if ((nir_op_infos[instr-op].algebraic_properties NIR_OP_IS_COMMUTATIVE) + nir_src_as_const_value(instr-src[0].src)) { + std::swap(op[0], op[1]); + } + The real problem is that we haven't lifted the restriction about non-commutative integer multiply on Broadwell: brw_fs_copy_propagation.cpp: /* Fit this constant in by commuting the operands. * Exception: we can't do this for 32-bit integer MUL/MACH * because it's asymmetric. */ if ((inst-opcode == BRW_OPCODE_MUL || inst-opcode == BRW_OPCODE_MACH) (inst-src[1].type == BRW_REGISTER_TYPE_D || inst-src[1].type == BRW_REGISTER_TYPE_UD)) break; inst-src[0] = inst-src[1]; inst-src[1] = val; progress = true; Yeah. I also wrote a patch to do that, adding (brw-gen 8 || brw-is_cherryview) In that case, shouldn't Cherry View take the same path as hsw when emitting multiplies and look for 15-bit constants? Almost... that path needs to set one of the MUL's source types to W/UW and the stride to 2, like in commit 0c06d019. I'm planning to rip out all of the multiplication code from brw_fs_visitor.cpp/brw_fs_nir.cpp and move it to a common lowering pass, so I'll fix that at the same time. Awesome! Thanks for working on that! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev