On 06/15/2018 02:27 AM, Iago Toral wrote: > On Thu, 2018-06-14 at 17:43 -0700, Ian Romanick wrote: >> From: Ian Romanick <ian.d.roman...@intel.com> >> >> fs_visitor::set_gs_stream_control_data_bits generates some code like >> "control_data_bits | stream_id << ((2 * (vertex_count - 1)) % 32)" as >> part of EmitVertex. The first time this (dynamically) occurs in the >> shader, control_data_bits is zero. Many times we can determine this >> statically and various optimizations will collaborate to make one of >> the >> OR operands literal zero. >> >> Converting the OR to a MOV usually allows it to be copy-propagated >> away. >> However, this does not happen in at least some shaders (in the >> assembly >> output of >> shaders/closed/UnrealEngine4/EffectsCaveDemo/301.shader_test, >> search for shl). >> >> All of the affected shaders are geometry shaders. >> >> Broadwell and Skylake had similar results. (Skylake shown) >> total instructions in shared programs: 14375452 -> 14375413 (<.01%) >> instructions in affected programs: 6422 -> 6383 (-0.61%) >> helped: 39 >> HURT: 0 >> helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 >> helped stats (rel) min: 0.14% max: 2.56% x̄: 1.91% x̃: 2.56% >> 95% mean confidence interval for instructions value: -1.00 -1.00 >> 95% mean confidence interval for instructions %-change: -2.26% -1.57% >> Instructions are helped. >> >> total cycles in shared programs: 531981179 -> 531980555 (<.01%) >> cycles in affected programs: 27493 -> 26869 (-2.27%) >> helped: 39 >> HURT: 0 >> helped stats (abs) min: 16 max: 16 x̄: 16.00 x̃: 16 >> helped stats (rel) min: 0.60% max: 7.92% x̄: 5.94% x̃: 7.92% >> 95% mean confidence interval for cycles value: -16.00 -16.00 >> 95% mean confidence interval for cycles %-change: -6.98% -4.90% >> Cycles are helped. >> >> No changes on earlier platforms. >> >> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> >> --- >> src/intel/compiler/brw_fs.cpp | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/src/intel/compiler/brw_fs.cpp >> b/src/intel/compiler/brw_fs.cpp >> index d67c0a41922..d836b268629 100644 >> --- a/src/intel/compiler/brw_fs.cpp >> +++ b/src/intel/compiler/brw_fs.cpp >> @@ -2448,7 +2448,8 @@ fs_visitor::opt_algebraic() >> } >> break; >> case BRW_OPCODE_OR: >> - if (inst->src[0].equals(inst->src[1])) { >> + if (inst->src[0].equals(inst->src[1]) || >> + inst->src[1].is_zero()) { >> inst->opcode = BRW_OPCODE_MOV; >> inst->src[1] = reg_undef; >> progress = true; > > While we are at this, shouldn't we also handle this as a MOV (from > src[1]) when src[0].is_zero() is true?
As far as I'm aware, immediate values can only appear on src1 for two-source instructions. A similar algebraic optimization for x+0 in the vec4 backend only checks src1. > Iago _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev