On Wed, Nov 25, 2015 at 5:36 AM, Juan A. Suarez Romero <jasua...@igalia.com> wrote: > If the shader asm code is something like: > > mov vgrf2767.0:F, [13F, 14F, 15F, 16F] > mov vgrf2768.0:F, [9F, 10F, 11F, 12F] > mov m8:F, [13F, 14F, 15F, 16F] > mov m7:F, [9F, 10F, 11F, 12F] > > And we apply Common Subexpresion Elimination optimization, we get the > following one: > > mov vgrf2771.0:F, [13F, 14F, 15F, 16F] > mov vgrf2767.0:F, vgrf2771.xyzw:F > mov vgrf2772.0:F, [9F, 10F, 11F, 12F] > mov vgrf2768.0:F, vgrf2772.xyzw:F > mov m8:F, vgrf2771.xyzw:F > mov m7:F, vgrf2772.xyzw:F > > The problem is that later we apply Copy Propagation optimization, which > reverts the change to the original one. If we run the optimizations in > a loop, there is always a progress, but we are in neverending loop. > > Usually, when we have a sentence of the form: > > X = exp > > We apply CSE if "exp" is actually an expression. But if it is constant > we do not apply it, as the point of CSE is saving running the same > computation more than once, that doesn't happen when we have a constant. > > So this commit ensures CSE is not applied to MOV immediate (as it > provides no gain, and it is reverted later by copy-propagation > optimization). > > Signed-off-by: Juan A. Suarez Romero <jasua...@igalia.com> > --- > src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > index 85cbf24..7ed7654 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > @@ -93,6 +93,17 @@ is_expression(const vec4_instruction *const inst) > } > > static bool > +is_mov_constant(const vec4_instruction *const inst) > +{ > + if (inst->opcode == BRW_OPCODE_MOV && > + inst->src[0].file == BRW_IMMEDIATE_VALUE) { > + return true; > + } else { > + return false; > + } > +} > + > +static bool > operands_match(const vec4_instruction *a, const vec4_instruction *b) > { > const src_reg *xs = a->src; > @@ -142,7 +153,7 @@ vec4_visitor::opt_cse_local(bblock_t *block) > int ip = block->start_ip; > foreach_inst_in_block (vec4_instruction, inst, block) { > /* Skip some cases. */ > - if (is_expression(inst) && !inst->predicate && inst->mlen == 0 && > + if (is_expression(inst) && !inst->predicate && inst->mlen == 0 && > !is_mov_constant(inst) && > ((inst->dst.file != ARF && inst->dst.file != FIXED_GRF) || > inst->dst.is_null())) > { > --
I'm sending a number of comments on PATCH 2/2 which might make this patch unnecessary, but in case it is necessary, I think you can simplify it by adding return inst->src[0].file != BRW_IMMEDIATE_VALUE; immediately after case BRW_OPCODE_MOV: in is_expression(). That's at least what I did when I tried something similar before when adding opt_vector_float(). _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev