We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%)
Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. --- src/glsl/nir/nir_opt_cse.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c index 55dc083..e0f2e9a 100644 --- a/src/glsl/nir/nir_opt_cse.c +++ b/src/glsl/nir/nir_opt_cse.c @@ -123,7 +123,7 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2) return false; return memcmp(load1->value.f, load2->value.f, - load1->def.num_components * sizeof load2->value.f) == 0; + load1->def.num_components * sizeof(*load2->value.f)) == 0; } case nir_instr_type_phi: { nir_phi_instr *phi1 = nir_instr_as_phi(instr1); -- 2.1.4 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev