From: Timothy Arceri <timothy.arc...@collabora.com> V2: leave copy propagation to avoid interpolation regressions
IVB is running into some spilling issues with the loop removed so we leave it there for gen7 and below for now. Run time for shader-db on my machine goes from ~795 seconds to ~665 seconds. shader-db results BDW: total instructions in shared programs: 12969459 -> 12968891 (-0.00%) instructions in affected programs: 1463154 -> 1462586 (-0.04%) helped: 3622 HURT: 3326 total cycles in shared programs: 246453572 -> 246504318 (0.02%) cycles in affected programs: 208842622 -> 208893368 (0.02%) helped: 24029 HURT: 35407 total loops in shared programs: 2931 -> 2931 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 14560 -> 14498 (-0.43%) spills in affected programs: 2270 -> 2208 (-2.73%) helped: 17 HURT: 2 total fills in shared programs: 19671 -> 19632 (-0.20%) fills in affected programs: 2060 -> 2021 (-1.89%) helped: 17 HURT: 2 LOST: 17 GAINED: 40 Most of the hurt shaders are 1-2 instructions, with what looks like a max of 7. I've looked at the worst cycles regressions and as far as I can tell its just a scheduling difference. --- src/mesa/drivers/dri/i965/brw_link.cpp | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index 7c10a40..bb7ab4f 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -81,21 +81,20 @@ brw_lower_packing_builtins(struct brw_context *brw, lower_packing_builtins(ir, LOWER_PACK_HALF_2x16 | LOWER_UNPACK_HALF_2x16); } static void process_glsl_ir(struct brw_context *brw, struct gl_shader_program *shader_prog, struct gl_linked_shader *shader) { struct gl_context *ctx = &brw->ctx; - const struct brw_compiler *compiler = brw->screen->compiler; const struct gl_shader_compiler_options *options = &ctx->Const.ShaderCompilerOptions[shader->Stage]; /* Temporary memory context for any new IR. */ void *mem_ctx = ralloc_context(NULL); ralloc_adopt(mem_ctx, shader->ir); lower_blend_equation_advanced(shader); @@ -125,34 +124,32 @@ process_glsl_ir(struct brw_context *brw, if (brw->gen < 6) lower_if_to_cond_assign(shader->Stage, shader->ir, 16); do_lower_texture_projection(shader->ir); do_vec_index_to_cond_assign(shader->ir); lower_vector_insert(shader->ir, true); lower_offset_arrays(shader->ir); lower_noise(shader->ir); lower_quadop_vector(shader->ir, false); - bool progress; - do { - progress = false; - - if (compiler->scalar_stage[shader->Stage]) { - if (shader->Stage == MESA_SHADER_VERTEX || - shader->Stage == MESA_SHADER_FRAGMENT) - brw_do_channel_expressions(shader->ir); - brw_do_vector_splitting(shader->ir); - } - - progress = do_common_optimization(shader->ir, true, true, - options, ctx->Const.NativeIntegers) || progress; - } while (progress); + /* TODO: IVB is failing to link with the GLSL IR opts removed for the + * piglit test: + * piglit.spec.arb_gpu_shader_fp64.varying-packing.simple + * + * With the error "VS compile failed: no register to spill". If we can fix + * this we should be able to remove this optimisation loop. + */ + if (brw->gen <= 7 && !brw->is_haswell) { + while (do_common_optimization(shader->ir, true, true, options, + ctx->Const.NativeIntegers)) + ; + } validate_ir_tree(shader->ir); /* Now that we've finished altering the linked IR, reparent any live IR back * to the permanent memory context, and free the temporary one (discarding any * junk we optimized away). */ reparent_ir(shader->ir, shader->ir); ralloc_free(mem_ctx); -- 2.9.3 _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev