On Tue, May 3, 2016 at 4:54 PM, Francisco Jerez <curroje...@riseup.net> wrote:
> Jason Ekstrand <ja...@jlekstrand.net> writes: > > > On Tue, May 3, 2016 at 4:24 PM, Jason Ekstrand <ja...@jlekstrand.net> > wrote: > > > >> > >> > >> On Tue, May 3, 2016 at 4:18 PM, Kenneth Graunke <kenn...@whitecape.org> > >> wrote: > >> > >>> On Tuesday, May 3, 2016 3:00:26 PM PDT Jason Ekstrand wrote: > >>> > There are a few different fixups that we have to do for texture > >>> > destinations that re-arrange channels, fix hardware vs. API > mismatches, > >>> or > >>> > just shrink the result to fit in the NIR destination. These were all > >>> being > >>> > done in a somewhat haphazard manner. This commit replaces all of the > >>> > shuffling with a single LOAD_PAYLOAD operation at the end and makes > it > >>> much > >>> > easier to insert fixups between the texture instruction itself and > the > >>> > LOAD_PAYLOAD. > >>> > > >>> > Shader-db results on Haswell: > >>> > > >>> > total instructions in shared programs: 6227035 -> 6226669 (-0.01%) > >>> > instructions in affected programs: 19119 -> 18753 (-1.91%) > >>> > helped: 85 > >>> > HURT: 0 > >>> > > >>> > total cycles in shared programs: 56491626 -> 56476126 (-0.03%) > >>> > cycles in affected programs: 672420 -> 656920 (-2.31%) > >>> > helped: 92 > >>> > HURT: 42 > >>> > --- > >>> > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 32 ++++++++++ > >>> +--------------------- > >>> > 1 file changed, 11 insertions(+), 21 deletions(-) > >>> > > >>> > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>> b/src/mesa/drivers/ > >>> dri/i965/brw_fs_nir.cpp > >>> > index ebc54ad..e0a76b4 100644 > >>> > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>> > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >>> > @@ -3311,43 +3311,33 @@ fs_visitor::nir_emit_texture(const fs_builder > >>> &bld, > >>> nir_tex_instr *instr) > >>> > emit_gen6_gather_wa(key_tex->gen6_gather_wa[texture], dst); > >>> > } > >>> > > >>> > - if (instr->op == nir_texop_query_levels) { > >>> > - /* # levels is in .w */ > >>> > - dst = offset(dst, bld, 3); > >>> > - } > >>> > + fs_reg *nir_dest = ralloc_array(mem_ctx, fs_reg, dest_size); > >>> > >>> Why not just stack allocate this using the pessimal size? > >>> > >>> fs_reg nir_dest[4]; > >>> > >> > >> Good point! Fixed locally. > >> > > > > Never mind. It's because bld.LOAD_PAYLOAD doesn't make a copy of the > > array, it just copies the pointer into the fs_inst. > > > fs_builder::LOAD_PAYLOAD just passes the pointer to the fs_inst > constructor which does the copy itself when you use the variable-source > variant -- I fixed that a while ago precisely in order to be able to > allocate source arrays in the stack. ;) > Yup. I made the change locally (after double and tripple-checking what curro just said) and verified with piglit. --Jason > > > >> > + for (unsigned i = 0; i < dest_size; i++) > >>> > + nir_dest[i] = offset(dst, bld, i); > >>> > > >>> > bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE > && > >>> > instr->is_array; > >>> > > >>> > - /* fixup #layers for cube map arrays */ > >>> > - if (instr->op == nir_texop_txs && (devinfo->gen < 7 || > >>> is_cube_array)) { > >>> > + if (instr->op == nir_texop_query_levels) { > >>> > + /* # levels is in .w */ > >>> > + nir_dest[0] = offset(dst, bld, 3); > >>> > + } else if (instr->op == nir_texop_txs && dest_size >= 3 && > >>> > + (devinfo->gen < 7 || is_cube_array)) { > >>> > fs_reg depth = offset(dst, bld, 2); > >>> > fs_reg fixed_depth = vgrf(glsl_type::int_type); > >>> > > >>> > if (is_cube_array) { > >>> > + /* fixup #layers for cube map arrays */ > >>> > bld.emit(SHADER_OPCODE_INT_QUOTIENT, fixed_depth, depth, > >>> brw_imm_d(6)); > >>> > } else if (devinfo->gen < 7) { > >>> > /* Gen4-6 return 0 instead of 1 for single layer surfaces. > */ > >>> > bld.emit_minmax(fixed_depth, depth, brw_imm_d(1), > >>> BRW_CONDITIONAL_GE); > >>> > } > >>> > > >>> > - fs_reg *fixed_payload = ralloc_array(mem_ctx, fs_reg, inst- > >>> >regs_written); > >>> > - int components = inst->regs_written / (inst->exec_size / 8); > >>> > - for (int i = 0; i < components; i++) { > >>> > - if (i == 2) { > >>> > - fixed_payload[i] = fixed_depth; > >>> > - } else { > >>> > - fixed_payload[i] = offset(dst, bld, i); > >>> > - } > >>> > - } > >>> > - bld.LOAD_PAYLOAD(dst, fixed_payload, components, 0); > >>> > + nir_dest[2] = fixed_depth; > >>> > } > >>> > > >>> > - fs_reg nir_dest = get_nir_dest(instr->dest); > >>> > - nir_dest.type = dst.type; > >>> > - emit_percomp(bld, fs_inst(BRW_OPCODE_MOV, bld.dispatch_width(), > >>> > - nir_dest, dst), > >>> > - (1 << dest_size) - 1); > >>> > + bld.LOAD_PAYLOAD(get_nir_dest(instr->dest), nir_dest, dest_size, > 0); > >>> > } > >>> > > >>> > void > >>> > > >>> > >>> > >> > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev