On Wed, Apr 23, 2014 at 11:19 PM, Kenneth Graunke <kenn...@whitecape.org> wrote: > On 04/18/2014 11:56 AM, Matt Turner wrote: >> --- >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 135 >> +++++++++++++++------------ >> 1 file changed, 73 insertions(+), 62 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> index 2aa3acd..91bbe0a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> @@ -1257,8 +1257,11 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg >> dst, fs_reg coordinate, >> int reg_width = dispatch_width / 8; >> bool header_present = false; >> >> - fs_reg payload = fs_reg(this, glsl_type::float_type); >> - fs_reg next = payload; >> + fs_reg *sources = ralloc_array(mem_ctx, fs_reg, >> MAX_SAMPLER_MESSAGE_SIZE); >> + for (int i = 0; i < MAX_SAMPLER_MESSAGE_SIZE; i++) { >> + sources[i] = fs_reg(this, glsl_type::float_type); >> + } >> + int length = 0; >> >> if (ir->op == ir_tg4 || (ir->offset && ir->op != ir_txf) || sampler >= >> 16) { >> /* For general texture offsets (no txf workaround), we need a header >> to >> @@ -1272,12 +1275,13 @@ fs_visitor::emit_texture_gen7(ir_texture *ir, fs_reg >> dst, fs_reg coordinate, >> * need to offset the Sampler State Pointer in the header. >> */ >> header_present = true; >> - next.reg_offset++; >> + sources[length] = reg_undef; >> + length++; >> } > > So...if you don't take the header_present = true path...then it looks > like the next thing (say, shadow_c) will land in sources[0], rather than > leaving sources[0] as BAD_FILE and putting the first part of the payload > in sources[1]. > > Is sources[0] special and reserved for the header or isn't it? > >> >> if (ir->shadow_comparitor) { >> - emit(MOV(next, shadow_c)); >> - next.reg_offset++; >> + emit(MOV(sources[length], shadow_c)); > > I'm confused by the fact that these MOVs are still here. I would have > expected: > > sources[length] = sample_c; > > When we visit expression trees, we generate results into various > registers. The point of these MOVs is to put them into a large-VGRF we > can SEND from, at the right ref_offset. > > Now, you have a new set of MAX_SAMPLER_MESSAGE_SIZE registers, and copy > from the original registers to these new temporaries, then LOAD_PAYLOAD > does a second copy from those into the new ones. It seems like we could > just use the original registers and do a single copy. > > I'm probably missing something here, but I can't think of what.
Not emitting the MOVs for the cases where we don't change types generates more instructions in shader-db, and we lose 8 SIMD16 programs. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev