On Thu, 2016-09-08 at 18:32 -0700, Francisco Jerez wrote: > Iago Toral <ito...@igalia.com> writes: > > > > > On Wed, 2016-09-07 at 18:48 -0700, Francisco Jerez wrote: > > > > > > This is in preparation for dropping vec4_instruction::regs_read > > > and > > > ::regs_written in favor of more accurate alternatives expressed > > > in > > > byte units. The main reason these wrappers are useful is that a > > > number of optimization passes implement dataflow analysis with > > > register granularity, so these helpers will come in handy once > > > we've > > > switched register offsets and sizes to the byte > > > representation. The > > > wrapper functions will also make sure that GRF misalignment > > > (currently > > > neglected by most of the back-end) is taken into account > > > correctly in > > > the calculation of regs_read and regs_written. > > > > This does not seem to replace all uses of regs_written and inst- > > > > > > regs_read() with these helpers. I am not sure if this was by > > > design or > > by mistake but the consequence is that later patches still do a lot > > of > > things like: > > > > - scan_inst->dst.offset / REG_SIZE + scan_inst->regs_written > > > + scan_inst->dst.offset / REG_SIZE + DIV_ROUND_UP(scan_inst- > > > > > > size_written, REG_SIZE) > > (this hunk is from the next patch in fs_visitor::compute_to_mrf(), > > but > > there are plenty more like this in that same patch) > > > > which would have not been necessary if we just used the > > regs_written() > > helper here. > > > The reason for the apparent inconsistency you've noticed here is that > regs_written(inst) and DIV_ROUND_UP(inst.size_written, REG_SIZE), > even > though they look like synonyms at this point of the series, are > intended > to do different things (they don't yet, but they will once several > fixes > are applied to regs_written() after PATCH 16). From the doxygen > comment > of regs_written(): > > > > > Return the number of dataflow registers written by the instruction > > (either fully or partially) counted from 'floor(reg_offset(inst- > > >dst) > > / register_size)'. The somewhat arbitrary register size unit is > > 16B > > for the UNIFORM and IMM files and 32B for all other files. > IOW, regs_written() is expected to behave as if it partitioned the > register file of inst->dst into 32B chunks starting from > reg_offset(r) > == 0, and returned how many of those chunks overlap the destination > region of the instruction, which is not necessarily equivalent to the > amount of data written by the instruction in register units (if e.g. > the > instruction writes exactly REG_SIZE bytes but the destination region > starts mid-GRF, regs_written(inst) would be expected to return two, > but > DIV_ROUND_UP(inst.size_written, REG_SIZE) would return one). > > The same goes for regs_read() vs DIV_ROUND_UP(size_read(), REG_SIZE).
Ah right, that makes sense. > That said, you could argue that in the example you pasted above > regs_written() would have been the more correct thing to do of the > two. > That's definitely the case, but I didn't bother to change it because > I > removed the whole condition anyway during the clean-up part of this > series, since it was just a rather hairy open-coded version of > region_contained_in(). I see, that's fine by me. Thanks for the explanation! > > > > > > > > --- > > > src/mesa/drivers/dri/i965/brw_ir_vec4.h | 26 > > > ++++++++++++++++++++++ > > > .../drivers/dri/i965/brw_schedule_instructions.cpp | 8 +++---- > > > src/mesa/drivers/dri/i965/brw_vec4.cpp | 4 ++-- > > > src/mesa/drivers/dri/i965/brw_vec4_cse.cpp | 6 ++--- > > > .../dri/i965/brw_vec4_dead_code_eliminate.cpp | 6 ++--- > > > .../drivers/dri/i965/brw_vec4_live_variables.cpp | 8 +++---- > > > 6 files changed, 42 insertions(+), 16 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > > index 4f49428..a1a201b 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > > @@ -254,6 +254,32 @@ set_saturate(bool saturate, vec4_instruction > > > *inst) > > > return inst; > > > } > > > > > > +/** > > > + * Return the number of dataflow registers written by the > > > instruction (either > > > + * fully or partially) counted from 'floor(reg_offset(inst->dst) > > > / > > > + * register_size)'. The somewhat arbitrary register size unit > > > is > > > 16B for the > > > + * UNIFORM and IMM files and 32B for all other files. > > > + */ > > > +inline unsigned > > > +regs_written(const vec4_instruction *inst) > > > +{ > > > + /* XXX - Take into account register-misaligned offsets > > > correctly. > > > */ > > > + return inst->regs_written; > > > +} > > > + > > > +/** > > > + * Return the number of dataflow registers read by the > > > instruction > > > (either > > > + * fully or partially) counted from 'floor(reg_offset(inst- > > > >src[i]) > > > / > > > + * register_size)'. The somewhat arbitrary register size unit > > > is > > > 16B for the > > > + * UNIFORM and IMM files and 32B for all other files. > > > + */ > > > +inline unsigned > > > +regs_read(const vec4_instruction *inst, unsigned i) > > > +{ > > > + /* XXX - Take into account register-misaligned offsets > > > correctly. > > > */ > > > + return inst->regs_read(i); > > > +} > > > + > > > } /* namespace brw */ > > > > > > #endif > > > diff --git > > > a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > > > b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > > > index 0d3a07c..c12bf09 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp > > > @@ -1269,7 +1269,7 @@ > > > vec4_instruction_scheduler::calculate_deps() > > > /* read-after-write deps. */ > > > for (int i = 0; i < 3; i++) { > > > if (inst->src[i].file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_read(i); ++j) > > > + for (unsigned j = 0; j < regs_read(inst, i); ++j) > > > add_dep(last_grf_write[inst->src[i].nr + j], n); > > > } else if (inst->src[i].file == FIXED_GRF) { > > > add_dep(last_fixed_grf_write, n); > > > @@ -1303,7 +1303,7 @@ > > > vec4_instruction_scheduler::calculate_deps() > > > > > > /* write-after-write deps. */ > > > if (inst->dst.file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_written; ++j) { > > > + for (unsigned j = 0; j < regs_written(inst); ++j) { > > > add_dep(last_grf_write[inst->dst.nr + j], n); > > > last_grf_write[inst->dst.nr + j] = n; > > > } > > > @@ -1351,7 +1351,7 @@ > > > vec4_instruction_scheduler::calculate_deps() > > > /* write-after-read deps. */ > > > for (int i = 0; i < 3; i++) { > > > if (inst->src[i].file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_read(i); ++j) > > > + for (unsigned j = 0; j < regs_read(inst, i); ++j) > > > add_dep(n, last_grf_write[inst->src[i].nr + j]); > > > } else if (inst->src[i].file == FIXED_GRF) { > > > add_dep(n, last_fixed_grf_write); > > > @@ -1384,7 +1384,7 @@ > > > vec4_instruction_scheduler::calculate_deps() > > > * can mark this as WAR dependency. > > > */ > > > if (inst->dst.file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_written; ++j) > > > + for (unsigned j = 0; j < regs_written(inst); ++j) > > > last_grf_write[inst->dst.nr + j] = n; > > > } else if (inst->dst.file == MRF) { > > > last_mrf_write[inst->dst.nr] = n; > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > index dd058db..a521867 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > > @@ -1335,11 +1335,11 @@ vec4_visitor::split_virtual_grfs() > > > * to split. > > > */ > > > foreach_block_and_inst(block, vec4_instruction, inst, cfg) { > > > - if (inst->dst.file == VGRF && inst->regs_written > 1) > > > + if (inst->dst.file == VGRF && regs_written(inst) > 1) > > > split_grf[inst->dst.nr] = false; > > > > > > for (int i = 0; i < 3; i++) { > > > - if (inst->src[i].file == VGRF && inst->regs_read(i) > > > > 1) > > > + if (inst->src[i].file == VGRF && regs_read(inst, i) > > > > 1) > > > split_grf[inst->src[i].nr] = false; > > > } > > > } > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > > > b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > > > index 10898a5..f0908b9 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_cse.cpp > > > @@ -178,10 +178,10 @@ vec4_visitor::opt_cse_local(bblock_t > > > *block) > > > bool no_existing_temp = entry->tmp.file == BAD_FILE; > > > if (no_existing_temp && !entry->generator- > > > > > > > > dst.is_null()) { > > > entry->tmp = retype(src_reg(VGRF, alloc.allocate( > > > - entry->generator- > > > > > > > > regs_written), > > > + regs_written(entry > > > - > > > > > > > > generator)), > > > NULL), inst- > > > >dst.type); > > > > > > - for (unsigned i = 0; i < entry->generator- > > > > > > > > regs_written; ++i) { > > > + for (unsigned i = 0; i < regs_written(entry- > > > > > > > > generator); ++i) { > > > vec4_instruction *copy = MOV(offset(entry- > > > > > > > > generator->dst, i), > > > offset(entry- > > > >tmp, > > > i)); > > > copy->force_writemask_all = > > > @@ -196,7 +196,7 @@ vec4_visitor::opt_cse_local(bblock_t *block) > > > if (!inst->dst.is_null()) { > > > assert(inst->dst.type == entry->tmp.type); > > > > > > - for (unsigned i = 0; i < inst->regs_written; ++i) > > > { > > > + for (unsigned i = 0; i < regs_written(inst); ++i) > > > { > > > vec4_instruction *copy = MOV(offset(inst->dst, > > > i), > > > offset(entry- > > > >tmp, > > > i)); > > > copy->force_writemask_all = inst- > > > > > > > > force_writemask_all; > > > diff --git > > > a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > > > b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > > > index c643212..50706a9 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_dead_code_eliminate.cpp > > > @@ -59,7 +59,7 @@ vec4_visitor::dead_code_eliminate() > > > bool result_live[4] = { false }; > > > > > > if (inst->dst.file == VGRF) { > > > - for (unsigned i = 0; i < inst->regs_written; i++) > > > { > > > + for (unsigned i = 0; i < regs_written(inst); i++) > > > { > > > for (int c = 0; c < 4; c++) > > > result_live[c] |= BITSET_TEST( > > > live, var_from_reg(alloc, offset(inst- > > > >dst, > > > i), c)); > > > @@ -110,7 +110,7 @@ vec4_visitor::dead_code_eliminate() > > > } > > > > > > if (inst->dst.file == VGRF && !inst->predicate) { > > > - for (unsigned i = 0; i < inst->regs_written; i++) { > > > + for (unsigned i = 0; i < regs_written(inst); i++) { > > > for (int c = 0; c < 4; c++) { > > > if (inst->dst.writemask & (1 << c)) { > > > BITSET_CLEAR(live, var_from_reg(alloc, > > > @@ -132,7 +132,7 @@ vec4_visitor::dead_code_eliminate() > > > > > > for (int i = 0; i < 3; i++) { > > > if (inst->src[i].file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_read(i); j++) > > > { > > > + for (unsigned j = 0; j < regs_read(inst, i); j++) > > > { > > > for (int c = 0; c < 4; c++) { > > > BITSET_SET(live, var_from_reg(alloc, > > > offset(inst- > > > > > > > > src[i], j), c)); > > > diff --git > > > a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > > > b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > > > index 57d5fbb..20344ed 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp > > > @@ -76,7 +76,7 @@ vec4_live_variables::setup_def_use() > > > /* Set use[] for this instruction */ > > > for (unsigned int i = 0; i < 3; i++) { > > > if (inst->src[i].file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_read(i); j++) > > > { > > > + for (unsigned j = 0; j < regs_read(inst, i); j++) > > > { > > > for (int c = 0; c < 4; c++) { > > > const unsigned v = > > > var_from_reg(alloc, offset(inst->src[i], > > > j), > > > c); > > > @@ -99,7 +99,7 @@ vec4_live_variables::setup_def_use() > > > */ > > > if (inst->dst.file == VGRF && > > > (!inst->predicate || inst->opcode == > > > BRW_OPCODE_SEL)) { > > > - for (unsigned i = 0; i < inst->regs_written; i++) { > > > + for (unsigned i = 0; i < regs_written(inst); i++) { > > > for (int c = 0; c < 4; c++) { > > > if (inst->dst.writemask & (1 << c)) { > > > const unsigned v = > > > @@ -257,7 +257,7 @@ vec4_visitor::calculate_live_intervals() > > > foreach_block_and_inst(block, vec4_instruction, inst, cfg) { > > > for (unsigned int i = 0; i < 3; i++) { > > > if (inst->src[i].file == VGRF) { > > > - for (unsigned j = 0; j < inst->regs_read(i); j++) { > > > + for (unsigned j = 0; j < regs_read(inst, i); j++) { > > > for (int c = 0; c < 4; c++) { > > > const unsigned v = > > > var_from_reg(alloc, offset(inst->src[i], > > > j), > > > c); > > > @@ -269,7 +269,7 @@ vec4_visitor::calculate_live_intervals() > > > } > > > > > > if (inst->dst.file == VGRF) { > > > - for (unsigned i = 0; i < inst->regs_written; i++) { > > > + for (unsigned i = 0; i < regs_written(inst); i++) { > > > for (int c = 0; c < 4; c++) { > > > if (inst->dst.writemask & (1 << c)) { > > > const unsigned v = _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev