Francisco Jerez <curroje...@riseup.net> writes: > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > >> From: Iago Toral Quiroga <ito...@igalia.com> >> >> We were not accounting for reg_suboffset in the check for the start >> of the region. This meant that would allow copy-propagation even if >> the dst wrote to sub_regoffset 4 and our source read from >> sub_regoffset 0, which is not correct. This was observed in fp64 code, >> since there we use reg_suboffset to select the high 32-bit of a double. >> > I don't think this paragraph is accurate, copy instructions with > non-zero destination subreg offset are currently considered partial > writes and should never have been added to the ACP hash table in the > first place. > >> Also, fs_reg::regs_read() already takes the stride into account, so we >> should not multiply its result by the stride again. This was making >> copy-propagation fail to copy-propagate cases that would otherwise be >> safe to copy-propagate. Again, this was observed in fp64 code, since >> there we use stride > 1 often. >> >> Incidentally, these fixes open up more possibilities for copy propagation >> which uncovered new bugs in copy-propagation. The folowing patches address >> each of these new issues. > > Oh man, that sucks... > >> --- >> .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 21 >> +++++++++++++-------- >> 1 file changed, 13 insertions(+), 8 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> index 5fae10f..23df877 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp >> @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned >> stride, >> return true; >> } >> >> +static inline bool >> +region_match(fs_reg src, unsigned regs_read, >> + fs_reg dst, unsigned regs_written) > Forgot to mention that there's no reason to pass the registers by value here, please use 'const fs_reg &' instead.
> How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I Oops, in case it wasn't not clear from my sentence above, I didn't intend to suggest using different argument names for this function, I just typoed them -- regs_written sounds fine to me. > personally wouldn't mind 'region_match' but > 'write_region_contains_read_region' sounds a bit too long for my taste). > >> +{ >> + return src.reg_offset >= dst.reg_offset && >> + (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) && >> + src.subreg_offset >= dst.subreg_offset; > > This works under the assumption that src.subreg_offset is strictly less > than the reg_offset unit -- Which *should* be the case unless we've > messed up that restriction in some place (we have in the past :P). To > be on the safe side you could do something like following, if you like: > > | return (src.reg_offset * REG_SIZE + src.subreg_offset >= > | dst.reg_offset * REG_SIZE + dst.subreg_offset) && > | src.reg_offset + regs_read <= dst.reg_offset + regs_written; > > With the above taken into account: > > Reviewed-by: Francisco Jerez <curroje...@riseup.net> > >> +} >> + >> bool >> fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) >> { >> @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, >> acp_entry *entry) >> /* Bail if inst is reading a range that isn't contained in the range >> * that entry is writing. >> */ >> - if (inst->src[arg].reg_offset < entry->dst.reg_offset || >> - (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset + >> - inst->regs_read(arg) * inst->src[arg].stride * 32) > >> - (entry->dst.reg_offset + entry->regs_written) * 32) >> + if (!region_match(inst->src[arg], inst->regs_read(arg), >> + entry->dst, entry->regs_written)) >> return false; >> >> /* we can't generally copy-propagate UD negations because we >> @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, >> acp_entry *entry) >> /* Bail if inst is reading a range that isn't contained in the range >> * that entry is writing. >> */ >> - if (inst->src[i].reg_offset < entry->dst.reg_offset || >> - (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset + >> - inst->regs_read(i) * inst->src[i].stride * 32) > >> - (entry->dst.reg_offset + entry->regs_written) * 32) >> + if (!region_match(inst->src[i], inst->regs_read(i), >> + entry->dst, entry->regs_written)) >> continue; >> >> fs_reg val = entry->src; >> -- >> 2.5.0 >> >> _______________________________________________ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev