On Tue, 2016-05-03 at 17:37 -0700, Jason Ekstrand wrote: > > > On Tue, May 3, 2016 at 5:21 AM, Samuel Iglesias Gonsálvez > <sigles...@igalia.com> wrote: > From: Iago Toral Quiroga <ito...@igalia.com> > > We were not accounting for reg_suboffset in the check for the > start > of the region. This meant that would allow copy-propagation > even if > the dst wrote to sub_regoffset 4 and our source read from > sub_regoffset 0, which is not correct. This was observed in > fp64 code, > since there we use reg_suboffset to select the high 32-bit of > a double. > > > I'm not sure if copy-prop can handle the case where you write to > sub_regoffset != 0 anyway. Maybe we should just bail on that?
Right, we found a few issues and some of the patches in this series address the problems we found. Maybe there are still a few other cases that we missed but my thinking was that we probably wanted to fix the pass for subreg_offset != 0 anyway and if we actually missed something we could just fix it up when it shows up as a bug somewhere. That said, we can of course simplify things and just bail here, we would lose some optimizations but we might not need some of the other patches in the series... > Also, fs_reg::regs_read() already takes the stride into > account, so we > should not multiply its result by the stride again. This was > making > copy-propagation fail to copy-propagate cases that would > otherwise be > safe to copy-propagate. Again, this was observed in fp64 code, > since > there we use stride > 1 often. > > Incidentally, these fixes open up more possibilities for copy > propagation > which uncovered new bugs in copy-propagation. The folowing > patches address > each of these new issues. > --- > .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 21 > +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git > a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > index 5fae10f..23df877 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp > @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned > arg, unsigned stride, > return true; > } > > +static inline bool > +region_match(fs_reg src, unsigned regs_read, > + fs_reg dst, unsigned regs_written) > > > I don't like the name. It should say something about containment with > perhaps a comment to go along with it. Should we assert that they > have the same file and number? /** * Check that the register region read by src [src.reg_offset, * src.reg_offset + regs_read] is contained inside the register * region written by dst [dst.reg_offset, dst.reg_offset + regs_written] * Both src and dst must have the same register number and file. */ static inline bool write_region_contains_read_region(...) And yes, we should definitely assert on the file and register number. > +{ > + return src.reg_offset >= dst.reg_offset && > + (src.reg_offset + regs_read) <= (dst.reg_offset + > regs_written) && > + src.subreg_offset >= dst.subreg_offset; > +} > + > bool > fs_visitor::try_copy_propagate(fs_inst *inst, int arg, > acp_entry *entry) > { > @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst > *inst, int arg, acp_entry *entry) > /* Bail if inst is reading a range that isn't contained in > the range > * that entry is writing. > */ > - if (inst->src[arg].reg_offset < entry->dst.reg_offset || > - (inst->src[arg].reg_offset * 32 + > inst->src[arg].subreg_offset + > - inst->regs_read(arg) * inst->src[arg].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[arg], inst->regs_read(arg), > + entry->dst, entry->regs_written)) > return false; > > /* we can't generally copy-propagate UD negations because > we > @@ -554,10 +561,8 @@ > fs_visitor::try_constant_propagate(fs_inst *inst, acp_entry > *entry) > /* Bail if inst is reading a range that isn't contained > in the range > * that entry is writing. > */ > - if (inst->src[i].reg_offset < entry->dst.reg_offset || > - (inst->src[i].reg_offset * 32 + > inst->src[i].subreg_offset + > - inst->regs_read(i) * inst->src[i].stride * 32) > > - (entry->dst.reg_offset + entry->regs_written) * 32) > + if (!region_match(inst->src[i], inst->regs_read(i), > + entry->dst, entry->regs_written)) > continue; > > fs_reg val = entry->src; > -- > 2.5.0 > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev