Francisco Jerez <curroje...@riseup.net> writes:

> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:
>
>> From: Iago Toral Quiroga <ito...@igalia.com>
>>
>> We were not accounting for reg_suboffset in the check for the start
>> of the region. This meant that would allow copy-propagation even if
>> the dst wrote to sub_regoffset 4 and our source read from
>> sub_regoffset 0, which is not correct. This was observed in fp64 code,
>> since there we use reg_suboffset to select the high 32-bit of a double.
>>
> I don't think this paragraph is accurate, copy instructions with
> non-zero destination subreg offset are currently considered partial
> writes and should never have been added to the ACP hash table in the
> first place.
>
>> Also, fs_reg::regs_read() already takes the stride into account, so we
>> should not multiply its result by the stride again. This was making
>> copy-propagation fail to copy-propagate cases that would otherwise be
>> safe to copy-propagate. Again, this was observed in fp64 code, since
>> there we use stride > 1 often.
>>
>> Incidentally, these fixes open up more possibilities for copy propagation
>> which uncovered new bugs in copy-propagation. The folowing patches address
>> each of these new issues.
>
> Oh man, that sucks...
>
>> ---
>>  .../drivers/dri/i965/brw_fs_copy_propagation.cpp    | 21 
>> +++++++++++++--------
>>  1 file changed, 13 insertions(+), 8 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> index 5fae10f..23df877 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
>> @@ -329,6 +329,15 @@ can_take_stride(fs_inst *inst, unsigned arg, unsigned 
>> stride,
>>     return true;
>>  }
>>  
>> +static inline bool
>> +region_match(fs_reg src, unsigned regs_read,
>> +             fs_reg dst, unsigned regs_written)
>
Forgot to mention that there's no reason to pass the registers by value
here, please use 'const fs_reg &' instead.

> How about 'region_contained_in(dst, regs_write, src, regs_read)'? (I

Oops, in case it wasn't not clear from my sentence above, I didn't
intend to suggest using different argument names for this function, I
just typoed them -- regs_written sounds fine to me.

> personally wouldn't mind 'region_match' but
> 'write_region_contains_read_region' sounds a bit too long for my taste).
>
>> +{
>> +   return src.reg_offset >= dst.reg_offset &&
>> +          (src.reg_offset + regs_read) <= (dst.reg_offset + regs_written) &&
>> +          src.subreg_offset >= dst.subreg_offset;
>
> This works under the assumption that src.subreg_offset is strictly less
> than the reg_offset unit -- Which *should* be the case unless we've
> messed up that restriction in some place (we have in the past :P).  To
> be on the safe side you could do something like following, if you like:
>
> |   return (src.reg_offset * REG_SIZE + src.subreg_offset >=
> |           dst.reg_offset * REG_SIZE + dst.subreg_offset) &&
> |          src.reg_offset + regs_read <= dst.reg_offset + regs_written;
>
> With the above taken into account:
>
> Reviewed-by: Francisco Jerez <curroje...@riseup.net>
>
>> +}
>> +
>>  bool
>>  fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry)
>>  {
>> @@ -351,10 +360,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
>> acp_entry *entry)
>>     /* Bail if inst is reading a range that isn't contained in the range
>>      * that entry is writing.
>>      */
>> -   if (inst->src[arg].reg_offset < entry->dst.reg_offset ||
>> -       (inst->src[arg].reg_offset * 32 + inst->src[arg].subreg_offset +
>> -        inst->regs_read(arg) * inst->src[arg].stride * 32) >
>> -       (entry->dst.reg_offset + entry->regs_written) * 32)
>> +   if (!region_match(inst->src[arg], inst->regs_read(arg),
>> +                     entry->dst, entry->regs_written))
>>        return false;
>>  
>>     /* we can't generally copy-propagate UD negations because we
>> @@ -554,10 +561,8 @@ fs_visitor::try_constant_propagate(fs_inst *inst, 
>> acp_entry *entry)
>>        /* Bail if inst is reading a range that isn't contained in the range
>>         * that entry is writing.
>>         */
>> -      if (inst->src[i].reg_offset < entry->dst.reg_offset ||
>> -          (inst->src[i].reg_offset * 32 + inst->src[i].subreg_offset +
>> -           inst->regs_read(i) * inst->src[i].stride * 32) >
>> -          (entry->dst.reg_offset + entry->regs_written) * 32)
>> +      if (!region_match(inst->src[i], inst->regs_read(i),
>> +                        entry->dst, entry->regs_written))
>>           continue;
>>  
>>        fs_reg val = entry->src;
>> -- 
>> 2.5.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to