On 07/03/17 22:39, Francisco Jerez wrote: > Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: > >> On 07/03/17 14:10, Samuel Iglesias Gonsálvez wrote: >>> >>> >>> On 04/03/17 01:13, Francisco Jerez wrote: >>>> Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes: >>>> >>>>> From: "Juan A. Suarez Romero" <jasua...@igalia.com> >>>>> >>>>> Take into account offset values less than a full register (32 bytes) >>>>> when getting the var from register. >>>>> >>>>> This is required when dealing with an operation that writes half of the >>>>> register (like one d2x in IVB/BYT, which uses exec_size == 4). >>>>> >>>>> - v2: take in account this offset < 32 in liveness analysis too (Curro) >>>>> --- >>>>> src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp | 12 ++++++++---- >>>>> src/mesa/drivers/dri/i965/brw_vec4_live_variables.h | 6 ++++-- >>>>> 2 files changed, 12 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp >>>>> b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp >>>>> index 73f658cd8fa..dc1ad21038c 100644 >>>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp >>>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.cpp >>>>> @@ -78,7 +78,8 @@ vec4_live_variables::setup_def_use() >>>>> if (inst->src[i].file == VGRF) { >>>>> for (unsigned j = 0; j < DIV_ROUND_UP(inst->size_read(i), >>>>> 16); j++) { >>>>> for (int c = 0; c < 4; c++) { >>>>> - const unsigned v = var_from_reg(alloc, >>>>> inst->src[i], c, j); >>>>> + const unsigned v = >>>>> + var_from_reg(alloc, inst->src[i], c, j); >>>> >>>> Neither this nor the four subsequent hunks seem to be doing anything, >>>> please drop them. >>>> >>> >>> Right >>> >>>>> if (!BITSET_TEST(bd->def, v)) >>>>> BITSET_SET(bd->use, v); >>>>> } >>>>> @@ -101,7 +102,8 @@ vec4_live_variables::setup_def_use() >>>>> for (unsigned i = 0; i < DIV_ROUND_UP(inst->size_written, >>>>> 16); i++) { >>>>> for (int c = 0; c < 4; c++) { >>>>> if (inst->dst.writemask & (1 << c)) { >>>>> - const unsigned v = var_from_reg(alloc, inst->dst, >>>>> c, i); >>>>> + const unsigned v = >>>>> + var_from_reg(alloc, inst->dst, c, i); >>>>> if (!BITSET_TEST(bd->use, v)) >>>>> BITSET_SET(bd->def, v); >>>>> } >>>>> @@ -257,7 +259,8 @@ vec4_visitor::calculate_live_intervals() >>>>> if (inst->src[i].file == VGRF) { >>>>> for (unsigned j = 0; j < DIV_ROUND_UP(inst->size_read(i), >>>>> 16); j++) { >>>>> for (int c = 0; c < 4; c++) { >>>>> - const unsigned v = var_from_reg(alloc, inst->src[i], >>>>> c, j); >>>>> + const unsigned v = >>>>> + var_from_reg(alloc, inst->src[i], c, j); >>>>> start[v] = MIN2(start[v], ip); >>>>> end[v] = ip; >>>>> } >>>>> @@ -269,7 +272,8 @@ vec4_visitor::calculate_live_intervals() >>>>> for (unsigned i = 0; i < DIV_ROUND_UP(inst->size_written, 16); >>>>> i++) { >>>>> for (int c = 0; c < 4; c++) { >>>>> if (inst->dst.writemask & (1 << c)) { >>>>> - const unsigned v = var_from_reg(alloc, inst->dst, c, >>>>> i); >>>>> + const unsigned v = >>>>> + var_from_reg(alloc, inst->dst, c, i); >>>>> start[v] = MIN2(start[v], ip); >>>>> end[v] = ip; >>>>> } >>>>> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h >>>>> b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h >>>>> index 8807c453743..b23df650c11 100644 >>>>> --- a/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h >>>>> +++ b/src/mesa/drivers/dri/i965/brw_vec4_live_variables.h >>>>> @@ -89,7 +89,8 @@ var_from_reg(const simple_allocator &alloc, const >>>>> src_reg ®, >>>>> const unsigned csize = DIV_ROUND_UP(type_sz(reg.type), 4); >>>>> unsigned result = >>>>> 8 * (alloc.offsets[reg.nr] + reg.offset / REG_SIZE) + >>>>> - (BRW_GET_SWZ(reg.swizzle, c) + k / csize * 4) * csize + k % csize; >>>>> + (BRW_GET_SWZ(reg.swizzle, c) + k / csize * 4) * csize + k % csize + >>>>> + (reg.offset % REG_SIZE) / type_sz(reg.type); >>>> >>>> Looks bogus to me, the result is expressed in dwords not in type_sz >>>> units (because the live analysis pass has dword granularity). Instead >>>> of adding new terms to the expression you could just take the >>>> 'reg.offset / REG_SIZE' term out of the first parentheses and replace it >>>> with 'reg.offset / 4'. >>>> >>> >> >> Actually I don't see the point of doing this: 8 / REG_SIZE = 1/4, so no >> matter if we keep it as is or take the 'reg.offset / REG_SIZE' term out >> of the first parentheses and replace it with 'reg.offset / 4'. >> >> At the end is the same, unless we expect REG_SIZE to change or to change >> the multiplier value. >> > > Nope, this is integer arithmetic so (x / b) * a is not necessarily the > same as x * (a / b). If you don't believe me just take some x which is > not a whole multiple of b, which is precisely the case this patch is > fixing. >
Right, I forgot that we are operating with integers here. Sam >> Sam >> >>> OK, thanks. >>> >>> Sam >>> >>>>> /* Do not exceed the limit for this register */ >>>>> assert(result < 8 * (alloc.offsets[reg.nr] + alloc.sizes[reg.nr])); >>>>> return result; >>>>> @@ -103,7 +104,8 @@ var_from_reg(const simple_allocator &alloc, const >>>>> dst_reg ®, >>>>> const unsigned csize = DIV_ROUND_UP(type_sz(reg.type), 4); >>>>> unsigned result = >>>>> 8 * (alloc.offsets[reg.nr] + reg.offset / REG_SIZE) + >>>>> - (c + k / csize * 4) * csize + k % csize; >>>>> + (c + k / csize * 4) * csize + k % csize + >>>>> + (reg.offset % REG_SIZE) / type_sz(reg.type); >>>> >>>> Same here. >>>> >>>>> /* Do not exceed the limit for this register */ >>>>> assert(result < 8 * (alloc.offsets[reg.nr] + alloc.sizes[reg.nr])); >>>>> return result; >>>>> -- >>>>> 2.11.0 >>>>> >>>>> _______________________________________________ >>>>> mesa-dev mailing list >>>>> mesa-dev@lists.freedesktop.org >>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev >>> >>>
signature.asc
Description: OpenPGP digital signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev