El 29/07/18 a las 19:47, Chema Casanova escribió: > El 28/07/18 a las 01:45, Francisco Jerez escribió: >> Chema Casanova <jmcasan...@igalia.com> writes:
[...] >>>>> If we have a partial write/read: >>>>> >>>>> I understood that you my initial patter proposal would only be ok for >>>>> the first GRF of src[i]/dst (reg_offset == 0) >>>>> >>>>> periodic_mask(this->exec_size, /* count */ >>>>> this->src[i].stride * type_sz(this->src[i].type), /*step */ >>>>> type_sz(this->src[i].type), /* bits */ >>>>> this->src[i].offset % REG_SIZE); /* offset */ >>>>> >>>>> In the case we manage only reg_offset == 0 we get a huge improvement >>>>> reducing all problems many of the register_pressure we have now on all >>>>> SIMD8 shaders with 8/16bits test cases. >>>>> >>>>> I understood that you didn't agree that for cases where src/destination >>>>> use more than 1 GRF (reg_offset == 1) we can not guarantee that we can >>>>> apply the same internal offset (this->src[i].offset % REG_SIZE) as the >>>>> base register to calculate a patter. So It would be better to return ~0u >>>>> on reads or 0u in writes. >>>>> >>> >>>> Yes, but you could easily determine whether the mask is going to be >>>> invariant with respect to reg_offset (where reg_offset is within bounds) >>>> and in that case return the periodic_mask() expression above, otherwise >>>> return 0/~0u depending on whether reg_offset is within bounds. >>> >>> Ok, so we are within bounds, we don't have a predicated write, we are >>> not a send message. Then we have an ALU opcode and we return the >>> periodic_mask. >>> >> >> Those are all necessary but not sufficient conditions for the >> periodic_mask() expression above to give you the correct answer for any >> in-bounds reg_offset > 0, you should check that byte_offset < type_size >> * stride in addition. > > That's true. Fixed in v5. > > If we don't satisfy the condition then we return 0 on writes and ~0u on > reads. Could you have a look at the v5 to check if I can count with your R-b ? https://patchwork.freedesktop.org/patch/241482/ I suppose you didn't have time to have a look at the other patch of the series. "[v2,2/2] intel/fs: Improve liveness range calculation for partial writes" https://patchwork.freedesktop.org/patch/239839/ Thanks in advance, Chema _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev