On Sun, Feb 8, 2015 at 4:52 AM, Francisco Jerez <curroje...@riseup.net> wrote: > Kenneth Graunke <kenn...@whitecape.org> writes: > >> On Saturday, February 07, 2015 02:10:19 AM Francisco Jerez wrote: >>> Matt Turner <matts...@gmail.com> writes: >>> >>> > On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez <curroje...@riseup.net> >>> > wrote: >>> >> Scalar registers are required to have zero stride, fix the >>> >> regs_written calculation not to assume that the instruction writes >>> >> zero registers in that case. >>> >> --- >>> >> src/mesa/drivers/dri/i965/brw_fs.cpp | 3 ++- >>> >> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >> >>> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> >> index 0a82fb7..bafe438 100644 >>> >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >>> >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >>> >> @@ -126,7 +126,8 @@ fs_inst::init(enum opcode opcode, uint8_t exec_size, >>> >> const fs_reg &dst, >>> >> case HW_REG: >>> >> case MRF: >>> >> case ATTR: >>> >> - this->regs_written = (dst.width * dst.stride * type_sz(dst.type) >>> >> + 31) / 32; >>> >> + this->regs_written = >>> >> + CEILING(MAX2(dst.width * dst.stride, 1) * type_sz(dst.type), >>> >> 32); >>> > >>> > I sent basically the same patch ("[PATCH 06/10] i965/fs: Consider >>> > subregister writes to be writing a register.") >>> > >>> > I don't really care which gets committed, but I did have to think a >>> > lot more about yours (and look up what CEILING did) to be sure it did >>> > the same thing. Maybe you like mine better? >>> >>> I find the helper function easier to understand and less error-prone >>> than the open-coded version. I admit though that the "easier to >>> understand" part may not be the case for someone that is not familiar >>> with CEILING(), as its meaning is far from obvious from its name. Maybe >>> we could come up with a more descriptive name? >> >> Wow, I don't like the name of that macro at all. I would expect it to >> do the well known mathematical function, ceil(). It doesn't. >> > It does, well, sort of... I'm not sure who came up with that macro, but > it seems that what the author had in mind was indeed taking the ceiling > of the fraction given by p/q for CEILING(p, q). > >> In most places in the driver, we do: >> >> ALIGN(value, 32) / 32; >> >> which would be my preference. I definitely dislike the old open coded >> (val + 31) / 32 business. >> > That only works for powers of two, and it still forces you to duplicate > the denominator -- in this specific case it's okay but it starts to get > annoying when the denominator is longer than two keystrokes. How about > we rename CEILING() to DIV_ROUND_UP() as the kernel calls it?
I like that suggestion. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev