On Wed, 2015-10-28 at 10:58 -0700, Kristian Høgsberg wrote: > On Wed, Oct 28, 2015 at 10:01:40AM +0100, Samuel Iglesias Gonsálvez wrote: > > There is no opinions about this issue or reviews of the proposed patch > > after one week. > > > > This is just a reminder in case you have missed it :-) > > Thanks for the reminder! How about something like this instead?
Yeah, that works too. I was a bit concerned that this same problem may be affecting other places and this would only address it for brw_send_indirect_message, but after a quick review we don't generally need to hold pointers to previous instructions and the places where we do, like in brw_ENDIF or brw_WHILE we are careful to create the instructions we need before we look for pointers to others (which we do using indices into the store anyway). Reviewed-by: Iago Toral Quiroga <ito...@igalia.com> I'll push this patch tomorrow if nobody else objects. Thanks Kristian! > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c > b/src/mesa/drivers/dri/i965/brw_eu_emit.c > index ebd811f..cd5c726 100644 > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c > @@ -2511,12 +2511,20 @@ brw_send_indirect_message(struct brw_codegen *p, > struct brw_reg desc) > { > const struct brw_device_info *devinfo = p->devinfo; > - struct brw_inst *send, *setup; > + struct brw_inst *send; > + int setup; > > assert(desc.type == BRW_REGISTER_TYPE_UD); > > + /* We hold on to the setup instruction (the SEND in the direct case, the > OR > + * in the indirect case) by its index in the instruction store. The > + * pointer returned by next_insn() may become invalid if emitting the SEND > + * in the indirect case reallocs the store. > + */ > + > if (desc.file == BRW_IMMEDIATE_VALUE) { > - setup = send = next_insn(p, BRW_OPCODE_SEND); > + setup = p->nr_insn; > + send = next_insn(p, BRW_OPCODE_SEND); > brw_set_src1(p, send, desc); > > } else { > @@ -2531,7 +2539,8 @@ brw_send_indirect_message(struct brw_codegen *p, > * caller can specify additional descriptor bits with the usual > * brw_set_*_message() helper functions. > */ > - setup = brw_OR(p, addr, desc, brw_imm_ud(0)); > + setup = p->nr_insn; > + brw_OR(p, addr, desc, brw_imm_ud(0)); > > brw_pop_insn_state(p); > > @@ -2543,7 +2552,7 @@ brw_send_indirect_message(struct brw_codegen *p, > brw_set_src0(p, send, retype(payload, BRW_REGISTER_TYPE_UD)); > brw_inst_set_sfid(devinfo, send, sfid); > > - return setup; > + return &p->store[setup]; > } > > static struct brw_inst * > > > > Sam > > > > On 21/10/15 12:23, Iago Toral wrote: > > > Hi, > > > > > > The problem is with code like this (see brw_send_indirect_message): > > > > > > setup = brw_OR(p, addr, desc, brw_imm_ud(0)); > > > send = next_insn(p, BRW_OPCODE_SEND); > > > ... > > > return setup; > > > > > > If next_insn triggers a realloc of the instruction store, then the setup > > > instruction pointer is no longer valid. Notice that this can happen > > > anywhere where we keep pointers to previous instructions before creating > > > new ones (!) > > > > > > The patch from Samuel fixes this by special-casing this for SEND > > > instructions only (since we know that the indirect versions can hit > > > this, maybe there are more situations though). It does so by trying to > > > make sure that we never realloc the store with a SEND instruction. For > > > this, we realloc before we reach the end of the current store (32 > > > instructions before the limit) as long as the instruction is not a SEND > > > (so that if it is a SEND we still have up to 32 opportunities to do the > > > realloc without a different instruction before running out of space in > > > the store). > > > > > > Iago > > > > > > On Wed, 2015-10-21 at 09:02 +0200, Samuel Iglesias Gonsalvez wrote: > > >> Hello, > > >> > > >> I have found several invalid memory accesses when running > > >> dEQP-GLES31.functional.ssbo.* tests on i965 driver (and gen7+). That > > >> invalid memory accesses were unluckily happening when generating the > > >> assembly instructions for SSBO stores for different compute shaders. > > >> > > >> However it looks like this problem could happen to other shaders and > > >> situations. Because of that, I am going to explain the problem here: > > >> > > >> When generating a untyped surface write/read, i965 driver will end up > > >> calling brw_send_indirect_message() through > > >> brw_send_indirect_surface_message(). At brw_send_indirect_message()'s > > >> 'else' branch, the code generates a load of the indirect descriptor to > > >> an address register using an OR instruction and it also generates a new > > >> SEND instruction; if this case happens, the OR instruction is returned. > > >> brw_send_indirect_surface_message() uses that OR instruction to set mlen > > >> and rlen's descriptor bits later. > > >> > > >> Just to give more context, when generating instructions in fs/vec4 > > >> generators, i965 driver uses pointers to elements in the 'store' table > > >> inside struct brw_codegen. That table has an initial size of 1024 but, > > >> when it's full, it is resized (doubling its size each time, > > >> see brw_next_insn()). This resize operation ends up calling > > >> realloc(). However the returned pointer by realloc() could be different > > >> and the old allocated memory would be free'd as part of the process. > > >> > > >> Back to the issue, if the p->store's resize happens when we get the > > >> pointer to the SEND instruction at brw_send_indirect_message()'s 'else' > > >> branch, we could have the following problem: > > >> > > >> The realloc() returns a new pointer and *free'd* the old allocation, so > > >> the pointer we previously saved for the OR instruction at > > >> brw_send_indirect_surface_message() becomes invalid (because it is a > > >> pointer of the old allocation). Then, we access to that invalid pointer > > >> when setting up rlen/mlen bits at the end of > > >> brw_send_indirect_surface_message() and we would have undefined results. > > >> > > >> This issue is quite unlikely to happen but it is reproducible on > > >> ~120 dEQP-GLES31.functional.ssbo.* tests, basically because they have > > >> the same shaders except the buffer variable's data type. Those tests > > >> were failing intermittently at different rates but valgrind helped to > > >> find what was happening. > > >> > > >> I would like to expose publicly the problem and analyse possible > > >> solutions for it along with the community. For the time being, a patch > > >> is proposed to mitigate this issue, which is sent as a reply to this > > >> one. > > >> > > >> What this work-around patch does is: book enough room for one or more > > >> consecutive SEND* instructions at the end of the p->store table in order > > >> to avoid reallocating p->store in the aforementioned case. The 32 value > > >> was chosen arbitrary because it has low impact in p->store > > >> (its initial size is 1024) and makes this issue much less likely to > > >> happen. We could tune this number to a less conservative value if > > >> needed. If you want to test it, that patch should be applied on top of > > >> this Curro's patch [0] as it fixes a lot of compute shader compilation > > >> errors (~700 dEQP-GLES31.functional.ssbo.* tests). I have setup a branch > > >> with both patches in [1]. > > >> > > >> Feel free to comment about other solutions, ideas, opinions, etc. > > >> > > >> Thanks, > > >> > > >> Sam > > >> > > >> [0] See attachment at > > >> http://lists.freedesktop.org/archives/mesa-dev/2015-October/097183.html > > >> [1] $ git clone -b dEQP-functional-ssbo-fixes-v1 \ > > >> https://github.com/Igalia/mesa.git > > >> > > >> Samuel Iglesias Gonsalvez (1): > > >> i965: book space at the end of p->store for SEND opcodes to avoid > > >> invalid memory access > > >> > > >> src/mesa/drivers/dri/i965/brw_eu_emit.c | 12 +++++++++++- > > >> 1 file changed, 11 insertions(+), 1 deletion(-) > > >> > > > > > > > > > > > _______________________________________________ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev