On Wed, Jun 27, 2018 at 2:25 AM, Iago Toral <ito...@igalia.com> wrote:
> On Tue, 2018-06-26 at 10:59 -0700, Jason Ekstrand wrote: > > On Tue, Jun 26, 2018 at 4:08 AM, Iago Toral Quiroga <ito...@igalia.com> > wrote: > > Storage images require to patch push constant stateto work, which happens > during > binding table emision. In the scenario where our pipeline and descriptors > are > not dirty, we don't re-emit the binding table, however, if our push > constant > state is dirty, we will re-emit the push constant state, trashing storage > image setup. > > While that scenario is probably not very likely to happen in practice, > there > are some CTS tests that trigger this by clearing storage images and buffers > and dispatching a compute shader in a loop. The clearing of the images > and buffers will trigger a blorp execution which will dirty our push > constant > state, however, because we don't alter the descriptors or the compute > dispatch > at all in the loop (we are basically execution the same program in a loop), > our pipeline and descriptor state is not dirty. If the shader uses a > storage > image, then any iteration after the first will re-emit push constant state > without re-emitting binding tables and the storage image will not be > properly > setup any more. > > > I don't see why that is a problem. The only thing flush_descriptor_sets > does is fill out the binding and sampler tables and fill in the push > constant data for storage images/buffers. The actual HW packets are filled > out by flush_push_constants and emit_descriptor_pointers. Yes, blorp > trashes our descriptor pointers but the descriptor sets should be fine. > For push constants, it does emit 3DSTATE_CONSTANT_* but it doesn't actually > modify anv_cmd_state::push_constants. > > Are secondary command buffers involved? I could see something funny going > on with those. > > > No, no secondaries are involved. I did some more investigation and I think > my explanation of the problem was not good, this is what is really > happening: > > First, I found the problem in the compute pipeline and I only extended the > fix to the graphics pipeline because it looked like the same rationale > would apply, so I'll explain what happens in compute and then we can > discuss whether the same problem applies to graphics. > > The test does something like this: > > for (...) { > clear ssbos / storage images > dispatch compute > } > > The first iteration of this loop will find that the compute pipeline and > descriptors are dirty and proceed to emit binding tables. We have storage > images, so during that process the push constant buffer is amended to > include storage images. Specifically, we call > anv_cmd_buffer_ensure_push_constants_size() > for the images field. This gives us a size of 624. > > We move on to the second iteration of the loop. When we clear images and > ssbos via blorp, we again mark the push constant buffer as dirty. Now we > execute the compute dispatch and the first thing we do there is > anv_cmd_buffer_push_base_group_id() which calls > anv_cmd_buffer_ensure_push_constants_size() for the base group id, which > gives as a size of 144. This is smaller than what we computed in the > previous iteration, because we haven't called the same function for the > images field yet. Unfortunately, we will never call that again, because we > only do that during binding table emission and we only do that if the > compute pipeline is dirty (it is not) or our descriptors are dirty (they > are not). So we don't re-emit binding table and we don't ensure push > constant space for the image data, but because we come from a blorp > execution our push constant dirty flag is true, so we re-emit push constant > data, only that this time we won't emit the push constant data we need for > the storage images, which leads to the problem. > The intention has always been that anv_cmd_buffer_ensure_push_constants_size would only ever grow the push constants and never shrink them. The most obvious bug is in anv_cmd_buffer_ensure_push_constants_size. > I thought that maybe making anv_cmd_buffer_ensure_push_constants_size() > only update the size if we alloc or realloc would fix this, but that can > cause GPU hangs in some cases when I run multiple tests in parallel, so I > guess it isn't that simple. > Ugh... that makes things more interesting. That does look like the right fix and now I'm wondering why it leads to a hang. In the compute case, flush_compute_descriptor_sets emits MEDIA_INTERFACE_DESCRIPTOR_LOAD. My feeling is that not emitting that packet is the real bug. In GL, we just re-emit all 4 compute packets all the time and don't try to track dirty bits. I had patches to do that in Vulkan somewhere. I rebased them and pushed them here: https://gitlab.freedesktop.org/jekstrand/mesa/tree/wip/anv-cs-packets Is that plus your patch to fix ensure_push_constants_size() enough to fix the bug? If so, then I think what's going on is that we have to re-emit all the compute packets after a switch from 3D to compute and we're not doing that. --Jason > I hope I am making more sense now. > > Iago > > --Jason > > > > Fixes multiple failures in some new CTS tests. > --- > src/intel/vulkan/genX_cmd_buffer.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/genX_cmd_buffer.c > b/src/intel/vulkan/genX_cmd_buffer.c > index 97b321ccaeb..6e48aaedb9b 100644 > --- a/src/intel/vulkan/genX_cmd_buffer.c > +++ b/src/intel/vulkan/genX_cmd_buffer.c > @@ -2554,7 +2554,8 @@ genX(cmd_buffer_flush_state)(struct anv_cmd_buffer > *cmd_buffer) > * 3DSTATE_BINDING_TABLE_POINTER_* for the push constants to take > effect. > */ > uint32_t dirty = 0; > - if (cmd_buffer->state.descriptors_dirty) > + if (cmd_buffer->state.descriptors_dirty || > + cmd_buffer->state.push_constants_dirty) > dirty = flush_descriptor_sets(cmd_buffer); > > if (dirty || cmd_buffer->state.push_constants_dirty) { > @@ -2988,7 +2989,13 @@ genX(cmd_buffer_flush_compute_state)(struct > anv_cmd_buffer *cmd_buffer) > anv_batch_emit_batch(&cmd_buffer->batch, &pipeline->batch); > } > > + /* Storage images require push constant data, which is setup during the > + * binding table emision. If we have dirty push constants, we need to > + * re-emit the binding table so we get the push constant storage image > setup > + * done, otherwise we trash it when we emit push constants below. > + */ > if ((cmd_buffer->state.descriptors_dirty & > VK_SHADER_STAGE_COMPUTE_BIT) || > + (cmd_buffer->state.push_constants_dirty & > VK_SHADER_STAGE_COMPUTE_BIT) || > cmd_buffer->state.compute.pipeline_dirty) { > /* FIXME: figure out descriptors for gen7 */ > result = flush_compute_descriptor_set(cmd_buffer); > -- > 2.14.1 > > >
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev