Re: [Mesa-dev] [PATCH] radv: force compute flush at end of command stream.

2017-07-27 Thread Dave Airlie
On 27 July 2017 at 07:37, Dave Airlie  wrote:
> On 27 July 2017 at 00:06, Nicolai Hähnle  wrote:
>> On 26.07.2017 05:42, Dave Airlie wrote:
>>>
>>> From: Dave Airlie 
>>>
>>> This seems like a workaround, but we don't see the bug on CIK/VI.
>>>
>>> On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
>>> tests, when one tests complete, the first flush at the start of the next
>>> test causes a VM fault as we've destroyed the VM, but we end up flushing
>>> the compute shader then, and it must still be in the process of doing
>>> something.
>>>
>>> Could also be a kernel difference between SI and CIK.
>>
>>
>> What do you mean by "destroyed the VM"? I thought the Vulkan CTS runs in a
>> single process?
>
> It can, but I run it inside piglit. But even just running one test
> twice in a row causes the
> problem.
>
>>
>> I guess it's fine as a temporary workaround, but I highly suspect we have
>> some SI-specific bug related to these flushes; I've seen issues with
>> radeonsi on amdgpu as well. It would be great to understand them properly.
>>
>> What do the VM faults look like? How reproducible is this?
>
> Writes to an address that is no longer valid, the address was valid in
> the last compute
> shader execution in the previous process.
>
> Yes just get an SI, build radv, run
> ./deqp-vk 
> --deqp-case=dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
> run it again, viola faults.

I should also mention I've previously seen traces using pro always do
partial cs/ps flushes at end
of every command buffer (on all GPUs). So maybe this is where that
comes from there.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: force compute flush at end of command stream.

2017-07-26 Thread Dave Airlie
On 27 July 2017 at 00:06, Nicolai Hähnle  wrote:
> On 26.07.2017 05:42, Dave Airlie wrote:
>>
>> From: Dave Airlie 
>>
>> This seems like a workaround, but we don't see the bug on CIK/VI.
>>
>> On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
>> tests, when one tests complete, the first flush at the start of the next
>> test causes a VM fault as we've destroyed the VM, but we end up flushing
>> the compute shader then, and it must still be in the process of doing
>> something.
>>
>> Could also be a kernel difference between SI and CIK.
>
>
> What do you mean by "destroyed the VM"? I thought the Vulkan CTS runs in a
> single process?

It can, but I run it inside piglit. But even just running one test
twice in a row causes the
problem.

>
> I guess it's fine as a temporary workaround, but I highly suspect we have
> some SI-specific bug related to these flushes; I've seen issues with
> radeonsi on amdgpu as well. It would be great to understand them properly.
>
> What do the VM faults look like? How reproducible is this?

Writes to an address that is no longer valid, the address was valid in
the last compute
shader execution in the previous process.

Yes just get an SI, build radv, run
./deqp-vk 
--deqp-case=dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
run it again, viola faults.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radv: force compute flush at end of command stream.

2017-07-26 Thread Nicolai Hähnle

On 26.07.2017 05:42, Dave Airlie wrote:

From: Dave Airlie 

This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.


What do you mean by "destroyed the VM"? I thought the Vulkan CTS runs in 
a single process?


I guess it's fine as a temporary workaround, but I highly suspect we 
have some SI-specific bug related to these flushes; I've seen issues 
with radeonsi on amdgpu as well. It would be great to understand them 
properly.


What do the VM faults look like? How reproducible is this?

Cheers,
Nicolai




Signed-off-by: Dave Airlie 
---
  src/amd/vulkan/radv_cmd_buffer.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c
index 4415e36..d185c00 100644
--- a/src/amd/vulkan/radv_cmd_buffer.c
+++ b/src/amd/vulkan/radv_cmd_buffer.c
@@ -2233,8 +2233,10 @@ VkResult radv_EndCommandBuffer(
  {
RADV_FROM_HANDLE(radv_cmd_buffer, cmd_buffer, commandBuffer);
  
-	if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER)

+   if (cmd_buffer->queue_family_index != RADV_QUEUE_TRANSFER) {
+   cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_CS_PARTIAL_FLUSH;
si_emit_cache_flush(cmd_buffer);
+   }
  
  	if (!cmd_buffer->device->ws->cs_finalize(cmd_buffer->cs) ||

cmd_buffer->record_fail)




--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev