[Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-08-29 Thread Christian König
From: Marek Olšák 

For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now
---
 src/gallium/drivers/radeon/r600_buffer_common.c |  7 ++
 src/gallium/drivers/radeon/radeon_winsys.h  | 20 +
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c   | 30 ++---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c   | 27 +-
 4 files changed, 56 insertions(+), 28 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_buffer_common.c 
b/src/gallium/drivers/radeon/r600_buffer_common.c
index dd1c209..2747ac4 100644
--- a/src/gallium/drivers/radeon/r600_buffer_common.c
+++ b/src/gallium/drivers/radeon/r600_buffer_common.c
@@ -167,6 +167,13 @@ void r600_init_resource_fields(struct r600_common_screen 
*rscreen,
 RADEON_FLAG_GTT_WC;
}
 
+   /* Only displayable single-sample textures can be shared between
+* processes. */
+   if (res->b.b.target == PIPE_BUFFER ||
+   res->b.b.nr_samples >= 2 ||
+   rtex->surface.micro_tile_mode != RADEON_MICRO_MODE_DISPLAY)
+   res->flags |= RADEON_FLAG_NO_INTERPROCESS_SHARING;
+
/* If VRAM is just stolen system memory, allow both VRAM and
 * GTT, whichever has free space. If a buffer is evicted from
 * VRAM to GTT, it will stay there.
diff --git a/src/gallium/drivers/radeon/radeon_winsys.h 
b/src/gallium/drivers/radeon/radeon_winsys.h
index b00b144..f0a0a92 100644
--- a/src/gallium/drivers/radeon/radeon_winsys.h
+++ b/src/gallium/drivers/radeon/radeon_winsys.h
@@ -54,6 +54,7 @@ enum radeon_bo_flag { /* bitfield */
 RADEON_FLAG_NO_CPU_ACCESS = (1 << 1),
 RADEON_FLAG_NO_SUBALLOC =   (1 << 2),
 RADEON_FLAG_SPARSE =(1 << 3),
+RADEON_FLAG_NO_INTERPROCESS_SHARING = (1 << 4),
 };
 
 enum radeon_bo_usage { /* bitfield */
@@ -661,14 +662,19 @@ static inline unsigned radeon_flags_from_heap(enum 
radeon_heap heap)
 {
 switch (heap) {
 case RADEON_HEAP_VRAM_NO_CPU_ACCESS:
-return RADEON_FLAG_GTT_WC | RADEON_FLAG_NO_CPU_ACCESS;
+return RADEON_FLAG_GTT_WC |
+   RADEON_FLAG_NO_CPU_ACCESS |
+   RADEON_FLAG_NO_INTERPROCESS_SHARING;
+
 case RADEON_HEAP_VRAM:
 case RADEON_HEAP_VRAM_GTT:
 case RADEON_HEAP_GTT_WC:
-return RADEON_FLAG_GTT_WC;
+return RADEON_FLAG_GTT_WC |
+   RADEON_FLAG_NO_INTERPROCESS_SHARING;
+
 case RADEON_HEAP_GTT:
 default:
-return 0;
+return RADEON_FLAG_NO_INTERPROCESS_SHARING;
 }
 }
 
@@ -700,8 +706,14 @@ static inline int radeon_get_heap_index(enum 
radeon_bo_domain domain,
 /* NO_CPU_ACCESS implies VRAM only. */
 assert(!(flags & RADEON_FLAG_NO_CPU_ACCESS) || domain == 
RADEON_DOMAIN_VRAM);
 
+/* Resources with interprocess sharing don't use any winsys allocators. */
+if (!(flags & RADEON_FLAG_NO_INTERPROCESS_SHARING))
+return -1;
+
 /* Unsupported flags: NO_SUBALLOC, SPARSE. */
-if (flags & ~(RADEON_FLAG_GTT_WC | RADEON_FLAG_NO_CPU_ACCESS))
+if (flags & ~(RADEON_FLAG_GTT_WC |
+  RADEON_FLAG_NO_CPU_ACCESS |
+  RADEON_FLAG_NO_INTERPROCESS_SHARING))
 return -1;
 
 switch (domain) {
diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 97bbe23..08348ed 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -1134,7 +1134,7 @@ amdgpu_bo_create(struct radeon_winsys *rws,
 {
struct amdgpu_winsys *ws = amdgpu_winsys(rws);
struct amdgpu_winsys_bo *bo;
-   unsigned usage = 0, pb_cache_bucket;
+   unsigned usage = 0, pb_cache_bucket = 0;
 
/* VRAM implies WC. This is not optional. */
assert(!(domain & RADEON_DOMAIN_VRAM) || flags & RADEON_FLAG_GTT_WC);
@@ -1189,19 +1189,23 @@ no_slab:
size = align64(size, ws->info.gart_page_size);
alignment = align(alignment, ws->info.gart_page_size);
 
-   int heap = radeon_get_heap_index(domain, flags);
-   assert(heap >= 0 && heap < RADEON_MAX_CACHED_HEAPS);
-   usage = 1 << heap; /* Only set one usage bit for each heap. */
+   bool use_reusable_pool = flags & RADEON_FLAG_NO_INTERPROCESS_SHARING;
 
-   pb_cache_bucket = radeon_get_pb_cache_bucket_index(heap);
-   assert(pb_cache_bucket < ARRAY_SIZE(ws->bo_cache.buckets));
+   if (use_reusable_pool) {
+   int heap = radeon_get_heap_index(domain, flags);
+   assert(heap >= 0 && heap < RADEON_MAX_CACHED_HEAPS);
+   usage = 1 << heap; /* Only set one usage bit for each heap. */
 
-   /* Get a buffer from the cache. */
-   bo = (struct amdgpu_winsys_bo*)
-pb_cache_reclaim_buffer(&ws->bo_cache, size, alignment, usage,
-p

Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-01 Thread Michel Dänzer
On 29/08/17 11:47 PM, Christian König wrote:
> From: Marek Olšák 
> 
> For lower overhead in the CS ioctl.
> Winsys allocators are not used with interprocess-sharable resources.
> 
> v2: It shouldn't crash anymore, but the kernel will reject the new flag.
> v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
> v4 (christian): Remove setting the kernel flag for now

This change seems to have caused a GPU hang when running piglit on my
Kaveri with the radeon kernel driver. Haven't been able to isolate it to
a specific test, seems to only happen when running multiple tests
concurrently. There's a GPUVM fault before the hang, I suspect it's related:

radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)


Any ideas?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-01 Thread Nicolai Hähnle

On 01.09.2017 11:58, Michel Dänzer wrote:

On 29/08/17 11:47 PM, Christian König wrote:

From: Marek Olšák 

For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the BO list.
v4 (christian): Remove setting the kernel flag for now


This change seems to have caused a GPU hang when running piglit on my
Kaveri with the radeon kernel driver. Haven't been able to isolate it to
a specific test, seems to only happen when running multiple tests
concurrently. There's a GPUVM fault before the hang, I suspect it's related:

radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)


Any ideas?


Only that "read from CPF" means it can only be one of:

- command buffers
- indirect draw data
- predication data (conditional render)

(I hope I didn't miss anything)

Hmm, actually, I think CI has unavoidable VM faults related to 
ARB_sparse_buffers, so this may be benign. You could try to exclude the 
ARB_sparse_buffers tests.


Cheers,
Nicolai
--
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-01 Thread Michel Dänzer
On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
> On 01.09.2017 11:58, Michel Dänzer wrote:
>> On 29/08/17 11:47 PM, Christian König wrote:
>>> From: Marek Olšák 
>>>
>>> For lower overhead in the CS ioctl.
>>> Winsys allocators are not used with interprocess-sharable resources.
>>>
>>> v2: It shouldn't crash anymore, but the kernel will reject the new flag.
>>> v3 (christian): Rename the flag, avoid sending those buffers in the
>>> BO list.
>>> v4 (christian): Remove setting the kernel flag for now
>>
>> This change seems to have caused a GPU hang when running piglit on my
>> Kaveri with the radeon kernel driver. Haven't been able to isolate it to
>> a specific test, seems to only happen when running multiple tests
>> concurrently. There's a GPUVM fault before the hang, I suspect it's
>> related:
>>
>> radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
>> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
>> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
>> VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)
>>
>>
>> Any ideas?
> 
> Only that "read from CPF" means it can only be one of:
> 
> - command buffers
> - indirect draw data
> - predication data (conditional render)
> 
> (I hope I didn't miss anything)
> 
> Hmm, actually, I think CI has unavoidable VM faults related to
> ARB_sparse_buffers, so this may be benign. You could try to exclude the
> ARB_sparse_buffers tests.

GL_ARB_sparse_buffer isn't supported with the radeon kernel driver AFAICT.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-01 Thread Christian König

Am 01.09.2017 um 12:28 schrieb Michel Dänzer:

On 01/09/17 07:23 PM, Nicolai Hähnle wrote:

On 01.09.2017 11:58, Michel Dänzer wrote:

On 29/08/17 11:47 PM, Christian König wrote:

From: Marek Olšák 

For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new flag.
v3 (christian): Rename the flag, avoid sending those buffers in the
BO list.
v4 (christian): Remove setting the kernel flag for now

This change seems to have caused a GPU hang when running piglit on my
Kaveri with the radeon kernel driver. Haven't been able to isolate it to
a specific test, seems to only happen when running multiple tests
concurrently. There's a GPUVM fault before the hang, I suspect it's
related:

radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)


Any ideas?


Not the slightest, but I'm still investigating problems with that on amdgpu.

If we can't find the root cause till Monday it might be a good idea to 
revert the patches for now.


Christian.


Only that "read from CPF" means it can only be one of:

- command buffers
- indirect draw data
- predication data (conditional render)

(I hope I didn't miss anything)

Hmm, actually, I think CI has unavoidable VM faults related to
ARB_sparse_buffers, so this may be benign. You could try to exclude the
ARB_sparse_buffers tests.

GL_ARB_sparse_buffer isn't supported with the radeon kernel driver AFAICT.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-01 Thread Marek Olšák
Well, this patch shouldn't change behavior. Are you sure it's not some
random thing? I know, for example, that SI isn't very stable with piglit on
radeon either.

Marek

On Sep 1, 2017 12:28 PM, "Michel Dänzer"  wrote:

> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
> > On 01.09.2017 11:58, Michel Dänzer wrote:
> >> On 29/08/17 11:47 PM, Christian König wrote:
> >>> From: Marek Olšák 
> >>>
> >>> For lower overhead in the CS ioctl.
> >>> Winsys allocators are not used with interprocess-sharable resources.
> >>>
> >>> v2: It shouldn't crash anymore, but the kernel will reject the new
> flag.
> >>> v3 (christian): Rename the flag, avoid sending those buffers in the
> >>> BO list.
> >>> v4 (christian): Remove setting the kernel flag for now
> >>
> >> This change seems to have caused a GPU hang when running piglit on my
> >> Kaveri with the radeon kernel driver. Haven't been able to isolate it to
> >> a specific test, seems to only happen when running multiple tests
> >> concurrently. There's a GPUVM fault before the hang, I suspect it's
> >> related:
> >>
> >> radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
> >> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
> >> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
> >> VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)
> >>
> >>
> >> Any ideas?
> >
> > Only that "read from CPF" means it can only be one of:
> >
> > - command buffers
> > - indirect draw data
> > - predication data (conditional render)
> >
> > (I hope I didn't miss anything)
> >
> > Hmm, actually, I think CI has unavoidable VM faults related to
> > ARB_sparse_buffers, so this may be benign. You could try to exclude the
> > ARB_sparse_buffers tests.
>
> GL_ARB_sparse_buffer isn't supported with the radeon kernel driver AFAICT.
>
>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-02 Thread Michel Dänzer
On 01/09/17 09:18 PM, Marek Olšák wrote:
> Well, this patch shouldn't change behavior. Are you sure it's not some
> random thing?

Pretty sure. I never saw this GPUVM fault before this commit, but with
it or later commits it's reproducible reliably.


> I know, for example, that SI isn't very stable with piglit on radeon either.

My Kaveri has been very stable for years.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-07 Thread Michel Dänzer
On 01/09/17 07:40 PM, Christian König wrote:
> Am 01.09.2017 um 12:28 schrieb Michel Dänzer:
>> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
>>> On 01.09.2017 11:58, Michel Dänzer wrote:
 On 29/08/17 11:47 PM, Christian König wrote:
> From: Marek Olšák 
>
> For lower overhead in the CS ioctl.
> Winsys allocators are not used with interprocess-sharable resources.
>
> v2: It shouldn't crash anymore, but the kernel will reject the new
> flag.
> v3 (christian): Rename the flag, avoid sending those buffers in the
> BO list.
> v4 (christian): Remove setting the kernel flag for now
 This change seems to have caused a GPU hang when running piglit on my
 Kaveri with the radeon kernel driver.

I think we can remove "seems to have". I'm still reliably getting the
GPUVM fault and hang with current master, but not if I revert this
commit (and the one after it).

 Haven't been able to isolate it to a specific test, seems to only
 happen when running multiple tests concurrently.

I reproduced the problem with piglit process separation enabled as well,
and all four tests running when it hung were textureGather tests.
Before, reproducing the problem twice with piglit process separation
disabled, three textureGather tests were running when it hung both times
as well. I've been unable to reproduce the problem by manually running
the same textureGather tests in parallel though.


 There's a GPUVM fault before the hang, I suspect it's related:

 radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
 radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
 radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
 VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)


 Any ideas?
> 
> Not the slightest, but I'm still investigating problems with that on
> amdgpu.
> 
> If we can't find the root cause till Monday it might be a good idea to
> revert the patches for now.

What's the status on that?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-07 Thread Christian König

Am 07.09.2017 um 11:23 schrieb Michel Dänzer:

On 01/09/17 07:40 PM, Christian König wrote:

Am 01.09.2017 um 12:28 schrieb Michel Dänzer:

On 01/09/17 07:23 PM, Nicolai Hähnle wrote:

On 01.09.2017 11:58, Michel Dänzer wrote:

On 29/08/17 11:47 PM, Christian König wrote:

From: Marek Olšák 

For lower overhead in the CS ioctl.
Winsys allocators are not used with interprocess-sharable resources.

v2: It shouldn't crash anymore, but the kernel will reject the new
flag.
v3 (christian): Rename the flag, avoid sending those buffers in the
BO list.
v4 (christian): Remove setting the kernel flag for now

This change seems to have caused a GPU hang when running piglit on my
Kaveri with the radeon kernel driver.

I think we can remove "seems to have". I'm still reliably getting the
GPUVM fault and hang with current master, but not if I revert this
commit (and the one after it).


Haven't been able to isolate it to a specific test, seems to only
happen when running multiple tests concurrently.

I reproduced the problem with piglit process separation enabled as well,
and all four tests running when it hung were textureGather tests.
Before, reproducing the problem twice with piglit process separation
disabled, three textureGather tests were running when it hung both times
as well. I've been unable to reproduce the problem by manually running
the same textureGather tests in parallel though.



There's a GPUVM fault before the hang, I suspect it's related:

radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)


Any ideas?

Not the slightest, but I'm still investigating problems with that on
amdgpu.

If we can't find the root cause till Monday it might be a good idea to
revert the patches for now.

What's the status on that?



I've found and fixed the remaining kernel bugs over the last 
weekend/beginning of this week.


Still need to commit the fix for UVD/VCE, but that one shouldn't affect 
GFX at all.


Christian.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-07 Thread Marek Olšák
On Sep 7, 2017 12:08 PM, "Christian König"  wrote:

Am 07.09.2017 um 11:23 schrieb Michel Dänzer:

> On 01/09/17 07:40 PM, Christian König wrote:
>
>> Am 01.09.2017 um 12:28 schrieb Michel Dänzer:
>>
>>> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
>>>
 On 01.09.2017 11:58, Michel Dänzer wrote:

> On 29/08/17 11:47 PM, Christian König wrote:
>
>> From: Marek Olšák 
>>
>> For lower overhead in the CS ioctl.
>> Winsys allocators are not used with interprocess-sharable resources.
>>
>> v2: It shouldn't crash anymore, but the kernel will reject the new
>> flag.
>> v3 (christian): Rename the flag, avoid sending those buffers in the
>> BO list.
>> v4 (christian): Remove setting the kernel flag for now
>>
> This change seems to have caused a GPU hang when running piglit on my
> Kaveri with the radeon kernel driver.
>
 I think we can remove "seems to have". I'm still reliably getting the
> GPUVM fault and hang with current master, but not if I revert this
> commit (and the one after it).
>
> Haven't been able to isolate it to a specific test, seems to only
> happen when running multiple tests concurrently.
>
 I reproduced the problem with piglit process separation enabled as well,
> and all four tests running when it hung were textureGather tests.
> Before, reproducing the problem twice with piglit process separation
> disabled, three textureGather tests were running when it hung both times
> as well. I've been unable to reproduce the problem by manually running
> the same textureGather tests in parallel though.
>
>
> There's a GPUVM fault before the hang, I suspect it's related:
>
> radeon :00:01.0: GPU fault detected: 146 0x0ae6760c
> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
> radeon :00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
> VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)
>
>
> Any ideas?
>
 Not the slightest, but I'm still investigating problems with that on
>> amdgpu.
>>
>> If we can't find the root cause till Monday it might be a good idea to
>> revert the patches for now.
>>
> What's the status on that?
>


I've found and fixed the remaining kernel bugs over the last
weekend/beginning of this week.

Still need to commit the fix for UVD/VCE, but that one shouldn't affect GFX
at all.


Michel is seeing hangs on the radeon KMD, which should be unaffected by you
kernel work I think.

We could revert this to unbreak Michel's Kaveri, but I think it shouldn't
be so difficult to find the culprit in this patch if there is one.

Marek



Christian.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-07 Thread Christian König

Am 07.09.2017 um 12:14 schrieb Marek Olšák:



On Sep 7, 2017 12:08 PM, "Christian König" > wrote:


Am 07.09.2017 um 11:23 schrieb Michel Dänzer:

On 01/09/17 07:40 PM, Christian König wrote:

Am 01.09.2017 um 12:28 schrieb Michel Dänzer:

On 01/09/17 07:23 PM, Nicolai Hähnle wrote:

On 01.09.2017 11:58, Michel Dänzer wrote:

On 29/08/17 11:47 PM, Christian König wrote:

From: Marek Olšák mailto:marek.ol...@amd.com>>

For lower overhead in the CS ioctl.
Winsys allocators are not used with
interprocess-sharable resources.

v2: It shouldn't crash anymore, but the
kernel will reject the new
flag.
v3 (christian): Rename the flag, avoid
sending those buffers in the
BO list.
v4 (christian): Remove setting the kernel
flag for now

This change seems to have caused a GPU hang
when running piglit on my
Kaveri with the radeon kernel driver.

I think we can remove "seems to have". I'm still reliably
getting the
GPUVM fault and hang with current master, but not if I revert this
commit (and the one after it).

Haven't been able to isolate it to a specific
test, seems to only
happen when running multiple tests concurrently.

I reproduced the problem with piglit process separation
enabled as well,
and all four tests running when it hung were textureGather tests.
Before, reproducing the problem twice with piglit process
separation
disabled, three textureGather tests were running when it hung
both times
as well. I've been unable to reproduce the problem by manually
running
the same textureGather tests in parallel though.


There's a GPUVM fault before the hang, I
suspect it's related:

radeon :00:01.0: GPU fault detected: 146
0x0ae6760c
radeon :00:01.0:
 VM_CONTEXT1_PROTECTION_FAULT_ADDR  0x01D7
radeon :00:01.0:
 VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
VM fault (0x0c, vmid 3) at page 471, read from
'CPF' (0x43504600) (118)


Any ideas?

Not the slightest, but I'm still investigating problems
with that on
amdgpu.

If we can't find the root cause till Monday it might be a
good idea to
revert the patches for now.

What's the status on that?



I've found and fixed the remaining kernel bugs over the last
weekend/beginning of this week.

Still need to commit the fix for UVD/VCE, but that one shouldn't
affect GFX at all.


Michel is seeing hangs on the radeon KMD, which should be unaffected 
by you kernel work I think.


We could revert this to unbreak Michel's Kaveri, but I think it 
shouldn't be so difficult to find the culprit in this patch if there 
is one.


The only crux is that the userspace patch shouldn't affect radeon at 
all. So the real question is what the heck is going on here?


Christian.



Marek



Christian.




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

2017-09-08 Thread Michel Dänzer
On 07/09/17 07:24 PM, Christian König wrote:
> Am 07.09.2017 um 12:14 schrieb Marek Olšák:
>> On Sep 7, 2017 12:08 PM, "Christian König" > > wrote:
>> Am 07.09.2017 um 11:23 schrieb Michel Dänzer:
>> On 01/09/17 07:40 PM, Christian König wrote:
>> Am 01.09.2017 um 12:28 schrieb Michel Dänzer:
>> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
>> On 01.09.2017 11:58, Michel Dänzer wrote:
>> On 29/08/17 11:47 PM, Christian König wrote:
>>
>> From: Marek Olšák > >
>>
>> For lower overhead in the CS ioctl.
>> Winsys allocators are not used with
>> interprocess-sharable resources.
>>
>> v2: It shouldn't crash anymore, but the
>> kernel will reject the new
>> flag.
>> v3 (christian): Rename the flag, avoid
>> sending those buffers in the
>> BO list.
>> v4 (christian): Remove setting the kernel
>> flag for now
>>
>> This change seems to have caused a GPU hang
>> when running piglit on my
>> Kaveri with the radeon kernel driver.
>>
>> I think we can remove "seems to have". I'm still reliably
>> getting the
>> GPUVM fault and hang with current master, but not if I revert this
>> commit (and the one after it).
>>
>> Haven't been able to isolate it to a specific
>> test, seems to only
>> happen when running multiple tests concurrently.
>>
>> I reproduced the problem with piglit process separation
>> enabled as well,
>> and all four tests running when it hung were textureGather tests.
>> Before, reproducing the problem twice with piglit process
>> separation
>> disabled, three textureGather tests were running when it hung
>> both times
>> as well. I've been unable to reproduce the problem by manually
>> running
>> the same textureGather tests in parallel though.
>>
>>
>> There's a GPUVM fault before the hang, I
>> suspect it's related:
>>
>> radeon :00:01.0: GPU fault detected: 146
>> 0x0ae6760c
>> radeon :00:01.0: 
>>  VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x01D7
>> radeon :00:01.0: 
>>  VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
>> VM fault (0x0c, vmid 3) at page 471, read from
>> 'CPF' (0x43504600) (118)
>>
>>
>> Any ideas?
>>
>> Not the slightest, but I'm still investigating problems
>> with that on
>> amdgpu.
>>
>> If we can't find the root cause till Monday it might be a
>> good idea to
>> revert the patches for now.
>>
>> What's the status on that?
>>
>>
>>
>> I've found and fixed the remaining kernel bugs over the last
>> weekend/beginning of this week.
>>
>> Still need to commit the fix for UVD/VCE, but that one shouldn't
>> affect GFX at all.
>>
>>
>> Michel is seeing hangs on the radeon KMD, which should be unaffected
>> by you kernel work I think.
>>
>> We could revert this to unbreak Michel's Kaveri,

FWIW, there's no need to do anything for my Kaveri development system in
particular; it's going out of service soon, and in the meantime I can
revert these changes locally.

My concern is that the underlying issue might cause other problems in
real world scenarios.


>> but I think it shouldn't be so difficult to find the culprit in this
>> patch if there is one.
> 
> The only crux is that the userspace patch shouldn't affect radeon at
> all. So the real question is what the heck is going on here?

Maybe some buffers that were previously allocated directly are now
sub-allocated or re-used from the BO cache, or vice versa, or something
like that?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev