Re: [PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently
Mhm, good catch. And yes using the paging queue when it is available sounds like a good idea to me as well. So far I've only used it for VM updates to actually test if it works as expected. Regards, Christian. Am 19.10.18 um 21:53 schrieb Kuehling, Felix: > [+Christian] > > Should the buffer funcs also use the paging ring? I think that would be > important for being able to clear page tables or migrating a BO while > handling a page fault. > > Regards, > Felix > > On 2018-10-19 3:13 p.m., Yang, Philip wrote: >> For sdma v4, there is bug caused by >> commit d4e869b6b5d6 ("drm/amdgpu: add ring test for page queue")' >> >> local variable ring is reused and changed, so >> amdgpu_ttm_set_buffer_funcs_status(adev, true) >> is skipped accidently. As a result, amdgpu_fill_buffer() will fail, kernel >> message: >> >> [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring >> turned off. >> [ 25.260444] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear >> memory with ring turned off. >> [ 25.260627] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear >> memory with ring turned off. >> [ 25.290119] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear >> memory with ring turned off. >> [ 25.290370] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear >> memory with ring turned off. >> [ 25.319971] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear >> memory with ring turned off. >> [ 25.320486] amdgpu :19:00.0: [mmhub] VMC page fault (src_id:0 >> ring:154 vmid:8 pasid:32768, for process pid 0 thread pid 0) >> [ 25.320533] amdgpu :19:00.0: in page starting at address >> 0x from 18 >> [ 25.320563] amdgpu :19:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00800134 >> >> Change-Id: Idacdf8e60557edb0a4a499aa4051b75d87ce4091 >> Signed-off-by: Philip Yang >> --- >> drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 7 --- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c >> b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c >> index ede149a..cd368ac 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c >> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c >> @@ -1151,10 +1151,11 @@ static int sdma_v4_0_start(struct amdgpu_device >> *adev) >> } >> >> if (adev->sdma.has_page_queue) { >> -ring = &adev->sdma.instance[i].page; >> -r = amdgpu_ring_test_ring(ring); >> +struct amdgpu_ring *page = &adev->sdma.instance[i].page; >> + >> +r = amdgpu_ring_test_ring(page); >> if (r) { >> -ring->ready = false; >> +page->ready = false; >> return r; >> } >> } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently
Reviewed-by: Alex Deucher From: amd-gfx on behalf of Yang, Philip Sent: Friday, October 19, 2018 3:13:56 PM To: amd-gfx@lists.freedesktop.org Cc: Yang, Philip Subject: [PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently For sdma v4, there is bug caused by commit d4e869b6b5d6 ("drm/amdgpu: add ring test for page queue")' local variable ring is reused and changed, so amdgpu_ttm_set_buffer_funcs_status(adev, true) is skipped accidently. As a result, amdgpu_fill_buffer() will fail, kernel message: [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260444] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260627] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290119] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290370] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.319971] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.320486] amdgpu :19:00.0: [mmhub] VMC page fault (src_id:0 ring:154 vmid:8 pasid:32768, for process pid 0 thread pid 0) [ 25.320533] amdgpu :19:00.0: in page starting at address 0x from 18 [ 25.320563] amdgpu :19:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00800134 Change-Id: Idacdf8e60557edb0a4a499aa4051b75d87ce4091 Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index ede149a..cd368ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -1151,10 +1151,11 @@ static int sdma_v4_0_start(struct amdgpu_device *adev) } if (adev->sdma.has_page_queue) { - ring = &adev->sdma.instance[i].page; - r = amdgpu_ring_test_ring(ring); + struct amdgpu_ring *page = &adev->sdma.instance[i].page; + + r = amdgpu_ring_test_ring(page); if (r) { - ring->ready = false; + page->ready = false; return r; } } -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently
[+Christian] Should the buffer funcs also use the paging ring? I think that would be important for being able to clear page tables or migrating a BO while handling a page fault. Regards, Felix On 2018-10-19 3:13 p.m., Yang, Philip wrote: > For sdma v4, there is bug caused by > commit d4e869b6b5d6 ("drm/amdgpu: add ring test for page queue")' > > local variable ring is reused and changed, so > amdgpu_ttm_set_buffer_funcs_status(adev, true) > is skipped accidently. As a result, amdgpu_fill_buffer() will fail, kernel > message: > > [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring > turned off. > [ 25.260444] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear > memory with ring turned off. > [ 25.260627] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear > memory with ring turned off. > [ 25.290119] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear > memory with ring turned off. > [ 25.290370] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear > memory with ring turned off. > [ 25.319971] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear > memory with ring turned off. > [ 25.320486] amdgpu :19:00.0: [mmhub] VMC page fault (src_id:0 ring:154 > vmid:8 pasid:32768, for process pid 0 thread pid 0) > [ 25.320533] amdgpu :19:00.0: in page starting at address > 0x from 18 > [ 25.320563] amdgpu :19:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00800134 > > Change-Id: Idacdf8e60557edb0a4a499aa4051b75d87ce4091 > Signed-off-by: Philip Yang > --- > drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > index ede149a..cd368ac 100644 > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c > @@ -1151,10 +1151,11 @@ static int sdma_v4_0_start(struct amdgpu_device *adev) > } > > if (adev->sdma.has_page_queue) { > - ring = &adev->sdma.instance[i].page; > - r = amdgpu_ring_test_ring(ring); > + struct amdgpu_ring *page = &adev->sdma.instance[i].page; > + > + r = amdgpu_ring_test_ring(page); > if (r) { > - ring->ready = false; > + page->ready = false; > return r; > } > } ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] drm/amdgpu: fix sdma v4 ring is disabled accidently
For sdma v4, there is bug caused by commit d4e869b6b5d6 ("drm/amdgpu: add ring test for page queue")' local variable ring is reused and changed, so amdgpu_ttm_set_buffer_funcs_status(adev, true) is skipped accidently. As a result, amdgpu_fill_buffer() will fail, kernel message: [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260444] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.260627] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290119] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.290370] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.319971] [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. [ 25.320486] amdgpu :19:00.0: [mmhub] VMC page fault (src_id:0 ring:154 vmid:8 pasid:32768, for process pid 0 thread pid 0) [ 25.320533] amdgpu :19:00.0: in page starting at address 0x from 18 [ 25.320563] amdgpu :19:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00800134 Change-Id: Idacdf8e60557edb0a4a499aa4051b75d87ce4091 Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c index ede149a..cd368ac 100644 --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c @@ -1151,10 +1151,11 @@ static int sdma_v4_0_start(struct amdgpu_device *adev) } if (adev->sdma.has_page_queue) { - ring = &adev->sdma.instance[i].page; - r = amdgpu_ring_test_ring(ring); + struct amdgpu_ring *page = &adev->sdma.instance[i].page; + + r = amdgpu_ring_test_ring(page); if (r) { - ring->ready = false; + page->ready = false; return r; } } -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx