Re: [PATCH 5/6] drm/amdkfd: enable subsequent retry fault
On 2021-04-20 9:22 p.m., Felix Kuehling wrote: Am 2021-04-20 um 4:21 p.m. schrieb Philip Yang: After draining the stale retry fault, or failed to validate the range to recover, have to remove the fault address from fault filter ring, to be able to handle subsequent retry interrupt on same address. Otherwise the retry fault will not be processed to recover until timeout passed. Signed-off-by: Philip Yang Patches 1-3 and patch 5 are Reviewed-by: Felix Kuehling I didn't see a patch 6. Was the email lost or not send intentionally? 6/6 is the patch from Alex to create unregistered range, which is under code review. I cherry-pick it on top of my patches to test. Thanks. Philip --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 45dd055118eb..d90e0cb6e573 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2262,8 +2262,10 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid, mutex_lock(&prange->migrate_mutex); - if (svm_range_skip_recover(prange)) + if (svm_range_skip_recover(prange)) { + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); goto out_unlock_range; + } timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp; /* skip duplicate vm fault on different pages of same range */ @@ -2325,6 +2327,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid, if (r == -EAGAIN) { pr_debug("recover vm fault later\n"); + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); r = 0; } return r; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
Re: [PATCH 5/6] drm/amdkfd: enable subsequent retry fault
Am 2021-04-20 um 4:21 p.m. schrieb Philip Yang: > After draining the stale retry fault, or failed to validate the range > to recover, have to remove the fault address from fault filter ring, to > be able to handle subsequent retry interrupt on same address. Otherwise > the retry fault will not be processed to recover until timeout passed. > > Signed-off-by: Philip Yang Patches 1-3 and patch 5 are Reviewed-by: Felix Kuehling I didn't see a patch 6. Was the email lost or not send intentionally? > --- > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > index 45dd055118eb..d90e0cb6e573 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > @@ -2262,8 +2262,10 @@ svm_range_restore_pages(struct amdgpu_device *adev, > unsigned int pasid, > > mutex_lock(&prange->migrate_mutex); > > - if (svm_range_skip_recover(prange)) > + if (svm_range_skip_recover(prange)) { > + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); > goto out_unlock_range; > + } > > timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp; > /* skip duplicate vm fault on different pages of same range */ > @@ -2325,6 +2327,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, > unsigned int pasid, > > if (r == -EAGAIN) { > pr_debug("recover vm fault later\n"); > + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); > r = 0; > } > return r; ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH 5/6] drm/amdkfd: enable subsequent retry fault
After draining the stale retry fault, or failed to validate the range to recover, have to remove the fault address from fault filter ring, to be able to handle subsequent retry interrupt on same address. Otherwise the retry fault will not be processed to recover until timeout passed. Signed-off-by: Philip Yang --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 45dd055118eb..d90e0cb6e573 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2262,8 +2262,10 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid, mutex_lock(&prange->migrate_mutex); - if (svm_range_skip_recover(prange)) + if (svm_range_skip_recover(prange)) { + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); goto out_unlock_range; + } timestamp = ktime_to_us(ktime_get()) - prange->validate_timestamp; /* skip duplicate vm fault on different pages of same range */ @@ -2325,6 +2327,7 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid, if (r == -EAGAIN) { pr_debug("recover vm fault later\n"); + amdgpu_gmc_filter_faults_remove(adev, addr, pasid); r = 0; } return r; -- 2.17.1 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx