amdkfd: Don't stuck in svm restore worker

Felix Kuehling Fri, 03 Oct 2025 13:56:56 -0700

On 2025-10-03 14:34, Chen, Xiaogang wrote:

On 10/3/2025 1:27 PM, Philip Yang wrote:
On 2025-10-03 14:22, Chen, Xiaogang wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
The MADV_FREE is handled atmadvise_free_single_vma(madvise_dontneed_free) frommadvise_vma_behavior at mm/madvice.c.
It calls mmu_notifier_invalidate_range_start(&range) withMMU_NOTIFY_CLEAR to notify registered vm intervals. I am not surehow driver don't receive the notification. Is there something elsecause the problem?

I guess we got that notification. That's why we're trying to restore therange later in the first place.

I cannot repro the vma not found issue with madvise MADV_FREE onUbunut kernel 6.16, this issue reported on a customized older kernelversion.
Regards,
One principle of hmm is gpu vm is mirror of cpu vm. Whatever happenedat cpu vm need to reflect to gpu vm. If there is something wrong fromhmm that driver did not get notification from mmu notifier driver needupdate gpu vm, also the correspondent prange. So, besides unmap fromgpu vm, I think driver also need split prange to remove correspondentvm range from prange.

I would agree. It's strange that the vma disappeared without sending usa MMU_NOTIFY_UNMAP. If we're adding a workaround here that unmaps thingsfor missing VMAs, we're potentially leaking the prange structure and anyassociated svm_bo. Because I don't think we can expect that someone elsewill later send us the MMU_NOTIFY_UNMAP.

If we cannot reproduce this problem in an upstream kernel, we should notinclude a workaround for it. If we can reproduce it in an upstreamkernel, we should discuss the proper solution with the HMM maintainers.It may not be in our driver.

The only place where we can put such a workaround for a bug that may bespecific to an old or modified customer kernel would be our DKMS branch.And even there I would be hesitant with a workaround for a problemthat's not fully understood. It makes the branch harder to maintain, andmay be broken at any time if we can't test it.


Regards,
  Felix

Regards

Xiaogang
Philip
Regards

Xiaogang

-----Original Message-----
From: Yang, Philip <[email protected]>
Sent: Friday, October 3, 2025 1:15 PM
To: [email protected]
Cc: Kuehling, Felix <[email protected]>; Kasiviswanathan,Harish <[email protected]>; Chen, Xiaogang<[email protected]>; Yang, Philip <[email protected]>
Subject: [PATCH v2 3/3] drm/amdkfd: Don't stuck in svm restore worker
If vma is not found, the application has freed the memory usingmadvise MADV_FREE, but driver don't receive the unmap from CPU MMUnotifier callback, the memory is still mapped on GPUs. svm restorework will schedule the work to retry forever. Then user queues notresumed and cause application hangs to wait for queue finish.
svm restore work should unmap the memory range from GPUs then resumequeues. If GPU page fault happens on the unmapped address, it isapplication use-after-free bug.
Signed-off-by: Philip Yang <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 75++++++++++++++--------------
  1 file changed, 38 insertions(+), 37 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.cb/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 0aadd20be56a..e87c9b3533b9 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -1708,50 +1708,51 @@ static int svm_range_validate_and_map(structmm_struct *mm,
                 bool readonly;

                 vma = vma_lookup(mm, addr);
-               if (vma) {
-                       readonly = !(vma->vm_flags & VM_WRITE);
+               next = vma ? min(vma->vm_end, end) : end;
+               npages = (next - addr) >> PAGE_SHIFT;

-                       next = min(vma->vm_end, end);
-                       npages = (next - addr) >> PAGE_SHIFT;
+               if (!vma || !(vma->vm_flags & VM_READ)) {
/* HMM requires at least READ permissions.If provided with PROT_NONE, * unmap the memory. If it's not alreadymapped, this is a no-op * If PROT_WRITE is provided without READ,warn first then unmap+ * If vma is not found, addr is invalid,unmap from GPUs
                          */
-                       if (!(vma->vm_flags & VM_READ)) {
-                               unsigned long e, s;
-
-                               svm_range_lock(prange);
-                               if (vma->vm_flags & VM_WRITE)
- pr_debug("VM_WRITE withoutVM_READ is not supported");- s = max(addr >> PAGE_SHIFT,prange->start);
-                               e = s + npages - 1;
- r =svm_range_unmap_from_gpus(prange, s, e,
- KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
-                               svm_range_unlock(prange);
- /* If unmap returns non-zero, we'llbail on the next for loop- * iteration, so just leave r andcontinue
-                                */
-                               addr = next;
-                               continue;
-                       }
+                       unsigned long e, s;
+
+                       svm_range_lock(prange);
+                       if (!vma)
+                               pr_debug("vma not found\n");
+                       else if (vma->vm_flags & VM_WRITE)
+ pr_debug("VM_WRITE without VM_READis not supported");
+
+                       s = max(addr >> PAGE_SHIFT, prange->start);
+                       e = s + npages - 1;
+                       r = svm_range_unmap_from_gpus(prange, s, e,
+ KFD_SVM_UNMAP_TRIGGER_UNMAP_FROM_CPU);
+                       svm_range_unlock(prange);
+ /* If unmap returns non-zero, we'll bail onthe next for loop
+                        * iteration, so just leave r and continue
+                        */
+                       addr = next;
+                       continue;
+               }
- hmm_range = kzalloc(sizeof(*hmm_range),GFP_KERNEL);
-                       if (unlikely(!hmm_range)) {
-                               r = -ENOMEM;
-                       } else {
- WRITE_ONCE(p->svms.faulting_task, current);
- r =amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
- readonly, owner,
- hmm_range);
- WRITE_ONCE(p->svms.faulting_task, NULL);
-                               if (r) {
-                                       kfree(hmm_range);
-                                       hmm_range = NULL;
- pr_debug("failed %d to getsvm range pages\n", r);
-                               }
-                       }
+               readonly = !(vma->vm_flags & VM_WRITE);
+
+               hmm_range = kzalloc(sizeof(*hmm_range), GFP_KERNEL);
+               if (unlikely(!hmm_range)) {
+                       r = -ENOMEM;
                 } else {
-                       r = -EFAULT;
+                       WRITE_ONCE(p->svms.faulting_task, current);
+ r =amdgpu_hmm_range_get_pages(&prange->notifier, addr, npages,
+ readonly, owner,
+ hmm_range);
+                       WRITE_ONCE(p->svms.faulting_task, NULL);
+                       if (r) {
+                               kfree(hmm_range);
+                               hmm_range = NULL;
+ pr_debug("failed %d to get svm rangepages\n", r);
+                       }
                 }

                 if (!r) {
--
2.49.0

Re: [PATCH v2 3/3] drm/amdkfd: Don't stuck in svm restore worker

Reply via email to