Re: [PATCH v9 2/2] drm/amdgpu: sync page table freeing with tlb flush
Am 18.03.24 um 17:11 schrieb Shashank Sharma: The idea behind this patch is to delay the freeing of PT entry objects until the TLB flush is done. This patch: - Adds a tlb_flush_waitlist in amdgpu_vm_update_params which will keep the objects that need to be freed after tlb_flush. - Adds PT entries in this list in amdgpu_vm_ptes_update after finding the PT entry. - Changes functionality of amdgpu_vm_pt_free_dfs from (df_search + free) to simply freeing of the BOs, also renames it to amdgpu_vm_pt_free_list to reflect this same. - Exports function amdgpu_vm_pt_free_list to be called directly. - Calls amdgpu_vm_pt_free_list directly from amdgpu_vm_update_range. V2: rebase V4: Addressed review comments from Christian - add only locked PTEs entries in TLB flush waitlist. - do not create a separate function for list flush. - do not create a new lock for TLB flush. - there is no need to wait on tlb_flush_fence exclusively. V5: Addressed review comments from Christian - change the amdgpu_vm_pt_free_dfs's functionality to simple freeing of the objects and rename it. - add all the PTE objects in params->tlb_flush_waitlist - let amdgpu_vm_pt_free_root handle the freeing of BOs independently - call amdgpu_vm_pt_free_list directly V6: Rebase V7: Rebase V8: Added a NULL check to fix this backtrace issue: [ 415.351447] BUG: kernel NULL pointer dereference, address: 0008 [ 415.359245] #PF: supervisor write access in kernel mode [ 415.365081] #PF: error_code(0x0002) - not-present page [ 415.370817] PGD 101259067 P4D 101259067 PUD 10125a067 PMD 0 [ 415.377140] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 415.382004] CPU: 0 PID: 25481 Comm: test_with_MPI.e Tainted: G OE 5.18.2-mi300-build-140423-ubuntu-22.04+ #24 [ 415.394437] Hardware name: AMD Corporation Sh51p/Sh51p, BIOS RMO1001AS 02/21/2024 [ 415.402797] RIP: 0010:amdgpu_vm_ptes_update+0x6fd/0xa10 [amdgpu] [ 415.409648] Code: 4c 89 ff 4d 8d 66 30 e8 f1 ed ff ff 48 85 db 74 42 48 39 5d a0 74 40 48 8b 53 20 48 8b 4b 18 48 8d 43 18 48 8d 75 b0 4c 89 ff <48 89 51 08 48 89 0a 49 8b 56 30 48 89 42 08 48 89 53 18 4c 89 63 [ 415.430621] RSP: 0018:c9000401f990 EFLAGS: 00010287 [ 415.436456] RAX: 888147bb82f0 RBX: 888147bb82d8 RCX: [ 415.26] RDX: RSI: c9000401fa30 RDI: 888161f8 [ 415.452397] RBP: c9000401fa80 R08: R09: c9000401fa00 [ 415.460368] R10: 0007f0cc R11: 0007f0c85000 R12: c9000401fb20 [ 415.468340] R13: 0007f0d0 R14: c9000401faf0 R15: 888161f8 [ 415.476312] FS: 7f132ff89840() GS:889f87c0() knlGS: [ 415.485350] CS: 0010 DS: ES: CR0: 80050033 [ 415.491767] CR2: 0008 CR3: 000161d46003 CR4: 00770ef0 [ 415.499738] PKRU: 5554 [ 415.502750] Call Trace: [ 415.505482] [ 415.507825] amdgpu_vm_update_range+0x32a/0x880 [amdgpu] [ 415.513869] amdgpu_vm_clear_freed+0x117/0x250 [amdgpu] [ 415.519814] amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu+0x18c/0x250 [amdgpu] [ 415.527729] kfd_ioctl_unmap_memory_from_gpu+0xed/0x340 [amdgpu] [ 415.534551] kfd_ioctl+0x3b6/0x510 [amdgpu] V9: Addressed review comments from Christian - No NULL check reqd for root PT freeing - Free PT list regardless of needs_flush - Move adding BOs in list in a separate function Cc: Christian König Cc: Alex Deucher Cc: Felix Kuehling Cc: Rajneesh Bhardwaj Acked-by: Felix Kuehling Acked-by: Rajneesh Bhardwaj Tested-by: Rajneesh Bhardwaj Signed-off-by: Shashank Sharma Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 66 +++ 3 files changed, 53 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 104bf600c85f..8fada1152664 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -986,6 +986,7 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm, params.unlocked = unlocked; params.needs_flush = flush_tlb; params.allow_override = allow_override; + INIT_LIST_HEAD(_flush_waitlist); /* Implicitly sync to command submissions in the same VM before * unmapping. Sync to moving fences before mapping. @@ -1076,6 +1077,8 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm, tlb_cb = NULL; } + amdgpu_vm_pt_free_list(adev, ); + error_free: kfree(tlb_cb); amdgpu_vm_eviction_unlock(vm); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index b0a4fe683352..54d7da396de0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++
[PATCH v9 2/2] drm/amdgpu: sync page table freeing with tlb flush
The idea behind this patch is to delay the freeing of PT entry objects until the TLB flush is done. This patch: - Adds a tlb_flush_waitlist in amdgpu_vm_update_params which will keep the objects that need to be freed after tlb_flush. - Adds PT entries in this list in amdgpu_vm_ptes_update after finding the PT entry. - Changes functionality of amdgpu_vm_pt_free_dfs from (df_search + free) to simply freeing of the BOs, also renames it to amdgpu_vm_pt_free_list to reflect this same. - Exports function amdgpu_vm_pt_free_list to be called directly. - Calls amdgpu_vm_pt_free_list directly from amdgpu_vm_update_range. V2: rebase V4: Addressed review comments from Christian - add only locked PTEs entries in TLB flush waitlist. - do not create a separate function for list flush. - do not create a new lock for TLB flush. - there is no need to wait on tlb_flush_fence exclusively. V5: Addressed review comments from Christian - change the amdgpu_vm_pt_free_dfs's functionality to simple freeing of the objects and rename it. - add all the PTE objects in params->tlb_flush_waitlist - let amdgpu_vm_pt_free_root handle the freeing of BOs independently - call amdgpu_vm_pt_free_list directly V6: Rebase V7: Rebase V8: Added a NULL check to fix this backtrace issue: [ 415.351447] BUG: kernel NULL pointer dereference, address: 0008 [ 415.359245] #PF: supervisor write access in kernel mode [ 415.365081] #PF: error_code(0x0002) - not-present page [ 415.370817] PGD 101259067 P4D 101259067 PUD 10125a067 PMD 0 [ 415.377140] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 415.382004] CPU: 0 PID: 25481 Comm: test_with_MPI.e Tainted: G OE 5.18.2-mi300-build-140423-ubuntu-22.04+ #24 [ 415.394437] Hardware name: AMD Corporation Sh51p/Sh51p, BIOS RMO1001AS 02/21/2024 [ 415.402797] RIP: 0010:amdgpu_vm_ptes_update+0x6fd/0xa10 [amdgpu] [ 415.409648] Code: 4c 89 ff 4d 8d 66 30 e8 f1 ed ff ff 48 85 db 74 42 48 39 5d a0 74 40 48 8b 53 20 48 8b 4b 18 48 8d 43 18 48 8d 75 b0 4c 89 ff <48 > 89 51 08 48 89 0a 49 8b 56 30 48 89 42 08 48 89 53 18 4c 89 63 [ 415.430621] RSP: 0018:c9000401f990 EFLAGS: 00010287 [ 415.436456] RAX: 888147bb82f0 RBX: 888147bb82d8 RCX: [ 415.26] RDX: RSI: c9000401fa30 RDI: 888161f8 [ 415.452397] RBP: c9000401fa80 R08: R09: c9000401fa00 [ 415.460368] R10: 0007f0cc R11: 0007f0c85000 R12: c9000401fb20 [ 415.468340] R13: 0007f0d0 R14: c9000401faf0 R15: 888161f8 [ 415.476312] FS: 7f132ff89840() GS:889f87c0() knlGS: [ 415.485350] CS: 0010 DS: ES: CR0: 80050033 [ 415.491767] CR2: 0008 CR3: 000161d46003 CR4: 00770ef0 [ 415.499738] PKRU: 5554 [ 415.502750] Call Trace: [ 415.505482] [ 415.507825] amdgpu_vm_update_range+0x32a/0x880 [amdgpu] [ 415.513869] amdgpu_vm_clear_freed+0x117/0x250 [amdgpu] [ 415.519814] amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu+0x18c/0x250 [amdgpu] [ 415.527729] kfd_ioctl_unmap_memory_from_gpu+0xed/0x340 [amdgpu] [ 415.534551] kfd_ioctl+0x3b6/0x510 [amdgpu] V9: Addressed review comments from Christian - No NULL check reqd for root PT freeing - Free PT list regardless of needs_flush - Move adding BOs in list in a separate function Cc: Christian König Cc: Alex Deucher Cc: Felix Kuehling Cc: Rajneesh Bhardwaj Acked-by: Felix Kuehling Acked-by: Rajneesh Bhardwaj Tested-by: Rajneesh Bhardwaj Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c| 3 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h| 7 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 66 +++ 3 files changed, 53 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 104bf600c85f..8fada1152664 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -986,6 +986,7 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm, params.unlocked = unlocked; params.needs_flush = flush_tlb; params.allow_override = allow_override; + INIT_LIST_HEAD(_flush_waitlist); /* Implicitly sync to command submissions in the same VM before * unmapping. Sync to moving fences before mapping. @@ -1076,6 +1077,8 @@ int amdgpu_vm_update_range(struct amdgpu_device *adev, struct amdgpu_vm *vm, tlb_cb = NULL; } + amdgpu_vm_pt_free_list(adev, ); + error_free: kfree(tlb_cb); amdgpu_vm_eviction_unlock(vm); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h index b0a4fe683352..54d7da396de0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h @@ -266,6 +266,11 @@ struct amdgpu_vm_update_params {