Am 13.09.22 um 22:52 schrieb Felix Kuehling:
Am 2022-09-12 um 08:36 schrieb Christian König:Use DMA_RESV_USAGE_BOOKKEEP for VM page table updates and KFD preemption fence.v2: actually update all usages for KFD Signed-off-by: Christian König <christian.koe...@amd.com> --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 26 ++++++++++++------- drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c | 3 ++- 2 files changed, 18 insertions(+), 11 deletions(-)diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.cindex f4c49537d837..978d3970b5cc 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c@@ -298,7 +298,7 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,*/ replacement = dma_fence_get_stub(); dma_resv_replace_fences(bo->tbo.base.resv, ef->base.context, - replacement, DMA_RESV_USAGE_READ); + replacement, DMA_RESV_USAGE_BOOKKEEP); dma_fence_put(replacement); return 0; }@@ -1391,8 +1391,9 @@ static int init_kfd_vm(struct amdgpu_vm *vm, void **process_info,ret = dma_resv_reserve_fences(vm->root.bo->tbo.base.resv, 1); if (ret) goto reserve_shared_fail; - amdgpu_bo_fence(vm->root.bo, - &vm->process_info->eviction_fence->base, true); + dma_resv_add_fence(vm->root.bo->tbo.base.resv, + &vm->process_info->eviction_fence->base, + DMA_RESV_USAGE_BOOKKEEP); amdgpu_bo_unreserve(vm->root.bo); /* Update process info */ @@ -1989,9 +1990,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu( }if (!amdgpu_ttm_tt_get_usermm(bo->tbo.ttm) && !bo->tbo.pin_count)- amdgpu_bo_fence(bo, - &avm->process_info->eviction_fence->base, - true); + dma_resv_add_fence(bo->tbo.base.resv, + &avm->process_info->eviction_fence->base, + DMA_RESV_USAGE_BOOKKEEP);This removes the implicit dma_resv_reserve_fences that amdgpu_bo_fence used to do. Do we need to add back an explicit dma_resv_reserve_fences somewhere here?
I was pondering the same thought. The extra reserve in amdgpu_bo_fence() is actually just a workaround. So when this here goes boom we really need to fix it in the KFD code.
ret = unreserve_bo_and_vms(&ctx, false, false); goto out;@@ -2760,15 +2761,18 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence **ef)if (mem->bo->tbo.pin_count) continue; - amdgpu_bo_fence(mem->bo, - &process_info->eviction_fence->base, true); + dma_resv_add_fence(mem->bo->tbo.base.resv, + &process_info->eviction_fence->base, + DMA_RESV_USAGE_BOOKKEEP);Same as above.} /* Attach eviction fence to PD / PT BOs */ list_for_each_entry(peer_vm, &process_info->vm_list_head, vm_list_node) { struct amdgpu_bo *bo = peer_vm->root.bo;- amdgpu_bo_fence(bo, &process_info->eviction_fence->base, true);+ dma_resv_add_fence(bo->tbo.base.resv, + &process_info->eviction_fence->base, + DMA_RESV_USAGE_BOOKKEEP);Same as above.} validate_map_fail:@@ -2822,7 +2826,9 @@ int amdgpu_amdkfd_add_gws_to_process(void *info, void *gws, struct kgd_mem **memret = dma_resv_reserve_fences(gws_bo->tbo.base.resv, 1); if (ret) goto reserve_shared_fail; - amdgpu_bo_fence(gws_bo, &process_info->eviction_fence->base, true); + dma_resv_add_fence(gws_bo->tbo.base.resv, + &process_info->eviction_fence->base, + DMA_RESV_USAGE_BOOKKEEP); amdgpu_bo_unreserve(gws_bo); mutex_unlock(&(*mem)->process_info->lock);diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.cindex 1fd3cbca20a2..03ec099d64e0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_sdma.c@@ -112,7 +112,8 @@ static int amdgpu_vm_sdma_commit(struct amdgpu_vm_update_params *p,swap(p->vm->last_unlocked, tmp); dma_fence_put(tmp); } else { - amdgpu_bo_fence(p->vm->root.bo, f, true); + dma_resv_add_fence(p->vm->root.bo->tbo.base.resv, f, + DMA_RESV_USAGE_BOOKKEEP);Same as above.
This one here should certainly have an associated dma_fence_reserve_fence(). Regards, Christian.
Regards, Felix} if (fence && !p->immediate)