drm_sched_entity_flush() may kill the VM entities under certain condition.
then KFD need to issue kfd_process_wq_release to release associated
resources, it cam cause following job submissions of process failed.

[ 3976.788183] [drm:amddrm_sched_entity_push_job [amd_sched]] *ERROR* Trying to 
push to a killed entity
Or
[  129.600916] [drm:amdgpu_job_submit [amdgpu]] *ERROR* Trying to push to a 
killed entity

Signed-off-by: Gang Ba <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index bebf2ebc4f34..2361c09ddc77 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2997,6 +2997,9 @@ static int amdgpu_flush(struct file *f, fl_owner_t id)
        struct amdgpu_fpriv *fpriv = file_priv->driver_priv;
        long timeout = MAX_WAIT_SCHED_ENTITY_Q_EMPTY;
 
+       if (fpriv->vm.is_compute_context)
+               return 0;
+
        timeout = amdgpu_ctx_mgr_entity_flush(&fpriv->ctx_mgr, timeout);
        timeout = amdgpu_vm_wait_idle(&fpriv->vm, timeout);
 
-- 
2.34.1

Reply via email to