On GPUs with RAS enabled, below call trace is observed when
suspending or shutting down device. The cause is we have enabled
memory wipe flag for BOs on such GPUs by default, and such BOs
will go to memory wipe by amdgpu_fill_buffer, however, because
ring is off already, it fails to clean up the memory and throw
this error message. So add a suspend/shutdown check before
wipping memory.

[drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring 
turned off.

v2: fix coding style issue

Fixes: e7e7c87a205d("drm/amdgpu: Wipe all VRAM on free when RAS is enabled")
Signed-off-by: Guchun Chen <guchun.c...@amd.com>
Reviewed-by: Christian König <christian.koe...@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 23c9a60693ee..c712d7f5e8a8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1284,6 +1284,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo, uint64_t 
*vram_mem,
  */
 void amdgpu_bo_release_notify(struct ttm_buffer_object *bo)
 {
+       struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
        struct dma_fence *fence = NULL;
        struct amdgpu_bo *abo;
        int r;
@@ -1303,7 +1304,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object 
*bo)
                amdgpu_amdkfd_remove_fence_on_pt_pd_bos(abo);
 
        if (bo->resource->mem_type != TTM_PL_VRAM ||
-           !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE))
+           !(abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE) ||
+           adev->in_suspend || adev->shutdown)
                return;
 
        if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv)))
-- 
2.17.1

Reply via email to