Hello Stanley.Yang, The patch b1338a8e71ac: "drm/amdgpu: Workaround to skip kiq ring test during ras gpu recovery" from Oct 17, 2023 (linux-next), leads to the following Smatch static checker warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c:513 amdgpu_get_xgmi_hive() warn: sleeping in atomic context drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c 500 struct amdgpu_hive_info *amdgpu_get_xgmi_hive(struct amdgpu_device *adev) 501 { 502 struct amdgpu_hive_info *hive = NULL; 503 int ret; 504 505 if (!adev->gmc.xgmi.hive_id) 506 return NULL; 507 508 if (adev->hive) { 509 kobject_get(&adev->hive->kobj); 510 return adev->hive; 511 } 512 --> 513 mutex_lock(&xgmi_mutex); The patch adds a new caller amdgpu_gfx_disable_kcq() which is holding spin_lock(&kiq->ring_lock). And we can't take a mutex if we're already holding a spin_lock. Turn on CONFIG_DEBUG_ATOMIC_SLEEP to see the warning. regards, dan carpenter