When trying to unload amdgpu in the SteamDeck (TTY mode), the following set of errors happens and the system gets unstable:
[..] [drm] Initialized amdgpu 3.64.0 for 0000:04:00.0 on minor 0 amdgpu 0000:04:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on gfx_0.0.0 (-110). amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110). [..] amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff! amdgpu 0000:04:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 amdgpu 0000:04:00.0: amdgpu: Failed to disable gfxoff! [..] When the driver initializes the GPU, the PSP validates all the firmware loaded, and after that, it is not possible to load any other firmware unless the device is reset. What is happening in the load/unload situation is that PSP halts the GC engine because it suspects that something is amiss. To address this issue, this commit ensures that the GPU is reset (mode 2 reset) in the load/unload sequence. Suggested-by: Alex Deucher <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> --- drivers/gpu/drm/amd/amdgpu/nv.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/nv.c b/drivers/gpu/drm/amd/amdgpu/nv.c index 50e77d9b30af..1964aa37c499 100644 --- a/drivers/gpu/drm/amd/amdgpu/nv.c +++ b/drivers/gpu/drm/amd/amdgpu/nv.c @@ -543,6 +543,13 @@ static bool nv_need_reset_on_init(struct amdgpu_device *adev) { u32 sol_reg; + /* GFX in the SteamDeck hangs when amdgpu module is reloaded, since the + * firmware is already loaded. To avoid this issue, ensure that the + * device is reset to put the PSP in a good state. + */ + if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(10, 3, 1)) + return true; + if (adev->flags & AMD_IS_APU) return false; -- 2.51.0
