[AMD Official Use Only - AMD Internal Distribution Only]

Reviewed-by: Feifei Xu <[email protected]>

-----Original Message-----
From: SHANMUGAM, SRINIVASAN <[email protected]>
Sent: Tuesday, April 14, 2026 4:43 PM
To: Koenig, Christian <[email protected]>; Deucher, Alexander 
<[email protected]>
Cc: [email protected]; SHANMUGAM, SRINIVASAN 
<[email protected]>; Dan Carpenter <[email protected]>; Xu, Feifei 
<[email protected]>; Lazar, Lijo <[email protected]>; Zhang, Hawking 
<[email protected]>
Subject: [PATCH v2] drm/amd/pm: Fix mode2 reset ACK handling on aldebaran v2

aldebaran_mode2_reset() sends a mode2 reset message and waits for an 
acknowledgment from the SMU.

The current ACK handling is incorrect.

The wait loop runs only when ret is -ETIME. But after a successful async send, 
ret is 0. Because of this, the loop is skipped and the code does not wait for 
the reset acknowledgment.

Also, the code checks for ret != 1 after calling smu_msg_wait_response(). 
However, smu_msg_wait_response() returns
0 on success and negative error codes on failure. So checking against 1 is 
wrong.

Return -EOPNOTSUPP when the firmware does not support this reset message.

Fix this by setting ret to -ETIME before entering the wait loop, checking for 
ret != 0 after getting the SMU response, and returning -EOPNOTSUPP when the 
firmware does not support the message.

v2:
- Update ACK check to use ret != 0 instead of ret != 1, since
  smu_msg_wait_response() returns 0 on success (Feifei)
- Remove unnecessary handling for ret == 0

Fixes: e42569d02acb ("drm/amd/pm: Modify mode2 msg sequence on aldebaran")
Reported-by: Dan Carpenter <[email protected]>
Cc: Feifei Xu <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
---
 drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index 259e5a13c1bd..cb7cbbccb875 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1847,6 +1847,7 @@ static int aldebaran_mode2_reset(struct smu_context *smu)
                amdgpu_device_load_pci_state(adev->pdev);

                dev_dbg(adev->dev, "wait for reset ack\n");
+               ret = -ETIME;
                while (ret == -ETIME && timeout)  {
                        ret = smu_msg_wait_response(ctl, 0);
                        /* Wait a bit more time for getting ACK */ @@ -1856,7 
+1857,7 @@ static int aldebaran_mode2_reset(struct smu_context *smu)
                                continue;
                        }

-                       if (ret != 1) {
+                       if (ret != 0) {
                                dev_err(adev->dev, "failed to send mode2 
message \tparam: 0x%08x response %#x\n",
                                                SMU_RESET_MODE_2, ret);
                                goto out;
@@ -1866,10 +1867,9 @@ static int aldebaran_mode2_reset(struct smu_context *smu)
        } else {
                dev_err(adev->dev, "smu fw 0x%x does not support 
MSG_GfxDeviceDriverReset MSG\n",
                                smu->smc_fw_version);
+               ret = -EOPNOTSUPP;
        }

-       if (ret == 1)
-               ret = 0;
 out:
        mutex_unlock(&ctl->lock);

--
2.34.1

Reply via email to