The driver currently only checks that the MES packet submission fence
did not timeout but does not actually check if the fence return status
matches the expected completion value it passed to MES prior to
submission.

For example, this can result in REMOVE_QUEUE requests returning success
to the driver when the queue actually failed to preempt.

Fix this by having the driver actually compare the completion status
value to the expected success value.

Signed-off-by: Jonathan Kim <[email protected]>
---
 drivers/gpu/drm/amd/amdgpu/mes_v12_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c 
b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
index aff06f06aeee..58f61170cf85 100644
--- a/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mes_v12_0.c
@@ -228,8 +228,7 @@ static int mes_v12_0_submit_pkt_and_poll_completion(struct 
amdgpu_mes *mes,
                        pipe, x_pkt->header.opcode);
 
        r = amdgpu_fence_wait_polling(ring, seq, timeout);
-       if (r < 1 || !*status_ptr) {
-
+       if (r < 1 || *status_ptr != api_status->api_completion_fence_value) {
                if (misc_op_str)
                        dev_err(adev->dev, "MES(%d) failed to respond to msg=%s 
(%s)\n",
                                pipe, op_str, misc_op_str);
-- 
2.34.1

Reply via email to