On 10/15/2025 7:43 AM, Mario Limonciello wrote:
[Why]
Newer VPE microcode has functionality that will decrease DPM level
only when a workload has run for 2 or more seconds.  If VPE is turned
off before this DPM decrease, the SOC can get stuck with a higher
DPM level.

This can happen from amdgpu's ring buffer test because it's a short
quick workload for VPE and VPE is turned off after 1s.

[How]
In idle handler besides checking fences are drained check that VPE DPM
level is really is at DPM0. If not, schedule delayed work again until
it is.

Cc: [email protected]
Reported-by: Sultan Alsawaf <[email protected]>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4615
Signed-off-by: Mario Limonciello <[email protected]>
---
v4:
  * only apply to Strix Halo (VPE 6.1.1)
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 31 ++++++++++++++++++++++---
  1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
index 474bfe36c0c2..95d0badeb479 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
@@ -322,6 +322,26 @@ static int vpe_early_init(struct amdgpu_ip_block *ip_block)
        return 0;
  }
+/*
+ * vpe_wait_dpm0: whether to reschedule idle work waiting for IP to reach DPM0
+ */
+static int vpe_wait_dpm0(struct amdgpu_device *adev)
+{

This is not really a wait, may be renamed as vpe_get_req_dpm_level()

+       struct amdgpu_vpe *vpe = &adev->vpe;
+
+       if (!adev->pm.dpm_enabled)
+               return 0;
+
+       switch (amdgpu_ip_version(adev, VPE_HWIP, 0)) {
+       case IP_VERSION(6, 1, 1):
+               break;
+       default:
+               return 0;
+       }

These checks could be in a separate function like - vpe_need_dpm0_at_power_down(adev). It could return false once pmfw carries the fix for IP version.
+
+       return RREG32(vpe_get_reg_offset(vpe, 0, vpe->regs.dpm_request_lv));
+}
+
  static void vpe_idle_work_handler(struct work_struct *work)
  {
        struct amdgpu_device *adev =
@@ -329,11 +349,16 @@ static void vpe_idle_work_handler(struct work_struct 
*work)
        unsigned int fences = 0;
fences += amdgpu_fence_count_emitted(&adev->vpe.ring);
+       if (fences)
+               goto reschedule;
- if (fences == 0)
+       if (!vpe_wait_dpm0(adev)) {

This may be,

        if (vpe_need_dpm0_at_power_down(adev) && vpe_get_req_dpm_level() != 0)
                goto rechedule
        else
                power_gate()

Thanks,
Lijo
                amdgpu_device_ip_set_powergating_state(adev, 
AMD_IP_BLOCK_TYPE_VPE, AMD_PG_STATE_GATE);
-       else
-               schedule_delayed_work(&adev->vpe.idle_work, VPE_IDLE_TIMEOUT);
+               return;
+       }
+
+reschedule:
+       schedule_delayed_work(&adev->vpe.idle_work, VPE_IDLE_TIMEOUT);
  }
static int vpe_common_init(struct amdgpu_vpe *vpe)

Reply via email to