Re: [PATCH v2] drm/amdgpu: reset asic after system-wide suspend aborted (v2)

Lazar, Lijo Wed, 24 Nov 2021 05:30:48 -0800



On 11/24/2021 6:13 PM, Prike Liang wrote:

Do ASIC reset at the moment Sx suspend aborted behind of amdgpu suspend
to keep AMDGPU in a clean reset state and that can avoid re-initialize
device improperly error. Currently,we just always do asic reset in the
amdgpu resume until sort out the PM abort case.

v2: Remove incomplete PM abort flag and add GPU hive case check for
GPU reset.

Signed-off-by: Prike Liang <prike.li...@amd.com>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 7d4115d..3fcd90d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3983,6 +3983,14 @@ int amdgpu_device_resume(struct drm_device *dev, bool 
fbcon)
        if (adev->in_s0ix)
                amdgpu_gfx_state_change_set(adev, sGpuChangeState_D0Entry);

+ /*TODO: In order to not let all-always asic reset affect resume latency

+        * need sort out the case which really need asic reset in the resume 
process.
+        * As to the known issue on the system suspend abort behind the AMDGPU 
suspend,
+        * may can sort this case by checking struct suspend_stats which need 
exported
+        * firstly.
+        */
+       if (adev->gmc.xgmi.num_physical_nodes <= 1)
+               amdgpu_asic_reset(adev);

Newer dGPUs depend on PMFW to do reset and that is not loaded at thispoint. For some, there is a mini FW available which could technicallyhandle a reset and some of the older ones depend on PSP. Stronglysuggest to check all such cases before doing a reset here.


Or, the safest at this point could be to do the reset only for APUs.

Thanks,
Lijo

        /* post card */
        if (amdgpu_device_need_post(adev)) {
                r = amdgpu_device_asic_init(adev);

Re: [PATCH v2] drm/amdgpu: reset asic after system-wide suspend aborted (v2)

Reply via email to