On 2/2/26 8:35 AM, Christian König wrote:
On 2/2/26 15:25, Mario Limonciello wrote:
On 1/31/26 6:24 PM, Bert Karwatzki wrote:
This reverts commit 7294863a6f01248d72b61d38478978d638641bee.

This commit was erroneously applied again after commit 0ab5d711ec74
("drm/amd: Refactor `amdgpu_aspm` to be evaluated per device")
removed it, leading to very hard to debug crashes, when used with a system with 
two
AMD GPUs of which only one supports ASPM.

Link: https://lore.kernel.org/linux-acpi/[email protected]/
Link: https://github.com/acpica/acpica/issues/1060
Fixes: 0ab5d711ec74 ("drm/amd: Refactor `amdgpu_aspm` to be evaluated per 
device")

Signed-off-by: Bert Karwatzki <[email protected]>
---

Amazing detective work, thanks so much.

This added the code initially:
cba07cce39ace drm/amd: Check if ASPM is enabled from PCIe subsystem

This effectively removed it:
0ab5d711ec74d drm/amd: Refactor `amdgpu_aspm` to be evaluated per device

This was the accidental re-apply:
7294863a6f012 drm/amd: Check if ASPM is enabled from PCIe subsystem

It looks like this as right on the edge of the 5.17-rc6 and 5.18-rc1.
I think drm-fixes-2022-02-25 and amd-drm-next-5.18-2022-02-25 ended up with 
different content.

Nonethless this is the correct change and I've applied it to 
amd-staging-drm-next.

Reviewed-by: Mario Limonciello (AMD) <[email protected]>

Reviewed-by: Christian König <[email protected]>

There is just one major question left: Why is disabling ASPM causing problems?


My theory is that it's a mismatch of PCIe core and AMDGPU. IE if the PCIe core thinks it's enabled but amdgpu thinks it is disabled can hit some corner scenarios.

I mean we had tons of problems with ASPM before, but only by accidentally 
enabling it and never accidentally disabling it.

IIRC we even suggested to disable ASPM as possible workaround.

Thanks,
Christian.


   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ---
   1 file changed, 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index d6d0a6e34c6b..95d26f086d54 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -2405,9 +2405,6 @@ static int amdgpu_pci_probe(struct pci_dev *pdev,
               return -ENODEV;
       }
   -    if (amdgpu_aspm == -1 && !pcie_aspm_enabled(pdev))
-        amdgpu_aspm = 0;
-
       if (amdgpu_virtual_display ||
           amdgpu_device_asic_has_dc_support(pdev, flags & AMD_ASIC_MASK))
           supports_atomic = true;



Reply via email to