https://bugzilla.kernel.org/show_bug.cgi?id=214035

            Bug ID: 214035
           Summary: acpi_turn_off_unused_power_resources() may take down
                    necessary hardware
           Product: ACPI
           Version: 2.5
    Kernel Version: 5.13.9
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: high
          Priority: P1
         Component: Power-Other
          Assignee: acpi_power-ot...@kernel-bugs.osdl.org
          Reporter: cfswo...@gmail.com
        Regression: No

The commit 7e4fdea changes the ACPI power system's initialization to turn off
any unused power resources, even ones in an unknown state. This essentially has
the effect of ensuring that all unused power resources are turned off, which is
great in theory.

I have recently encountered an issue on 5.13.0+ where this behavior is taking
down the NVMe SSD before the driver can initialize, resulting in the following
in the kernel log:
acpi LNXPOWER:02: Turning OFF
...
pci 0000:02:00.0: CLS mismatch (64 != 1020), using 64 bytes
...
nvme 0000:02:00.0: can't change power state from D3hot to D0 (config space
inaccessible)

(See bug 214025 for a discussion about making this error message more clear)

What is happening is that the LNXPOWER:02 resource is controlling the power to
the PCIe port where the NVMe SSD is attached, but no other ACPI object is
claiming this in power_resources_D*, and so:
$ cat /sys/bus/acpi/devices/LNXPOWER:02/resource_in_use
0

This causes the acpi_turn_off_unused_power_resources() function to believe that
the resource is fair game and turn off the PCIe port, between the time that the
PCIe device is discovered and the time that the driver gets a chance to probe
the device.

I'm currently working around this by bypassing
acpi_turn_off_unused_power_resources() entirely, but a proper fix will require
flagging the power resource as "in use." I don't know whether this is a problem
with the device's ACPI or if Linux should be claiming all LNXPOWER:* resources
under each PCI bridge's firmware_node.

Happy to do any additional debugging steps.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

_______________________________________________
acpi-bugzilla mailing list
acpi-bugzilla@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to