On 6/11/2025 7:14 AM, Nicolas Dichtel wrote:
Le 06/05/2025 à 01:06, Bjorn Helgaas a écrit :
On Wed, Apr 23, 2025 at 11:31:32PM -0500, Mario Limonciello wrote:
From: Mario Limonciello <mario.limoncie...@amd.com>
AMD BIOS team has root caused an issue that NVME storage failed to come
back from suspend to a lack of a call to _REG when NVME device was probed.
commit 112a7f9c8edbf ("PCI/ACPI: Call _REG when transitioning D-states")
added support for calling _REG when transitioning D-states, but this only
works if the device actually "transitions" D-states.
commit 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI
devices") added support for runtime PM on PCI devices, but never actually
'explicitly' sets the device to D0.
To make sure that devices are in D0 and that platform methods such as
_REG are called, explicitly set all devices into D0 during initialization.
Fixes: 967577b062417 ("PCI/PM: Keep runtime PM enabled for unbound PCI devices")
Signed-off-by: Mario Limonciello <mario.limoncie...@amd.com>
Applied to pci/pm for v6.16, thanks!
I've a regression after this commit.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4d4c10f763d7
I've started a QEMU with "-cpu host" on an AMD (AMD Ryzen 5 3600 6-Core
Processor) machine + virtio-net interfaces. When I try to start a testpmd (a
DPDK app), it cannot find the virtio port. The ioctl VFIO_GROUP_GET_DEVICE_FD
fails.
To reproduce the issue:
qemu-system-x86_64 --enable-kvm -m 5G -cpu host \
-smp sockets=1,cores=2,threads=2 \
-snapshot -vga none -display none -nographic \
-drive if=none,file=/opt/vm/ubuntu-24.04-with-linux-net.qcow2,id=hda \
-device virtio-blk,drive=hda \
-device virtio-net,netdev=eth0,addr=03 -netdev user,id=eth0 \
-device virtio-net,netdev=eth1,addr=04 -netdev
socket,id=eth1,mcast=230.0.0.1:1234
git clone git://dpdk.org/dpdk
cd dpdk/
meson build-static --werror --default-library=static --debug
ninja -C build-static
echo 3 > /proc/sys/vm/drop_caches
echo 256 >
/sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
modprobe vfio-pci
lspci
python3 ./usertools/dpdk-devbind.py --noiommu-mode -b vfio-pci 0000:00:04.0
./build-static/app/dpdk-testpmd -l 1,2 --socket-mem 512,0 -a 0000:00:04.0 -- -i
Here is the output:
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
EAL: Getting a vfio_dev_fd for 0000:00:04.0 failed
PCI_BUS: Cannot get offset of region 0.
PCI_BUS: fail to disable req notifier.
PCI_BUS: fail to disable req notifier.
VIRTIO_INIT: eth_virtio_pci_init(): Failed to init PCI device
PCI_BUS: Requested device 0000:00:04.0 cannot be used
EAL: Bus (pci) probe failed.
testpmd: No probed ethernet devices
Interactive-mode selected
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Done
testpmd>
=> the problem starts at the line "Getting a vfio_dev_fd for 0000:00:04.0
failed"
https://git.dpdk.org/dpdk/tree/lib/eal/linux/eal_vfio.c#n966
FWIW, here is the output when it starts correctly:
EAL: Detected CPU lcores: 4
EAL: Detected NUMA nodes: 1
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 8 (No-IOMMU)
Interactive-mode selected
Warning: NUMA should be configured manually by using --port-numa-config and
--ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mb_pool_0>: n=155456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
Warning! port-topology=paired and odd forward ports number, the last port will
pair with itself.
Configuring Port 0 (socket 0)
EAL: Error disabling MSI-X interrupts for fd 277
Port 0: DE:ED:01:E0:1B:75
Checking link statuses...
Done
testpmd>
Any help would be appreciated.
Regards,
Nicolas
+AlexW
Thanks for the report and especially for the repro steps. This sounds
just like the one reported for the QAT regression also in this thread.
https://lore.kernel.org/linux-pci/aems+oql7ibjd...@gcabiddu-mobl.ger.corp.intel.com/T/#m7e8929d6421690dc8bd6dc639d86c2b4db27cbc4
I'm traveling this week, but as your report doesn't have a dependency on
QAT hardware I will try to reproduce next week to understand what's
going on.
Alex - if you have any ideas please let me know.