On Thu, 2 Oct 2025 17:40:07 +0300 Oleksandr Nahnybida <[email protected]> wrote:
> Hi all, > > We are working on migrating our DPDK application from igb_uio to vfio-pci. > Our target environment is a VMware ESXi host running on an AMD Epyc server > with NICs configured for PCI Passthrough to a guest VM with Debian Bookworm > (Kernel 6.1.0-39-amd64) > > We've encountered a couple of issues. > > Problem 1: > > Initially, attempting to use vfio-pci failed with an error code of -22, and > the /sys/class/iommu/ directory was empty. We discovered the "expose IOMMU > to guest OS" option in VMware and enabled it. > > This led to a new error: > "The virtual machine cannot be powered on because IOMMU virtualization is > not compatible with PCI passthru on AMD platforms" > > We found a workaround by adding amd.iommu.supportsPcip = "TRUE" to the VM's > configuration. The VM now boots, and the IOMMU is visible in the guest. > > However, when we run our DPDK application, it hangs after printing "EAL: > VFIO support initialized", and shortly after, the guest kernel panics with > a soft lockup error, making the system eventually unresponsive. > BUG: soft lockup - CPU#34 stuck for 75s! [kcompactd0:529] > > Problem 2: > > Separately, we've noticed that our IOMMU groups are not ideal. Many groups > contain not only the NICs we need to bind, but also other devices like PCI > bridges. > > IOMMU Group 7: > 0000:00:17.0 - PCI bridge: VMware PCI Express Root Port > 0000:00:17.1 > 0000:00:17.2 > 0000:00:17.3 > 0000:00:17.4 > 0000:00:17.5 > 0000:00:17.6 > 0000:00:17.7 > 0000:13:00.0 - nic > 0000:13:00.1 - nic > 0000:14:00.0 - nic > 0000:14:00.1 - nic > > Questions: > > 1. Is enabling guest IOMMU virtualization in VMware with the > amd.iommu.supportsPcip workaround, the correct approach here? > 2. I see that vfio-pci can be run in unsafe mode, but is there any > benefit to using it over igb_uio in this case? > 1. In my understanding, the hypervisor is using actual hardware IOMMU > to implement PCI pass-through anyway, so what's the point of having it > inside of a guest VM again? > 2. Also, usually enable_unsafe_noiommu_mode is not compiled in, so we > need to recompile vfio separately, and since we already compile igb_uio > anyway, vfio won't be any better in terms of deployment. > 3. There also seems to be an option of using SR-IOV instead of > passthrough, but we haven't explored this option yet. The question here is, > do you still need to "expose iommu" to be able to bind VF to vfio? And > what's the correct workflow here in general? > > Best Regarads, > Oleksandr You may have to use vfio-pci without iommu option.
