On Sun, Jun 15, 2025 at 02:47:15PM -0400, Demi Marie Obenour wrote: > Virtio-IOMMU interrupt remapping turned out to be much harder than I > realized. The main problem is that interrupt remapping is set up > very early in boot. In fact, Linux calls the interrupt remapping probe > function from the APIC initialization code: x86_64_probe_apic -> > enable_IR_x2apic -> irq_remapping_prepare(). This is almost certainly > much before PCI has been initialized. Also, the order in which devices > will be initialized is not something Linux guarantees at all, which is a > problem because interrupt remapping must be initialized before drivers > start setting up interrupts. Otherwise, the interrupt remapping table > won't include entries for already-existing interrupts, and things will > either break badly, not get the benefit of interrupt remapping > security-wise, or both. > > The reason I expect this doesn't cause problems for address translation > is that the IOMMU probably starts in bypass mode by default, meaning > that all DMA is permitted. If the IOMMU is only used by VFIO or > IOMMUFD, it will not be needed until userspace starts up, which is after > the IOMMU has been initialized. This isn't ideal, though, as it means > that kernel drivers operate without DMA protection. > > Is a paravirtualized IOMMU with interrupt remapping something that makes > sense? Absolutely! However, the IOMMU should be considered a platform > device that must be initialized very early in boot. Using virtio-IOMMU > with MMIO transport as the interface might be a reasonable option, but > the IOMMU needs to be enumerated via ACPI, device tree, or kernel > command line argument. This allows it to be brought up before anything > capable of DMA is initialized. > > Is this the right path to go down? What do others think about this? > -- > Sincerely, > Demi Marie Obenour (she/her/hers)
The project for this discussion is also virtio-comment, this ML is for driver work.