Virtio-IOMMU interrupt remapping turned out to be much harder than I realized. The main problem is that interrupt remapping is set up very early in boot. In fact, Linux calls the interrupt remapping probe function from the APIC initialization code: x86_64_probe_apic -> enable_IR_x2apic -> irq_remapping_prepare(). This is almost certainly much before PCI has been initialized. Also, the order in which devices will be initialized is not something Linux guarantees at all, which is a problem because interrupt remapping must be initialized before drivers start setting up interrupts. Otherwise, the interrupt remapping table won't include entries for already-existing interrupts, and things will either break badly, not get the benefit of interrupt remapping security-wise, or both.
The reason I expect this doesn't cause problems for address translation is that the IOMMU probably starts in bypass mode by default, meaning that all DMA is permitted. If the IOMMU is only used by VFIO or IOMMUFD, it will not be needed until userspace starts up, which is after the IOMMU has been initialized. This isn't ideal, though, as it means that kernel drivers operate without DMA protection. Is a paravirtualized IOMMU with interrupt remapping something that makes sense? Absolutely! However, the IOMMU should be considered a platform device that must be initialized very early in boot. Using virtio-IOMMU with MMIO transport as the interface might be a reasonable option, but the IOMMU needs to be enumerated via ACPI, device tree, or kernel command line argument. This allows it to be brought up before anything capable of DMA is initialized. Is this the right path to go down? What do others think about this? -- Sincerely, Demi Marie Obenour (she/her/hers)
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature