On Fri, 3 Sep 2021 12:03:06 -0400 Peter Xu <pet...@redhat.com> wrote:
> On Fri, Sep 03, 2021 at 03:00:05PM +0200, Igor Mammedov wrote: > > PS: > > Another, albeit machine depended approach to resolve IOMMU ordering problem > > can be adding to a specific machine pre_plug hook, an IOMMU handling. > > Which is called during IOMMU realize time and check if existing buses > > without bypass enabled (iommu managed) have any children. And if they > > have devices attached, error out telling user to reorder '-device iommu' > > before affected devices/bus. > > It should cover mixed IOMMU+bypass case and doesn't require fixing > > vfio-pci address space initialization nor defining any priorities > > for PCI devices. > > This sounds appealing among the approaches. That's the easy one, compared to moving address space (re)initialization to reset time (at least to me since vfio realize looks intimidating on the first glance, but its maintainer(s) probably should know enough to impl. change properly). > Does it need to be a pre_plug hook? I thought we might just need a flag in > the > pci device classes showing that it should be after vIOMMUs, then in vIOMMU > realize functions we walk pci bus to make sure no such device exist? > > We could have a base vIOMMU class, then that could be in the realize() of the > common class. We basically don't know if device needs IOMMU or not and can work with/without it just fine. In this case I'd think about IOMMU as board feature that morphs PCI buses (some of them) (address space, bus numers, ...). So I don't perceive any iommu flag as a device property at all. As for realize vs pre_plug, the later is the part of abstract realize (see: device_set_realized) and is already used by some PCI infrastructure: ex: pcie_cap_slot_pre_plug_cb/spapr_pci_pre_plug It's purpose is to check pre-conditions and possibly pre-configure some some wiring on behalf of device's parent hot-plug handler (bus owner/machine), and fail cleanly if something is wrong without leaving side effects. See 0ed48fd32eb8 for boiler plate required to set up custom hot-plug handler. You might need only parts of it, but still it's something that's to be done for each affected machine type, to implement error checking at proper layer. So I'd rather look into 'reset' approach and only if that doesn't look possible, resort to adding pre_plug/error check. PS: yours d2321d31ff98b & c6cbc29d36f look to me like another candidate for pre_plug for pci deivice instead of adding dedicated hook just for vfio-pci to generic machine. > > (but I think it's more a hack compared earlier suggested > > address space initialization at reset time, and it would need to be > > done for every affected machine) > > Agreed. >