On Wed, Aug 25, 2021 at 02:28:55PM +0200, Markus Armbruster wrote: > Markus Armbruster <arm...@redhat.com> writes: > > > Peter Xu <pet...@redhat.com> writes: > > > >> On Mon, Aug 23, 2021 at 05:56:23PM -0400, Eduardo Habkost wrote: > >>> I don't have any other example, but I assume address assignment > >>> based on ordering is a common pattern in device code. > >>> > >>> I would take a very close and careful look at the devices with > >>> non-default vmsd priority. If you can prove that the 13 device > >>> types with non-default priority are all order-insensitive, a > >>> custom sort function as you describe might be safe. > >> > >> Besides virtio-mem-pci, there'll also similar devfn issue with all > >> MIG_PRI_PCI_BUS, as they'll be allocated just like other pci devices. Say, > >> below two cmdlines will generate different pci topology too: > >> > >> $ qemu-system-x86_64 -device pcie-root-port,chassis=0 \ > >> -device pcie-root-port,chassis=1 \ > >> -device virtio-net-pci > >> > >> And: > >> > >> $ qemu-system-x86_64 -device pcie-root-port,chassis=0 \ > >> -device virtio-net-pci > >> -device pcie-root-port,chassis=1 \ > >> > >> This cannot be solved by keeping priority==0 ordering. > >> > >> After a second thought, I think I was initially wrong on seeing migration > >> priority and device realization the same problem. > >> > >> For example, for live migration we have a requirement on PCI_BUS being > >> migrated > >> earlier than MIG_PRI_IOMMU because there's bus number information required > >> because IOMMU relies on the bus number to find address spaces. However > >> that's > >> definitely not a requirement for device realizations, say, realizing vIOMMU > >> after pci buses are fine (bus assigned during bios). > >> > >> I've probably messed up with the ideas (though they really look alike!). > >> Sorry > >> about that. > >> > >> Since the only ordering constraint so far is IOMMU vs all the rest of > >> devices, > >> I'll introduce a new priority mechanism and only make sure vIOMMUs are > >> realized > >> earlier. That'll also avoid other implications on pci devfn allocations. > >> > >> Will rework a new version tomorrow. Thanks a lot for all the comments, > > > > Is it really a good idea to magically reorder device realization just to > > make a non-working command line work? Why can't we just fail the > > non-working command line in a way that tells users how to get a working > > one? We have way too much ordering magic already... > > > > If we decide we want more magic, then I'd argue for *dependencies* > > instead of priorities. Dependencies are specific and local: $this needs > > to go after $that because $reasons. Priorities are unspecific and > > global. > > Having thought about this a bit more... > > Constraints on realize order are nothing new. For instance, when a > device plugs into a bus, it needs to be realized after the device > providing the bus. > > We ensure this by having the device refer to the bus, e.g. bus=pci.0. > The reference may be implicit, but it's there. It must resolve for > device creation to succeed, and if it resolves, the device providing the > bus will be realized in time. > > I believe what's getting us into trouble with IOMMU is not having such a > reference. Or in other words, keeping the dependence between the IOMMU > and the devices relying on it *implicit*, and thus hidden from the > existing realize-ordering machinery. > > Instead of inventing another such machinery, let's try to use the one we > already have.
Hmm... I just found that we don't have such machinery, do we? This does not really work: $ ./qemu-system-x86_64 -M q35 -device virtio-net-pci,bus=pcie.1 \ -device pcie-root-port,id=pcie.1,bus=pcie.0 qemu-system-x86_64: -device virtio-net-pci,bus=pcie.1: Bus 'pcie.1' not found While this will: $ ./qemu-system-x86_64 -M q35 -device pcie-root-port,id=pcie.1,bus=pcie.0 \ -device virtio-net-pci,bus=pcie.1 Thanks, -- Peter Xu