On Mon, Sep 30, 2013 at 11:55:47AM +0200, Markus Armbruster wrote: > "Michael S. Tsirkin" <m...@redhat.com> writes: > > > On Fri, Sep 27, 2013 at 07:06:44PM +0200, Markus Armbruster wrote: > >> Marcel Apfelbaum <marcel.apfelb...@gmail.com> writes: > >> > >> > On Wed, 2013-09-25 at 10:01 +0300, Michael S. Tsirkin wrote: > >> >> On Tue, Sep 24, 2013 at 06:01:02AM -0400, Laine Stump wrote: > >> >> > When I added support for the Q35-based machinetypes to libvirt, I > >> >> > specifically prohibited attaching any PCI devices (with the exception > >> >> > of > >> >> > graphics controllers) to the PCIe root complex, > >> >> > >> >> That's wrong I think. Anything attached to RC is an integrated > >> >> endpoint, and these can be PCI devices. > >> > I couldn't find on PCIe spec any mention that "Root Complex > >> > Integrated EndPoint" > >> > must be PCIe. But, from spec 1.3.2.3: > >> > - A Root Complex Integrated Endpoint must not require I/O resources > >> > claimed through BAR(s). > >> > - A Root Complex Integrated Endpoint must not generate I/O Requests. > >> > - A Root Complex Integrated Endpoint is required to support MSI or > >> > MSI-X or both if an > >> > interrupt resource is requested. > >> > I suppose that this restriction can be removed for PCI devices that > >> > 1. Actually work when plugged in into RC Integrated EndPoint > >> > 2. Respond to the above limitations > >> > > >> >> > >> >> > and had planned to > >> >> > prevent attaching them to PCIe root ports (ioh3420 device) and PCIe > >> >> > downstream switch ports (xio-3130 device) as well. I did this because, > >> >> > even though qemu currently allows attaching a normal PCI device in any > >> >> > of these three places, the restriction exists for real hardware and I > >> >> > didn't see any guarantee that qemu wouldn't add the restriction in the > >> >> > future in order to more closely emulate real hardware. > >> >> > > >> >> > However, since I did that, I've learned that many of the qemu "pci" > >> >> > devices really should be considered as "pci or pcie". Gerd Hoffman > >> >> > lists > >> >> > some of these cases in a bug he filed against libvirt: > >> >> > > >> >> > https://bugzilla.redhat.com/show_bug.cgi?id=1003983 > >> >> > > >> >> > I would like to loosen up the restrictions in libvirt, but want to > >> >> > make > >> >> > sure that I don't allow something that could later be forbidden by > >> >> > qemu > >> >> > (thus creating a compatibility problem during upgrades). Beyond Gerd's > >> >> > specific requests to allow ehci, uhci, and hda controllers to attach > >> >> > to > >> >> > PCIe ports, are there any other devices that I specifically should or > >> >> > shouldn't allow? (I would rather be conservative in what I allow - > >> >> > it's > >> >> > easy to allow more things later, but nearly impossible to revoke > >> >> > permission once it's been allowed). > >> > For the moment I would not remove any restrictions, but only the ones > >> > requested and verified by somebody. > >> > > >> >> > >> >> IMO, we really need to grow an interface to query this kind of thing. > >> > Basically libvirt needs to know: > >> > 1. for (libvirt) controllers: what kind of devices can be plugged in > >> > 2. for devices (controller is also a device) > >> > - to which controllers can it be plugged in > >> > - does it support hot-plug? > >> > 3. implicit controllers of the machine types (q35 - "pcie-root", i440fx > >> > - "pci-root") > >> > All the above must be exported to libvirt > >> > > >> > Implementation options: > >> > 1. Add a compliance field on PCI/PCIe devices and controllers stating if > >> > it supports > >> > PCI/PCIe or both (and maybe hot-plug) > >> > - consider plug type + compliance to figure out whether a plug can go > >> > into a socket > >> > > >> > 2. Use Markus Armbruster idea of introducing a concept of "plug and > >> > sockets": > >> > - dividing the devices into adapters and plugs > >> > - adding sockets to bridges(buses?). > >> > In this way it would be clear which devices can connect to bridges > >> > >> This isn't actually my idea. It's how things are designed to work in > >> qdev, at least in my admittedly limited understanding of qdev. > >> > >> In traditional qdev, a device has exactly one plug (its "bus type", > >> shown by -device help), and it may have one ore more buses. Each bus > >> has a type, and you can plug a device only into a bus of the matching > >> type. This was too limiting, and is not how things work now. > >> > >> As far as I know, libvirt already understands that a device can only > >> plug into a matching bus. > >> > >> In my understanding of where we're headed with qdev, things are / will > >> become more general, yet stay really simple: > >> > >> * A device can have an arbitrary number of sockets and plugs. > >> > >> * Each socket / plug has a type. > >> > >> * A plug can go into a socket of the same type, and only there. > >> > >> Pretty straightforward generalization of traditional qdev. I wouldn't > >> expect libvirt to have serious trouble coping with it, as long as we > >> provide the necessary information on device models' plugs and sockets. > >> > >> In this framework, there's no such thing as a device model that can plug > >> either into a PCI or a PCIe socket. Makes sense to me, because there's > >> no such thing in the physical world, either. > >> Instead, and just like in the physical world, you have one separate > >> device variant per desired plug type. > >> > > > > Two types of bus is not how things work in real world though. > > In real world there are 3 types of express bus besides > > classical pci bus, and limitations on which devices go where > > that actually apply to devices qemu emulates. > > For example, a dwonstream switch port can only go on > > internal virtual express bus of a switch. > > Devices with multiple interfaces actually do exist > > in real world - e.g. esata/usb - > > I think the orthodox way to model a disk with both eSATA an USB > connectors would be two separate plugs, where only one of them can be > used at the same time. Not that I can see why anyone would want to > model such a device when you can just as well have separate eSATA-only > and USB-only devices, and use the one you want. > > > I never heard of a pci/pci express > > one but it's not impossible I think. > > PCI on one side of the card, PCIe on the other, and a switchable > backplate? Weird :) > > Again, I can't see why we'd want to model this, even if it existed. > > Nevertheless, point taken: devices with multiple interfaces of which > only one can be used at the same time exist, and we can't exclude the > possibility that we want to model such a device one day. > > >> To get that, you have to split the device into a common core and bus > >> adapters. You compose the core with the PCI adapter to get the PCI > >> device, with the PCIe adapter to get the PCIe device, and so forth. > >> I'm not claiming that's the best way to do PCI + PCIe. It's a purely > >> theoretical approach, concerned only with conceptual cleanliness, not > >> practical coding difficulties. > > > > I don't mind if that's the internal implementation, > > but I don't think we should expose this split > > in the user interface. > > I'm not so sure. > > The current interface munges together all PCIish connectors, and the > result is a mess: users can't see which device can be plugged into which > socket. Libvirt needs to know, and it has grown a bunch of hardcoded ad > hoc rules, which aren't quite right. > > With separate types for incompatible plugs and sockets, the "what can > plug into what" question remains as trivial as it was by the initial > design.
Yes but, same as in the initial design, it really makes it user's problem. So we'd have virtio-net-pci-conventional virtio-net-pci-express virtio-net-pci-integrated All this while users just really want to say "virtio" (that's the expert user, what most people want is for guest to be faster). > >> What we have now is entirely different: we've overloaded the existing > >> PCI plug with all the other PCI-related plug types that came with the > >> PCIe support, so it means pretty much nothing anymore. In particular, > >> there's no way for libvirt to figure out progragmatically whether some > >> alleged "PCI" device can go into some alleged "PCI" bus. I call that a > >> mess. > > > > There are lots of problems. > > > > First, bus type is not the only factor that can limit > > which devices go where. > > For example, specific slots might not support hotplug. > > We made "can hotplug" a property of the bus. Perhaps it should be a > property of the slot. Sure. > > Another example, all devices in same pci slot must have > > "multifunction" property set. > > PCI multifunction devices are simply done wrong in the current code. > > I think the orthodox way to model multifunction devices would involve > composing the functions with a container device, resulting in a > composite PCI device that can only be plugged as a whole. It would also presumably involve a new bus which has no basis in reality and a new type of device for when it's a function within a multifunction device. > Again, this is a conceptually clean approach, unconcerned with practical > coding difficulties. And need to grow new interfaces to specify these containers. > > Second, there's no way to find out what is a valid > > bus address. For example, with express you can > > only use slot 0. > > Device introspection via QOM should let you enumerate available sockets. Slots would need to become sockets for this to work :) > > Hotplug is only supported ATM if no two devices > > share a pci slot. > > Do you mean "hotplug works only with multifunction off"? If yes, see > above. If no, please elaborate. Well it does kind of work with most guests. The way it works is a hack though. > > Also, there's apparently no way to figure > > out what kind of bus (or multiple buses) is behind each device. > > Again, introspection via QOM should let you enumerate available sockets. Maybe it should but it doesn't seem to let me. > > Solution proposed above (separate each device into > > two parts) only solves the pci versus express issue without > > addressing any of the other issues. > > So I'm not sure it's worth the effort. > > As I said, I'm not claiming I know the only sane solution to this > problem. I've only described a solution that stays true to the qdev / > QOM design as I understand it. Right. And I don't argue from the implementation point of view. What I am saying is, encoding everything in a single string isn't a good user interface. We really should let users say "virtio" detect that it's connected to pci on one end and network on the other and instanciate virtio-net-pci. > True qdev / QOM experts, please help out with advice. -- MST