Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
On Jun 29, 2014, at 16:55 , Saggi Mizrahi smizr...@redhat.com wrote: - Original Message - From: Martin Polednik mpole...@redhat.com To: devel@ovirt.org Sent: Tuesday, June 24, 2014 1:26:17 PM Subject: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices Hello, I'm actively working on getting host device passthrough (pci, usb and scsi) exposed in VDSM, but I've encountered growing complexity of this feature. The devices are currently created in the same manner as virtual devices and their reporting is done via hostDevices list in getCaps. As I implemented usb and scsi devices, the size of this list grew almost twice - and that is on a laptop. There should be a separate verb with ability to filter by type. +1 Similar problem is with the devices themselves, they are closely tied to host and currently, engine would have to keep their mapping to VMs, reattach back loose devices and handle all of this in case of migration. Migration sound very complicated, especially at the phase where the VM actually starts running on the target host. The hardware state is completely different but the guest OS wouldn't have any idea that happened. So detaching before migration and than reattaching on the destination is a must but that could cause issues in the guest. I'd imaging that this would be an issue when hibernating on one host and waking up on another. If qemu actually supports this at all it would need to be very specific for each device, restoring/setting a concrete HW state is a challenging task. I would also see it as pin to host and then on specific cases detachattach (or that sr-iov's fancy temporary emulated device) I would like to hear your opinion on building something like host device pool in VDSM. The pool would be populated and periodically updated (to handle hot(un)plugs) and VMs/engine could query it for free/assigned/possibly problematic devices (which could be reattached by the pool). This has added benefit of requiring fewer libvirt calls, but a bit more complexity and possibly one thread. The persistence of the pool on VDSM restart could be kept in config or constructed from XML. I'd much rather VDSM not cache state unless this is absolutely necessary. This sounds like something that doesn't need to be queried every 3 seconds so it's best if we just get to ask libvirt. well, unless we try to persist it a cache doesn't hurt I don't see a particular problem in reconstructing the structures on startup I do wonder how that kind of thing can be configured in the VM creation phase as you would sometimes want to just specify a type of device and sometimes specify a specific one. Also, I'd assume there will be a fallback policy stating if the VM should run if said resource is unavailable. I'd need new API verbs to allow engine to communicate with the pool, possibly leaving caps as they are and engine could detect the presence of newer vdsm by presence of these API verbs. Again, I think that getting a list of devices filterable by kind\type might be best than a real pool. We might want to return if a device is in use (could also be in use by the host operating system and not just VMs) The vmCreate call would remain almost the same, only with the addition of new device for VMs (where the detach and tracking routine would be communicated with the pool). ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
- Original Message - From: Martin Polednik mpole...@redhat.com To: devel@ovirt.org Sent: Tuesday, June 24, 2014 12:26:17 PM Subject: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices Hello, Sorry for the late reply, Similar problem is with the devices themselves, they are closely tied to host and currently, engine would have to keep their mapping to VMs, reattach back loose devices and handle all of this in case of migration. This is the simplest and the most consistent solution, so I'd start from here as first option. Your concern is about the sheer number of (host) devices and the added traffic to/from Engine or there is something else? I was also trying to guess which devices should we expect in the common case, and if it is safe to filter out by default in VDSM some devices classes; or, to flip the side, to have whitelists. This could possibly help us to migration work smoothly and without surprises like an host device available on host A but not into host B. For example, and just to my understanding, which one is the intended use case of an host SCSI device? This on top of any filter verbs (which someone else, probably Saggi, already mentioned, and I like the concept). I would like to hear your opinion on building something like host device pool in VDSM. The pool would be populated and periodically updated (to handle hot(un)plugs) and VMs/engine could query it for free/assigned/possibly problematic devices (which could be reattached by the pool). This has added benefit of requiring fewer libvirt calls, but a bit more complexity and possibly one thread. The persistence of the pool on VDSM restart could be kept in config or constructed from XML. The drawbacks of the added complexity seems to hedge the gains, and I'm concerned about migrations in particular, so I tend to dislike the pool concept. I can be convinced, anyway, if you can outline the design win of the pool concept, and especially the pain points this approach is supposed to solve. -- Francesco Romani RedHat Engineering Virtualization R D Phone: 8261328 IRC: fromani ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
- Original Message - From: Martin Polednik mpole...@redhat.com To: devel@ovirt.org Sent: Tuesday, June 24, 2014 1:26:17 PM Subject: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices Hello, I'm actively working on getting host device passthrough (pci, usb and scsi) exposed in VDSM, but I've encountered growing complexity of this feature. The devices are currently created in the same manner as virtual devices and their reporting is done via hostDevices list in getCaps. As I implemented usb and scsi devices, the size of this list grew almost twice - and that is on a laptop. There should be a separate verb with ability to filter by type. Similar problem is with the devices themselves, they are closely tied to host and currently, engine would have to keep their mapping to VMs, reattach back loose devices and handle all of this in case of migration. Migration sound very complicated, especially at the phase where the VM actually starts running on the target host. The hardware state is completely different but the guest OS wouldn't have any idea that happened. So detaching before migration and than reattaching on the destination is a must but that could cause issues in the guest. I'd imaging that this would be an issue when hibernating on one host and waking up on another. I would like to hear your opinion on building something like host device pool in VDSM. The pool would be populated and periodically updated (to handle hot(un)plugs) and VMs/engine could query it for free/assigned/possibly problematic devices (which could be reattached by the pool). This has added benefit of requiring fewer libvirt calls, but a bit more complexity and possibly one thread. The persistence of the pool on VDSM restart could be kept in config or constructed from XML. I'd much rather VDSM not cache state unless this is absolutely necessary. This sounds like something that doesn't need to be queried every 3 seconds so it's best if we just get to ask libvirt. I do wonder how that kind of thing can be configured in the VM creation phase as you would sometimes want to just specify a type of device and sometimes specify a specific one. Also, I'd assume there will be a fallback policy stating if the VM should run if said resource is unavailable. I'd need new API verbs to allow engine to communicate with the pool, possibly leaving caps as they are and engine could detect the presence of newer vdsm by presence of these API verbs. Again, I think that getting a list of devices filterable by kind\type might be best than a real pool. We might want to return if a device is in use (could also be in use by the host operating system and not just VMs) The vmCreate call would remain almost the same, only with the addition of new device for VMs (where the detach and tracking routine would be communicated with the pool). ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
- Original Message - From: Michal Skrivanek michal.skriva...@redhat.com To: Martin Polednik mpole...@redhat.com Cc: devel@ovirt.org Sent: Wednesday, June 25, 2014 9:15:38 AM Subject: Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices On Jun 24, 2014, at 12:26 , Martin Polednik mpole...@redhat.com wrote: Hello, I'm actively working on getting host device passthrough (pci, usb and scsi) exposed in VDSM, but I've encountered growing complexity of this feature. The devices are currently created in the same manner as virtual devices and their reporting is done via hostDevices list in getCaps. As I implemented usb and scsi devices, the size of this list grew almost twice - and that is on a laptop. To be fair, laptops do have quite a bit of devices :P Similar problem is with the devices themselves, they are closely tied to host and currently, engine would have to keep their mapping to VMs, reattach back loose devices and handle all of this in case of migration. In general, with host device passthrough the simpler and first step would be that the VM gets pinned to the host if it uses a host device. If we want to make migration possible, we should whitelist devices or device classes for which it is not troublesome, but I would start small and have the VM pinned. I would like to hear your opinion on building something like host device pool in VDSM. The pool would be populated and periodically updated (to handle hot(un)plugs) and VMs/engine could query it for free/assigned/possibly problematic I'm not sure about making a pool. Having a verb for getting host devices sounds good though (Specially with the pinning solution, as the engine would only need to poll when the VM is pinned). How costly is it to list them? It shouldn't be much costlier than navigating sysfs and a potential libvirt call, right? devices (which could be reattached by the pool). This has added benefit of requiring fewer libvirt calls, but a bit more complexity and possibly one thread. The persistence of the pool on VDSM restart could be kept in config or constructed from XML. best would be if we can live without too much persistence. Can we find out the actual state of things including VM mapping on vdsm startup? I would actually go for not keeping this data in memory unless it proves really expensive (as I say above). I'd need new API verbs to allow engine to communicate with the pool, possibly leaving caps as they are and engine could detect the presence of newer vdsm by presence of these API verbs. +1 The vmCreate call would remain almost the same, only with the addition of new device for VMs (where the detach and tracking routine would be communicated with the pool). ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel
Re: [ovirt-devel] [vdsm] Infrastructure design for node (host) devices
On Jun 24, 2014, at 12:26 , Martin Polednik mpole...@redhat.com wrote: Hello, I'm actively working on getting host device passthrough (pci, usb and scsi) exposed in VDSM, but I've encountered growing complexity of this feature. The devices are currently created in the same manner as virtual devices and their reporting is done via hostDevices list in getCaps. As I implemented usb and scsi devices, the size of this list grew almost twice - and that is on a laptop. Similar problem is with the devices themselves, they are closely tied to host and currently, engine would have to keep their mapping to VMs, reattach back loose devices and handle all of this in case of migration. I would like to hear your opinion on building something like host device pool in VDSM. The pool would be populated and periodically updated (to handle hot(un)plugs) and VMs/engine could query it for free/assigned/possibly problematic devices (which could be reattached by the pool). This has added benefit of requiring fewer libvirt calls, but a bit more complexity and possibly one thread. The persistence of the pool on VDSM restart could be kept in config or constructed from XML. best would be if we can live without too much persistence. Can we find out the actual state of things including VM mapping on vdsm startup? I'd need new API verbs to allow engine to communicate with the pool, possibly leaving caps as they are and engine could detect the presence of newer vdsm by presence of these API verbs. The vmCreate call would remain almost the same, only with the addition of new device for VMs (where the detach and tracking routine would be communicated with the pool). ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel ___ Devel mailing list Devel@ovirt.org http://lists.ovirt.org/mailman/listinfo/devel