Boris-Michel Deschenes wrote: > John, > > Sorry for my late response.. > > It would be great to collaborate, like I said, I prefer to keep the libvirt > layer as it works great with openstack and many other techs (collectd, > virt-manager, etc.), the virsh tool is also very useful for us. > > You say: > ----------- > We have GPU passthrough working with NVIDIA GPUs in Xen 4.1.2, if I recall > correctly. We don't yet have a stable Xen + Libvirt installation working, > but we're looking at it. Perhaps it would be worth collaborating since it > sounds like this could be a win for both of us. > ----------- > I have Jim Fehlig in CC since this could be of interest to him. > > We managed to have the GPU passthrough of NVIDIA cards using Xen 4.1.2 but > ONLY with the xenapi (actually the whole XCP toolstack), with libvirt/Xen > 4.1.2 and even libvirt/Xen 4.1.3, I only manage to apss through radeon GPUs, > the reason could be: > > 1. The inability to pass the gfx_passthru parameter through libvirt (IIRC > this parameter passes the PCI device as the main VGA card and not a second > one). > 2. Bad FLR reset support (or other PCI low-level function) from the NVIDIA > boards >
I've noticed this issue with some Broadcom multifunction nics. No FLR, so fallback to secondary bus reset, which is problematic if another function is being used by a different vm. > 3. something else entirely. > > Anyway, like I said, this GPU passthrough of nvidia worked well with XCP > using xenapi but not with libvirt/Xen > Hmm, would be nice to get that fixed. To date, I haven't tried GPU passthrough with Xen so I'm not familiar with the issues. > Now, as for the libvirt/Xen setup we have, I don't know if I would call it > stable but it does the job as a POC cloud and is actually used by real people > with real GPU needs (for example developing on OpenCL 1.2), the main thing is > that it seamlessly integrates with openstack (because of libvirt) and with > the instance_type_extra_specs, you can actually add a couple of these > "special" nodes to an existing plain KVM cloud and they will receive the > instances requesting GPUs without any problem. > > the setup: > (this only refers to compute nodes as controller nodes are un-modified) > > 1. Install Centos 6.2 and make your own project Zeus (transforming a centos > in Xen) > http://www.howtoforge.com/virtualization-with-xen-on-centos-6.2-x86_64-paravirtualization-and-hardware-virtualization > (first page only and skip the bridge setup as openstack-nova-compute does > this at startup). You end up with a Xen hypervisor with libvirt, the libvirt > patch is actually a single-line config change IIRC. Pretty straight-forward. > > 2. Install openstack-nova from EPEL (so all this refers only to ESSEX, > openstack 2012.1) > > 3. configure the compute node accordingly (libvirt_type=xen) > > That's the first part, at this point, you can spawn a VM, and attach a GPU > manually with: > > virsh nodedev-dettach pci_0000_02_00_01 > (edit the VM's nova libvirt.xml to add a pci node dev definition like this: > http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/chap-Virtualization-PCI_passthrough.html > ) > virsh define libvirt.xml > virsh start instance-0000000x > > Now, this is all manual and we wish to automate this in openstack, so this is > what I've done, I currently can launch VMs in my cloud and the passthrough > occurs without any intervention. > > These files were modified from an original essex installation to make this > possible: > > (on the controller) > create a g1.small instance_type with {'free_gpus': '1'} as > instance_type_extra_specs > select the compute_filter filter to enforce extra_specs in scheduling (also > the function host_passes of the filter is slightly modified so that it read > key>=value instead of key=value... (free_gpus>=1 is good, does not need to be > strictly equals to 1) > I think this has already been done for you in Folsom via the ComputeCapabilitiesFilter and Jinwoo Suh's addition of instance_type_extra_specs operators. See commit 90f77d71. > (on the compute node) > nova/virt/libvirt/gpu.py > a new file that contains functions like detach_all_gpus, get_free_gpus, > simple stuff Have you considered pushing this upstream? > using virsh and lspci > nova/virt/libvirt/connection.py > calls gpu.detach_all_gpus on startup (virsh nodedev-dettach) > builds the VM libvirt.xml as normal but also adds the pci nodedev > definition > advertises free_gpus capabilities so that the scheduler gets it through > host_state calls > > that's about it, with that we get: > > 1. compute nodes that detach all GPUS on startup > 2. compute nodes that advertise the nb of free gpus to the scheduler > 3. compute nodes that are able to build the VMs libvirt.xml with a valid, > free GPU definition when a VM is launched > 4. controller that runs a scheduler that knows where to send VMs (free_gpus > >= 1) > > It does the trick for now, with RADEON 6950 I get 100% success, I spawn a VM > and in 20 seconds I get a windows 7 with a real GPU available through RDC. > > I'll try and get what the problem is regarding NVIDIA passthrough, If I do > I'll be sure to inform Jim Fehlig so that we can work this into libvirt. > Yes, please do. > All this is in openstack essex (2012.1) so I will probably never send the > code upstream as most of this has changed if folsom (for example the > extra_specs already is different in folsom) but if you want to have a look, > let me know. > As mentioned above, I think that one has already been done for you. Seems you just need to work on getting your nova/virt/libvirt/gpu.py addition upstream. Regards, Jim _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp