On Wed, Jul 06, 2016 at 12:32:26PM -0400, Jonathan D. Proulx wrote: : :I do have an odd remaining issue where I can run cuda jobs in the vm :but snapshots fail and after pause (for snapshotting) the pci device :can't be reattached (which is where i think it deletes the snapshot :it took). Got same issue with 3.16 and 4.4 kernels. : :Not very well categorized yet, but I'm hoping it's because the VM I :was hacking on had it's libvirt.xml written out with the older qemu :maybe? It had been through a couple reboots of the physical system :though. : :Currently building a fresh instance and bashing more keys...
After an ugly bout of bashing I've solve my failing snapshot issue which I'll post here in hopes of saving someonelse Short version: add "/dev/vfio/vfio rw," to /etc/apparmor.d/abstractions/libvirt-qemu add "ulimit -l unlimited" to /etc/init/libvirt-bin.conf Longer version: What was happening. * send snapshot request * instance pauses while snapshot is pending * instance attempt to resume * fails to reattach pci device * nova-compute.log Exception during message handling: internal error: unable to execute QEMU command 'device_add': Device initialization failedcompute.log * qemu/<id>.log vfio: failed to open /dev/vfio/vfio: Permission denied vfio: failed to setup container for group 48 vfio: failed to get group 48 * snapshot disappears * instance resumes but without passed through device (hard reboot reattaches) seeing permsission denied I though would be an easy fix but: # ls -l /dev/vfio/vfio crw-rw-rw- 1 root root 10, 196 Jul 6 14:05 /dev/vfio/vfio so I'm guessing I'm in apparmor hell, I try adding "/dev/vfio/vfio rw," to /etc/apparmor.d/abstractions/libvirt-qemu rebooting the hypervisor and trying again which gets me a different libvirt error set: VFIO_MAP_DMA: -12 vfio_dma_map(0x5633a5fa69b0, 0x0, 0xa0000, 0x7f4e7be00000) = -12 (Cannot allocate memory) kern.log (and thus dmesg) showing: vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded Getting rid of this one required inserting 'ulimit -l unlimited' into /etc/init/libvirt-bin.conf in the 'script' section: <previous bits excluded> script [ -r /etc/default/libvirt-bin ] && . /etc/default/libvirt-bin ulimit -l unlimited exec /usr/sbin/libvirtd $libvirtd_opts end script -Jon _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators