This bug was fixed in the package qemu - 2.0.0+dfsg-2ubuntu1.25 --------------- qemu (2.0.0+dfsg-2ubuntu1.25) trusty; urgency=medium
[Kai Storbeck] * backport patch to fix guest hangs after live migration (LP: #1297218) -- Serge Hallyn <serge.hal...@ubuntu.com> Fri, 01 Jul 2016 14:25:20 -0500 ** Changed in: qemu (Ubuntu Trusty) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1297218 Title: guest hangs after live migration due to tsc jump Status in QEMU: New Status in glusterfs package in Ubuntu: Invalid Status in qemu package in Ubuntu: Fix Released Status in glusterfs source package in Trusty: Confirmed Status in qemu source package in Trusty: Fix Released Bug description: ===================================== SRU Justification: 1. Impact: guests hang after live migration with 100% cpu 2. Upstream fix: a set of four patches fix this upstream 3. Stable fix: we have a backport of the four patches into a single patch. 4. Test case: try a set of migrations of different VMS (it is unfortunately not 100% reproducible) 5. Regression potential: the patch is not trivial, however the lp:qa-regression-tests testsuite passed 100% with this package. ===================================== We have two identical Ubuntu servers running libvirt/kvm/qemu, sharing a Gluster filesystem. Guests can be live migrated between them. However, live migration often leads to the guest being stuck at 100% for a while. In that case, the dmesg output for such a guest will show (once it recovers): Clocksource tsc unstable (delta = 662463064082 ns). In this particular example, a guest was migrated and only after 11 minutes (662 seconds) did it become responsive again. It seems that newly booted guests doe not suffer from this problem, these can be migrated back and forth at will. After a day or so, the problem becomes apparent. It also seems that migrating from server A to server B causes much more problems than going from B back to A. If necessary, I can do more measurements to qualify these observations. The VM servers run Ubuntu 13.04 with these packages: Kernel: 3.8.0-35-generic x86_64 Libvirt: 1.0.2 Qemu: 1.4.0 Gluster-fs: 3.4.2 (libvirt access the images via the filesystem, not using libgfapi yet as the Ubuntu libvirt is not linked against libgfapi). The interconnect between both machines (both for migration and gluster) is 10GbE. Both servers are synced to NTP and well within 1ms form one another. Guests are either Ubuntu 13.04 or 13.10. On the guests, the current_clocksource is kvm-clock. The XML definition of the guests only contains: <clock offset='utc'/> Now as far as I've read in the documentation of kvm-clock, it specifically supports live migrations, so I'm a bit surprised at these problems. There isn't all that much information to find on these issue, although I have found postings by others that seem to have run into the same issues, but without a solution. --- ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 DistroRelease: Ubuntu 14.04 Package: libvirt (not installed) ProcCmdline: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=1b0c3c6d-a9b8-4e84-b076-117ae267d178 ro console=ttyS1,115200n8 BOOTIF=01-00-25-90-75-b5-c8 ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9 Tags: trusty apparmor apparmor apparmor apparmor apparmor Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: _MarkForUpload: True modified.conffile..etc.default.libvirt.bin: [modified] modified.conffile..etc.libvirt.libvirtd.conf: [modified] modified.conffile..etc.libvirt.qemu.conf: [modified] modified.conffile..etc.libvirt.qemu.networks.default.xml: [deleted] mtime.conffile..etc.default.libvirt.bin: 2014-05-12T19:07:40.020662 mtime.conffile..etc.libvirt.libvirtd.conf: 2014-05-13T14:40:25.894837 mtime.conffile..etc.libvirt.qemu.conf: 2014-05-12T18:58:27.885506 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1297218/+subscriptions