I've repeated the experiment without any shared storage, so that eliminates GlusterFS as a suspect.
server-a# virsh migrate --live --persistent --undefinesource --copy- storage-inc guest qemu+tls://server-b/system Result: After about a week of uptime, the guest froze solid for 27 seconds after the migration. This is after the migration, because the guest is running on the destination server, using up a full core, and not present on the originating server anymore. CPU usage goes back to normal once the guest becomes responsive again. Just before the migration, NTP was perfectly locked to well within 100us. Right after the machine become responsive again, this NTP status shows the machine simply lost more than 27 seconds: root@guest:~# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *cl0 xx.xx.xx.xx 3 u 15 16 377 0.457 27388.3 0.100 cl1 xx.xx.xx.xx 3 u 13 16 377 0.429 27388.4 0.178 root@guest:~# uptime 16:03:30 up 8 days, 23:45, 1 user, load average: 0.02, 0.02, 0.05 During these 27 seconds, it did not respond to any network activity or (virtual) console. There is no mention of clock-jumps or anything else in dmesg this time. Note that I have now reproduced this on two different pairs of machines: our original KVM cluster, and two compute nodes (different hardware) to test this with a supported Ubuntu release. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1297218 Title: guest hangs after live migration due to tsc jump To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1297218/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs