I can answer some of the questions. It's been 3 months or so since I looked into it. I ended up disabling kvmclock from the qemu command line and moving on. I saw it with CentOS 6.5 and Ubuntu 12.04 guests. Sending the guest to the BIOS CLI or PXE would not reproduce the issue. I didn't attempt an array of qemu versions, but I can say that it did occur on 1.7.0 and 1.6.1, with the host running kernel 3.10 or 3.12. The CPUs are Intel E5-2650.
As mentioned by others, the mode of reproduction is to launch the vm, wait about an hour, and then try to migrate it Here's are example qemu command lines (as generated from libvirt with the only difference being "<timer name='kvmclock' present='no'/>"): Broken: /usr/bin/qemu-system-x86_64 -machine accel=kvm -name i-382-5388-VM -S -machine pc-i440fx-1.7,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid ced153e7-63d3-4fca-a786-964c9755f0de -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/i-382-5388-VM.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/dev/md/volumes/82163218-7a4a-422c-a667-11f723ec7a1d,if=none,id=drive-virtio-disk0,format=raw,serial=82163218-7a4a-422c-a667-11f723ec7a1d,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:2d:5d:00:05,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/i-382-5388-VM.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=i-382-5388-VM.vport -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 Working: /usr/bin/qemu-system-x86_64 -machine accel=kvm -name i-382-5388-VM -S -machine pc-i440fx-1.7,accel=kvm,usb=off -cpu qemu64,-kvmclock -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid ced153e7-63d3-4fca-a786-964c9755f0de -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/i-382-5388-VM.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/dev/md/volumes/82163218-7a4a-422c-a667-11f723ec7a1d,if=none,id=drive-virtio-disk0,format=raw,serial=82163218-7a4a-422c-a667-11f723ec7a1d,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none -device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:2d:5d:00:05,bus=pci.0,addr=0x3 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/i-382-5388-VM.agent,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=i-382-5388-VM.vport -device usb-tablet,id=input0 -vnc 0.0.0.0:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 On Tue, Apr 15, 2014 at 1:59 AM, Dr. David Alan Gilbert <dgilb...@redhat.com> wrote: > * Marcus (shadow...@gmail.com) wrote: >> Dang, I was hoping some ground was being made on this. > > Can you answer the same questions I asked Marcin? > What's the latest version of QEMU you've seen this on, what CPU are you > using, what guest OS and what's your QEMU command line? > > Dave > >> >> On Wed, Apr 2, 2014 at 11:05 AM, Marcin Gibu??a <m.gib...@beyond.pl> wrote: >> >>> Yes, that's where it gets weird. I've never seen this on fresh VM. >> >>> It needs to be idle for couple of hours at least. And even then it >> >>> doesn't always hang. >> >> >> >> >> >> So your OS is just sitting at a text console, running nothing special? >> >> When you reboot after the migration what's the last thing you see >> >> in the guests logs? Is there anything from after the migration? >> > >> > >> > Yes, it's completely idle. After reboot there is nothing in logs. I've >> > dumped memory of one of hanged test VMs and found kernel message buffer. >> > The >> > last entries were: >> > >> > >> > init: failsafe main process (659) killed by TERM signal >> > init: plymouth-upstart-bridge main process (651) killed by TERM signal >> > >> > <migration goes here, guest hangs> >> > >> > Clocksource tsc unstable (delta = 470666274 ns) >> > >> > <inject-nmi to test> >> > >> > Uhhuh. NMI received for unknown reason 30 on CPU 0. >> > Do you have a strange power saving mode enabled?I: >> > >> > Dazed and confused, but trying to continue >> > Uhhuh. NMI received for unknown reason 20 on CPU 0. >> > Do you have a strange power saving mode enabled?I: >> > >> > Dazed and confused, but trying to continue >> > <0>Dazed and confused, but trying to continue >> > >> > >> > >> > I've tried to disassemble where VM kernel (3.8.something from Ubuntu) is >> > spinning (using qemu-monitor, registers info and symbols from guest kernel) >> > and it was loop inside __run_timers function from kernel/timer.c: >> > >> > while (time_after_eq(jiffies, base->timer_jiffies)) { >> > ... >> > } >> > >> > However my disassembly and qemu debugging skills are limited, would it help >> > if I dump memory of broken VM and send it you somehow? >> > >> > -- >> > mg >> > >> > >> > >> > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK