I can answer some of the questions. It's been 3 months or so since I
looked into it. I ended up disabling kvmclock from the qemu command
line and moving on. I saw it with CentOS 6.5 and Ubuntu 12.04 guests.
Sending the guest to the BIOS CLI or PXE would not reproduce the
issue. I didn't attempt an array of qemu versions, but I can say that
it did occur on 1.7.0 and 1.6.1, with the host running kernel 3.10 or
3.12. The CPUs are Intel E5-2650.

As mentioned by others, the mode of reproduction is to launch the vm,
wait about an hour, and then try to migrate it

Here's are example qemu command lines (as generated from libvirt with
the only difference being "<timer name='kvmclock' present='no'/>"):

Broken:
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name i-382-5388-VM -S
-machine pc-i440fx-1.7,accel=kvm,usb=off -m 4096 -realtime mlock=off
-smp 2,sockets=2,cores=1,threads=1 -uuid
ced153e7-63d3-4fca-a786-964c9755f0de -no-user-config -nodefaults
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/i-382-5388-VM.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/dev/md/volumes/82163218-7a4a-422c-a667-11f723ec7a1d,if=none,id=drive-virtio-disk0,format=raw,serial=82163218-7a4a-422c-a667-11f723ec7a1d,cache=none
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2
-drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
-netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:2d:5d:00:05,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/i-382-5388-VM.agent,server,nowait
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=i-382-5388-VM.vport
-device usb-tablet,id=input0 -vnc 0.0.0.0:0 -device
cirrus-vga,id=video0,bus=pci.0,addr=0x2

Working:
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name i-382-5388-VM -S
-machine pc-i440fx-1.7,accel=kvm,usb=off -cpu qemu64,-kvmclock -m 4096
-realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid
ced153e7-63d3-4fca-a786-964c9755f0de -no-user-config -nodefaults
-chardev 
socket,id=charmonitor,path=/var/lib/libvirt/qemu/i-382-5388-VM.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc
-no-shutdown -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive
file=/dev/md/volumes/82163218-7a4a-422c-a667-11f723ec7a1d,if=none,id=drive-virtio-disk0,format=raw,serial=82163218-7a4a-422c-a667-11f723ec7a1d,cache=none
-device 
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=2
-drive if=none,id=drive-ide0-1-0,readonly=on,format=raw,cache=none
-device ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1
-netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=30 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=02:00:2d:5d:00:05,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/i-382-5388-VM.agent,server,nowait
-device 
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=i-382-5388-VM.vport
-device usb-tablet,id=input0 -vnc 0.0.0.0:0 -device
cirrus-vga,id=video0,bus=pci.0,addr=0x2

On Tue, Apr 15, 2014 at 1:59 AM, Dr. David Alan Gilbert
<dgilb...@redhat.com> wrote:
> * Marcus (shadow...@gmail.com) wrote:
>> Dang, I was hoping some ground was being made on this.
>
> Can you answer the same questions I asked Marcin?
> What's the latest version of QEMU you've seen this on, what CPU are you
> using, what guest OS and what's your QEMU command line?
>
> Dave
>
>>
>> On Wed, Apr 2, 2014 at 11:05 AM, Marcin Gibu??a <m.gib...@beyond.pl> wrote:
>> >>> Yes, that's where it gets weird. I've never seen this on fresh VM.
>> >>> It needs to be idle for couple of hours at least. And even then it
>> >>> doesn't always hang.
>> >>
>> >>
>> >> So your OS is just sitting at a text console, running nothing special?
>> >> When you reboot after the migration what's the last thing you see
>> >> in the guests logs? Is there anything from after the migration?
>> >
>> >
>> > Yes, it's completely idle. After reboot there is nothing in logs. I've
>> > dumped memory of one of hanged test VMs and found kernel message buffer. 
>> > The
>> > last entries were:
>> >
>> >
>> > init: failsafe main process (659) killed by TERM signal
>> > init: plymouth-upstart-bridge main process (651) killed by TERM signal
>> >
>> > <migration goes here, guest hangs>
>> >
>> > Clocksource tsc unstable (delta = 470666274 ns)
>> >
>> > <inject-nmi to test>
>> >
>> > Uhhuh. NMI received for unknown reason 30 on CPU 0.
>> > Do you have a strange power saving mode enabled?I:
>> >
>> > Dazed and confused, but trying to continue
>> > Uhhuh. NMI received for unknown reason 20 on CPU 0.
>> > Do you have a strange power saving mode enabled?I:
>> >
>> > Dazed and confused, but trying to continue
>> > <0>Dazed and confused, but trying to continue
>> >
>> >
>> >
>> > I've tried to disassemble where VM kernel (3.8.something from Ubuntu) is
>> > spinning (using qemu-monitor, registers info and symbols from guest kernel)
>> > and it was loop inside __run_timers function from kernel/timer.c:
>> >
>> > while (time_after_eq(jiffies, base->timer_jiffies)) {
>> >   ...
>> > }
>> >
>> > However my disassembly and qemu debugging skills are limited, would it help
>> > if I dump memory of broken VM and send it you somehow?
>> >
>> > --
>> > mg
>> >
>> >
>> >
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to