On Mon, Apr 3, 2017 at 9:11 PM, Stefan Hajnoczi <stefa...@gmail.com> wrote:
> On Fri, Mar 31, 2017 at 02:12:36PM -0600, Chris Friesen wrote:
>> I'm running into an issue with live-migrating a guest from a host running
>> qemu-kvm-ev 2.3.0-31 to a host running qemu-kvm-ev 2.6.0-27.1.  This is a
>> libvirt-tunnelled migration, in the context of upgrading an OpenStack
>> install to newer software.  The source host is running CentOS 7.2.1511,
>> while the dest host is running CentOS 7.3.1611.
>>
>> I'll include the qemu commandlines for the source/dest at the bottom.
>>
>> Initially we have a bunch of guests running on compute-2 (which is running
>> qemu-kvm-ev 2.3.0).  We then started live-migrating them one at a time to
>> compute-0 (which is running qemu-kvm-ev 2.6.0).  Three of them migrated
>> successfully.  The fourth (which was essentially identical in configuration
>> to the first three) failed, as per the following logs in
>> /var/log/libvirt/qemu/instance-0000000e.log:
>>
>>
>> 2017-03-29T06:38:37.886940Z qemu-kvm: VQ 2 size 0x80 < last_avail_idx 0x47b
>> - used_idx 0x47c
>> 2017-03-29T06:38:37.886974Z qemu-kvm: error while loading state for instance
>> 0x0 of device '0000:00:07.0/virtio-balloon'
>> 2017-03-29T06:38:37.888684Z qemu-kvm: load of migration failed: Operation
>> not permitted
>> 2017-03-29 06:38:37.896+0000: shutting down
>>
>>
>> Does anyone know of an existing bug report covering this issue?  (I took a
>> look and didn't see anything obviously related.)
>
> This is the virtio-balloon device.  If you remove the device the live
> migration should work reliably.
>
> Alternatively, you can temporarily rmmod virtio_balloon inside the guest
> for live migration.  After migration you can modprobe virtio_balloon
> again.
>
> last_avail_idx 0x47b with used_idx 0x47c is an invalid device state.
> I've diffed qemu-kvm-ev 2.6.0-27.1 hw/virtio/virtio-balloon.c against
> qemu.git/master and do not see an obvious bug.  I also compared
> qemu-kvm-ev 2.3.0-31 with qemu-kvm-ev 2.6.0-27.1.

The device likely got into the invalid state as part of a previous
migration to an unfixed QEMU. I second Stefan's suggestion to
temporarily remove the device or unload the driver.

Thanks!
Ladi

>>
>>
>> The qemu commandline on the source compute node is:
>>
>>
>> /usr/libexec/qemu-kvm -c 0x00000000000000000000000000000001 -n 4
>> --proc-type=secondary --file-prefix=vs -- -enable-dpdk -name
>> instance-0000000e -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -m 512
>> -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -object 
>> memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/mnt/huge-2048kB/libvirt/qemu,share=yes,size=536870912,host-nodes=1,policy=bind
>> -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid
>> 57ae849f-aa66-422a-90a2-62db6c59db29 -smbios type=1,manufacturer=Fedora
>> Project,product=OpenStack 
>> Nova,version=13.0.0-0.tis.4,serial=4c8121f1-d927-424e-8712-88b1de45be37,uuid=57ae849f-aa66-422a-90a2-62db6c59db29,family=Virtual
>> Machine -no-user-config -nodefaults -chardev 
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-instance-0000000e/monitor.sock,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
>> -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot
>> reboot-timeout=5000,strict=on -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
>> file=/dev/disk/by-path/ip-192.168.205.6:3260-iscsi-iqn.2010-10.org.openstack:volume-ac57fcaa-7ecd-4d3b-8671-3bc740337a42-lun-0,if=none,id=drive-virtio-disk0,format=raw,serial=ac57fcaa-7ecd-4d3b-8671-3bc740337a42,cache=none,aio=native
>> -device 
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -chardev 
>> socket,id=charnet0,path=/var/run/vswitch/usvhost-9e574d3c-32dd-4d39-97e6-447b15fb00b4
>> -netdev type=vhost-user,id=hostnet0,chardev=charnet0 -device 
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:b0:59:a9,bus=pci.0,addr=0x3
>> -chardev 
>> socket,id=charnet1,path=/var/run/vswitch/usvhost-7bc48d91-f215-4394-99ff-eb7f20d9ff1e
>> -netdev type=vhost-user,id=hostnet1,chardev=charnet1 -device 
>> virtio-net-pci,netdev=hostnet1,id=net1,mac=fa:16:3e:8b:6f:09,bus=pci.0,addr=0x4
>> -chardev 
>> socket,id=charnet2,path=/var/run/vswitch/usvhost-c32e2d0d-9ed4-4f4b-abc9-539a12a86008
>> -netdev type=vhost-user,id=hostnet2,chardev=charnet2 -device 
>> virtio-net-pci,netdev=hostnet2,id=net2,mac=fa:16:3e:07:ca:a0,bus=pci.0,addr=0x5
>> -chardev 
>> file,id=charserial0,path=/etc/nova/instances/57ae849f-aa66-422a-90a2-62db6c59db29/console.log
>> -device isa-serial,chardev=charserial0,id=serial0 -chardev
>> pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device
>> usb-tablet,id=input0 -vnc 0.0.0.0:11 -k en-us -device
>> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming fd:25 -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
>>
>>
>>
>> The complete instance-0000000e.log file on the destination is:
>>
>> 2017-03-29 06:38:35.962+0000: starting up libvirt version: 2.0.0, package:
>> 10.el7_3.2.tis.24 (Unknown, 2017-03-15-14:59:22,
>> yow-dsulliva-lx-vm1.wrs.com), qemu version: 2.6.0
>> (qemu-kvm-ev-2.6.0-27.1.el7.tis.31), hostname: compute-0
>> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
>> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm '-c
>> 0x00000000000000000000000000000001' '-n 4' --proc-type=secondary
>> --file-prefix=vs -- -enable-dpdk -name
>> guest=instance-0000000e,debug-threads=on -S -object 
>> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-10-instance-0000000e/master-key.aes
>> -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -m 512 -realtime mlock=off
>> -smp 1,sockets=1,cores=1,threads=1 -object 
>> memory-backend-file,id=ram-node0,prealloc=yes,mem-path=/mnt/huge-2048kB/libvirt/qemu,share=yes,size=536870912,host-nodes=0,policy=bind
>> -numa node,nodeid=0,cpus=0,memdev=ram-node0 -uuid
>> 57ae849f-aa66-422a-90a2-62db6c59db29 -smbios 'type=1,manufacturer=Fedora
>> Project,product=OpenStack 
>> Nova,version=13.0.0-0.tis.4,serial=4c8121f1-d927-424e-8712-88b1de45be37,uuid=57ae849f-aa66-422a-90a2-62db6c59db29,family=Virtual
>> Machine' -no-user-config -nodefaults -chardev 
>> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-10-instance-0000000e/monitor.sock,server,nowait
>> -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
>> -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot
>> reboot-timeout=5000,strict=on -device
>> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 
>> file=/dev/disk/by-path/ip-192.168.205.6:3260-iscsi-iqn.2010-10.org.openstack:volume-ac57fcaa-7ecd-4d3b-8671-3bc740337a42-lun-0,format=raw,if=none,id=drive-virtio-disk0,serial=ac57fcaa-7ecd-4d3b-8671-3bc740337a42,cache=none,aio=native
>> -device 
>> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>> -chardev 
>> socket,id=charnet0,path=/var/run/vswitch/usvhost-9e574d3c-32dd-4d39-97e6-447b15fb00b4
>> -netdev type=vhost-user,id=hostnet0,chardev=charnet0 -device 
>> virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:b0:59:a9,bus=pci.0,addr=0x3
>> -chardev 
>> socket,id=charnet1,path=/var/run/vswitch/usvhost-7bc48d91-f215-4394-99ff-eb7f20d9ff1e
>> -netdev type=vhost-user,id=hostnet1,chardev=charnet1 -device 
>> virtio-net-pci,netdev=hostnet1,id=net1,mac=fa:16:3e:8b:6f:09,bus=pci.0,addr=0x4
>> -chardev 
>> socket,id=charnet2,path=/var/run/vswitch/usvhost-c32e2d0d-9ed4-4f4b-abc9-539a12a86008
>> -netdev type=vhost-user,id=hostnet2,chardev=charnet2 -device 
>> virtio-net-pci,netdev=hostnet2,id=net2,mac=fa:16:3e:07:ca:a0,bus=pci.0,addr=0x5
>> -add-fd set=0,fd=51 -chardev file,id=charserial0,path=/dev/fdset/0,append=on
>> -device isa-serial,chardev=charserial0,id=serial0 -chardev
>> pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device
>> usb-tablet,id=input0 -vnc 0.0.0.0:9 -k en-us -device
>> cirrus-vga,id=video0,bus=pci.0,addr=0x2 -incoming defer -device
>> virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on
>> Domain id=10 is tainted: high-privileges
>> EAL:eal_memory.c:1591: WARNING: Address Space Layout Randomization (ASLR) is
>> enabled in the kernel.
>> EAL:eal_memory.c:1593:    This may cause issues with mapping memory into
>> secondary processes
>> char device redirected to /dev/pts/9 (label charserial1)
>> 2017-03-29T06:38:37.886940Z qemu-kvm: VQ 2 size 0x80 < last_avail_idx 0x47b
>> - used_idx 0x47c
>> 2017-03-29T06:38:37.886974Z qemu-kvm: error while loading state for instance
>> 0x0 of device '0000:00:07.0/virtio-balloon'
>> 2017-03-29T06:38:37.888684Z qemu-kvm: load of migration failed: Operation
>> not permitted
>> 2017-03-29 06:38:37.896+0000: shutting down
>>
>>
>> For what it's worth, the differences between the two qemu command lines are
>> as follows:
>>
>> source:
>> -name instance-0000000e -chardev 
>> file,id=charserial0,path=/etc/nova/instances/57ae849f-aa66-422a-90a2-62db6c59db29/console.log
>> -vnc 0.0.0.0:9 -incoming fd:25
>>
>> destination:
>> -name guest=instance-0000000e,debug-threads=on -object 
>> secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-10-instance-0000000e/master-key.aes
>> -add-fd set=0,fd=51 -chardev file,id=charserial0,path=/dev/fdset/0,append=on
>> -vnc 0.0.0.0:11 -incoming defer
>>
>> Thanks,
>> Chris
>>

Reply via email to