Hi, On Wed, Jun 14, 2017 at 11:56 PM, Fernando Casas Schössow <casasferna...@hotmail.com> wrote: > Hi there, > > I recently migrated a Hyper-V host to qemu/kvm runing on Alpine Linux 3.6.1 > (kernel 4.9.30 -with grsec patches- and qemu 2.8.1). > > Almost on daily basis at least one of the guests is showing the following > error in the log and the it needs to be terminated and restarted to recover > it: > > qemu-system-x86_64: Virtqueue size exceeded > > Is not always the same guest, and the error is appearing for both, Linux > (CentOS 7.3) and Windows (2012R2) guests. > As soon as this error appears the guest is not really working anymore. It may > respond to ping or you can even try to login but then everything is very slow > or completely unresponsive. Restarting the guest from within the guest OS is > not working either and the only thing I can do is to terminate it (virsh > destroy) and start it again until the next failure. > > In Windows guest the error seems to be related to disk: > "Reset to device, \Device\RaidPort2, was issued" and the source is viostor > > And in Linux guests the error is always (with the process and pid changing): > > INFO: task <process>:<pid> blocked for more than 120 seconds > > But unfortunately I was not able to find any other indication of a problem in > the guests logs nor in the host logs except for the error regarding the > virtqueue size. The problem is happening at different times of day and I > couldn't find any patterns yet. > > All the Windows guests are using virtio drivers version 126 and all Linux > guests are CentOS 7.3 using the latest kernel available in the distribution > (3.10.0-514.21.1). They all run qemu-guest agent as well. > All the guest disks are qcow2 images with cache=none and aimode=threads > (tried native mode before but with the same results). > > Example qemu command for a Linux guest: > > /usr/bin/qemu-system-x86_64 -name guest=DOCKER01,debug-threads=on -S -object > secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-24-DOCKER01/master-key.aes > -machine pc-i440fx-2.8,accel=kvm,usb=off,dump-guest-core=off -cpu > IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+osxsave,+arat,+xsaveopt > -drive > file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on > -drive > file=/var/lib/libvirt/qemu/nvram/DOCKER01_VARS.fd,if=pflash,format=raw,unit=1 > -m 2048 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid > 4705b146-3b14-4c20-923c-42105d47e7fc -no-user-config -nodefaults -chardev > socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-24-DOCKER01/monitor.sock,server,nowait > -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew > -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global > PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device > ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7 -device > ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4 > -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1 > -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2 > -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive > file=/storage/storage-ssd-vms/virtual_machines_ssd/docker01.qcow2,format=qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=threads > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 > -netdev tap,fd=35,id=hostnet0,vhost=on,vhostfd=45 -device > virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:1c:af:ce,bus=pci.0,addr=0x3 > -chardev pty,id=charserial0 -device > isa-serial,chardev=charserial0,id=serial0 -chardev > socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-24-DOCKER01/org.qemu.guest_agent.0,server,nowait > -device > virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 > -chardev spicevmc,id=charchannel1,name=vdagent -device > virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 > -device usb-tablet,id=input0,bus=usb.0,port=1 -spice > port=5905,addr=127.0.0.1,disable-ticketing,seamless-migration=on -device > qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 > -chardev spicevmc,id=charredir0,name=usbredir -device > usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev > spicevmc,id=charredir1,name=usbredir -device > usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -object > rng-random,id=objrng0,filename=/dev/random -device > virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x8 -msg timestamp=on > > For what it worth, the same guests were working fine for years on Hyper-V on > the same hardware (Intel Xeon E3, 32GB RAM, Supermicro mainboard, 6x3TB > Western Digital Red disks and 6x120MB Kingston V300 SSD all connected to a > LSI LSISAS2008 controller). > Except for this stability issue that I hope to solve everything else is > working great and outperforming Hyper-V. > > Any ideas, thoughts or suggestions to try to narrow down the problem?
Would you be able to enhance the error message and rebuild QEMU? --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -856,7 +856,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { - virtio_error(vdev, "Virtqueue size exceeded"); + virtio_error(vdev, "Virtqueue %u device %s size exceeded", vq->queue_index, vdev->name); goto done; } This would at least confirm the theory that it's caused by virtio-blk-pci. If rebuilding is not feasible I would start by removing other virtio devices -- particularly balloon which has had quite a few virtio related bugs fixed recently. Does your environment involve VM migrations or saving/resuming, or does the crashing QEMU process always run the VM from its boot? Thanks! > Thanks in advance and sorry for the long email but I wanted to be as > descriptive as possible. > > Fer