I can test again with qemu 3.1 but with previous versions yes, it was happening the same with both virtio-blk and virtio-scsi. For 3.1 I can confirm it happens for virtio-scsi (already tested it) and I can test with virtio-blk again if that will add value to the investigation. Also I'm attaching a guest console screenshot showing the errors displayed by the guest when it goes unresponsive in case it can help.
Thanks for the patch. I will build the custom qemu binary and reproduce the issue. This may take a couple of days since I cannot reproduce it at will. Sometimes it takes 12 hours sometimes 2 days until it happens. Hopefully the code below will add more light on to this problem. Thanks, Fernando On lun, feb 4, 2019 at 7:06 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: Are you sure this happens with both virtio-blk and virtio-scsi? The following patch adds more debug output. You can build as follows: $ git clone https://git.qemu.org/git/qemu.git $ cd qemu $ patch apply -p1 ...paste the patch here... ^D # For info on build dependencies see https://wiki.qemu.org/Hosts/Linux $ ./configure --target-list=x86_64-softmmu $ make -j4 You can configure a libvirt domain to use your custom QEMU binary by changing the <devices><emulator> tag to the qemu/x86_64-softmmu/qemu-system-x86_64 path. --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 22bd1ac34e..aa44bffa1f 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -879,6 +879,9 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num); virtio_error(vdev, "Virtqueue size exceeded"); goto done; }