The commits 03de2f527 "virtio-blk: do not use vring in dataplane" and 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active" changed how notifications are done for virtio-blk substantially. Due to a race condition, interrupts are lost when irqfd behind the guest notifier is torn down after notify_guest_bh was scheduled but before it actually runs.
Let's fix this by forcing guest notifications before cleaning up the irqfd's. Let's also add some explanatory comments. Cc: qemu-sta...@nongnu.org Signed-off-by: Halil Pasic <pa...@linux.vnet.ibm.com> Reported-by: Michael A. Tebolt <mi...@us.ibm.com> Tested-by: Michael A. Tebolt <mi...@us.ibm.com> Suggested-by: Paolo Bonzini <pbonz...@redhat.com> --- This patch withstood the test case which discovered the problem for several days (as reported by Michale Tebolt). v1 --> v2: * Fixed typo pointed out by Connie * Added Tested-by --- hw/block/dataplane/virtio-blk.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 5556f0e..045a580 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -258,9 +258,16 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) virtio_queue_aio_set_host_notifier_handler(vq, s->ctx, NULL); } - /* Drain and switch bs back to the QEMU main loop */ + /* Drain and switch bs back to the QEMU main loop. After drain, the + * device will not submit (nor complete) any requests until dataplane + * starts again. + */ blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context()); + /* Notify guest before the guest notifiers get cleaned up */ + qemu_bh_cancel(s->bh); + notify_guest_bh(s); + aio_context_release(s->ctx); for (i = 0; i < nvqs; i++) { -- 2.8.4