Saving or migrating a vhost-blk guest under disk load can fail to load on the destination:
qemu-kvm: VQ 0 size 0x100 < last_avail_idx 0xb8ab - used_idx 0xb934 qemu-kvm: Failed to load vhost-blk:virtio qemu-kvm: error while loading state for instance 0x0 of device vhost-blk load of migration failed: Operation not permitted virtio_load() rejects the device because the saved used_idx is ahead of last_avail_idx, which is impossible for a coherent vring. The root cause is that vhost-blk has no "stop fetching" step before the device is stopped. On stop, QEMU's vhost_dev_stop() reads last_avail_idx via VHOST_GET_VRING_BASE, but the vhost worker is still running: it keeps pulling the avail-ring backlog and completing those requests, advancing the guest used->idx past the last_avail_idx that was just sampled. The saved state is therefore incoherent. vhost-net does not hit this because it detaches the backend (VHOST_NET_SET_BACKEND, fd == -1) before VHOST_GET_VRING_BASE, so its worker stops fetching. vhost-blk had no equivalent operation. Teach VHOST_BLK_SET_BACKEND to treat a negative fd as "stop the device": detach the backend from every vq (vhost_blk_handle_guest_kick() bails on a NULL backend), drain in-flight requests with vhost_blk_flush(), and release the backing file. After this the worker no longer advances the rings, so the subsequent VHOST_GET_VRING_BASE reports a final, coherent last_avail_idx. The unconsumed avail backlog stays in the ring and is reprocessed once the device is restarted. The companion QEMU change issues this stop before vhost_dev_stop(). https://virtuozzo.atlassian.net/browse/VSTOR-133464 Fixes: 40a5928ec730 ("drivers/vhost: vhost-blk accelerator for virtio-blk guests") Signed-off-by: Andrey Drobyshev <[email protected]> --- drivers/vhost/blk.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c index b11f08f878f4..1b073011c445 100644 --- a/drivers/vhost/blk.c +++ b/drivers/vhost/blk.c @@ -744,6 +744,24 @@ static long vhost_blk_set_backend(struct vhost_blk *blk, int fd) if (ret) goto out_dev; + /* + * fd < 0 means "stop the device". Detach the backend from every vq so + * vhost_blk_handle_guest_kick() stops fetching descriptors, drain the + * in-flight requests, and release the backing file. + */ + if (fd < 0) { + if (!blk->backend) { + mutex_unlock(&blk->dev.mutex); + return 0; /* already stopped */ + } + vhost_blk_drop_backends(blk); + vhost_blk_flush(blk); + fput(blk->backend); + blk->backend = NULL; + mutex_unlock(&blk->dev.mutex); + return 0; + } + if (blk->backend) { ret = -EBUSY; goto out_dev; -- 2.47.1 _______________________________________________ Devel mailing list [email protected] https://lists.openvz.org/mailman/listinfo/devel
