My apologies for the late review here. I appreciate the need to work
around these issues but I do feel the approach complicates Qemu
significantly and it may be possible to achieve similar results
managing state inside the backend. More comments inline.

I like a lot of the cleanups here - maybe consider breaking out a
series with some of the cleanups?

On Wed, Aug 13, 2025 at 12:56 PM Vladimir Sementsov-Ogievskiy
<[email protected]> wrote:
>
> Hi all!
>
> Local migration of vhost-user-blk requires non-trivial actions
> from management layer, it should provide a new connection for new
> QEMU process and handle disk operation movement from one connection
> to another.
>
> Such switching, including reinitialization of vhost-user connection,
> draining disk requests, etc, adds significant value to local migration
> downtime.

I see how draining IO requests adds downtime and is impactful. That
said, we need to start-stop the device anyways so I'm not convinced
that setting up mappings and sending messages back and forth are
impactful enough to warrant adding a whole new migration mode. Am I
missing anything here?

>
> This all leads to an idea: why not to just pass all we need from
> old QEMU process to the new one (including open file descriptors),
> and don't touch the backend at all? This way, the vhost user backend
> server will not even know, that QEMU process is changed, as live
> vhost-user connection is migrated.

Alternatively, if it really is about avoiding IO draining, what if
Qemu advertised a new vhost-user protocol feature which would query
whether the backend already has state for the device? Then, if the
backend indicates that it does, Qemu and the backend can take a
different path in vhost-user, exchanging relevant information,
including the descriptor indexes for the VQs such that draining can be
avoided. I expect that could be implemented to cut down a lot of the
other vhost-user overhead anyways (i.e. you could skip setting the
memory table). If nothing else it would probably help other device
types take advantage of this without adding more options to Qemu.

Thoughts?

>
> So this series realize the idea. No requests are done to backend
> during migration, instead all backend-related state and all related
> file descriptors (vhost-user connection, guest/host notifiers,
> inflight region) are passed to new process. Of course, migration
> should go through unix socket.
>
> The most of the series are refactoring patches. The core feature is
> spread between 24, 28-31 patches.
>
> Why not CPR-transfer?
>
> 1. In the new mode of local migration we need to pass not only
> file descriptors, but additional parts of backend-related state,
> which we don't want (or even can't) reinitialize in target process.
> And it's a lot simpler to add new fields to common migration stream.
> And why not to pass fds in the same stream?
>
> 2. No benefit of vhost-user connection fd passed to target in early
> stage before device creation: we can't use it together with source
> QEMU process anyway. So, we need a moment, when source qemu stops using
> the fd, and target start doing it. And native place for this moment is
> usual save/load of the device in migration process. And yes, we have to
> deeply update initialization/starting of the device to not reinitialize
> the backend, but just continue to work with it in a new QEMU process.
>
> 3. So, if we can't actually use fd, passed early before device creation,
> no reason to care about:
> - non-working QMP connection on target until "migrate" command on source
> - additional migration channel
> - implementing code to pass additional non-fd fields together with fds in CPR
>
> However, the series doesn't conflict with CPR-transfer, as it's actually
> a usual migration with some additional capabilities. The only
> requirement is that main migration channel should be a unix socket.
>
> Vladimir Sementsov-Ogievskiy (33):
>   vhost: introduce vhost_ops->vhost_set_vring_enable_supported method
>   vhost: drop backend_features field
>   vhost-user: introduce vhost_user_has_prot() helper
>   vhost: move protocol_features to vhost_user
>   vhost-user-gpu: drop code duplication
>   vhost: make vhost_dev.features private
>   virtio: move common part of _set_guest_notifier to generic code
>   virtio: drop *_set_guest_notifier_fd_handler() helpers
>   vhost-user: keep QIOChannelSocket for backend channel
>   vhost: vhost_virtqueue_start(): fix failure path
>   vhost: make vhost_memory_unmap() null-safe
>   vhost: simplify calls to vhost_memory_unmap()
>   vhost: move vrings mapping to the top of vhost_virtqueue_start()
>   vhost: vhost_virtqueue_start(): drop extra local variables
>   vhost: final refactoring of vhost vrings map/unmap
>   vhost: simplify vhost_dev_init() error-path
>   vhost: move busyloop timeout initialization to vhost_virtqueue_init()
>   vhost: introduce check_memslots() helper
>   vhost: vhost_dev_init(): drop extra features variable
>   hw/virtio/virtio-bus: refactor virtio_bus_set_host_notifier()
>   vhost-user: make trace events more readable
>   vhost-user-blk: add some useful trace-points
>   vhost: add some useful trace-points
>   chardev-add: support local migration
>   virtio: introduce .skip_vhost_migration_log() handler
>   io/channel-socket: introduce qio_channel_socket_keep_nonblock()
>   migration/socket: keep fds non-block
>   vhost: introduce backend migration
>   vhost-user: support backend migration
>   virtio: support vhost backend migration
>   vhost-user-blk: support vhost backend migration
>   test/functional: exec_command_and_wait_for_pattern: add vm arg
>   tests/functional: add test_x86_64_vhost_user_blk_fd_migration.py
>
>  backends/cryptodev-vhost.c                    |   1 -
>  chardev/char-socket.c                         | 101 +++-
>  hw/block/trace-events                         |  10 +
>  hw/block/vhost-user-blk.c                     | 201 ++++++--
>  hw/display/vhost-user-gpu.c                   |  11 +-
>  hw/net/vhost_net.c                            |  27 +-
>  hw/scsi/vhost-scsi.c                          |   1 -
>  hw/scsi/vhost-user-scsi.c                     |   1 -
>  hw/virtio/trace-events                        |  12 +-
>  hw/virtio/vdpa-dev.c                          |   3 +-
>  hw/virtio/vhost-user-base.c                   |   8 +-
>  hw/virtio/vhost-user.c                        | 326 +++++++++---
>  hw/virtio/vhost.c                             | 474 ++++++++++++------
>  hw/virtio/virtio-bus.c                        |  20 +-
>  hw/virtio/virtio-hmp-cmds.c                   |   2 -
>  hw/virtio/virtio-mmio.c                       |  41 +-
>  hw/virtio/virtio-pci.c                        |  34 +-
>  hw/virtio/virtio-qmp.c                        |  10 +-
>  hw/virtio/virtio.c                            | 120 ++++-
>  include/chardev/char-socket.h                 |   3 +
>  include/hw/virtio/vhost-backend.h             |  10 +
>  include/hw/virtio/vhost-user-blk.h            |   2 +
>  include/hw/virtio/vhost.h                     |  42 +-
>  include/hw/virtio/virtio-pci.h                |   3 -
>  include/hw/virtio/virtio.h                    |  11 +-
>  include/io/channel-socket.h                   |   3 +
>  io/channel-socket.c                           |  16 +-
>  migration/options.c                           |  14 +
>  migration/options.h                           |   2 +
>  migration/socket.c                            |   1 +
>  net/vhost-vdpa.c                              |   7 +-
>  qapi/char.json                                |  16 +-
>  qapi/migration.json                           |  19 +-
>  qapi/virtio.json                              |   3 -
>  stubs/meson.build                             |   1 +
>  stubs/qemu_file.c                             |  15 +
>  stubs/vmstate.c                               |   6 +
>  tests/functional/qemu_test/cmd.py             |   7 +-
>  ...test_x86_64_vhost_user_blk_fd_migration.py | 279 +++++++++++
>  tests/qtest/meson.build                       |   2 +-
>  tests/unit/meson.build                        |   4 +-
>  41 files changed, 1420 insertions(+), 449 deletions(-)
>  create mode 100644 stubs/qemu_file.c
>  create mode 100644 
> tests/functional/test_x86_64_vhost_user_blk_fd_migration.py
>
> --
> 2.48.1
>
>

Reply via email to