On 2026-06-19 3:55 AM, Andrey Drobyshev wrote:
> During CPR migration, instead of reopening /dev/vhost-vsock device on
> each initialization, let's make sure the already open FD is preserved and
> properly reused on a subsequent initialization.
> 
> Signed-off-by: Andrey Drobyshev <[email protected]>
> ---
>  hw/virtio/vhost-vsock.c | 25 ++++++++++++++++++++-----
>  1 file changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
> index da244eb1657..ea9a7d2149d 100644
> --- a/hw/virtio/vhost-vsock.c
> +++ b/hw/virtio/vhost-vsock.c
> @@ -20,6 +20,7 @@
>  #include "hw/core/qdev-properties.h"
>  #include "hw/virtio/vhost-vsock.h"
>  #include "monitor/monitor.h"
> +#include "migration/cpr.h"
>  
>  static void vhost_vsock_get_config(VirtIODevice *vdev, uint8_t *config)
>  {
> @@ -126,6 +127,7 @@ static void vhost_vsock_device_realize(DeviceState *dev, 
> Error **errp)
>      VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(dev);
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>      VHostVSock *vsock = VHOST_VSOCK(dev);
> +    DeviceState *proxy = qdev_get_parent_bus(DEVICE(vsock))->parent;
>      int vhostfd;
>      int ret;
>  
> @@ -141,7 +143,8 @@ static void vhost_vsock_device_realize(DeviceState *dev, 
> Error **errp)
>      }
>  
>      if (vsock->conf.vhostfd) {
> -        vhostfd = monitor_fd_param(monitor_cur(), vsock->conf.vhostfd, errp);
> +        vhostfd = cpr_get_fd_param(proxy->id, vsock->conf.vhostfd, 0,
> +                                   errp);
>          if (vhostfd == -1) {
>              error_prepend(errp, "vhost-vsock: unable to parse vhostfd: ");
>              return;
> @@ -151,10 +154,20 @@ static void vhost_vsock_device_realize(DeviceState 
> *dev, Error **errp)
>              return;
>          }
>      } else {
> -        vhostfd = open("/dev/vhost-vsock", O_RDWR);

I think this patchset also implicitly fixes the duplicate vhost thread issue
noticed when testing vhost-scsi.

With vhost-scsi, I noticed that if QEMU opens the vhost device fd without
O_CLOEXEC, the fd may survive cpr-exec unexpectedly. Then the new QEMU opens
another vhost fd, and we can end up with duplicate vhost threads.

I did not see the same issue when the fd was opened outside QEMU and passed in
via QMP getfd, probably because QEMU sets close-on-exec on fds received that 
way.

With this patchset, vhost-vsock no longer opens a new fd on the incoming side.
It restores the fd from CPR state, so the "duplicate vhost threads" issue should
not happen for this path.

I am still thinking about whether it is worth sending a temporary patch to add
O_CLOEXEC to these vhost device opens, or whether we should enhance vhost-scsi
in the same way as this patchset instead.

Thank you very much!

Dongli Zhang

> -        if (vhostfd < 0) {
> -            error_setg_file_open(errp, errno, "/dev/vhost-vsock");
> -            return;
> +        if (cpr_is_incoming()) {
> +            vhostfd = cpr_find_fd(proxy->id, 0);
> +            if (vhostfd < 0) {
> +                error_setg(errp, "vhost-vsock: could not restore vhost FD "
> +                           "for %s", proxy->id);
> +                return;
> +            }
> +        } else {
> +            vhostfd = open("/dev/vhost-vsock", O_RDWR);
> +            if (vhostfd < 0) {
> +                error_setg_file_open(errp, errno, "/dev/vhost-vsock");
> +                return;
> +            }
> +            cpr_save_fd(proxy->id, 0, vhostfd);
>          }
>  
>          if (!qemu_set_blocking(vhostfd, false, errp)) {
> @@ -193,10 +206,12 @@ static void vhost_vsock_device_unrealize(DeviceState 
> *dev)
>  {
>      VHostVSockCommon *vvc = VHOST_VSOCK_COMMON(dev);
>      VirtIODevice *vdev = VIRTIO_DEVICE(dev);
> +    DeviceState *proxy = qdev_get_parent_bus(dev)->parent;
>  
>      /* This will stop vhost backend if appropriate. */
>      vhost_vsock_set_status(vdev, 0);
>  
> +    cpr_delete_fd(proxy->id, 0);
>      vhost_dev_cleanup(&vvc->vhost_dev);
>      vhost_vsock_common_unrealize(vdev);
>  }


Reply via email to