On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasow...@redhat.com> wrote:
>
>
> On 2021/1/30 上午4:54, Eugenio Pérez wrote:
> > Shadow virtqueue notifications forwarding is disabled when vhost_dev
> > stops.
> >
> > Signed-off-by: Eugenio Pérez <epere...@redhat.com>
> > ---
> >   hw/virtio/vhost-shadow-virtqueue.h |   5 ++
> >   include/hw/virtio/vhost.h          |   4 +
> >   hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++-
> >   hw/virtio/vhost.c                  | 135 ++++++++++++++++++++++++++++-
> >   4 files changed, 264 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
> > b/hw/virtio/vhost-shadow-virtqueue.h
> > index 6cc18d6acb..466f8ae595 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.h
> > +++ b/hw/virtio/vhost-shadow-virtqueue.h
> > @@ -17,6 +17,11 @@
> >
> >   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> >
> > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> > +                               VhostShadowVirtqueue *svq);
> > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> > +                              VhostShadowVirtqueue *svq);
> > +
> >   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx);
> >
> >   void vhost_shadow_vq_free(VhostShadowVirtqueue *vq);
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 2be782cefd..732a4b2a2b 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -55,6 +55,8 @@ struct vhost_iommu {
> >       QLIST_ENTRY(vhost_iommu) iommu_next;
> >   };
> >
> > +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;
> > +
> >   typedef struct VhostDevConfigOps {
> >       /* Vhost device config space changed callback
> >        */
> > @@ -83,7 +85,9 @@ struct vhost_dev {
> >       uint64_t backend_cap;
> >       bool started;
> >       bool log_enabled;
> > +    bool sw_lm_enabled;
> >       uint64_t log_size;
> > +    VhostShadowVirtqueue **shadow_vqs;
> >       Error *migration_blocker;
> >       const VhostOps *vhost_ops;
> >       void *opaque;
> > diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
> > b/hw/virtio/vhost-shadow-virtqueue.c
> > index c0c967a7c5..908c36c66d 100644
> > --- a/hw/virtio/vhost-shadow-virtqueue.c
> > +++ b/hw/virtio/vhost-shadow-virtqueue.c
> > @@ -8,15 +8,129 @@
> >    */
> >
> >   #include "hw/virtio/vhost-shadow-virtqueue.h"
> > +#include "hw/virtio/vhost.h"
> > +#include "hw/virtio/virtio-access.h"
> > +
> > +#include "standard-headers/linux/vhost_types.h"
> > +#include "standard-headers/linux/virtio_ring.h"
> >
> >   #include "qemu/error-report.h"
> > -#include "qemu/event_notifier.h"
> > +#include "qemu/main-loop.h"
> >
> >   typedef struct VhostShadowVirtqueue {
> >       EventNotifier kick_notifier;
> >       EventNotifier call_notifier;
> > +    const struct vhost_virtqueue *hvq;
> > +    VirtIODevice *vdev;
> > +    VirtQueue *vq;
> >   } VhostShadowVirtqueue;
>
>
> So instead of doing things at virtio level, how about do the shadow
> stuffs at vhost level?
>
> It works like:
>
> virtio -> [shadow vhost backend] -> vhost backend
>
> Then the QMP is used to plug the shadow vhost backend in the middle or not.
>
> It looks kind of easier since we don't need to deal with virtqueue
> handlers etc.. Instead, we just need to deal with eventfd stuffs:
>
> When shadow vhost mode is enabled, we just intercept the host_notifiers
> and guest_notifiers. When it was disabled, we just pass the host/guest
> notifiers to the real vhost backends?
>

Hi Jason.

Sure we can try that model, but it seems to me that it comes with a
different set of problems.

For example, there are code in vhost.c that checks if implementations
are available in vhost_ops, like:

if (dev->vhost_ops->vhost_vq_get_addr) {
        r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq);
        ...
}

I can count 14 of these, checking:

dev->vhost_ops->vhost_backend_can_merge
dev->vhost_ops->vhost_backend_mem_section_filter
dev->vhost_ops->vhost_force_iommu
dev->vhost_ops->vhost_requires_shm_log
dev->vhost_ops->vhost_set_backend_cap
dev->vhost_ops->vhost_set_vring_busyloop_timeout
dev->vhost_ops->vhost_vq_get_addr
hdev->vhost_ops->vhost_dev_start
hdev->vhost_ops->vhost_get_config
hdev->vhost_ops->vhost_get_inflight_fd
hdev->vhost_ops->vhost_net_set_backend
hdev->vhost_ops->vhost_set_config
hdev->vhost_ops->vhost_set_inflight_fd
hdev->vhost_ops->vhost_set_iotlb_callback

So we should Implement all of the vhost_ops callbacks, forwarding them
to actual vhost_backed, and delete conditionally these ones? In other
words, dynamically generate the new shadow vq vhost_ops? If a new
callback is added to any vhost backend in the future, do we have to
force the adding / checking for NULL in shadow backend vhost_ops?
Would this be a good moment to check if all backends implement these
and delete the checks?

There are also checks like:

if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER)

How would shadow_vq backend expose itself? (I guess as the actual used backend).

I can modify this patchset to not relay the guest->host notifications
on vq handlers but on eventfd handlers. Although this will make it
independent of the actual virtio device kind used, I can see two
drawbacks:
* The actual fact that it makes it independent of virtio device kind.
If a device does not use the notifiers and poll the ring by itself, it
has no chance of knowing that it should stop. What happens if
virtio-net tx timer is armed when we start shadow vq?.
* The fixes (current and future) in vq notifications, like the one
currently implemented in virtio_notify_irqfd for windows drivers
regarding ISR bit 0. I think this one in particular is OK not to
carry, but I think many changes affecting any of the functions will
have to be mirrored in the other.

Thoughts on this?

Thanks!

> Thanks
>
>
> >
> > +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq)
> > +{
> > +    const struct vring_used *used = svq->hvq->used;
> > +    return virtio_tswap16(svq->vdev, used->flags);
> > +}
> > +
> > +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq)
> > +{
> > +    return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY);
> > +}
> > +
> > +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq)
> > +{
> > +    if (vhost_shadow_vring_should_kick(vq)) {
> > +        event_notifier_set(&vq->kick_notifier);
> > +    }
> > +}
> > +
> > +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq)
> > +{
> > +    struct vhost_dev *hdev = vhost_dev_from_virtio(vdev);
> > +    uint16_t idx = virtio_get_queue_index(vq);
> > +
> > +    VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx];
> > +
> > +    vhost_shadow_vring_kick(svq);
> > +}
> > +
> > +/*
> > + * Start shadow virtqueue operation.
> > + * @dev vhost device
> > + * @svq Shadow Virtqueue
> > + *
> > + * Run in RCU context
> > + */
> > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev,
> > +                               VhostShadowVirtqueue *svq)
> > +{
> > +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev);
> > +    EventNotifier *vq_host_notifier = 
> > virtio_queue_get_host_notifier(svq->vq);
> > +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> > +    struct vhost_vring_file kick_file = {
> > +        .index = idx,
> > +        .fd = event_notifier_get_fd(&svq->kick_notifier),
> > +    };
> > +    int r;
> > +    bool ok;
> > +
> > +    /* Check that notifications are still going directly to vhost dev */
> > +    assert(virtio_queue_host_notifier_status(svq->vq));
> > +
> > +    ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq);
> > +    if (!ok) {
> > +        error_report("Couldn't set the vq handler");
> > +        goto err_set_kick_handler;
> > +    }
> > +
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> > +    if (r != 0) {
> > +        error_report("Couldn't set kick fd: %s", strerror(errno));
> > +        goto err_set_vring_kick;
> > +    }
> > +
> > +    event_notifier_set_handler(vq_host_notifier,
> > +                               virtio_queue_host_notifier_read);
> > +    virtio_queue_set_host_notifier_enabled(svq->vq, false);
> > +    virtio_queue_host_notifier_read(vq_host_notifier);
> > +
> > +    return true;
> > +
> > +err_set_vring_kick:
> > +    k->set_vq_handler(dev->vdev, idx, NULL);
> > +
> > +err_set_kick_handler:
> > +    return false;
> > +}
> > +
> > +/*
> > + * Stop shadow virtqueue operation.
> > + * @dev vhost device
> > + * @svq Shadow Virtqueue
> > + *
> > + * Run in RCU context
> > + */
> > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev,
> > +                              VhostShadowVirtqueue *svq)
> > +{
> > +    const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev);
> > +    unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq);
> > +    EventNotifier *vq_host_notifier = 
> > virtio_queue_get_host_notifier(svq->vq);
> > +    struct vhost_vring_file kick_file = {
> > +        .index = idx,
> > +        .fd = event_notifier_get_fd(vq_host_notifier),
> > +    };
> > +    int r;
> > +
> > +    /* Restore vhost kick */
> > +    r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file);
> > +    /* Cannot do a lot of things */
> > +    assert(r == 0);
> > +
> > +    event_notifier_set_handler(vq_host_notifier, NULL);
> > +    virtio_queue_set_host_notifier_enabled(svq->vq, true);
> > +    k->set_vq_handler(svq->vdev, idx, NULL);
> > +}
> > +
> >   /*
> >    * Creates vhost shadow virtqueue, and instruct vhost device to use the 
> > shadow
> >    * methods and file descriptors.
> > @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue {
> >   VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx)
> >   {
> >       g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 
> > 1);
> > +    int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + 
> > idx);
> >       int r;
> >
> > +    svq->vq = virtio_get_queue(dev->vdev, vq_idx);
> > +    svq->hvq = &dev->vqs[idx];
> > +    svq->vdev = dev->vdev;
> > +
> >       r = event_notifier_init(&svq->kick_notifier, 0);
> >       if (r != 0) {
> >           error_report("Couldn't create kick event notifier: %s",
> > @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct 
> > vhost_dev *dev, int idx)
> >           goto err_init_call_notifier;
> >       }
> >
> > -    return svq;
> > +    return g_steal_pointer(&svq);
> >
> >   err_init_call_notifier:
> >       event_notifier_cleanup(&svq->kick_notifier);
> > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > index 42836e45f3..bde688f278 100644
> > --- a/hw/virtio/vhost.c
> > +++ b/hw/virtio/vhost.c
> > @@ -25,6 +25,7 @@
> >   #include "exec/address-spaces.h"
> >   #include "hw/virtio/virtio-bus.h"
> >   #include "hw/virtio/virtio-access.h"
> > +#include "hw/virtio/vhost-shadow-virtqueue.h"
> >   #include "migration/blocker.h"
> >   #include "migration/qemu-file-types.h"
> >   #include "sysemu/dma.h"
> > @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener 
> > *listener)
> >       }
> >   }
> >
> > +static int vhost_sw_live_migration_stop(struct vhost_dev *dev)
> > +{
> > +    int idx;
> > +
> > +    WITH_RCU_READ_LOCK_GUARD() {
> > +        dev->sw_lm_enabled = false;
> > +
> > +        for (idx = 0; idx < dev->nvqs; ++idx) {
> > +            vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]);
> > +        }
> > +    }
> > +
> > +    for (idx = 0; idx < dev->nvqs; ++idx) {
> > +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> > +    }
> > +
> > +    g_free(dev->shadow_vqs);
> > +    dev->shadow_vqs = NULL;
> > +    return 0;
> > +}
> > +
> > +static int vhost_sw_live_migration_start(struct vhost_dev *dev)
> > +{
> > +    int idx;
> > +
> > +    dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs);
> > +    for (idx = 0; idx < dev->nvqs; ++idx) {
> > +        dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx);
> > +        if (unlikely(dev->shadow_vqs[idx] == NULL)) {
> > +            goto err;
> > +        }
> > +    }
> > +
> > +    WITH_RCU_READ_LOCK_GUARD() {
> > +        for (idx = 0; idx < dev->nvqs; ++idx) {
> > +            int stop_idx = idx;
> > +            bool ok = vhost_shadow_vq_start_rcu(dev,
> > +                                                dev->shadow_vqs[idx]);
> > +
> > +            if (!ok) {
> > +                while (--stop_idx >= 0) {
> > +                    vhost_shadow_vq_stop_rcu(dev, 
> > dev->shadow_vqs[stop_idx]);
> > +                }
> > +
> > +                goto err;
> > +            }
> > +        }
> > +    }
> > +
> > +    dev->sw_lm_enabled = true;
> > +    return 0;
> > +
> > +err:
> > +    for (; idx >= 0; --idx) {
> > +        vhost_shadow_vq_free(dev->shadow_vqs[idx]);
> > +    }
> > +    g_free(dev->shadow_vqs[idx]);
> > +
> > +    return -1;
> > +}
> > +
> > +static int vhost_sw_live_migration_enable(struct vhost_dev *dev,
> > +                                          bool enable_lm)
> > +{
> > +    int r;
> > +
> > +    if (enable_lm == dev->sw_lm_enabled) {
> > +        return 0;
> > +    }
> > +
> > +    r = enable_lm ? vhost_sw_live_migration_start(dev)
> > +                  : vhost_sw_live_migration_stop(dev);
> > +
> > +    return r;
> > +}
> > +
> >   static void vhost_log_start(MemoryListener *listener,
> >                               MemoryRegionSection *section,
> >                               int old, int new)
> > @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void 
> > *opaque,
> >       hdev->log = NULL;
> >       hdev->log_size = 0;
> >       hdev->log_enabled = false;
> > +    hdev->sw_lm_enabled = false;
> >       hdev->started = false;
> >       memory_listener_register(&hdev->memory_listener, 
> > &address_space_memory);
> >       QLIST_INSERT_HEAD(&vhost_devices, hdev, entry);
> > @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, 
> > VirtIODevice *vdev)
> >           hdev->vhost_ops->vhost_dev_start(hdev, false);
> >       }
> >       for (i = 0; i < hdev->nvqs; ++i) {
> > +        if (hdev->sw_lm_enabled) {
> > +            vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]);
> > +            vhost_shadow_vq_free(hdev->shadow_vqs[i]);
> > +        }
> > +
> >           vhost_virtqueue_stop(hdev,
> >                                vdev,
> >                                hdev->vqs + i,
> > @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, 
> > VirtIODevice *vdev)
> >           memory_listener_unregister(&hdev->iommu_listener);
> >       }
> >       vhost_log_put(hdev, true);
> > +    g_free(hdev->shadow_vqs);
> > +    hdev->sw_lm_enabled = false;
> >       hdev->started = false;
> >       hdev->vdev = NULL;
> >   }
> > @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev,
> >
> >   void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error 
> > **errp)
> >   {
> > -    error_setg(errp, "Shadow virtqueue still not implemented.");
> > +    struct vhost_dev *hdev;
> > +    const char *err_cause = NULL;
> > +    const VirtioDeviceClass *k;
> > +    int r;
> > +    ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR;
> > +
> > +    QLIST_FOREACH(hdev, &vhost_devices, entry) {
> > +        if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) {
> > +            break;
> > +        }
> > +    }
> > +
> > +    if (!hdev) {
> > +        err_class = ERROR_CLASS_DEVICE_NOT_FOUND;
> > +        err_cause = "Device not found";
> > +        goto err;
> > +    }
> > +
> > +    if (!hdev->started) {
> > +        err_cause = "Device is not started";
> > +        goto err;
> > +    }
> > +
> > +    if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) {
> > +        err_cause = "Use packed vq";
> > +        goto err;
> > +    }
> > +
> > +    if (vhost_dev_has_iommu(hdev)) {
> > +        err_cause = "Device use IOMMU";
> > +        goto err;
> > +    }
> > +
> > +    k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev);
> > +    if (!k->set_vq_handler) {
> > +        err_cause = "Virtio device type does not support reset of vq 
> > handler";
> > +        goto err;
> > +    }
> > +
> > +    r = vhost_sw_live_migration_enable(hdev, enable);
> > +    if (unlikely(r)) {
> > +        err_cause = "Error enabling (see monitor)";
> > +    }
> > +
> > +err:
> > +    if (err_cause) {
> > +        error_set(errp, err_class,
> > +                  "Can't enable shadow vq on %s: %s", name, err_cause);
> > +    }
> >   }
>


Reply via email to