On Mon, Feb 1, 2021 at 7:29 AM Jason Wang <jasow...@redhat.com> wrote: > > > On 2021/1/30 上午4:54, Eugenio Pérez wrote: > > Shadow virtqueue notifications forwarding is disabled when vhost_dev > > stops. > > > > Signed-off-by: Eugenio Pérez <epere...@redhat.com> > > --- > > hw/virtio/vhost-shadow-virtqueue.h | 5 ++ > > include/hw/virtio/vhost.h | 4 + > > hw/virtio/vhost-shadow-virtqueue.c | 123 +++++++++++++++++++++++++- > > hw/virtio/vhost.c | 135 ++++++++++++++++++++++++++++- > > 4 files changed, 264 insertions(+), 3 deletions(-) > > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.h > > b/hw/virtio/vhost-shadow-virtqueue.h > > index 6cc18d6acb..466f8ae595 100644 > > --- a/hw/virtio/vhost-shadow-virtqueue.h > > +++ b/hw/virtio/vhost-shadow-virtqueue.h > > @@ -17,6 +17,11 @@ > > > > typedef struct VhostShadowVirtqueue VhostShadowVirtqueue; > > > > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev, > > + VhostShadowVirtqueue *svq); > > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev, > > + VhostShadowVirtqueue *svq); > > + > > VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx); > > > > void vhost_shadow_vq_free(VhostShadowVirtqueue *vq); > > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h > > index 2be782cefd..732a4b2a2b 100644 > > --- a/include/hw/virtio/vhost.h > > +++ b/include/hw/virtio/vhost.h > > @@ -55,6 +55,8 @@ struct vhost_iommu { > > QLIST_ENTRY(vhost_iommu) iommu_next; > > }; > > > > +typedef struct VhostShadowVirtqueue VhostShadowVirtqueue; > > + > > typedef struct VhostDevConfigOps { > > /* Vhost device config space changed callback > > */ > > @@ -83,7 +85,9 @@ struct vhost_dev { > > uint64_t backend_cap; > > bool started; > > bool log_enabled; > > + bool sw_lm_enabled; > > uint64_t log_size; > > + VhostShadowVirtqueue **shadow_vqs; > > Error *migration_blocker; > > const VhostOps *vhost_ops; > > void *opaque; > > diff --git a/hw/virtio/vhost-shadow-virtqueue.c > > b/hw/virtio/vhost-shadow-virtqueue.c > > index c0c967a7c5..908c36c66d 100644 > > --- a/hw/virtio/vhost-shadow-virtqueue.c > > +++ b/hw/virtio/vhost-shadow-virtqueue.c > > @@ -8,15 +8,129 @@ > > */ > > > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > +#include "hw/virtio/vhost.h" > > +#include "hw/virtio/virtio-access.h" > > + > > +#include "standard-headers/linux/vhost_types.h" > > +#include "standard-headers/linux/virtio_ring.h" > > > > #include "qemu/error-report.h" > > -#include "qemu/event_notifier.h" > > +#include "qemu/main-loop.h" > > > > typedef struct VhostShadowVirtqueue { > > EventNotifier kick_notifier; > > EventNotifier call_notifier; > > + const struct vhost_virtqueue *hvq; > > + VirtIODevice *vdev; > > + VirtQueue *vq; > > } VhostShadowVirtqueue; > > > So instead of doing things at virtio level, how about do the shadow > stuffs at vhost level? > > It works like: > > virtio -> [shadow vhost backend] -> vhost backend > > Then the QMP is used to plug the shadow vhost backend in the middle or not. > > It looks kind of easier since we don't need to deal with virtqueue > handlers etc.. Instead, we just need to deal with eventfd stuffs: > > When shadow vhost mode is enabled, we just intercept the host_notifiers > and guest_notifiers. When it was disabled, we just pass the host/guest > notifiers to the real vhost backends? >
Hi Jason. Sure we can try that model, but it seems to me that it comes with a different set of problems. For example, there are code in vhost.c that checks if implementations are available in vhost_ops, like: if (dev->vhost_ops->vhost_vq_get_addr) { r = dev->vhost_ops->vhost_vq_get_addr(dev, &addr, vq); ... } I can count 14 of these, checking: dev->vhost_ops->vhost_backend_can_merge dev->vhost_ops->vhost_backend_mem_section_filter dev->vhost_ops->vhost_force_iommu dev->vhost_ops->vhost_requires_shm_log dev->vhost_ops->vhost_set_backend_cap dev->vhost_ops->vhost_set_vring_busyloop_timeout dev->vhost_ops->vhost_vq_get_addr hdev->vhost_ops->vhost_dev_start hdev->vhost_ops->vhost_get_config hdev->vhost_ops->vhost_get_inflight_fd hdev->vhost_ops->vhost_net_set_backend hdev->vhost_ops->vhost_set_config hdev->vhost_ops->vhost_set_inflight_fd hdev->vhost_ops->vhost_set_iotlb_callback So we should Implement all of the vhost_ops callbacks, forwarding them to actual vhost_backed, and delete conditionally these ones? In other words, dynamically generate the new shadow vq vhost_ops? If a new callback is added to any vhost backend in the future, do we have to force the adding / checking for NULL in shadow backend vhost_ops? Would this be a good moment to check if all backends implement these and delete the checks? There are also checks like: if (dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_USER) How would shadow_vq backend expose itself? (I guess as the actual used backend). I can modify this patchset to not relay the guest->host notifications on vq handlers but on eventfd handlers. Although this will make it independent of the actual virtio device kind used, I can see two drawbacks: * The actual fact that it makes it independent of virtio device kind. If a device does not use the notifiers and poll the ring by itself, it has no chance of knowing that it should stop. What happens if virtio-net tx timer is armed when we start shadow vq?. * The fixes (current and future) in vq notifications, like the one currently implemented in virtio_notify_irqfd for windows drivers regarding ISR bit 0. I think this one in particular is OK not to carry, but I think many changes affecting any of the functions will have to be mirrored in the other. Thoughts on this? Thanks! > Thanks > > > > > > +static uint16_t vhost_shadow_vring_used_flags(VhostShadowVirtqueue *svq) > > +{ > > + const struct vring_used *used = svq->hvq->used; > > + return virtio_tswap16(svq->vdev, used->flags); > > +} > > + > > +static bool vhost_shadow_vring_should_kick(VhostShadowVirtqueue *vq) > > +{ > > + return !(vhost_shadow_vring_used_flags(vq) & VRING_USED_F_NO_NOTIFY); > > +} > > + > > +static void vhost_shadow_vring_kick(VhostShadowVirtqueue *vq) > > +{ > > + if (vhost_shadow_vring_should_kick(vq)) { > > + event_notifier_set(&vq->kick_notifier); > > + } > > +} > > + > > +static void handle_shadow_vq(VirtIODevice *vdev, VirtQueue *vq) > > +{ > > + struct vhost_dev *hdev = vhost_dev_from_virtio(vdev); > > + uint16_t idx = virtio_get_queue_index(vq); > > + > > + VhostShadowVirtqueue *svq = hdev->shadow_vqs[idx]; > > + > > + vhost_shadow_vring_kick(svq); > > +} > > + > > +/* > > + * Start shadow virtqueue operation. > > + * @dev vhost device > > + * @svq Shadow Virtqueue > > + * > > + * Run in RCU context > > + */ > > +bool vhost_shadow_vq_start_rcu(struct vhost_dev *dev, > > + VhostShadowVirtqueue *svq) > > +{ > > + const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(dev->vdev); > > + EventNotifier *vq_host_notifier = > > virtio_queue_get_host_notifier(svq->vq); > > + unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq); > > + struct vhost_vring_file kick_file = { > > + .index = idx, > > + .fd = event_notifier_get_fd(&svq->kick_notifier), > > + }; > > + int r; > > + bool ok; > > + > > + /* Check that notifications are still going directly to vhost dev */ > > + assert(virtio_queue_host_notifier_status(svq->vq)); > > + > > + ok = k->set_vq_handler(dev->vdev, idx, handle_shadow_vq); > > + if (!ok) { > > + error_report("Couldn't set the vq handler"); > > + goto err_set_kick_handler; > > + } > > + > > + r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file); > > + if (r != 0) { > > + error_report("Couldn't set kick fd: %s", strerror(errno)); > > + goto err_set_vring_kick; > > + } > > + > > + event_notifier_set_handler(vq_host_notifier, > > + virtio_queue_host_notifier_read); > > + virtio_queue_set_host_notifier_enabled(svq->vq, false); > > + virtio_queue_host_notifier_read(vq_host_notifier); > > + > > + return true; > > + > > +err_set_vring_kick: > > + k->set_vq_handler(dev->vdev, idx, NULL); > > + > > +err_set_kick_handler: > > + return false; > > +} > > + > > +/* > > + * Stop shadow virtqueue operation. > > + * @dev vhost device > > + * @svq Shadow Virtqueue > > + * > > + * Run in RCU context > > + */ > > +void vhost_shadow_vq_stop_rcu(struct vhost_dev *dev, > > + VhostShadowVirtqueue *svq) > > +{ > > + const VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(svq->vdev); > > + unsigned idx = virtio_queue_get_idx(svq->vdev, svq->vq); > > + EventNotifier *vq_host_notifier = > > virtio_queue_get_host_notifier(svq->vq); > > + struct vhost_vring_file kick_file = { > > + .index = idx, > > + .fd = event_notifier_get_fd(vq_host_notifier), > > + }; > > + int r; > > + > > + /* Restore vhost kick */ > > + r = dev->vhost_ops->vhost_set_vring_kick(dev, &kick_file); > > + /* Cannot do a lot of things */ > > + assert(r == 0); > > + > > + event_notifier_set_handler(vq_host_notifier, NULL); > > + virtio_queue_set_host_notifier_enabled(svq->vq, true); > > + k->set_vq_handler(svq->vdev, idx, NULL); > > +} > > + > > /* > > * Creates vhost shadow virtqueue, and instruct vhost device to use the > > shadow > > * methods and file descriptors. > > @@ -24,8 +138,13 @@ typedef struct VhostShadowVirtqueue { > > VhostShadowVirtqueue *vhost_shadow_vq_new(struct vhost_dev *dev, int idx) > > { > > g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, > > 1); > > + int vq_idx = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + > > idx); > > int r; > > > > + svq->vq = virtio_get_queue(dev->vdev, vq_idx); > > + svq->hvq = &dev->vqs[idx]; > > + svq->vdev = dev->vdev; > > + > > r = event_notifier_init(&svq->kick_notifier, 0); > > if (r != 0) { > > error_report("Couldn't create kick event notifier: %s", > > @@ -40,7 +159,7 @@ VhostShadowVirtqueue *vhost_shadow_vq_new(struct > > vhost_dev *dev, int idx) > > goto err_init_call_notifier; > > } > > > > - return svq; > > + return g_steal_pointer(&svq); > > > > err_init_call_notifier: > > event_notifier_cleanup(&svq->kick_notifier); > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > index 42836e45f3..bde688f278 100644 > > --- a/hw/virtio/vhost.c > > +++ b/hw/virtio/vhost.c > > @@ -25,6 +25,7 @@ > > #include "exec/address-spaces.h" > > #include "hw/virtio/virtio-bus.h" > > #include "hw/virtio/virtio-access.h" > > +#include "hw/virtio/vhost-shadow-virtqueue.h" > > #include "migration/blocker.h" > > #include "migration/qemu-file-types.h" > > #include "sysemu/dma.h" > > @@ -945,6 +946,82 @@ static void vhost_log_global_stop(MemoryListener > > *listener) > > } > > } > > > > +static int vhost_sw_live_migration_stop(struct vhost_dev *dev) > > +{ > > + int idx; > > + > > + WITH_RCU_READ_LOCK_GUARD() { > > + dev->sw_lm_enabled = false; > > + > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > + vhost_shadow_vq_stop_rcu(dev, dev->shadow_vqs[idx]); > > + } > > + } > > + > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > + vhost_shadow_vq_free(dev->shadow_vqs[idx]); > > + } > > + > > + g_free(dev->shadow_vqs); > > + dev->shadow_vqs = NULL; > > + return 0; > > +} > > + > > +static int vhost_sw_live_migration_start(struct vhost_dev *dev) > > +{ > > + int idx; > > + > > + dev->shadow_vqs = g_new0(VhostShadowVirtqueue *, dev->nvqs); > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > + dev->shadow_vqs[idx] = vhost_shadow_vq_new(dev, idx); > > + if (unlikely(dev->shadow_vqs[idx] == NULL)) { > > + goto err; > > + } > > + } > > + > > + WITH_RCU_READ_LOCK_GUARD() { > > + for (idx = 0; idx < dev->nvqs; ++idx) { > > + int stop_idx = idx; > > + bool ok = vhost_shadow_vq_start_rcu(dev, > > + dev->shadow_vqs[idx]); > > + > > + if (!ok) { > > + while (--stop_idx >= 0) { > > + vhost_shadow_vq_stop_rcu(dev, > > dev->shadow_vqs[stop_idx]); > > + } > > + > > + goto err; > > + } > > + } > > + } > > + > > + dev->sw_lm_enabled = true; > > + return 0; > > + > > +err: > > + for (; idx >= 0; --idx) { > > + vhost_shadow_vq_free(dev->shadow_vqs[idx]); > > + } > > + g_free(dev->shadow_vqs[idx]); > > + > > + return -1; > > +} > > + > > +static int vhost_sw_live_migration_enable(struct vhost_dev *dev, > > + bool enable_lm) > > +{ > > + int r; > > + > > + if (enable_lm == dev->sw_lm_enabled) { > > + return 0; > > + } > > + > > + r = enable_lm ? vhost_sw_live_migration_start(dev) > > + : vhost_sw_live_migration_stop(dev); > > + > > + return r; > > +} > > + > > static void vhost_log_start(MemoryListener *listener, > > MemoryRegionSection *section, > > int old, int new) > > @@ -1389,6 +1466,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void > > *opaque, > > hdev->log = NULL; > > hdev->log_size = 0; > > hdev->log_enabled = false; > > + hdev->sw_lm_enabled = false; > > hdev->started = false; > > memory_listener_register(&hdev->memory_listener, > > &address_space_memory); > > QLIST_INSERT_HEAD(&vhost_devices, hdev, entry); > > @@ -1816,6 +1894,11 @@ void vhost_dev_stop(struct vhost_dev *hdev, > > VirtIODevice *vdev) > > hdev->vhost_ops->vhost_dev_start(hdev, false); > > } > > for (i = 0; i < hdev->nvqs; ++i) { > > + if (hdev->sw_lm_enabled) { > > + vhost_shadow_vq_stop_rcu(hdev, hdev->shadow_vqs[i]); > > + vhost_shadow_vq_free(hdev->shadow_vqs[i]); > > + } > > + > > vhost_virtqueue_stop(hdev, > > vdev, > > hdev->vqs + i, > > @@ -1829,6 +1912,8 @@ void vhost_dev_stop(struct vhost_dev *hdev, > > VirtIODevice *vdev) > > memory_listener_unregister(&hdev->iommu_listener); > > } > > vhost_log_put(hdev, true); > > + g_free(hdev->shadow_vqs); > > + hdev->sw_lm_enabled = false; > > hdev->started = false; > > hdev->vdev = NULL; > > } > > @@ -1845,5 +1930,53 @@ int vhost_net_set_backend(struct vhost_dev *hdev, > > > > void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error > > **errp) > > { > > - error_setg(errp, "Shadow virtqueue still not implemented."); > > + struct vhost_dev *hdev; > > + const char *err_cause = NULL; > > + const VirtioDeviceClass *k; > > + int r; > > + ErrorClass err_class = ERROR_CLASS_GENERIC_ERROR; > > + > > + QLIST_FOREACH(hdev, &vhost_devices, entry) { > > + if (hdev->vdev && 0 == strcmp(hdev->vdev->name, name)) { > > + break; > > + } > > + } > > + > > + if (!hdev) { > > + err_class = ERROR_CLASS_DEVICE_NOT_FOUND; > > + err_cause = "Device not found"; > > + goto err; > > + } > > + > > + if (!hdev->started) { > > + err_cause = "Device is not started"; > > + goto err; > > + } > > + > > + if (hdev->acked_features & BIT_ULL(VIRTIO_F_RING_PACKED)) { > > + err_cause = "Use packed vq"; > > + goto err; > > + } > > + > > + if (vhost_dev_has_iommu(hdev)) { > > + err_cause = "Device use IOMMU"; > > + goto err; > > + } > > + > > + k = VIRTIO_DEVICE_GET_CLASS(hdev->vdev); > > + if (!k->set_vq_handler) { > > + err_cause = "Virtio device type does not support reset of vq > > handler"; > > + goto err; > > + } > > + > > + r = vhost_sw_live_migration_enable(hdev, enable); > > + if (unlikely(r)) { > > + err_cause = "Error enabling (see monitor)"; > > + } > > + > > +err: > > + if (err_cause) { > > + error_set(errp, err_class, > > + "Can't enable shadow vq on %s: %s", name, err_cause); > > + } > > } >