Current memory operations like pinning may take a lot of time at the destination. Currently they are done after the source of the migration is stopped, and before the workload is resumed at the destination. This is a period where neigher traffic can flow, nor the VM workload can continue (downtime).
We can do better as we know the memory layout of the guest RAM at the destination from the moment that all devices are initializaed. So moving that operation allows QEMU to communicate the kernel the maps while the workload is still running in the source, so Linux can start mapping them. As a small drawback, there is a time in the initialization where QEMU cannot respond to QMP etc. By some testing, this time is about 0.2seconds. This may be further reduced (or increased) depending on the vdpa driver and the platform hardware, and it is dominated by the cost of memory pinning. This matches the time that we move out of the called downtime window. The downtime is measured as checking the trace timestamp from the moment the source suspend the device to the moment the destination starts the eight and last virtqueue pair. For a 39G guest, it goes from ~2.2526 secs to 2.0949. Signed-off-by: Eugenio Pérez <epere...@redhat.com> --- hw/virtio/vhost-vdpa.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 521a889104..eae8b790dd 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -660,7 +660,13 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error **errp) vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE | VIRTIO_CONFIG_S_DRIVER); + /* + * Being optimistic and listening address space memory. If the device + * uses vIOMMU, it is changed at vhost_vdpa_dev_start. + */ v->shared->listener = vhost_vdpa_memory_listener; + memory_listener_register(&v->shared->listener, &address_space_memory); + v->shared->listener_registered = true; return 0; } @@ -1331,6 +1337,11 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, bool started) "IOMMU and try again"); return -1; } + if (v->shared->listener_registered && + dev->vdev->dma_as != v->shared->listener.address_space) { + memory_listener_unregister(&v->shared->listener); + v->shared->listener_registered = false; + } if (!v->shared->listener_registered) { memory_listener_register(&v->shared->listener, dev->vdev->dma_as); v->shared->listener_registered = true; -- 2.39.3