From: "Maciej S. Szmigiero" <[email protected]>
When multiple VFIO devices are present in a VM the fact that their state
transitions on migration happen sequentially has a visible impact on
migration downtime.
This is because both PRE_COPY -> PRE_COPY_P2P -> STOP_COPY transitions on
the source and RESUMING -> RUNNING_P2P -> RUNNING transitions on the target
happen during the switchover phase.
During this phase the VM is stopped so the downtime is ticking.
These device state transitions are performed by VM state change handlers
registered by the VFIO device migration code.
Instead of performing such state transition synchronously launch a thread
performing the state change in parallel with other VFIO devices and other
VM state change handlers at the particular VFIO device qdev tree depth.
Only wait for this thread to finish *after* all other handlers at this
tree depth finish doing their jobs.
To implement the above allow adjustment of priority for VM state change
handlers - specifically allow registering qdev VM state change handlers
below and above the normal priority level for the registering device qdev
tree depth, but still properly ordered with respect to handlers
registered at other tree depths.
This way these state transitions can happen in parallel not only with
respect to other VFIO device instances but also ordinary (serialized)
handlers for other devices at this qdev tree depth.
Downtime results:
4 VFs 2 VFs 1 VF
Disabled: 1385 ms 758 ms 497 ms
Enabled: 986 ms 653 ms 493 ms
IMPROVEMENT: ~29 % ~14 % ~0 %
Test VM shape:
vCPU 12 cores x 2 threads, 15 GiB RAM.
VFIO devices in the source and target machine:
Mellanox ConnectX-7 with 100GbE link and ~100 MiB of device state per VF.
Maciej S. Szmigiero (2):
system/runstate: Allow adjustment of priority for VM state change
handlers
vfio/migration: Parallelize device state transitions
hw/core/vm-change-state-handler.c | 22 ++--
hw/vfio/migration.c | 174 ++++++++++++++++++++++++++++--
hw/vfio/pci.c | 2 +
hw/vfio/vfio-migration-internal.h | 4 +-
include/hw/vfio/vfio-device.h | 1 +
include/system/runstate.h | 2 +-
6 files changed, 189 insertions(+), 16 deletions(-)