From: "Maciej S. Szmigiero" <maciej.szmigi...@oracle.com>

VFIO device state transfer is currently done via the main migration channel.
This means that transfers from multiple VFIO devices are done sequentially
and via just a single common migration channel.

Such way of transferring VFIO device state migration data reduces
performance and severally impacts the migration downtime (~50%) for VMs
that have multiple such devices with large state size - see the test
results below.

However, we already have a way to transfer migration data using multiple
connections - that's what multifd channels are.

Unfortunately, multifd channels are currently utilized for RAM transfer
only.
This patch set adds a new framework allowing their use for device state
transfer too.

The wire protocol is based on Avihai's x-channel-header patches, which
introduce a header for migration channels that allow the migration source
to explicitly indicate the migration channel type without having the
target deduce the channel type by peeking in the channel's content.

The new wire protocol can be switch on and off via migration.x-channel-header
option for compatibility with older QEMU versions and testing.
Switching the new wire protocol off also disables device state transfer via
multifd channels.

The device state transfer can happen either via the same multifd channels
as RAM data is transferred, mixed with RAM data (when
migration.x-multifd-channels-device-state is 0) or exclusively via
dedicated device state transfer channels (when
migration.x-multifd-channels-device-state > 0).

Using dedicated device state transfer multifd channels brings further
performance benefits since these channels don't need to participate in
the RAM sync process.


These patches introduce a few new SaveVMHandlers:
* "save_live_complete_precopy_async{,wait}" handlers that allow device to
  provide its own asynchronous transmission of the remaining data at the
  end of a precopy phase.

  The "save_live_complete_precopy_async" handler is supposed to start such
  transmission (for example, by launching appropriate threads) while the
  "save_live_complete_precopy_async_wait" handler is supposed to wait until
  such transfer has finished (for example, until the sending threads
  have exited).

* "load_state_buffer" and its caller qemu_loadvm_load_state_buffer() that
  allow providing device state buffer to explicitly specified device via
  its idstr and instance id.

* "load_finish" the allows migration code to poll whether a device-specific
  asynchronous device state loading operation had finished before proceeding
  further in the migration process (with associated condition variable for
  notification to avoid unnecessary polling).


A VFIO device migration consumer for the new multifd channels device state
migration framework was implemented with a reassembly process for the multifd
received data since device state packets sent via different multifd channels
can arrive out-of-order.

The VFIO device data loading process happens in a separate thread in order
to avoid blocking a multifd receive thread during this fairly long process.


Test setup:
Source machine: 2x Xeon Gold 5218 / 192 GiB RAM
                Mellanox ConnectX-7 with 100GbE link
                6.9.0-rc1+ kernel
Target machine: 2x Xeon Platinum 8260 / 376 GiB RAM
                Mellanox ConnectX-7 with 100GbE link
                6.6.0+ kernel
VM: CPU 12cores x 2threads / 15 GiB RAM / 4x Mellanox ConnectX-7 VF


Migration config: 15 multifd channels total
                  new way had 4 channels dedicated to device state transfer
                  x-return-path=true, x-switchover-ack=true

Downtime with ~400 MiB VFIO total device state size:
                                                       TLS off     TLS on
        migration.x-channel-header=false (old way)    ~2100 ms   ~2300 ms
        migration.x-channel-header=true (new way)     ~1100 ms   ~1200 ms
        IMPROVEMENT                                       ~50%       ~50%


This patch set is also available as a git tree:
https://github.com/maciejsszmigiero/qemu/tree/multifd-device-state-transfer-vfio


Avihai Horon (7):
  migration: Add x-channel-header pseudo-capability
  migration: Add migration channel header send/receive
  migration: Add send/receive header for main channel
  migration: Allow passing migration header in migration channel
    creation
  migration: Add send/receive header for postcopy preempt channel
  migration: Add send/receive header for multifd channel
  migration: Enable x-channel-header pseudo-capability

Maciej S. Szmigiero (19):
  multifd: change multifd_new_send_channel_create() param type
  migration: Add a DestroyNotify parameter to
    socket_send_channel_create()
  multifd: pass MFDSendChannelConnectData when connecting sending socket
  migration/postcopy: pass PostcopyPChannelConnectData when connecting
    sending preempt socket
  migration/options: Mapped-ram is not channel header compatible
  vfio/migration: Add save_{iterate,complete_precopy}_started trace
    events
  migration/ram: Add load start trace event
  migration/multifd: Zero p->flags before starting filling a packet
  migration: Add save_live_complete_precopy_async{,wait} handlers
  migration: Add qemu_loadvm_load_state_buffer() and its handler
  migration: Add load_finish handler and associated functions
  migration: Add x-multifd-channels-device-state parameter
  migration: Add MULTIFD_DEVICE_STATE migration channel type
  migration/multifd: Device state transfer support - receive side
  migration/multifd: Convert multifd_send_pages::next_channel to atomic
  migration/multifd: Device state transfer support - send side
  migration/multifd: Add migration_has_device_state_support()
  vfio/migration: Multifd device state transfer support - receive side
  vfio/migration: Multifd device state transfer support - send side

 hw/core/machine.c              |   1 +
 hw/vfio/migration.c            | 530 ++++++++++++++++++++++++++++++++-
 hw/vfio/trace-events           |  15 +-
 include/hw/vfio/vfio-common.h  |  25 ++
 include/migration/misc.h       |   5 +
 include/migration/register.h   |  67 +++++
 migration/channel.c            |  68 +++++
 migration/channel.h            |  17 ++
 migration/file.c               |   5 +-
 migration/migration-hmp-cmds.c |   7 +
 migration/migration.c          | 110 ++++++-
 migration/migration.h          |   6 +
 migration/multifd-zlib.c       |   2 +-
 migration/multifd-zstd.c       |   2 +-
 migration/multifd.c            | 512 ++++++++++++++++++++++++++-----
 migration/multifd.h            |  62 +++-
 migration/options.c            |  66 ++++
 migration/options.h            |   2 +
 migration/postcopy-ram.c       |  81 ++++-
 migration/ram.c                |   1 +
 migration/savevm.c             | 112 +++++++
 migration/savevm.h             |   7 +
 migration/socket.c             |   6 +-
 migration/socket.h             |   3 +-
 migration/trace-events         |   3 +
 qapi/migration.json            |  16 +-
 26 files changed, 1626 insertions(+), 105 deletions(-)


Reply via email to