Hey, The cost of switchover is usually not accounted in the migration algorithm, as the migration algorithm reduces all of it to "pending bytes" fitting a "threshold" (which represents some available or proactively-measured link bandwidth) as the rule of thumb to calculate downtime.
External latencies (OS, or Qemu ones), as well as when VFs are present, may affect how big or small the switchover may be. Given the wide range of configurations possible, it is either non exactly determinist or predictable to have some generic rule to calculate the cost of switchover. This series is aimed at improving observability what contributes to the switchover/downtime particularly. The breakdown: * The first 2 patches move storage of downtime timestamps to its dedicated data structure, and then we add a couple key places to measure those timestamps. * What we do with those timestamps is the next 2 patches by calculating the downtime breakdown when asked for the data as well as adding the tracepointt. * Finally last patch provides introspection to the calculated expected-downtime (pending_bytes vs threshold_size) which is when we decide to switchover, and print that data when available to give some comparison. For now, mainly precopy data, and here I added both tracepoints and QMP stats via query-migrate. Postcopy is still missing. Thoughts, comments appreciated as usual. Thanks! Joao Joao Martins (5): migration: Store downtime timestamps in an array migration: Collect more timestamps during switchover migration: Add a tracepoint for the downtime stats migration: Provide QMP access to downtime stats migration: Print expected-downtime on completion qapi/migration.json | 50 +++++++++++++++++++++++++ migration/migration.h | 7 +++- migration/migration.c | 85 ++++++++++++++++++++++++++++++++++++++++-- migration/savevm.c | 2 + migration/trace-events | 1 + 5 files changed, 139 insertions(+), 6 deletions(-) -- 2.39.3