This small series (actually only the last patch; first two are cleanups)
wants to improve ability of QEMU downtime analysis similarly to what Joao
used to propose here:

  https://lore.kernel.org/r/20230926161841.98464-1-joao.m.mart...@oracle.com

But with a few differences:

  - Nothing exported yet to qapi, all tracepoints so far

  - Instead of major checkpoints (stop, iterable, non-iterable, resume-rp),
    finer granule by providing downtime measurements for each vmstate (I
    made microsecond to be the unit to be accurate).  So far it seems
    iterable / non-iterable is the core of the problem, and I want to nail
    it to per-device.

  - Trace dest QEMU too

For the last bullet: consider the case where a device save() can be super
fast, while load() can actually be super slow.  Both of them will
contribute to the ultimate downtime, but not a simple summary: when src
QEMU is save()ing on device1, dst QEMU can be load()ing on device2.  So
they can run in parallel.  However the only way to figure all components of
the downtime is to record both.

Please have a look, thanks.

Peter Xu (3):
  migration: Set downtime_start even for postcopy
  migration: Add migration_downtime_start|end() helpers
  migration: Add per vmstate downtime tracepoints

 migration/migration.c  | 38 +++++++++++++++++++++-----------
 migration/savevm.c     | 49 ++++++++++++++++++++++++++++++++++++++----
 migration/trace-events |  2 ++
 3 files changed, 72 insertions(+), 17 deletions(-)

-- 
2.41.0


Reply via email to