This small series (actually only the last patch; first two are cleanups) wants to improve ability of QEMU downtime analysis similarly to what Joao used to propose here:
https://lore.kernel.org/r/20230926161841.98464-1-joao.m.mart...@oracle.com But with a few differences: - Nothing exported yet to qapi, all tracepoints so far - Instead of major checkpoints (stop, iterable, non-iterable, resume-rp), finer granule by providing downtime measurements for each vmstate (I made microsecond to be the unit to be accurate). So far it seems iterable / non-iterable is the core of the problem, and I want to nail it to per-device. - Trace dest QEMU too For the last bullet: consider the case where a device save() can be super fast, while load() can actually be super slow. Both of them will contribute to the ultimate downtime, but not a simple summary: when src QEMU is save()ing on device1, dst QEMU can be load()ing on device2. So they can run in parallel. However the only way to figure all components of the downtime is to record both. Please have a look, thanks. Peter Xu (3): migration: Set downtime_start even for postcopy migration: Add migration_downtime_start|end() helpers migration: Add per vmstate downtime tracepoints migration/migration.c | 38 +++++++++++++++++++++----------- migration/savevm.c | 49 ++++++++++++++++++++++++++++++++++++++---- migration/trace-events | 2 ++ 3 files changed, 72 insertions(+), 17 deletions(-) -- 2.41.0