These two APIs are a slight duplication. For example, there're a few users that directly pass in the same function.
It might also be error prone to provide two hooks, so that it's easier to happen one module report different things via the two hooks. In reality, they should always report the same thing, only about whether we should use a fast-path when the slow path might be too slow, as QEMU may query these information quite frequently during migration process. Merge it into one API, provide a bool showing if the query is an exact query or not. No functional change intended. Export qemu_savevm_query_pending(). We should use the new API here provided when there're new users to do the query. This will happen very soon. Cc: Halil Pasic <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Eric Farman <[email protected]> Cc: Matthew Rosato <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Cornelia Huck <[email protected]> Cc: Eric Blake <[email protected]> Cc: Vladimir Sementsov-Ogievskiy <[email protected]> Cc: John Snow <[email protected]> Reviewed-by: Jason J. Herne <[email protected]> Reviewed-by: Juraj Marcin <[email protected]> Reviewed-by: Avihai Horon <[email protected]> Acked-by: Vladimir Sementsov-Ogievskiy <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Peter Xu <[email protected]> --- docs/devel/migration/main.rst | 9 ++---- docs/devel/migration/vfio.rst | 9 ++---- include/migration/register.h | 52 ++++++++++++---------------------- migration/savevm.h | 3 ++ hw/s390x/s390-stattrib.c | 9 +++--- hw/vfio/migration.c | 48 ++++++++++++++----------------- migration/block-dirty-bitmap.c | 10 +++---- migration/ram.c | 33 +++++++-------------- migration/savevm.c | 42 +++++++++++++-------------- hw/vfio/trace-events | 3 +- 10 files changed, 86 insertions(+), 132 deletions(-) diff --git a/docs/devel/migration/main.rst b/docs/devel/migration/main.rst index 2de7050764..430673a499 100644 --- a/docs/devel/migration/main.rst +++ b/docs/devel/migration/main.rst @@ -515,13 +515,8 @@ An iterative device must provide: - A ``load_setup`` function that initialises the data structures on the destination. - - A ``state_pending_exact`` function that indicates how much more - data we must save. The core migration code will use this to - determine when to pause the CPUs and complete the migration. - - - A ``state_pending_estimate`` function that indicates how much more - data we must save. When the estimated amount is smaller than the - threshold, we call ``state_pending_exact``. + - A ``save_query_pending`` function that indicates how much more + data we must save. - A ``save_live_iterate`` function should send a chunk of data until the point that stream bandwidth limits tell it to stop. Each call diff --git a/docs/devel/migration/vfio.rst b/docs/devel/migration/vfio.rst index 0790e5031d..691061d182 100644 --- a/docs/devel/migration/vfio.rst +++ b/docs/devel/migration/vfio.rst @@ -50,13 +50,8 @@ VFIO implements the device hooks for the iterative approach as follows: * A ``load_setup`` function that sets the VFIO device on the destination in _RESUMING state. -* A ``state_pending_estimate`` function that reports an estimate of the - remaining pre-copy data that the vendor driver has yet to save for the VFIO - device. - -* A ``state_pending_exact`` function that reads pending_bytes from the vendor - driver, which indicates the amount of data that the vendor driver has yet to - save for the VFIO device. +* A ``save_query_pending`` function that reports the remaining data that + the vendor driver has yet to save for the VFIO device. * An ``is_active_iterate`` function that indicates ``save_live_iterate`` is active only when the VFIO device is in pre-copy states. diff --git a/include/migration/register.h b/include/migration/register.h index d0f37f5f43..e2117e8dd4 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -16,6 +16,13 @@ #include "hw/core/vmstate-if.h" +typedef struct MigPendingData { + /* Amount of pending bytes can be transferred in precopy or stopcopy */ + uint64_t precopy_bytes; + /* Amount of pending bytes can be transferred in postcopy */ + uint64_t postcopy_bytes; +} MigPendingData; + /** * struct SaveVMHandlers: handler structure to finely control * migration of complex subsystems and devices, such as RAM, block and @@ -197,46 +204,23 @@ typedef struct SaveVMHandlers { bool (*save_postcopy_prepare)(QEMUFile *f, void *opaque, Error **errp); /** - * @state_pending_estimate - * - * This estimates the remaining data to transfer + * @save_query_pending * - * Sum of @can_postcopy and @must_postcopy is the whole amount of - * pending data. - * - * @opaque: data pointer passed to register_savevm_live() - * @must_precopy: amount of data that must be migrated in precopy - * or in stopped state, i.e. that must be migrated - * before target start. - * @can_postcopy: amount of data that can be migrated in postcopy - * or in stopped state, i.e. after target start. - * Some can also be migrated during precopy (RAM). - * Some must be migrated after source stops - * (block-dirty-bitmap) - */ - void (*state_pending_estimate)(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy); - - /** - * @state_pending_exact + * This estimates the remaining data to transfer on the source side. * - * This calculates the exact remaining data to transfer + * When @exact is true, a module must report accurate results. When + * @exact is false, a module may report estimates. * - * Sum of @can_postcopy and @must_postcopy is the whole amount of - * pending data. + * It's highly recommended that modules implement a faster version of + * the query path (for example, by proper caching on the counters) if + * an accurate query will be time-consuming. * * @opaque: data pointer passed to register_savevm_live() - * @must_precopy: amount of data that must be migrated in precopy - * or in stopped state, i.e. that must be migrated - * before target start. - * @can_postcopy: amount of data that can be migrated in postcopy - * or in stopped state, i.e. after target start. - * Some can also be migrated during precopy (RAM). - * Some must be migrated after source stops - * (block-dirty-bitmap) + * @pending: pointer to a MigPendingData struct + * @exact: set to true for an accurate (slow) query */ - void (*state_pending_exact)(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy); + void (*save_query_pending)(void *opaque, MigPendingData *pending, + bool exact); /** * @load_state diff --git a/migration/savevm.h b/migration/savevm.h index b3d1e8a13c..e4efd243f3 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -14,6 +14,8 @@ #ifndef MIGRATION_SAVEVM_H #define MIGRATION_SAVEVM_H +#include "migration/register.h" + #define QEMU_VM_FILE_MAGIC 0x5145564d #define QEMU_VM_FILE_VERSION_COMPAT 0x00000002 #define QEMU_VM_FILE_VERSION 0x00000003 @@ -43,6 +45,7 @@ int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy); void qemu_savevm_state_cleanup(void); void qemu_savevm_state_complete_postcopy(QEMUFile *f); int qemu_savevm_state_complete_precopy(MigrationState *s); +void qemu_savevm_query_pending(MigPendingData *pending, bool exact); void qemu_savevm_state_pending_exact(uint64_t *must_precopy, uint64_t *can_postcopy); void qemu_savevm_state_pending_estimate(uint64_t *must_precopy, diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c index 2e83aa211c..dfbd452e44 100644 --- a/hw/s390x/s390-stattrib.c +++ b/hw/s390x/s390-stattrib.c @@ -187,15 +187,15 @@ static int cmma_save_setup(QEMUFile *f, void *opaque, Error **errp) return 0; } -static void cmma_state_pending(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy) +static void cmma_state_pending(void *opaque, MigPendingData *pending, + bool exact) { S390StAttribState *sas = S390_STATTRIB(opaque); S390StAttribClass *sac = S390_STATTRIB_GET_CLASS(sas); long long res = sac->get_dirtycount(sas); if (res >= 0) { - *must_precopy += res; + pending->precopy_bytes += res; } } @@ -340,8 +340,7 @@ static SaveVMHandlers savevm_s390_stattrib_handlers = { .save_setup = cmma_save_setup, .save_live_iterate = cmma_save_iterate, .save_complete = cmma_save_complete, - .state_pending_exact = cmma_state_pending, - .state_pending_estimate = cmma_state_pending, + .save_query_pending = cmma_state_pending, .save_cleanup = cmma_save_cleanup, .load_state = cmma_load, .is_active = cmma_active, diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 5d5fca09bd..e965ba51fb 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -571,42 +571,39 @@ static void vfio_save_cleanup(void *opaque) trace_vfio_save_cleanup(vbasedev->name); } -static void vfio_state_pending_estimate(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy) +static void vfio_state_pending_sync(VFIODevice *vbasedev) { - VFIODevice *vbasedev = opaque; VFIOMigration *migration = vbasedev->migration; - if (!vfio_device_state_is_precopy(vbasedev)) { - return; - } - - *must_precopy += - migration->precopy_init_size + migration->precopy_dirty_size; + vfio_query_stop_copy_size(vbasedev); - trace_vfio_state_pending_estimate(vbasedev->name, *must_precopy, - *can_postcopy, - migration->precopy_init_size, - migration->precopy_dirty_size); + if (vfio_device_state_is_precopy(vbasedev)) { + vfio_query_precopy_size(migration); + } } -static void vfio_state_pending_exact(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy) +static void vfio_state_pending(void *opaque, MigPendingData *pending, + bool exact) { VFIODevice *vbasedev = opaque; VFIOMigration *migration = vbasedev->migration; + uint64_t remain; - vfio_query_stop_copy_size(vbasedev); - *must_precopy += migration->stopcopy_size; - - if (vfio_device_state_is_precopy(vbasedev)) { - vfio_query_precopy_size(migration); + if (exact) { + vfio_state_pending_sync(vbasedev); + remain = migration->stopcopy_size; + } else { + if (!vfio_device_state_is_precopy(vbasedev)) { + return; + } + remain = migration->precopy_init_size + migration->precopy_dirty_size; } - trace_vfio_state_pending_exact(vbasedev->name, *must_precopy, *can_postcopy, - migration->stopcopy_size, - migration->precopy_init_size, - migration->precopy_dirty_size); + pending->precopy_bytes += remain; + + trace_vfio_state_pending(vbasedev->name, migration->stopcopy_size, + migration->precopy_init_size, + migration->precopy_dirty_size, exact); } static bool vfio_is_active_iterate(void *opaque) @@ -851,8 +848,7 @@ static const SaveVMHandlers savevm_vfio_handlers = { .save_prepare = vfio_save_prepare, .save_setup = vfio_save_setup, .save_cleanup = vfio_save_cleanup, - .state_pending_estimate = vfio_state_pending_estimate, - .state_pending_exact = vfio_state_pending_exact, + .save_query_pending = vfio_state_pending, .is_active_iterate = vfio_is_active_iterate, .save_live_iterate = vfio_save_iterate, .save_complete = vfio_save_complete_precopy, diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c index a061aad817..15d417013c 100644 --- a/migration/block-dirty-bitmap.c +++ b/migration/block-dirty-bitmap.c @@ -766,9 +766,8 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque) return 0; } -static void dirty_bitmap_state_pending(void *opaque, - uint64_t *must_precopy, - uint64_t *can_postcopy) +static void dirty_bitmap_state_pending(void *opaque, MigPendingData *data, + bool exact) { DBMSaveState *s = &((DBMState *)opaque)->save; SaveBitmapState *dbms; @@ -788,7 +787,7 @@ static void dirty_bitmap_state_pending(void *opaque, trace_dirty_bitmap_state_pending(pending); - *can_postcopy += pending; + data->postcopy_bytes += pending; } /* First occurrence of this bitmap. It should be created if doesn't exist */ @@ -1250,8 +1249,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = { .save_setup = dirty_bitmap_save_setup, .save_complete = dirty_bitmap_save_complete, .has_postcopy = dirty_bitmap_has_postcopy, - .state_pending_exact = dirty_bitmap_state_pending, - .state_pending_estimate = dirty_bitmap_state_pending, + .save_query_pending = dirty_bitmap_state_pending, .save_live_iterate = dirty_bitmap_save_iterate, .is_active_iterate = dirty_bitmap_is_active_iterate, .load_state = dirty_bitmap_load, diff --git a/migration/ram.c b/migration/ram.c index 2046f16caa..44503bf3f7 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3449,30 +3449,18 @@ static int ram_save_complete(QEMUFile *f, void *opaque) return qemu_fflush(f); } -static void ram_state_pending_estimate(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy) -{ - RAMState **temp = opaque; - RAMState *rs = *temp; - - uint64_t remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE; - - if (migrate_postcopy_ram()) { - /* We can do postcopy, and all the data is postcopiable */ - *can_postcopy += remaining_size; - } else { - *must_precopy += remaining_size; - } -} - -static void ram_state_pending_exact(void *opaque, uint64_t *must_precopy, - uint64_t *can_postcopy) +static void ram_state_pending(void *opaque, MigPendingData *pending, + bool exact) { RAMState **temp = opaque; RAMState *rs = *temp; uint64_t remaining_size; - if (!migration_in_postcopy()) { + /* + * Sync is not needed either with: (1) a fast query, or (2) after + * postcopy has started (no new dirty will generate anymore). + */ + if (exact && !migration_in_postcopy()) { bql_lock(); WITH_RCU_READ_LOCK_GUARD() { migration_bitmap_sync_precopy(false); @@ -3484,9 +3472,9 @@ static void ram_state_pending_exact(void *opaque, uint64_t *must_precopy, if (migrate_postcopy_ram()) { /* We can do postcopy, and all the data is postcopiable */ - *can_postcopy += remaining_size; + pending->postcopy_bytes += remaining_size; } else { - *must_precopy += remaining_size; + pending->precopy_bytes += remaining_size; } } @@ -4709,8 +4697,7 @@ static SaveVMHandlers savevm_ram_handlers = { .save_live_iterate = ram_save_iterate, .save_complete = ram_save_complete, .has_postcopy = ram_has_postcopy, - .state_pending_exact = ram_state_pending_exact, - .state_pending_estimate = ram_state_pending_estimate, + .save_query_pending = ram_state_pending, .load_state = ram_load, .save_cleanup = ram_save_cleanup, .load_setup = ram_load_setup, diff --git a/migration/savevm.c b/migration/savevm.c index 765df8ce2d..41f1906598 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1796,46 +1796,44 @@ int qemu_savevm_state_complete_precopy(MigrationState *s) return qemu_fflush(f); } -/* Give an estimate of the amount left to be transferred, - * the result is split into the amount for units that can and - * for units that can't do postcopy. - */ -void qemu_savevm_state_pending_estimate(uint64_t *must_precopy, - uint64_t *can_postcopy) +void qemu_savevm_query_pending(MigPendingData *pending, bool exact) { SaveStateEntry *se; - *must_precopy = 0; - *can_postcopy = 0; + pending->precopy_bytes = 0; + pending->postcopy_bytes = 0; QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { - if (!se->ops || !se->ops->state_pending_estimate) { + if (!se->ops || !se->ops->save_query_pending) { continue; } if (!qemu_savevm_state_active(se)) { continue; } - se->ops->state_pending_estimate(se->opaque, must_precopy, can_postcopy); + se->ops->save_query_pending(se->opaque, pending, exact); } } +void qemu_savevm_state_pending_estimate(uint64_t *must_precopy, + uint64_t *can_postcopy) +{ + MigPendingData pending; + + qemu_savevm_query_pending(&pending, false); + + *must_precopy = pending.precopy_bytes; + *can_postcopy = pending.postcopy_bytes; +} + void qemu_savevm_state_pending_exact(uint64_t *must_precopy, uint64_t *can_postcopy) { - SaveStateEntry *se; + MigPendingData pending; - *must_precopy = 0; - *can_postcopy = 0; + qemu_savevm_query_pending(&pending, true); - QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { - if (!se->ops || !se->ops->state_pending_exact) { - continue; - } - if (!qemu_savevm_state_active(se)) { - continue; - } - se->ops->state_pending_exact(se->opaque, must_precopy, can_postcopy); - } + *must_precopy = pending.precopy_bytes; + *can_postcopy = pending.postcopy_bytes; } void qemu_savevm_state_cleanup(void) diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 846e3625c5..287df0b8cb 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -173,8 +173,7 @@ vfio_save_device_config_state(const char *name) " (%s)" vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64 vfio_save_iterate_start(const char *name) " (%s)" vfio_save_setup(const char *name, uint64_t data_buffer_size) " (%s) data buffer size %"PRIu64 -vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64 -vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy %"PRIu64" postcopy %"PRIu64" stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64 +vfio_state_pending(const char *name, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size, bool exact) " (%s) stopcopy size %"PRIu64" precopy initial size %"PRIu64" precopy dirty size %"PRIu64 " exact %d" vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" -- 2.53.0
