This series introduces a new POSTCOPY_DEVICE state that is active (both, on source and destination side), while the destination loads the device state. Before this series, if the destination machine failed during the device load, the source side would stay stuck POSTCOPY_ACTIVE with no way of recovery. With this series, if the migration fails while in POSTCOPY_DEVICE state, the source side can safely resume, as destination has not started yet.
RFC: https://lore.kernel.org/all/[email protected]/ V1: https://lore.kernel.org/all/[email protected]/ V2: https://lore.kernel.org/all/[email protected]/ V3 changes: - rebased on top of https://gitlab.com/peterx/qemu/-/commits/staging Patch 2: - split into two separate patches Patches 4, 5, 6 (was Patch 3) - return to previous migration_incoming_state_destroy() that will not exit on error - use existing exit-on-error option also for postcopy, in separate patch - moved conversion of the postcopy listen thread to a joinable thread into a separate patch Patch 7: - added reset of postcopy_package_loaded_event Juraj Marcin (6): migration: Move postcopy_ram_listen_thread() to postcopy-ram.c migration: Introduce postcopy incoming setup and cleanup functions migration: Refactor all incoming cleanup info migration_incoming_destroy() migration: Respect exit-on-error when migration fails before resuming migration: Make postcopy listen thread joinable migration: Introduce POSTCOPY_DEVICE state Peter Xu (1): migration: Do not try to start VM if disk activation fails migration/migration.c | 116 ++++++++++++------- migration/migration.h | 4 + migration/postcopy-ram.c | 160 ++++++++++++++++++++++++++ migration/postcopy-ram.h | 3 + migration/savevm.c | 134 +-------------------- migration/savevm.h | 2 + migration/trace-events | 3 +- qapi/migration.json | 8 +- tests/qtest/migration/precopy-tests.c | 3 +- 9 files changed, 260 insertions(+), 173 deletions(-) -- 2.51.0
