Devices may opt-in migration FAILED notifiers to be invoked when migration
fails. Currently, the notifications happen in migration_cleanup(). It is
normally fine, but maybe not ideal if there's dependency of the fallback
v.s. VM starts.
This patch moves the FAILED notification earlier, so that if the failure
happened during switchover, it'll notify before VM restart.
After walking over all existing FAILED notifier users, I got the conclusion
that this should also be a cleaner approach at least from design POV.
We have these notifier users, where the first two do not need to trap
FAILED:
|----------------------------+-------------------------------------+---------------------|
| device | handler | events
needed |
|----------------------------+-------------------------------------+---------------------|
| gicv3 | kvm_arm_gicv3_notifier | DONE
|
| vfio_iommufd / vfio_legacy | vfio_cpr_reboot_notifier | SETUP
|
| cpr-exec | cpr_exec_notifier | FAILED,
DONE |
| virtio-net | virtio_net_migration_state_notifier | SETUP,
FAILED |
| vfio | vfio_migration_state_notifier | FAILED
|
| vdpa | vdpa_net_migration_state_notifier | SETUP,
FAILED |
| spice [*] | migration_state_notifier | SETUP,
FAILED, DONE |
|----------------------------+-------------------------------------+---------------------|
For cpr-exec, it tries to cleanup some cpr-exec specific fd or env
variables. This should be fine either way, as long as before
migration_cleanup().
For virtio-net, we need to re-plug the primary device back to guest in the
failover mode. Likely benign.
VFIO needs to re-start the device if FAILED. IIUC it should do it before
vm_start(), if the VFIO device can be put into a STOPed state due to
migration, we should logically make it running again before vCPUs run.
VDPA will disable SVQ when migration is FAILED. Likely benign too, but
looks better if we can do it before resuming vCPUs.
For spice, we should rely on "spice_server_migrate_end(false)" to retake
the ownership. Benign, but looks more reasonable if the spice client does
it before VM runs again.
Note that this change may introduce slightly more downtime, if the
migration failed exactly at the switchover phase. But that's very rare,
and even if it happens, none of above expects a long delay, but a short
one, likely will be buried in the total downtime even if failed.
Cc: Cédric Le Goater <[email protected]>
Cc: Marc-André Lureau <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
---
migration/migration.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 91775f8472..1d9a2fc068 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1481,7 +1481,6 @@ static void migration_cleanup_json_writer(MigrationState
*s)
static void migration_cleanup(MigrationState *s)
{
- MigrationEventType type;
QEMUFile *tmp = NULL;
trace_migration_cleanup();
@@ -1535,9 +1534,15 @@ static void migration_cleanup(MigrationState *s)
/* It is used on info migrate. We can't free it */
error_report_err(error_copy(s->error));
}
- type = migration_has_failed(s) ? MIG_EVENT_PRECOPY_FAILED :
- MIG_EVENT_PRECOPY_DONE;
- migration_call_notifiers(s, type, NULL);
+
+ /*
+ * FAILED notification should have already happened. Notify DONE if
+ * migration completed successfully.
+ */
+ if (!migration_has_failed(s)) {
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_DONE, NULL);
+ }
+
yank_unregister_instance(MIGRATION_YANK_INSTANCE);
}
@@ -3589,6 +3594,13 @@ static void migration_iteration_finish(MigrationState *s)
error_free(local_err);
break;
}
+
+ /*
+ * Notify FAILED before starting VM, so that devices can invoke
+ * necessary fallbacks before vCPUs run again.
+ */
+ migration_call_notifiers(s, MIG_EVENT_PRECOPY_FAILED, NULL);
+
if (runstate_is_live(s->vm_old_state)) {
if (!runstate_check(RUN_STATE_SHUTDOWN)) {
vm_start();
--
2.50.1