On 2026-06-12 9:51 AM, Andrey Drobyshev wrote:
> For CPR-style migration, we need the VM to stop earlier on, before
> cpr_state_save() fires.  That is to prevent the race between source I/O
> and target device init, as well as make sure VHOST_RESET_OWNER and
> VHOST_SET_OWNER are fired in the right order (cpr-transfer case).  Thus,
> for CPR case migration, let's move the stop to qmp_migrate(), directly
> before launching cpr_state_save().
> 
> This patch is a rework of the change originally proposed by Steve and
> Ben at [0].
> 
> [0] 
> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/[email protected]__;!!ACWV5N9M2RV99hQ!McDgSIuX3a-mCdNkiaf0TCITdQB-bEHCppcpig8fe6BPBWlU7qkvjW5FhLN0twb2fq9pC2qDYPH_FQZJYglJy4C71FMad5DMog$
>  
> 
> Originally-by: Steve Sistare <[email protected]>
> Originally-by: Ben Chaney <[email protected]>
> Signed-off-by: Andrey Drobyshev <[email protected]>
> ---
>  migration/migration.c | 73 +++++++++++++++++++++++++++++++++----------
>  1 file changed, 56 insertions(+), 17 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 3c70467d314..c02f1da8b0e 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1380,10 +1380,13 @@ static void migration_cleanup(MigrationState *s)
>  
>      /*
>       * FAILED notification should have already happened.  Notify DONE if
> -     * migration completed successfully.
> +     * migration completed successfully.  On a failed/canceled path,
> +     * resume VM for the CPR case.
>       */
>      if (!migration_has_failed(s)) {
>          migration_call_notifiers(MIG_EVENT_DONE, NULL);
> +    } else if (migrate_mode_is_cpr()) {
> +        vm_resume(s->vm_old_state);

In migration_iteration_finish(), the VM is resumed only after
migration_block_activate() succeeds. With this patch, CPR failure handling is
moved to migration_cleanup(), where vm_resume() is called *unconditionally* for
failed CPR migrations.

>      }
>  
>      yank_unregister_instance(MIGRATION_YANK_INSTANCE);
> @@ -2130,6 +2133,45 @@ void qmp_migrate(const char *uri, bool has_channels,
>       */
>      Error *local_err = NULL;
>  
> +    /*
> +     * CPR-transfer  ordering:
> +     *
> +     *   SOURCE                              TARGET
> +     *   ------                              ------
> +     *                                       cpr_state_load() blocks
> +     *   |                                        |
> +     *   |  1. migration_stop_vm()                |
> +     *   |     VM stopped, devices quiesced       |
> +     *   |                                        | Waiting for
> +     *   |  2. notifiers (SETUP)                  | FDs from source
> +     *   |     vhost_reset_owner() releases       |
> +     *   |     device ownership                   |
> +     *   |                                        |
> +     *   |  3. cpr_state_save() ---- FDs -------> |
> +     *   |                                        |
> +     *   v                                        v
> +     *   postmigrate                         Device init begins
> +     *                                       - cpr_find_fd()
> +     *                                       - vhost_dev_init()
> +     *                                       - VHOST_SET_OWNER
> +     *
> +     * Step 3 is the synchronization/cut-over point. Target proceeds 
> immediately
> +     * upon receiving FDs, so steps 1-2 must complete otherwise:

Complete successfully?

> +     * - Target's VHOST_SET_OWNER fails with -EBUSY (source still owns)
> +     * - Race between source I/O and target device init
> +     *
> +     *  We stop the VM early (before FD transfer) to prevent this race.

How about move some of comments to the commit message instead?

When reading this comment in the final code, I found the race description a bit
confusing.

The vhost_dev_init change is introduced only in the next patch.

With this patch applied, the VM is already stopped at step 1 before SETUP and
before cpr_state_save(), so a future reader may wonder why there is still a race
between source I/O and target device init.

The race is the motivation for the change, not the behavior of the code after
the change.

It may be clearer to keep the comment focused on the required ordering.

Thank you very much!

Dongli Zhang

> +     *  Unlike regular migration, CPR-transfer passes memory via FD (memfd)
> +     *  rather than copying RAM, so early VM stop should have minimal 
> downtime.
> +     */
> +    if (migrate_mode_is_cpr()) {
> +        int ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> +        if (ret < 0) {
> +            error_setg(&local_err, "migration_stop_vm failed, error %d", 
> -ret);
> +            goto out;
> +        }
> +    }
> +
>      /* Notify before starting migration thread and before starting CPR */
>      if (migration_call_notifiers(MIG_EVENT_SETUP, &local_err)) {
>          goto out;
> @@ -3482,13 +3524,19 @@ static void migration_iteration_finish(MigrationState 
> *s)
>           */
>          migration_call_notifiers(MIG_EVENT_FAILED, NULL);
>  
> -        if (runstate_is_live(s->vm_old_state)) {
> -            if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> -                vm_start();
> -            }
> -        } else {
> -            if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
> -                runstate_set(s->vm_old_state);
> +        /*
> +         * For cpr the VM resume on failure is centralized in
> +         * migration_cleanup(), so don't resume here as well.
> +         */
> +        if (!migrate_mode_is_cpr()) {
> +            if (runstate_is_live(s->vm_old_state)) {
> +                if (!runstate_check(RUN_STATE_SHUTDOWN)) {
> +                    vm_start();
> +                }
> +            } else {
> +                if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
> +                    runstate_set(s->vm_old_state);
> +                }
>              }
>          }
>          break;
> @@ -3903,7 +3951,6 @@ void migration_start_outgoing(MigrationState *s)
>      Error *local_err = NULL;
>      uint64_t rate_limit;
>      bool resume = (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER_SETUP);
> -    int ret;
>  
>      if (resume) {
>          /* This is a resumed migration */
> @@ -3944,14 +3991,6 @@ void migration_start_outgoing(MigrationState *s)
>          return;
>      }
>  
> -    if (migrate_mode_is_cpr()) {
> -        ret = migration_stop_vm(s, RUN_STATE_FINISH_MIGRATE);
> -        if (ret < 0) {
> -            error_setg(&local_err, "migration_stop_vm failed, error %d", 
> -ret);
> -            goto fail;
> -        }
> -    }
> -
>      /*
>       * Take a refcount to make sure the migration object won't get freed by
>       * the main thread already in migration_shutdown().


Reply via email to