On Thu, Jun 04, 2026 at 06:36:14PM +0300, Avihai Horon wrote:
>
> On 6/3/2026 11:42 PM, Peter Xu wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > On Tue, Jun 02, 2026 at 12:26:15PM +0300, Avihai Horon wrote:
> > > Switchover ACK is checked only during precopy while the guest is still
> > > running. The last migration_can_switchover() decision and guest stop are
> > > not atomic, so a device may want to request another switchover ACK in
> > > the gap after switchover decision has been made but before the guest is
> > > stopped. Migration would then miss that request, which can increase
> > > downtime.
> > >
> > > Cover this case by failing the migration if a switchover-ack was
> > > requested during that time.
> > >
> > > Ideally, precopy iterations should be resumed in this case, however,
> > > VFIO doesn't support going back to precopy after being stopped, so
> > > implementing such logic would require non-trivial changes to the guest
> > > start/stop flow. Given the above and that this case should be rare,
> > > failing the migration seems reasonable.
> > >
> > > Signed-off-by: Avihai Horon <[email protected]>
> > Reviewed-by: Peter Xu <[email protected]>
> >
> > One nit:
> >
> > > ---
> > > migration/migration.c | 30 ++++++++++++++++++++++++++++++
> > > 1 file changed, 30 insertions(+)
> > >
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 4bb649a467..6ee1c795ff 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -2810,6 +2810,27 @@ static bool
> > > migration_switchover_prepare(MigrationState *s)
> > > return s->state == MIGRATION_STATUS_DEVICE;
> > > }
> > >
> > > +static bool
> > > migration_switchover_check_switchover_ack_pending(MigrationState *s,
> > The name is slightly confusing. It says "check if there is pending ack"
> > but then it returns true if there's no pending ACK..
> >
> > Maybe migration_switchover_is_acknowledged()?
>
> Sure, or maybe migration_have_new_switchover_ack_pending()?
Sure.
>
> I just noticed that I overlooked one corner case:
> If guest is stopped during precopy and source has pending acks > 0 (e.g.,
> virsh migrate with --timeout param), then we will fail in this check and
> unnecessarily abort migration.
>
> I thought about doing the change below. It's not as clean as I'd want it to
> be, but I have no other ideas.
Yes, it's not good to bleed switchover_ack_pending_num all over the
places.. Maybe you can make qemu_savevm_query_pending_final() return an
error when seeing any newly popped ack request while querying? Then the
new function is not needed.
Thanks,
>
> ===========8<===========
> diff --git a/migration/migration.c b/migration/migration.c
> index 6ee1c795ff..6054453b18 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2810,37 +2810,47 @@ static bool
> migration_switchover_prepare(MigrationState *s)
> return s->state == MIGRATION_STATUS_DEVICE;
> }
>
> -static bool
> migration_switchover_check_switchover_ack_pending(MigrationState *s,
> - Error **errp)
> +static bool migration_have_new_switchover_ack_pending(
> + MigrationState *s, uint32_t old_switchover_ack_pending_num, Error
> **errp)
> {
> uint32_t pending_num;
>
> if (!migrate_switchover_ack() || migrate_switchover_ack_legacy()) {
> - return true;
> + return false;
> }
>
> pending_num = qatomic_read(&s->switchover_ack_pending_num);
> - if (pending_num > 0) {
> + if (pending_num > old_switchover_ack_pending_num) {
> error_setg(errp,
> "Switchover ACK was requested by %" PRIu32
> " devices during switchover",
> - pending_num);
> - return false;
> + pending_num - old_switchover_ack_pending_num);
> + return true;
> }
>
> - return true;
> + return false;
> }
>
> static bool migration_switchover_start(MigrationState *s, Error **errp)
> {
> ERRP_GUARD();
> MigPendingData pending = {};
> + uint32_t old_switchover_ack_pending_num;
>
> if (!migration_switchover_prepare(s)) {
> error_setg(errp, "Switchover is interrupted");
> return false;
> }
>
> + /*
> + * If the guest was stopped during precopy (e.g., by management), we
> may get
> + * here with switchover_ack_pending_num > 0. Store the current
> + * switchover_ack_pending_num value before the final pending query to
> + * distinguish between existing pending ACKs and new ones.
> + */
> + old_switchover_ack_pending_num =
> + qatomic_read(&s->switchover_ack_pending_num);
> +
> qemu_savevm_query_pending_final(s, &pending);
>
> /*
> @@ -2848,7 +2858,8 @@ static bool migration_switchover_start(MigrationState
> *s, Error **errp)
> * Fail the migration in this case since we currently don't support
> going
> * back to precopy.
> */
> - if (!migration_switchover_check_switchover_ack_pending(s, errp)) {
> + if (migration_have_new_switchover_ack_pending(
> + s, old_switchover_ack_pending_num, errp)) {
> return false;
> }
>
> ===========8<===========
>
> Thanks.
>
> >
> > > + Error
> > > **errp)
> > > +{
> > > + uint32_t pending_num;
> > > +
> > > + if (!migrate_switchover_ack() || migrate_switchover_ack_legacy()) {
> > > + return true;
> > > + }
> > > +
> > > + pending_num = qatomic_read(&s->switchover_ack_pending_num);
> > > + if (pending_num > 0) {
> > > + error_setg(errp,
> > > + "Switchover ACK was requested by %" PRIu32
> > > + " devices during switchover",
> > > + pending_num);
> > > + return false;
> > > + }
> > > +
> > > + return true;
> > > +}
> > > +
> > > static bool migration_switchover_start(MigrationState *s, Error **errp)
> > > {
> > > ERRP_GUARD();
> > > @@ -2822,6 +2843,15 @@ static bool
> > > migration_switchover_start(MigrationState *s, Error **errp)
> > >
> > > qemu_savevm_query_pending_final(s, &pending);
> > >
> > > + /*
> > > + * Switchover-ack requests done after switchover decision, are not
> > > allowed.
> > > + * Fail the migration in this case since we currently don't support
> > > going
> > > + * back to precopy.
> > > + */
> > > + if (!migration_switchover_check_switchover_ack_pending(s, errp)) {
> > > + return false;
> > > + }
> > > +
> > > /* Inactivate disks except in COLO */
> > > if (!migrate_colo()) {
> > > /*
> > > --
> > > 2.40.1
> > >
> > --
> > Peter Xu
> >
>
--
Peter Xu