On 3/13/26 12:55, Denis V. Lunev wrote:
> When libvirtd reconnects to a running QEMU process that had an
> in-progress migration, qemuProcessReconnect first connects the
> monitor and only later recovers the migration job. During this window
> the async job is VIR_ASYNC_JOB_NONE, so any MIGRATION status events
> from QEMU are silently dropped by qemuProcessHandleMigrationStatus.
>
> If the migration was already cancelled or completed by QEMU during
> this window, no further events will be emitted. When
> qemuMigrationSrcCancelUnattended later restores the async job and
> calls qemuMigrationSrcCancel with wait=true, the wait loop calls
> qemuDomainObjWait (virCondWait with no timeout) and blocks forever
> waiting for an event that will never arrive.
>
> Fix this by querying QEMU migration status with query-migrate
> immediately after sending migrate_cancel, while still inside the
> monitor session. This ensures the job's migration status is up to
> date before entering the wait loop, so if QEMU already reached a
> terminal state (cancelled/completed/error), the loop exits
> immediately.
>
> Signed-off-by: Denis V. Lunev <[email protected]>
> CC: Peter Krempa <[email protected]>
> CC: Michal Privoznik <[email protected]>
> ---
>  src/qemu/qemu_migration.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
> index fec808ccfb..3a9185f65c 100644
> --- a/src/qemu/qemu_migration.c
> +++ b/src/qemu/qemu_migration.c
> @@ -4876,6 +4876,21 @@ qemuMigrationSrcCancel(virDomainObj *vm,
>          return -1;
>  
>      rc = qemuMonitorMigrateCancel(priv->mon);
> +
> +    if (rc == 0 && wait) {
> +        virDomainJobData *jobData = vm->job->current;
> +        qemuDomainJobDataPrivate *privJob = jobData->privateData;
> +        qemuMonitorMigrationStats stats;
> +
> +        /* During reconnect the async job is not yet restored when migration
> +         * events can arrive from QEMU, causing
> +         * qemuProcessHandleMigrationStatus() to drop them. In that case
> +         * QEMU won't send any more events and the wait loop would block
> +         * forever. */
> +        if (qemuMonitorGetMigrationStats(priv->mon, &stats, NULL) == 0)
> +            privJob->stats.mig.status = stats.status;
> +    }
> +
>      qemuDomainObjExitMonitor(vm);
>  
>      if (rc < 0)
ping

Reply via email to