On Tue, Nov 04, 2025 at 09:36:06AM +0800, Li Zhijian wrote:
> Commit 4881411136 ("migration: Always set DEVICE state") set a new DEVICE
> state before completed during migration, which broke the original transition
> to COLO. The migration flow for precopy has changed to:
> active -> pre-switchover -> device -> completed.
> 
> This patch updates the transition state to ensure that the Pre-COLO
> state corresponds to DEVICE state correctly.
> 
> Fixes: 4881411136 ("migration: Always set DEVICE state")
> Signed-off-by: Li Zhijian <[email protected]>
> ---
>  migration/migration.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index a63b46bbef..6ec7f3cec8 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3095,9 +3095,9 @@ static void migration_completion(MigrationState *s)
>          goto fail;
>      }
>  
> -    if (migrate_colo() && s->state == MIGRATION_STATUS_ACTIVE) {
> +    if (migrate_colo() && s->state == MIGRATION_STATUS_DEVICE) {
>          /* COLO does not support postcopy */
> -        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> +        migrate_set_state(&s->state, MIGRATION_STATUS_DEVICE,
>                            MIGRATION_STATUS_COLO);
>      } else {
>          migration_completion_end(s);

Thanks a lot for fixing it, Zhijian.  It means I broke COLO already for
10.0/10.1..

Hailiang/Chen, do you still know anyone who is using COLO, especially in
enterprise?  I don't expect any individual using it.. It definitely
complicates migration logics all over the places.  Fabiano and I discussed
a few times on removing legacy code and COLO was always in the list.

We used to discuss RDMA obsoletion too, that's when Huawei developers at
least tried to re-implement the whole RDMA using rsocket, that didn't land
only because of a perf regression.  Meanwhile, Zhijian also provided an
unit test, which we rely on recently to not break RDMA at the minimum.

If we do not have known users, I sincerely want to discuss with you on
obsoletion and removal of COLO from qemu codebase.  Do you see feasible?

Zhijian, do you have any input here?

Thanks,

-- 
Peter Xu


Reply via email to