COLO concurrency issues

Fabiano Rosas Fri, 13 Feb 2026 04:15:06 -0800

Hi, I've been following the qemu-colo.rst steps to test COLO and
encountered a couple of issues. Unfortunately, I don't have cycles to
investigate further. Happens with QEMU master (also tested some versions
back until the COLO fix 0b5bf4ea76).


1) Deadlock at fdmon_io_uring_wait:

(steps from qemu-colo.rst)
- Secondary Failover
- Secondary resume replication
- Start the new Secondary
- Sync
- Wait until disk is synced, then:

    {"execute": "stop"}
    {"execute": "block-job-cancel", "arguments":{ "device": "resync" } }

The above results in the old secondary hanging indefinitely at:

    do {
        ret = io_uring_submit_and_wait(&ctx->fdmon_io_uring, wait_nr);
    } while (ret == -EINTR);

(gdb) bt                                                                        
                                                                                
              
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x00007f5519e0204e in ??? () at //usr/lib64/liburing.so.2
#2  0x00007f5519e01b00 in ??? () at //usr/lib64/liburing.so.2
#3  0x0000563c2dc06cc9 in fdmon_io_uring_wait (ctx=0x563c30411b00, 
ready_list=0x7ffd0bad8f58, timeout=575708467831) at ../util/fdmon-io_uring.c:416
#4  0x0000563c2dc00976 in aio_poll (ctx=0x563c30411b00, blocking=true) at 
../util/aio-posix.c:699
#5  0x0000563c2daa01c6 in bdrv_drain_all_begin () at ../block/io.c:529
#6  0x0000563c2daa03d8 in bdrv_drain_all () at ../block/io.c:574
#7  0x0000563c2d764aae in do_vm_stop (state=RUN_STATE_PAUSED, send_stop=true) 
at ../system/cpus.c:312
#8  0x0000563c2d765964 in vm_stop (state=RUN_STATE_PAUSED) at 
../system/cpus.c:754
#9  0x0000563c2d7f3378 in qmp_stop (errp=0x7ffd0bad9080) at 
../monitor/qmp-cmds.c:62
#10 0x0000563c2dba7a72 in qmp_marshal_stop (args=0x563c306ac070, 
ret=0x7f5518dffda8, errp=0x7f5518dffda0) at qapi/qapi-commands-misc.c:197
#11 0x0000563c2dbf1316 in do_qmp_dispatch_bh (opaque=0x7f5518dffe40) at 
../qapi/qmp-dispatch.c:128
#12 0x0000563c2dc1de48 in aio_bh_call (bh=0x563c3040fef0) at ../util/async.c:173
#13 0x0000563c2dc1df64 in aio_bh_poll (ctx=0x563c3040c070) at 
../util/async.c:220
#14 0x0000563c2dbffff0 in aio_dispatch (ctx=0x563c3040c070) at 
../util/aio-posix.c:389
#15 0x0000563c2dc1e3cd in aio_ctx_dispatch (source=0x563c3040c070, 
callback=0x0, user_data=0x0) at ../util/async.c:365
#16 0x00007f551b114f4c in g_main_dispatch (context=0x563c304120f0) at 
../glib/gmain.c:3476
#17 g_main_context_dispatch_unlocked (context=context@entry=0x563c304120f0) at 
../glib/gmain.c:4284
#18 0x00007f551b1170c9 in g_main_context_dispatch (context=0x563c304120f0) at 
../glib/gmain.c:4272
#19 0x0000563c2dc1fa0b in glib_pollfds_poll () at ../util/main-loop.c:290
#20 0x0000563c2dc1fa85 in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:313
#21 0x0000563c2dc1fb8a in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:592
#22 0x0000563c2d78eb60 in qemu_main_loop () at ../system/runstate.c:903
#23 0x0000563c2db412fc in qemu_default_main (opaque=0x0) at ../system/main.c:50
#24 0x0000563c2db413ab in main (argc=40, argv=0x7ffd0bad94d8) at 
../system/main.c:93

---

2) Race at colo_process_checkpoint

The following pattern seems to be inherently racy, whether the switch
statement sees the state as COMPLETED or not varies:

colo_process_checkpoint()
{
    ...
out:
    ...
    /*
     * There are only two reasons we can get here, some error happened
     * or the user triggered failover.
     */
--> switch (failover_get_state()) {
    case FAILOVER_STATUS_COMPLETED:
        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
                                  COLO_EXIT_REASON_REQUEST);
        break;
    default:
        qapi_event_send_colo_exit(COLO_MODE_PRIMARY,
                                  COLO_EXIT_REASON_ERROR);
    }

    /* Hope this not to be too long to wait here */
--> qemu_event_wait(&s->colo_exit_event);
    ...
}

This results in what seems like a spurious:

{"timestamp": {"seconds": 1770984655, "microseconds": 216464}, "event":
"COLO_EXIT", "data": {"mode": "primary", "reason": "error"}}

I'm not sure if the intention is to just ignore it, but it seems moving
the qemu_event_wait before checking the state would eliminate the race.

COLO concurrency issues

Reply via email to