On 03.06.2026 22:38, Fabiano Rosas wrote:
It's not possible to access the image file while there is an incoming
migration in progress, the QEMU process doesn't hold any locks to the
storage at this point so nodes are inactive. Attempting to flush leads
to an assert at bdrv_co_write_req_prepare():

    assert(!(bs->open_flags & BDRV_O_INACTIVE))

The issue is reproducible by running iotest 181 on a host under cpu
load. The migration must coincide with the header already containing
the QED_F_NEED_CHECK flag.

The sequence of events is as follows, with the respective call stacks
referenced below:

During block device init, bdrv_qed_attach_aio_context() starts the
'need_check' timer. The timer will not fire during incoming migration
as it uses QEMU_CLOCK_VIRTUAL (to avoid this very issue, as the code
comment indicates).                                                   (0)

However, there's still bdrv_qed_drain_begin() which uses the fact that
the timer is live to decide whether to start the
qed_need_check_timer_entry() directly.                                (1)

The qed_need_check_timer_entry() eventually calls into
qed_write_header() -> bdrv_co_pwrite() leading to the assert.         (2)

Skip creating the 'need_check' timer whenever the image is inactive.
...
Note that this issue is not exactly the same as what's been reported
in Gitlab, but given how easily this reproduces, I imagine it has to
be happening in that setup as well.

Link: https://gitlab.com/qemu-project/qemu/-/work_items/3515
Signed-off-by: Fabiano Rosas <[email protected]>
---
  block/qed.c | 16 +++++++++++-----
  1 file changed, 11 insertions(+), 5 deletions(-)

I'm picking this one up for the active qemu-stable series.
Dunno how popular qed is these days, though - back in the day it was
a useful alternative to qcow2, now it feels more like a toy :)

Thanks,

/mjt

Reply via email to