On 03.06.2026 22:38, Fabiano Rosas wrote:
It's not possible to access the image file while there is an incoming migration in progress, the QEMU process doesn't hold any locks to the storage at this point so nodes are inactive. Attempting to flush leads to an assert at bdrv_co_write_req_prepare():assert(!(bs->open_flags & BDRV_O_INACTIVE)) The issue is reproducible by running iotest 181 on a host under cpu load. The migration must coincide with the header already containing the QED_F_NEED_CHECK flag. The sequence of events is as follows, with the respective call stacks referenced below: During block device init, bdrv_qed_attach_aio_context() starts the 'need_check' timer. The timer will not fire during incoming migration as it uses QEMU_CLOCK_VIRTUAL (to avoid this very issue, as the code comment indicates). (0) However, there's still bdrv_qed_drain_begin() which uses the fact that the timer is live to decide whether to start the qed_need_check_timer_entry() directly. (1) The qed_need_check_timer_entry() eventually calls into qed_write_header() -> bdrv_co_pwrite() leading to the assert. (2) Skip creating the 'need_check' timer whenever the image is inactive.
...
Note that this issue is not exactly the same as what's been reported in Gitlab, but given how easily this reproduces, I imagine it has to be happening in that setup as well. Link: https://gitlab.com/qemu-project/qemu/-/work_items/3515 Signed-off-by: Fabiano Rosas <[email protected]> --- block/qed.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
I'm picking this one up for the active qemu-stable series. Dunno how popular qed is these days, though - back in the day it was a useful alternative to qcow2, now it feels more like a toy :) Thanks, /mjt
