Hi, I'm working on a v3 of my query-block series [1] and I'm a bit confused about how to convert a QMP command into a coroutine.
In case you miss the context: In that series I'm turning query-block into a coroutine so we can avoid holding the BQL for too long in the case of a misbehaving (slow) syscall at the end of the call chain (get_allocated_file_size -> fstat in my case). The issue: After converting qmp_query_block into a coroutine, I'm hitting the assert(false) bug at qcow2_get_specific_info() which was already fixed for non-coroutines [2]. The bug was caused by qmp_query_block() running during bdrv_activate_all(): bdrv_activate_all ... bdrv_invalidate_cache bdrv_poll_co |-> aio_co_enter | ... | qcow2_co_invalidate_cache | memset(s, 0, ...) | qcow2_do_open | blk_co_pread | ... | qemu_coroutine_yield |-> AIO_WAIT_WHILE | aio_poll | reschedule of qmp_dispatch | qmp_query_block | ... | qcow2_get_specific_info | sees s->qcow_version == 0 | assert(false) So my question is how do we expect to be able to convert a QMP command into a coroutine if we're rescheduling all coroutines into qemu_aio_context (at qmp_dispatch). I don't see how to avoid any random aio_poll causing a dispatch of the coroutine in the middle of something else. If I keep the QMP command in the iohandler context, then the bug never happens. Rescheduling back into the iohandler would also work, were it not for the HMP path which only polls on qemu_aio_context and causes a deadlock. What's the recommended approach here? Thank you 1- https://lore.kernel.org/r/20230609201910.12100-1-faro...@suse.de 2- https://gitlab.com/qemu-project/qemu/-/issues/1933