As discussed in another thread [1], here's an RFC addressing a VCPU softlockup encountered when issuing QMP commands that target a disk placed on NFS.
Since QMP commands happen with the qemu_global_mutex locked, any command that takes too long to finish will block other threads waiting to take the global mutex. One such thread could be a VCPU thread going out of the guest to handle IO. This is the case when issuing the QMP command query-block, which eventually calls raw_co_get_allocated_file_size(). This function makes an 'fstat' call that has been observed to take a long time (seconds) over NFS. NFS latency issues aside, we can improve the situation by not blocking VCPU threads while the command is running. Move the 'fstat' call into the thread-pool and make the necessary adaptations to ensure raw_co_get_allocated_file_size runs in a coroutine in the block driver aio_context. 1- Question about QMP and BQL https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03141.html CI run: https://gitlab.com/farosas/qemu/-/pipelines/876583685 Fabiano Rosas (3): block: Remove bdrv_query_block_node_info block: Mark bdrv_co_get_allocated_file_size() as mixed block: Allow bdrv_get_allocated_file_size to run in bdrv context João Silva (1): block: Add a thread-pool version of fstat Lin Ma (2): Convert query-block/info_block to coroutine Convert query-block/info_block to coroutine block/file-posix.c | 40 ++++++++++++++++++++++++-- block/monitor/block-hmp-cmds.c | 2 +- block/qapi.c | 51 +++++++++++++++------------------- blockdev.c | 6 ++-- hmp-commands-info.hx | 1 + include/block/block-hmp-cmds.h | 2 +- include/block/block-io.h | 2 +- include/block/qapi.h | 3 -- include/block/raw-aio.h | 4 ++- qapi/block-core.json | 5 ++-- 10 files changed, 72 insertions(+), 44 deletions(-) -- 2.35.3