On 2020-09-17 16:44, Stefan Hajnoczi wrote: > On Thu, Sep 17, 2020 at 03:36:57PM +0800, Zhenyu Ye wrote: > > When the hang occurs, the QEMU is blocked at: > > > > #0 0x0000ffff95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 > > #1 0x0000ffff9575bd88 in pthread_mutex_lock () from > > target:/usr/lib64/libpthread.so.0 > > #2 0x0000aaaabb1f5948 in qemu_mutex_lock_impl (mutex=0xaaaacc8e1860, > > file=0xaaaabb4e1bd0 > > "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) > > #3 0x0000aaaabb20acd4 in aio_context_acquire (ctx=0xaaaacc8e1800) > > #4 0x0000aaaabb105e90 in bdrv_query_image_info (bs=0xaaaacc934620, > > p_info=0xaaaaccc41e18, errp=0xffffca669118) > > #5 0x0000aaaabb105968 in bdrv_block_device_info (blk=0xaaaacdca19f0, > > bs=0xaaaacc934620, > > flat=false, errp=0xffffca6692b8) > > #6 0x0000aaaabb1063dc in bdrv_query_info (blk=0xaaaacdca19f0, > > p_info=0xaaaacd29c9a8, > > errp=0xffffca6692b8) > > #7 0x0000aaaabb106c14 in qmp_query_block (errp=0x0) > > #8 0x0000aaaabacb8e6c in hmp_info_block (mon=0xffffca6693d0, > > qdict=0xaaaacd089790) > > Great, this shows that the main loop thread is stuck waiting for the > AioContext lock. > > Please post backtraces from all QEMU threads ((gdb) thread apply all bt) > so we can figure out which thread is holding up the main loop.
I think that is reflected in the perf backtrace posted by Zheny already: And in the host, the information of sys_enter_io_submit() is: Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count (approx.): 3150 Children Self Trace output - 66.70% 66.70% ctx_id: 0xffff9c044000, nr: 0x00000001, iocbpp: 0xffff9f7fad28 0xffffae7f871c 0xffffae8a27c4 qemu_thread_start iothread_run aio_poll aio_dispatch_ready_handlers aio_dispatch_handler virtio_queue_host_notifier_aio_read virtio_queue_notify_aio_vq virtio_blk_data_plane_handle_output virtio_blk_handle_vq blk_io_unplug bdrv_io_unplug bdrv_io_unplug raw_aio_unplug laio_io_unplug syscall So the iothread is blocked by a slow io_submit holding the AioContext lock. It would be interesting to know what in kernel is blocking io_submit from returning. Fam