On 2020-09-17 16:44, Stefan Hajnoczi wrote:
> On Thu, Sep 17, 2020 at 03:36:57PM +0800, Zhenyu Ye wrote:
> > When the hang occurs, the QEMU is blocked at:
> > 
> >     #0  0x0000ffff95762b64 in ?? () from target:/usr/lib64/libpthread.so.0
> >     #1  0x0000ffff9575bd88 in pthread_mutex_lock () from 
> > target:/usr/lib64/libpthread.so.0
> >     #2  0x0000aaaabb1f5948 in qemu_mutex_lock_impl (mutex=0xaaaacc8e1860,
> >         file=0xaaaabb4e1bd0 
> > "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605)
> >     #3  0x0000aaaabb20acd4 in aio_context_acquire (ctx=0xaaaacc8e1800)
> >     #4  0x0000aaaabb105e90 in bdrv_query_image_info (bs=0xaaaacc934620,
> >         p_info=0xaaaaccc41e18, errp=0xffffca669118)
> >     #5  0x0000aaaabb105968 in bdrv_block_device_info (blk=0xaaaacdca19f0, 
> > bs=0xaaaacc934620,
> >         flat=false, errp=0xffffca6692b8)
> >     #6  0x0000aaaabb1063dc in bdrv_query_info (blk=0xaaaacdca19f0, 
> > p_info=0xaaaacd29c9a8,
> >         errp=0xffffca6692b8)
> >     #7  0x0000aaaabb106c14 in qmp_query_block (errp=0x0)
> >     #8  0x0000aaaabacb8e6c in hmp_info_block (mon=0xffffca6693d0, 
> > qdict=0xaaaacd089790)
> 
> Great, this shows that the main loop thread is stuck waiting for the
> AioContext lock.
> 
> Please post backtraces from all QEMU threads ((gdb) thread apply all bt)
> so we can figure out which thread is holding up the main loop.

I think that is reflected in the perf backtrace posted by Zheny already:

And in the host, the information of sys_enter_io_submit() is:

Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count
(approx.): 3150
   Children      Self  Trace output
   -   66.70%    66.70%  ctx_id: 0xffff9c044000,
   nr: 0x00000001, iocbpp: 0xffff9f7fad28
   0xffffae7f871c
   0xffffae8a27c4
   qemu_thread_start
   iothread_run
   aio_poll
   aio_dispatch_ready_handlers
   aio_dispatch_handler
   virtio_queue_host_notifier_aio_read
   virtio_queue_notify_aio_vq
   virtio_blk_data_plane_handle_output
   virtio_blk_handle_vq
   blk_io_unplug
   bdrv_io_unplug
   bdrv_io_unplug
   raw_aio_unplug
   laio_io_unplug
   syscall

So the iothread is blocked by a slow io_submit holding the AioContext
lock.

It would be interesting to know what in kernel is blocking io_submit
from returning.

Fam

Reply via email to