Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020/12/15 0:33, Stefan Hajnoczi wrote: > On Tue, Dec 08, 2020 at 08:47:42AM -0500, Glauber Costa wrote: >> The work we did at the time was in fixing those things in the kernel >> as much as we could. >> But the API is just like that... > The best way for us is to replace io_submit with io_uring. Thanks for the discussion! Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, Dec 08, 2020 at 08:47:42AM -0500, Glauber Costa wrote: > The work we did at the time was in fixing those things in the kernel > as much as we could. > But the API is just like that... Thanks! Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, Dec 8, 2020 at 8:11 AM Stefan Hajnoczi wrote: > > On Thu, Oct 22, 2020 at 05:29:16PM +0100, Fam Zheng wrote: > > On Tue, 2020-10-20 at 09:34 +0800, Zhenyu Ye wrote: > > > On 2020/10/19 21:25, Paolo Bonzini wrote: > > > > On 19/10/20 14:40, Zhenyu Ye wrote: > > > > > The kernel backtrace for io_submit in GUEST is: > > > > > > > > > > guest# ./offcputime -K -p `pgrep -nx fio` > > > > > b'finish_task_switch' > > > > > b'__schedule' > > > > > b'schedule' > > > > > b'io_schedule' > > > > > b'blk_mq_get_tag' > > > > > b'blk_mq_get_request' > > > > > b'blk_mq_make_request' > > > > > b'generic_make_request' > > > > > b'submit_bio' > > > > > b'blkdev_direct_IO' > > > > > b'generic_file_read_iter' > > > > > b'aio_read' > > > > > b'io_submit_one' > > > > > b'__x64_sys_io_submit' > > > > > b'do_syscall_64' > > > > > b'entry_SYSCALL_64_after_hwframe' > > > > > -fio (1464) > > > > > 40031912 > > > > > > > > > > And Linux io_uring can avoid the latency problem. > > > > Thanks for the info. What this tells us is basically the inflight > > requests are high. It's sad that the linux-aio is in practice > > implemented as a blocking API. it is. > > > > Host side backtrace will be of more help. Can you get that too? > > I guess Linux AIO didn't set the BLK_MQ_REQ_NOWAIT flag so the task went > to sleep when it ran out of blk-mq tags. The easiest solution is to move > to io_uring. Linux AIO is broken - it's not AIO :). Agree! > > If we know that no other process is writing to the host block device > then maybe we can determine the blk-mq tags limit (the queue depth) and > avoid sending more requests. That way QEMU doesn't block, but I don't > think this approach works when other processes are submitting I/O to the > same host block device :(. > > Fam's original suggestion of invoking io_submit(2) from a worker thread > is an option, but I'm afraid it will slow down the uncontended case. > > I'm CCing Glauber in case he battled this in the past in ScyllaDB. We have, and a lot. I don't recall seeing this particular lock, but XFS would block us all the time if it had to update metadata to submit the operation, lock inodes, etc. The work we did at the time was in fixing those things in the kernel as much as we could. But the API is just like that... > > Stefan
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Thu, Oct 22, 2020 at 05:29:16PM +0100, Fam Zheng wrote: > On Tue, 2020-10-20 at 09:34 +0800, Zhenyu Ye wrote: > > On 2020/10/19 21:25, Paolo Bonzini wrote: > > > On 19/10/20 14:40, Zhenyu Ye wrote: > > > > The kernel backtrace for io_submit in GUEST is: > > > > > > > > guest# ./offcputime -K -p `pgrep -nx fio` > > > > b'finish_task_switch' > > > > b'__schedule' > > > > b'schedule' > > > > b'io_schedule' > > > > b'blk_mq_get_tag' > > > > b'blk_mq_get_request' > > > > b'blk_mq_make_request' > > > > b'generic_make_request' > > > > b'submit_bio' > > > > b'blkdev_direct_IO' > > > > b'generic_file_read_iter' > > > > b'aio_read' > > > > b'io_submit_one' > > > > b'__x64_sys_io_submit' > > > > b'do_syscall_64' > > > > b'entry_SYSCALL_64_after_hwframe' > > > > -fio (1464) > > > > 40031912 > > > > > > > > And Linux io_uring can avoid the latency problem. > > Thanks for the info. What this tells us is basically the inflight > requests are high. It's sad that the linux-aio is in practice > implemented as a blocking API. > > Host side backtrace will be of more help. Can you get that too? I guess Linux AIO didn't set the BLK_MQ_REQ_NOWAIT flag so the task went to sleep when it ran out of blk-mq tags. The easiest solution is to move to io_uring. Linux AIO is broken - it's not AIO :). If we know that no other process is writing to the host block device then maybe we can determine the blk-mq tags limit (the queue depth) and avoid sending more requests. That way QEMU doesn't block, but I don't think this approach works when other processes are submitting I/O to the same host block device :(. Fam's original suggestion of invoking io_submit(2) from a worker thread is an option, but I'm afraid it will slow down the uncontended case. I'm CCing Glauber in case he battled this in the past in ScyllaDB. Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, 2020-10-20 at 09:34 +0800, Zhenyu Ye wrote: > On 2020/10/19 21:25, Paolo Bonzini wrote: > > On 19/10/20 14:40, Zhenyu Ye wrote: > > > The kernel backtrace for io_submit in GUEST is: > > > > > > guest# ./offcputime -K -p `pgrep -nx fio` > > > b'finish_task_switch' > > > b'__schedule' > > > b'schedule' > > > b'io_schedule' > > > b'blk_mq_get_tag' > > > b'blk_mq_get_request' > > > b'blk_mq_make_request' > > > b'generic_make_request' > > > b'submit_bio' > > > b'blkdev_direct_IO' > > > b'generic_file_read_iter' > > > b'aio_read' > > > b'io_submit_one' > > > b'__x64_sys_io_submit' > > > b'do_syscall_64' > > > b'entry_SYSCALL_64_after_hwframe' > > > -fio (1464) > > > 40031912 > > > > > > And Linux io_uring can avoid the latency problem. Thanks for the info. What this tells us is basically the inflight requests are high. It's sad that the linux-aio is in practice implemented as a blocking API. Host side backtrace will be of more help. Can you get that too? Fam > > > > What filesystem are you using? > > > > On host, the VM image and disk images are based on ext4 filesystem. > In guest, the '/' uses xfs filesystem, and the disks are raw devices. > > guest# df -hT > Filesystem Type Size Used Avail Use% Mounted on > devtmpfsdevtmpfs 16G 0 16G 0% /dev > tmpfs tmpfs 16G 0 16G 0% /dev/shm > tmpfs tmpfs 16G 976K 16G 1% /run > /dev/mapper/fedora-root xfs 8.0G 3.2G 4.9G 40% / > tmpfs tmpfs 16G 0 16G 0% /tmp > /dev/sda1 xfs 1014M 181M 834M 18% /boot > tmpfs tmpfs 3.2G 0 3.2G 0% /run/user/0 > > guest# lsblk > NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sda 8:00 10G 0 disk > ├─sda18:10 1G 0 part /boot > └─sda28:20 9G 0 part > ├─fedora-root 253:00 8G 0 lvm / > └─fedora-swap 253:10 1G 0 lvm [SWAP] > vda 252:00 10G 0 disk > vdb 252:16 0 10G 0 disk > vdc 252:32 0 10G 0 disk > vdd 252:48 0 10G 0 disk > > Thanks, > Zhenyu >
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020/10/19 21:25, Paolo Bonzini wrote: > On 19/10/20 14:40, Zhenyu Ye wrote: >> The kernel backtrace for io_submit in GUEST is: >> >> guest# ./offcputime -K -p `pgrep -nx fio` >> b'finish_task_switch' >> b'__schedule' >> b'schedule' >> b'io_schedule' >> b'blk_mq_get_tag' >> b'blk_mq_get_request' >> b'blk_mq_make_request' >> b'generic_make_request' >> b'submit_bio' >> b'blkdev_direct_IO' >> b'generic_file_read_iter' >> b'aio_read' >> b'io_submit_one' >> b'__x64_sys_io_submit' >> b'do_syscall_64' >> b'entry_SYSCALL_64_after_hwframe' >> -fio (1464) >> 40031912 >> >> And Linux io_uring can avoid the latency problem. > > What filesystem are you using? > On host, the VM image and disk images are based on ext4 filesystem. In guest, the '/' uses xfs filesystem, and the disks are raw devices. guest# df -hT Filesystem Type Size Used Avail Use% Mounted on devtmpfsdevtmpfs 16G 0 16G 0% /dev tmpfs tmpfs 16G 0 16G 0% /dev/shm tmpfs tmpfs 16G 976K 16G 1% /run /dev/mapper/fedora-root xfs 8.0G 3.2G 4.9G 40% / tmpfs tmpfs 16G 0 16G 0% /tmp /dev/sda1 xfs 1014M 181M 834M 18% /boot tmpfs tmpfs 3.2G 0 3.2G 0% /run/user/0 guest# lsblk NAMEMAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:00 10G 0 disk ├─sda18:10 1G 0 part /boot └─sda28:20 9G 0 part ├─fedora-root 253:00 8G 0 lvm / └─fedora-swap 253:10 1G 0 lvm [SWAP] vda 252:00 10G 0 disk vdb 252:16 0 10G 0 disk vdc 252:32 0 10G 0 disk vdd 252:48 0 10G 0 disk Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 19/10/20 14:40, Zhenyu Ye wrote: > The kernel backtrace for io_submit in GUEST is: > > guest# ./offcputime -K -p `pgrep -nx fio` > b'finish_task_switch' > b'__schedule' > b'schedule' > b'io_schedule' > b'blk_mq_get_tag' > b'blk_mq_get_request' > b'blk_mq_make_request' > b'generic_make_request' > b'submit_bio' > b'blkdev_direct_IO' > b'generic_file_read_iter' > b'aio_read' > b'io_submit_one' > b'__x64_sys_io_submit' > b'do_syscall_64' > b'entry_SYSCALL_64_after_hwframe' > -fio (1464) > 40031912 > > And Linux io_uring can avoid the latency problem. What filesystem are you using? Paolo
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Stefan, On 2020/10/13 18:00, Stefan Hajnoczi wrote: > > Sorry, I lost track of this on-going email thread. > > Thanks for the backtrace. It shows the io_submit call is done while the > AioContext lock is held. The monitor thread is waiting for the > IOThread's AioContext lock. vcpus threads can get stuck waiting on the > big QEMU lock (BQL) that is held by the monitor in the meantime. > > Please collect the kernel backtrace for io_submit so we can understand > why multi-second io_submit latencies happen. > > I also suggest trying aio=io_uring to check if Linux io_uring avoids the > latency problem. > The kernel backtrace for io_submit in GUEST is: guest# ./offcputime -K -p `pgrep -nx fio` b'finish_task_switch' b'__schedule' b'schedule' b'io_schedule' b'blk_mq_get_tag' b'blk_mq_get_request' b'blk_mq_make_request' b'generic_make_request' b'submit_bio' b'blkdev_direct_IO' b'generic_file_read_iter' b'aio_read' b'io_submit_one' b'__x64_sys_io_submit' b'do_syscall_64' b'entry_SYSCALL_64_after_hwframe' -fio (1464) 40031912 And Linux io_uring can avoid the latency problem. Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Mon, Sep 21, 2020 at 11:14:35AM +, Fam Zheng wrote: > On 2020-09-19 10:22, Zhenyu Ye wrote: > > On 2020/9/18 22:06, Fam Zheng wrote: > > > > > > I can see how blocking in a slow io_submit can cause trouble for main > > > thread. I think one way to fix it (until it's made truly async in new > > > kernels) is moving the io_submit call to thread pool, and wrapped in a > > > coroutine, perhaps. > > > > > > > I'm not sure if any other operation will block the main thread, other > > than io_submit(). > > Then that's a problem with io_submit which should be fixed. Or more > precisely, that is a long held lock that we should avoid in QEMU's event > loops. > > > > > > I'm not sure qmp timeout is a complete solution because we would still > > > suffer from a blocked state for a period, in this exact situation before > > > the timeout. > > > > Anyway, the qmp timeout may be the last measure to prevent the VM > > soft lockup. > > Maybe, but I don't think baking such a workaround into the QMP API is a > good idea. No QMP command should be synchronously long running, so > having a timeout parameter is just a wrong design. Sorry, I lost track of this on-going email thread. Thanks for the backtrace. It shows the io_submit call is done while the AioContext lock is held. The monitor thread is waiting for the IOThread's AioContext lock. vcpus threads can get stuck waiting on the big QEMU lock (BQL) that is held by the monitor in the meantime. Please collect the kernel backtrace for io_submit so we can understand why multi-second io_submit latencies happen. I also suggest trying aio=io_uring to check if Linux io_uring avoids the latency problem. Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020-09-19 10:22, Zhenyu Ye wrote: > On 2020/9/18 22:06, Fam Zheng wrote: > > > > I can see how blocking in a slow io_submit can cause trouble for main > > thread. I think one way to fix it (until it's made truly async in new > > kernels) is moving the io_submit call to thread pool, and wrapped in a > > coroutine, perhaps. > > > > I'm not sure if any other operation will block the main thread, other > than io_submit(). Then that's a problem with io_submit which should be fixed. Or more precisely, that is a long held lock that we should avoid in QEMU's event loops. > > > I'm not sure qmp timeout is a complete solution because we would still > > suffer from a blocked state for a period, in this exact situation before > > the timeout. > > Anyway, the qmp timeout may be the last measure to prevent the VM > soft lockup. Maybe, but I don't think baking such a workaround into the QMP API is a good idea. No QMP command should be synchronously long running, so having a timeout parameter is just a wrong design. Thanks, Fam
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020/9/18 22:06, Fam Zheng wrote: > > I can see how blocking in a slow io_submit can cause trouble for main > thread. I think one way to fix it (until it's made truly async in new > kernels) is moving the io_submit call to thread pool, and wrapped in a > coroutine, perhaps. > I'm not sure if any other operation will block the main thread, other than io_submit(). > I'm not sure qmp timeout is a complete solution because we would still > suffer from a blocked state for a period, in this exact situation before > the timeout. Anyway, the qmp timeout may be the last measure to prevent the VM soft lockup. Ideally, after your fix of io_submit, this mechanism will never be triggered. Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020-09-18 19:23, Zhenyu Ye wrote: > Thread 5 (LWP 4802): > #0 0x83086b54 in syscall () at /lib64/libc.so.6 > #1 0x834598b8 in io_submit () at /lib64/libaio.so.1 > #2 0xe851e89c in ioq_submit (s=0xfffd3c001bb0) at > ../block/linux-aio.c:299 > #3 0xe851eb50 in laio_io_unplug (bs=0xef0f2340, > s=0xfffd3c001bb0) > at ../block/linux-aio.c:344 > #4 0xe8559f1c in raw_aio_unplug (bs=0xef0f2340) at > ../block/file-posix.c:2063 > #5 0xe8538344 in bdrv_io_unplug (bs=0xef0f2340) at > ../block/io.c:3135 > #6 0xe8538360 in bdrv_io_unplug (bs=0xef0eb020) at > ../block/io.c:3140 > #7 0xe8496104 in blk_io_unplug (blk=0xef0e8f20) > at ../block/block-backend.c:2147 > #8 0xe830e1a4 in virtio_blk_handle_vq (s=0xf0374280, > vq=0x700fc1d8) > at ../hw/block/virtio-blk.c:796 > #9 0xe82e6b68 in virtio_blk_data_plane_handle_output > (vdev=0xf0374280, vq=0x700fc1d8) at > ../hw/block/dataplane/virtio-blk.c:165 > #10 0xe83878fc in virtio_queue_notify_aio_vq (vq=0x700fc1d8) > at ../hw/virtio/virtio.c:2325 > #11 0xe838ab50 in virtio_queue_host_notifier_aio_poll > (opaque=0x700fc250) > at ../hw/virtio/virtio.c:3545 > #12 0xe85fab3c in run_poll_handlers_once > (ctx=0xef0a87b0, now=77604310618960, timeout=0x73ffdf78) > at ../util/aio-posix.c:398 > #13 0xe85fae5c in run_poll_handlers > (ctx=0xef0a87b0, max_ns=4000, timeout=0x73ffdf78) at > ../util/aio-posix.c:492 > #14 0xe85fb078 in try_poll_mode (ctx=0xef0a87b0, > timeout=0x73ffdf78) > at ../util/aio-posix.c:535 > #15 0xe85fb180 in aio_poll (ctx=0xef0a87b0, blocking=true) > at ../util/aio-posix.c:571 > #16 0xe8027004 in iothread_run (opaque=0xeee79a00) at > ../iothread.c:73 > #17 0xe85f269c in qemu_thread_start (args=0xef0a8d10) > at ../util/qemu-thread-posix.c:521 > #18 0x831428bc in () at /lib64/libpthread.so.0 > #19 0x8308aa1c in () at /lib64/libc.so.6 I can see how blocking in a slow io_submit can cause trouble for main thread. I think one way to fix it (until it's made truly async in new kernels) is moving the io_submit call to thread pool, and wrapped in a coroutine, perhaps. I'm not sure qmp timeout is a complete solution because we would still suffer from a blocked state for a period, in this exact situation before the timeout. Fam
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Stefan, Fam, On 2020/9/18 0:01, Fam Zheng wrote: > On 2020-09-17 16:44, Stefan Hajnoczi wrote: >> On Thu, Sep 17, 2020 at 03:36:57PM +0800, Zhenyu Ye wrote: >>> When the hang occurs, the QEMU is blocked at: >>> >>> #0 0x95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 >>> #1 0x9575bd88 in pthread_mutex_lock () from >>> target:/usr/lib64/libpthread.so.0 >>> #2 0xbb1f5948 in qemu_mutex_lock_impl (mutex=0xcc8e1860, >>> file=0xbb4e1bd0 >>> "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) >>> #3 0xbb20acd4 in aio_context_acquire (ctx=0xcc8e1800) >>> #4 0xbb105e90 in bdrv_query_image_info (bs=0xcc934620, >>> p_info=0xccc41e18, errp=0xca669118) >>> #5 0xbb105968 in bdrv_block_device_info (blk=0xcdca19f0, >>> bs=0xcc934620, >>> flat=false, errp=0xca6692b8) >>> #6 0xbb1063dc in bdrv_query_info (blk=0xcdca19f0, >>> p_info=0xcd29c9a8, >>> errp=0xca6692b8) >>> #7 0xbb106c14 in qmp_query_block (errp=0x0) >>> #8 0xbacb8e6c in hmp_info_block (mon=0xca6693d0, >>> qdict=0xcd089790) >> >> Great, this shows that the main loop thread is stuck waiting for the >> AioContext lock. >> >> Please post backtraces from all QEMU threads ((gdb) thread apply all bt) >> so we can figure out which thread is holding up the main loop. > > I think that is reflected in the perf backtrace posted by Zheny already: > > And in the host, the information of sys_enter_io_submit() is: > > Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count > (approx.): 3150 >Children Self Trace output >- 66.70%66.70% ctx_id: 0x9c044000, >nr: 0x0001, iocbpp: 0x9f7fad28 >0xae7f871c >0xae8a27c4 >qemu_thread_start >iothread_run >aio_poll >aio_dispatch_ready_handlers >aio_dispatch_handler >virtio_queue_host_notifier_aio_read >virtio_queue_notify_aio_vq >virtio_blk_data_plane_handle_output >virtio_blk_handle_vq >blk_io_unplug >bdrv_io_unplug >bdrv_io_unplug >raw_aio_unplug >laio_io_unplug >syscall > > So the iothread is blocked by a slow io_submit holding the AioContext > lock. > > It would be interesting to know what in kernel is blocking io_submit > from returning. > Sorry, I only get the backtraces of all QEMU threads, like below: (gdb) thread apply all bt Thread 35 (LWP 49700): #0 0x83148fe4 in pthread_cond_timedwait () at /lib64/libpthread.so.0 #1 0xe85f2080 in qemu_sem_timedwait (sem=0xef12e748, ms=1) at ../util/qemu-thread-posix.c:282 #2 0xe861c600 in worker_thread (opaque=0xef12e6c0) at ../util/thread-pool.c:91 #3 0xe85f269c in qemu_thread_start (args=0xf0204be0) at ../util/qemu-thread-posix.c:521 #4 0x831428bc in () at /lib64/libpthread.so.0 #5 0x8308aa1c in () at /lib64/libc.so.6 Thread 34 (LWP 45337): #0 0x83148fe4 in pthread_cond_timedwait () at /lib64/libpthread.so.0 #1 0xe85f2080 in qemu_sem_timedwait (sem=0xef12e748, ms=1) at ../util/qemu-thread-posix.c:282 #2 0xe861c600 in worker_thread (opaque=0xef12e6c0) at ../util/thread-pool.c:91 #3 0xe85f269c in qemu_thread_start (args=0xfffcc8000b20) at ../util/qemu-thread-posix.c:521 #4 0x831428bc in () at /lib64/libpthread.so.0 #5 0x8308aa1c in () at /lib64/libc.so.6 Thread 33 (LWP 45336): #0 0x83148fe4 in pthread_cond_timedwait () at /lib64/libpthread.so.0 #1 0xe85f2080 in qemu_sem_timedwait (sem=0xef12e748, ms=1) at ../util/qemu-thread-posix.c:282 #2 0xe861c600 in worker_thread (opaque=0xef12e6c0) at ../util/thread-pool.c:91 #3 0xe85f269c in qemu_thread_start (args=0xfffd14000b20) at ../util/qemu-thread-posix.c:521 #4 0x831428bc in () at /lib64/libpthread.so.0 #5 0x8308aa1c in () at /lib64/libc.so.6 Thread 32 (LWP 45335): #0 0x83148fe4 in pthread_cond_timedwait () at /lib64/libpthread.so.0 #1 0xe85f2080 in qemu_sem_timedwait (sem=0xef12e748, ms=1) at ../util/qemu-thread-posix.c:282 #2 0xe861c600 in worker_thread (opaque=0xef12e6c0) at ../util/thread-pool.c:91 #3 0xe85f269c in qemu_thread_start (args=0xefb00f00) at ../util/qemu-thread-posix.c:521 #4 0x831428bc in () at /lib64/libpthread.so.0 #5 0x8308aa1c in () at /lib64/libc.so.6 Thread 12 (LWP 4849): #0 0x83082a0c in i
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020-09-17 16:44, Stefan Hajnoczi wrote: > On Thu, Sep 17, 2020 at 03:36:57PM +0800, Zhenyu Ye wrote: > > When the hang occurs, the QEMU is blocked at: > > > > #0 0x95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 > > #1 0x9575bd88 in pthread_mutex_lock () from > > target:/usr/lib64/libpthread.so.0 > > #2 0xbb1f5948 in qemu_mutex_lock_impl (mutex=0xcc8e1860, > > file=0xbb4e1bd0 > > "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) > > #3 0xbb20acd4 in aio_context_acquire (ctx=0xcc8e1800) > > #4 0xbb105e90 in bdrv_query_image_info (bs=0xcc934620, > > p_info=0xccc41e18, errp=0xca669118) > > #5 0xbb105968 in bdrv_block_device_info (blk=0xcdca19f0, > > bs=0xcc934620, > > flat=false, errp=0xca6692b8) > > #6 0xbb1063dc in bdrv_query_info (blk=0xcdca19f0, > > p_info=0xcd29c9a8, > > errp=0xca6692b8) > > #7 0xbb106c14 in qmp_query_block (errp=0x0) > > #8 0xbacb8e6c in hmp_info_block (mon=0xca6693d0, > > qdict=0xcd089790) > > Great, this shows that the main loop thread is stuck waiting for the > AioContext lock. > > Please post backtraces from all QEMU threads ((gdb) thread apply all bt) > so we can figure out which thread is holding up the main loop. I think that is reflected in the perf backtrace posted by Zheny already: And in the host, the information of sys_enter_io_submit() is: Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count (approx.): 3150 Children Self Trace output - 66.70%66.70% ctx_id: 0x9c044000, nr: 0x0001, iocbpp: 0x9f7fad28 0xae7f871c 0xae8a27c4 qemu_thread_start iothread_run aio_poll aio_dispatch_ready_handlers aio_dispatch_handler virtio_queue_host_notifier_aio_read virtio_queue_notify_aio_vq virtio_blk_data_plane_handle_output virtio_blk_handle_vq blk_io_unplug bdrv_io_unplug bdrv_io_unplug raw_aio_unplug laio_io_unplug syscall So the iothread is blocked by a slow io_submit holding the AioContext lock. It would be interesting to know what in kernel is blocking io_submit from returning. Fam
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Thu, Sep 17, 2020 at 03:36:57PM +0800, Zhenyu Ye wrote: > When the hang occurs, the QEMU is blocked at: > > #0 0x95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 > #1 0x9575bd88 in pthread_mutex_lock () from > target:/usr/lib64/libpthread.so.0 > #2 0xbb1f5948 in qemu_mutex_lock_impl (mutex=0xcc8e1860, > file=0xbb4e1bd0 > "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) > #3 0xbb20acd4 in aio_context_acquire (ctx=0xcc8e1800) > #4 0xbb105e90 in bdrv_query_image_info (bs=0xcc934620, > p_info=0xccc41e18, errp=0xca669118) > #5 0xbb105968 in bdrv_block_device_info (blk=0xcdca19f0, > bs=0xcc934620, > flat=false, errp=0xca6692b8) > #6 0xbb1063dc in bdrv_query_info (blk=0xcdca19f0, > p_info=0xcd29c9a8, > errp=0xca6692b8) > #7 0xbb106c14 in qmp_query_block (errp=0x0) > #8 0xbacb8e6c in hmp_info_block (mon=0xca6693d0, > qdict=0xcd089790) Great, this shows that the main loop thread is stuck waiting for the AioContext lock. Please post backtraces from all QEMU threads ((gdb) thread apply all bt) so we can figure out which thread is holding up the main loop. Thanks, Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On 2020-09-17 15:36, Zhenyu Ye wrote: > Hi Stefan, > > On 2020/9/14 21:27, Stefan Hajnoczi wrote: > >> > >> Theoretically, everything running in an iothread is asynchronous. However, > >> some 'asynchronous' actions are not non-blocking entirely, such as > >> io_submit(). This will block while the iodepth is too big and I/O pressure > >> is too high. If we do some qmp actions, such as 'info block', at this > >> time, > >> may cause vm soft lockup. This series can make these qmp actions safer. > >> > >> I constructed the scene as follow: > >> 1. create a vm with 4 disks, using iothread. > >> 2. add press to the CPU on the host. In my scene, the CPU usage exceeds > >> 95%. > >> 3. add press to the 4 disks in the vm at the same time. I used the fio and > >> some parameters are: > >> > >> fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 > >> > >> 4. do block query actions, for example, by virsh: > >> > >>virsh qemu-monitor-command [vm name] --hmp info block > >> > >> Then the vm will soft lockup, the calltrace is: > >> > >> [ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! > >> [kworker/1:1:33] > > > > Hi, > > Sorry I haven't had time to investigate this myself yet. > > > > Do you also have a QEMU backtrace when the hang occurs? > > > > Let's find out if QEMU is stuck in the io_submit(2) syscall or whether > > there's an issue in QEMU itself that causes the softlockup (for example, > > aio_poll() with the global mutex held). > > Sorry for the delay. > > I traced the io_submit() within the guest. The maximum execution time exceeds > 16s: > > guest# perf trace -p `pidof -s fio` --duration 1 > 14.808 (3843.082 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 > 3861.336 (11470.608 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >15341.998 (479.283 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >15831.380 (3704.997 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >19547.869 (3412.839 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >22966.953 (1080.854 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >24062.689 (6939.813 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >31019.219 (532.147 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >31556.610 (3469.920 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >35038.885 (9007.175 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >44053.578 (16006.405 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >60068.944 (3068.748 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 >63138.474 (13012.499 ms): io_getevents(ctx_id: 281472924459008, > min_nr: 511, nr: 511, events: 0x19a83150) = 511 >76191.073 (2872.657 ms): io_submit(ctx_id: 281472924459008, nr: 1, > iocbpp: 0x19a87160 ) = 1 > ... > > And in the host, the information of sys_enter_io_submit() is: > > Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count > (approx.): 3150 > Children Self Trace output > - 66.70%66.70% ctx_id: 0x9c044000, nr: 0x0001, iocbpp: > 0x9f7fad28 > 0xae7f871c > 0xae8a27c4 > qemu_thread_start > iothread_run > aio_poll > aio_dispatch_ready_handlers > aio_dispatch_handler > virtio_queue_host_notifier_aio_read > virtio_queue_notify_aio_vq > virtio_blk_data_plane_handle_output > virtio_blk_handle_vq > blk_io_unplug > bdrv_io_unplug > bdrv_io_unplug > raw_aio_unplug > laio_io_unplug > syscall > > > When the hang occurs, the QEMU is blocked at: > > #0 0x95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 > #1 0x9575bd88 in pthread_mutex_lock () from > target:/usr/lib64/libpthread.so.0 > #2 0xbb1f5948 in qemu_mutex_lock_impl (mutex=0xcc8e1860, > file=0xbb4e1bd0 > "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) > #3 0xbb20acd4 in aio_context_acquire (ctx=0xcc8e1800) > #4 0xbb105e90 in bdrv_query_image_info (bs=0xcc934620, > p_info=0xccc41e18, errp=0xca669118) > #5 0xbb105968 in bdrv_block_device_info (blk=0xcdca19f0, > bs=0xcc934620, > flat=false, errp=0xca6692b8) > #6 0xbb1063dc in bdrv_query_info (blk=0xcdca19f0, > p_info=0xcd29c9a8, > errp=0xca6692b8) > #7 0xbb106c14 in qmp_query_block (errp=0x0) >
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Daniel, On 2020/9/14 22:42, Daniel P. Berrangé wrote: > On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote: >> Hi Kevin, >> >> On 2020/8/10 23:38, Kevin Wolf wrote: >>> Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: Before doing qmp actions, we need to lock the qemu_global_mutex, so the qmp actions should not take too long time. Unfortunately, some qmp actions need to acquire aio context and this may take a long time. The vm will soft lockup if this time is too long. >>> >>> Do you have a specific situation in mind where getting the lock of an >>> AioContext can take a long time? I know that the main thread can >>> block for considerable time, but QMP commands run in the main thread, so >>> this patch doesn't change anything for this case. It would be effective >>> if an iothread blocks, but shouldn't everything running in an iothread >>> be asynchronous and therefore keep the AioContext lock only for a short >>> time? >>> >> >> Theoretically, everything running in an iothread is asynchronous. However, >> some 'asynchronous' actions are not non-blocking entirely, such as >> io_submit(). This will block while the iodepth is too big and I/O pressure >> is too high. If we do some qmp actions, such as 'info block', at this time, >> may cause vm soft lockup. This series can make these qmp actions safer. >> >> I constructed the scene as follow: >> 1. create a vm with 4 disks, using iothread. >> 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. >> 3. add press to the 4 disks in the vm at the same time. I used the fio and >> some parameters are: >> >> fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 >> >> 4. do block query actions, for example, by virsh: >> >> virsh qemu-monitor-command [vm name] --hmp info block >> >> Then the vm will soft lockup, the calltrace is: > > [snip] > >> This problem can be avoided after this series applied. > > At what cost though ? With this timeout, QEMU is going to start > reporting bogus failures for various QMP commands when running > under high load, even if those commands would actually run > successfully. This will turn into an error report from libvirt > which will in turn probably cause an error in the mgmt application > using libvirt, and in turn could break the user's automation. > I think it's worth reporting an error to avoid the VM softlockup. The VM may even crash if kernel.softlockup_panic is configured! We can increase the timeout value (close to the VM cpu soft lock time) to avoid unnecessary errors. Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Stefan, On 2020/9/14 21:27, Stefan Hajnoczi wrote: >> >> Theoretically, everything running in an iothread is asynchronous. However, >> some 'asynchronous' actions are not non-blocking entirely, such as >> io_submit(). This will block while the iodepth is too big and I/O pressure >> is too high. If we do some qmp actions, such as 'info block', at this time, >> may cause vm soft lockup. This series can make these qmp actions safer. >> >> I constructed the scene as follow: >> 1. create a vm with 4 disks, using iothread. >> 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. >> 3. add press to the 4 disks in the vm at the same time. I used the fio and >> some parameters are: >> >> fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 >> >> 4. do block query actions, for example, by virsh: >> >> virsh qemu-monitor-command [vm name] --hmp info block >> >> Then the vm will soft lockup, the calltrace is: >> >> [ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! >> [kworker/1:1:33] > > Hi, > Sorry I haven't had time to investigate this myself yet. > > Do you also have a QEMU backtrace when the hang occurs? > > Let's find out if QEMU is stuck in the io_submit(2) syscall or whether > there's an issue in QEMU itself that causes the softlockup (for example, > aio_poll() with the global mutex held). Sorry for the delay. I traced the io_submit() within the guest. The maximum execution time exceeds 16s: guest# perf trace -p `pidof -s fio` --duration 1 14.808 (3843.082 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 3861.336 (11470.608 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 15341.998 (479.283 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 15831.380 (3704.997 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 19547.869 (3412.839 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 22966.953 (1080.854 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 24062.689 (6939.813 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 31019.219 (532.147 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 31556.610 (3469.920 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 35038.885 (9007.175 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 44053.578 (16006.405 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 60068.944 (3068.748 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 63138.474 (13012.499 ms): io_getevents(ctx_id: 281472924459008, min_nr: 511, nr: 511, events: 0x19a83150) = 511 76191.073 (2872.657 ms): io_submit(ctx_id: 281472924459008, nr: 1, iocbpp: 0x19a87160 ) = 1 ... And in the host, the information of sys_enter_io_submit() is: Samples: 3K of event 'syscalls:sys_enter_io_submit', Event count (approx.): 3150 Children Self Trace output - 66.70%66.70% ctx_id: 0x9c044000, nr: 0x0001, iocbpp: 0x9f7fad28 0xae7f871c 0xae8a27c4 qemu_thread_start iothread_run aio_poll aio_dispatch_ready_handlers aio_dispatch_handler virtio_queue_host_notifier_aio_read virtio_queue_notify_aio_vq virtio_blk_data_plane_handle_output virtio_blk_handle_vq blk_io_unplug bdrv_io_unplug bdrv_io_unplug raw_aio_unplug laio_io_unplug syscall When the hang occurs, the QEMU is blocked at: #0 0x95762b64 in ?? () from target:/usr/lib64/libpthread.so.0 #1 0x9575bd88 in pthread_mutex_lock () from target:/usr/lib64/libpthread.so.0 #2 0xbb1f5948 in qemu_mutex_lock_impl (mutex=0xcc8e1860, file=0xbb4e1bd0 "/Images/eillon/CODE/5-opensource/qemu/util/async.c", line=605) #3 0xbb20acd4 in aio_context_acquire (ctx=0xcc8e1800) #4 0xbb105e90 in bdrv_query_image_info (bs=0xcc934620, p_info=0xccc41e18, errp=0xca669118) #5 0xbb105968 in bdrv_block_device_info (blk=0xcdca19f0, bs=0xcc934620, flat=false, errp=0xca6692b8) #6 0xbb1063dc in bdrv_query_info (blk=0xcdca19f0, p_info=0xcd29c9a8, errp=0xca6692b8) #7 0xbb106c14 in qmp_query_block (errp=0x0) #8 0xbacb8e6c in hmp_info_block (mon=0xca6693d0, qdict=0xcd089790) #9 0xbb0068f0 in handle_hmp_command (mon=0xca6693d0, cmdline
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote: > Hi Kevin, > > On 2020/8/10 23:38, Kevin Wolf wrote: > > Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: > >> Before doing qmp actions, we need to lock the qemu_global_mutex, > >> so the qmp actions should not take too long time. > >> > >> Unfortunately, some qmp actions need to acquire aio context and > >> this may take a long time. The vm will soft lockup if this time > >> is too long. > > > > Do you have a specific situation in mind where getting the lock of an > > AioContext can take a long time? I know that the main thread can > > block for considerable time, but QMP commands run in the main thread, so > > this patch doesn't change anything for this case. It would be effective > > if an iothread blocks, but shouldn't everything running in an iothread > > be asynchronous and therefore keep the AioContext lock only for a short > > time? > > > > Theoretically, everything running in an iothread is asynchronous. However, > some 'asynchronous' actions are not non-blocking entirely, such as > io_submit(). This will block while the iodepth is too big and I/O pressure > is too high. If we do some qmp actions, such as 'info block', at this time, > may cause vm soft lockup. This series can make these qmp actions safer. > > I constructed the scene as follow: > 1. create a vm with 4 disks, using iothread. > 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. > 3. add press to the 4 disks in the vm at the same time. I used the fio and > some parameters are: > >fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 > > 4. do block query actions, for example, by virsh: > > virsh qemu-monitor-command [vm name] --hmp info block > > Then the vm will soft lockup, the calltrace is: [snip] > This problem can be avoided after this series applied. At what cost though ? With this timeout, QEMU is going to start reporting bogus failures for various QMP commands when running under high load, even if those commands would actually run successfully. This will turn into an error report from libvirt which will in turn probably cause an error in the mgmt application using libvirt, and in turn could break the user's automation. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote: > Hi Kevin, > > On 2020/8/10 23:38, Kevin Wolf wrote: > > Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: > >> Before doing qmp actions, we need to lock the qemu_global_mutex, > >> so the qmp actions should not take too long time. > >> > >> Unfortunately, some qmp actions need to acquire aio context and > >> this may take a long time. The vm will soft lockup if this time > >> is too long. > > > > Do you have a specific situation in mind where getting the lock of an > > AioContext can take a long time? I know that the main thread can > > block for considerable time, but QMP commands run in the main thread, so > > this patch doesn't change anything for this case. It would be effective > > if an iothread blocks, but shouldn't everything running in an iothread > > be asynchronous and therefore keep the AioContext lock only for a short > > time? > > > > Theoretically, everything running in an iothread is asynchronous. However, > some 'asynchronous' actions are not non-blocking entirely, such as > io_submit(). This will block while the iodepth is too big and I/O pressure > is too high. If we do some qmp actions, such as 'info block', at this time, > may cause vm soft lockup. This series can make these qmp actions safer. > > I constructed the scene as follow: > 1. create a vm with 4 disks, using iothread. > 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. > 3. add press to the 4 disks in the vm at the same time. I used the fio and > some parameters are: > >fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 > > 4. do block query actions, for example, by virsh: > > virsh qemu-monitor-command [vm name] --hmp info block > > Then the vm will soft lockup, the calltrace is: > > [ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! > [kworker/1:1:33] Hi, Sorry I haven't had time to investigate this myself yet. Do you also have a QEMU backtrace when the hang occurs? Let's find out if QEMU is stuck in the io_submit(2) syscall or whether there's an issue in QEMU itself that causes the softlockup (for example, aio_poll() with the global mutex held). Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Tue, Aug 11, 2020 at 09:54:08PM +0800, Zhenyu Ye wrote: > Hi Kevin, > > On 2020/8/10 23:38, Kevin Wolf wrote: > > Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: > >> Before doing qmp actions, we need to lock the qemu_global_mutex, > >> so the qmp actions should not take too long time. > >> > >> Unfortunately, some qmp actions need to acquire aio context and > >> this may take a long time. The vm will soft lockup if this time > >> is too long. > > > > Do you have a specific situation in mind where getting the lock of an > > AioContext can take a long time? I know that the main thread can > > block for considerable time, but QMP commands run in the main thread, so > > this patch doesn't change anything for this case. It would be effective > > if an iothread blocks, but shouldn't everything running in an iothread > > be asynchronous and therefore keep the AioContext lock only for a short > > time? > > > > Theoretically, everything running in an iothread is asynchronous. However, > some 'asynchronous' actions are not non-blocking entirely, such as > io_submit(). This will block while the iodepth is too big and I/O pressure > is too high. If we do some qmp actions, such as 'info block', at this time, > may cause vm soft lockup. This series can make these qmp actions safer. > > I constructed the scene as follow: > 1. create a vm with 4 disks, using iothread. > 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. > 3. add press to the 4 disks in the vm at the same time. I used the fio and > some parameters are: > >fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 > > 4. do block query actions, for example, by virsh: > > virsh qemu-monitor-command [vm name] --hmp info block Great, thanks for describing a reproducer. Do you also have a QEMU backtrace? I'm curious whether io_submit(2) is blocking for 40+ seconds or whether the softlockup is caused by QEMU code waiting to drain in-flight I/O requests with the global mutex held. I'll try to reproduce this here and start with: host# perf record -a -e syscalls:sys_enter_io_submit \ -e syscalls:sys_exit_io_submit \ -e kvm:kvm_exit \ -e kvm:kvm_entry This will allow us to check the io_submit() and vmexit->vmenter timings. I want to understand this better to be sure that the timeout is really the only solution before merging this patch. My concern is that relying on timeouts will hide QEMU bugs that could be fixed instead. > Then the vm will soft lockup, the calltrace is: > > [ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! > [kworker/1:1:33] > [ 192.314241] Kernel panic - not syncing: softlockup: hung tasks > [ 192.316370] CPU: 1 PID: 33 Comm: kworker/1:1 Kdump: loaded Tainted: G > OEL4.19.36+ #16 > [ 192.318765] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 > [ 192.325638] Workqueue: events drm_fb_helper_dirty_work > [ 192.roorr327238] Call trace: > [ 192.331528] dump_backtrace+0x0/0x198 > [ 192.332695] show_stack+0x24/0x30 > [ 192.334219] dump_stack+0xa4/0xcc > [ 192.335578] panic+0x12c/0x314 > [ 192.336565] watchdog_timoot_fn+0x3e4/0x3e8 > [ 192.339984] __hrtimer_run_queues+0x114/0x358 > [ 192.341576] hrtimer_interrupt+0x104/0x2d8 > [ 192.343247] arch_timer_handler_virt+0x38/0x58 > [ 192.345074] handle_percpu_devid_irq+0x90/0x248 > [ 192.347238] generic_handle_irq+0x34/0x50 > [ 192.349536] __handle_domain_irq+0x68/0xc0 > [ 192.351193] gic_handle_irq+0x6c/0x150 > [ 192.352639] el1_irq+0xb8/0x140 > [ 192.353855] vp_notify+0x28/0x38 [virtio_pci] > [ 192.355646] virtqueue_kick+0x3c/0x78 [virtio_ring] > [ 192.357539] virtio_gpu_queue_ctrl_buffer_locked+0x180/0x248 [virtio_gpu] > [ 192.359869] virtio_gpu_queue_ctrl_buffer+0x50/0x78 [virtio_gpu] > [ 192.361456] virtio_gpu_cmd_resource_flush+0x8c/0xb0 [virtio_gpu] > [ 192.363422] virtio_gpu_surface_dirty+0x60/0x110 [virtio_gpu] > [ 192.365215] virtio_gpu_framebuffer_surface_dirty+0x34/0x48 [virtio_gpu] > [ 192.367452] drm_fb_helper_dirty_work+0x178/0x1c0 > [ 192.368912] process_one_work+0x1b4/0x3f8 > [ 192.370192] worker_thread+0x54/0x470 > [ 192.371370] kthread+0x134/0x138 > [ 192.379241] ret_from_fork+0x10/0x18 > [ 192.382688] kernel fault(0x5) notification starting on CPU 1 > [ 192.385059] kernel fault(0x5) notification finished on CPU 1 > [ 192.387311] SMP: stopping secondary CPUs > [ 192.391024] Kernel Offset: disabled > [ 192.392111] CPU features: 0x0,a1806008 > [ 192.393306] Memory Limit: none > [ 192.396701] Starting crashdump kernel... > [ 192.397821] Bye! > > This problem can be avoided after this series applied. > > Thanks, > Zhenyu > signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Stefan, On 2020/8/12 21:51, Stefan Hajnoczi wrote: > On Mon, Aug 10, 2020 at 10:52:44PM +0800, Zhenyu Ye wrote: >> Before doing qmp actions, we need to lock the qemu_global_mutex, >> so the qmp actions should not take too long time. >> >> Unfortunately, some qmp actions need to acquire aio context and >> this may take a long time. The vm will soft lockup if this time >> is too long. >> >> So add a timeout mechanism while doing qmp actions. > > aio_context_acquire_timeout() is a workaround for code that blocks the > event loop. Ideally there should be no code that blocks the event loop. > > Which cases have you found where the event loop is blocked? > Currently I only found the io_submit() will block while I/O pressure is too high, for details, see: https://lore.kernel.org/qemu-devel/c6d75e49-3e36-6a76-fdc8-cdf09e7c3...@huawei.com/ io_submit can not ensure non-blocking at any time. > I think they should be investigated and fixed (if possible) before > introducing an API like aio_context_acquire_timeout(). > We cannot ensure that everything is non-blocking in iothread, because some actions seems like asynchronous but will block in some times (such as io_submit). Anyway, the _timeout() API can make these qmp commands (which need to get aio_context) be safer. Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
On Mon, Aug 10, 2020 at 10:52:44PM +0800, Zhenyu Ye wrote: > Before doing qmp actions, we need to lock the qemu_global_mutex, > so the qmp actions should not take too long time. > > Unfortunately, some qmp actions need to acquire aio context and > this may take a long time. The vm will soft lockup if this time > is too long. > > So add a timeout mechanism while doing qmp actions. aio_context_acquire_timeout() is a workaround for code that blocks the event loop. Ideally there should be no code that blocks the event loop. Which cases have you found where the event loop is blocked? I think they should be investigated and fixed (if possible) before introducing an API like aio_context_acquire_timeout(). Stefan signature.asc Description: PGP signature
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Hi Kevin, On 2020/8/10 23:38, Kevin Wolf wrote: > Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: >> Before doing qmp actions, we need to lock the qemu_global_mutex, >> so the qmp actions should not take too long time. >> >> Unfortunately, some qmp actions need to acquire aio context and >> this may take a long time. The vm will soft lockup if this time >> is too long. > > Do you have a specific situation in mind where getting the lock of an > AioContext can take a long time? I know that the main thread can > block for considerable time, but QMP commands run in the main thread, so > this patch doesn't change anything for this case. It would be effective > if an iothread blocks, but shouldn't everything running in an iothread > be asynchronous and therefore keep the AioContext lock only for a short > time? > Theoretically, everything running in an iothread is asynchronous. However, some 'asynchronous' actions are not non-blocking entirely, such as io_submit(). This will block while the iodepth is too big and I/O pressure is too high. If we do some qmp actions, such as 'info block', at this time, may cause vm soft lockup. This series can make these qmp actions safer. I constructed the scene as follow: 1. create a vm with 4 disks, using iothread. 2. add press to the CPU on the host. In my scene, the CPU usage exceeds 95%. 3. add press to the 4 disks in the vm at the same time. I used the fio and some parameters are: fio -rw=randrw -bs=1M -size=1G -iodepth=512 -ioengine=libaio -numjobs=4 4. do block query actions, for example, by virsh: virsh qemu-monitor-command [vm name] --hmp info block Then the vm will soft lockup, the calltrace is: [ 192.311393] watchdog: BUG: soft lockup - CPU#1 stuck for 42s! [kworker/1:1:33] [ 192.314241] Kernel panic - not syncing: softlockup: hung tasks [ 192.316370] CPU: 1 PID: 33 Comm: kworker/1:1 Kdump: loaded Tainted: G OEL4.19.36+ #16 [ 192.318765] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 192.325638] Workqueue: events drm_fb_helper_dirty_work [ 192.roorr327238] Call trace: [ 192.331528] dump_backtrace+0x0/0x198 [ 192.332695] show_stack+0x24/0x30 [ 192.334219] dump_stack+0xa4/0xcc [ 192.335578] panic+0x12c/0x314 [ 192.336565] watchdog_timoot_fn+0x3e4/0x3e8 [ 192.339984] __hrtimer_run_queues+0x114/0x358 [ 192.341576] hrtimer_interrupt+0x104/0x2d8 [ 192.343247] arch_timer_handler_virt+0x38/0x58 [ 192.345074] handle_percpu_devid_irq+0x90/0x248 [ 192.347238] generic_handle_irq+0x34/0x50 [ 192.349536] __handle_domain_irq+0x68/0xc0 [ 192.351193] gic_handle_irq+0x6c/0x150 [ 192.352639] el1_irq+0xb8/0x140 [ 192.353855] vp_notify+0x28/0x38 [virtio_pci] [ 192.355646] virtqueue_kick+0x3c/0x78 [virtio_ring] [ 192.357539] virtio_gpu_queue_ctrl_buffer_locked+0x180/0x248 [virtio_gpu] [ 192.359869] virtio_gpu_queue_ctrl_buffer+0x50/0x78 [virtio_gpu] [ 192.361456] virtio_gpu_cmd_resource_flush+0x8c/0xb0 [virtio_gpu] [ 192.363422] virtio_gpu_surface_dirty+0x60/0x110 [virtio_gpu] [ 192.365215] virtio_gpu_framebuffer_surface_dirty+0x34/0x48 [virtio_gpu] [ 192.367452] drm_fb_helper_dirty_work+0x178/0x1c0 [ 192.368912] process_one_work+0x1b4/0x3f8 [ 192.370192] worker_thread+0x54/0x470 [ 192.371370] kthread+0x134/0x138 [ 192.379241] ret_from_fork+0x10/0x18 [ 192.382688] kernel fault(0x5) notification starting on CPU 1 [ 192.385059] kernel fault(0x5) notification finished on CPU 1 [ 192.387311] SMP: stopping secondary CPUs [ 192.391024] Kernel Offset: disabled [ 192.392111] CPU features: 0x0,a1806008 [ 192.393306] Memory Limit: none [ 192.396701] Starting crashdump kernel... [ 192.397821] Bye! This problem can be avoided after this series applied. Thanks, Zhenyu
Re: [PATCH v1 0/2] Add timeout mechanism to qmp actions
Am 10.08.2020 um 16:52 hat Zhenyu Ye geschrieben: > Before doing qmp actions, we need to lock the qemu_global_mutex, > so the qmp actions should not take too long time. > > Unfortunately, some qmp actions need to acquire aio context and > this may take a long time. The vm will soft lockup if this time > is too long. Do you have a specific situation in mind where getting the lock of an AioContext can take a long time? I know that the main thread can block for considerable time, but QMP commands run in the main thread, so this patch doesn't change anything for this case. It would be effective if an iothread blocks, but shouldn't everything running in an iothread be asynchronous and therefore keep the AioContext lock only for a short time? Kevin
[PATCH v1 0/2] Add timeout mechanism to qmp actions
Before doing qmp actions, we need to lock the qemu_global_mutex, so the qmp actions should not take too long time. Unfortunately, some qmp actions need to acquire aio context and this may take a long time. The vm will soft lockup if this time is too long. So add a timeout mechanism while doing qmp actions. Zhenyu Ye (2): util: introduce aio_context_acquire_timeout qmp: use aio_context_acquire_timeout replace aio_context_acquire block/qapi-sysemu.c | 7 ++- block/qapi.c| 6 +- blockdev.c | 35 ++- include/block/aio.h | 6 ++ include/qemu/thread-posix.h | 1 + include/qemu/thread.h | 1 + util/async.c| 10 ++ util/qemu-thread-posix.c| 6 ++ 8 files changed, 65 insertions(+), 7 deletions(-) -- 2.22.0.windows.1