These patches enable Linux io_uring flags that can improve performance. Bernd Schubert mentioned io_uring_setup(2) flags that may improve performance: - IORING_SETUP_SINGLE_ISSUER: optimization when only 1 thread uses an io_uring context - IORING_SETUP_COOP_TASKRUN: avoids IPIs - IORING_SETUP_TASKRUN_FLAG: makes COOP_TASKRUN work with userspace CQ ring polling
Jens Axboe recently confirmed that SINGLE_ISSUER makes sense. Suraj Shirvankar already started work on SINGLE_ISSUER in the past: https://lore.kernel.org/qemu-devel/[email protected]/ Where this differs from Suraj's previous work is that I have worked around the need for the main loop AioContext to be shared by multiple threads (vCPU threads and the migration thread). Here are the performance numbers for fio bs=4k in a 4 vCPU guest with 1 IOThread using a virtio-blk disk backed by a local NVMe drive: IOPS IOPS IOPS Benchmark SINGLE_ISSUER +TASKRUN +NO_SQARRAY randread iodepth=1 99108 (+0.33%) 100816 (+2.1%) 104411 (+5.7%) randread iodepth=64 276314 (+0.12%) 275939 (-0.012%) 275899 (-0.026%) randwrite iodepth=1 99997 (-0.11%) 102866 (+2.8%) 105588 (+5.5%) randwrite iodepth=64 272205 (-0.2%) 271973 (-0.29%) 273257 (+0.18%) You can find detailed benchmarking results here including the fio output, fio command-line, and guest libvirt domain XML: https://gitlab.com/stefanha/virt-playbooks/-/tree/io_uring-flags/notebook/fio-output https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/fio.sh https://gitlab.com/stefanha/virt-playbooks/-/blob/io_uring-flags/files/test.xml.j2 Stefan Hajnoczi (4): iothread: create AioContext in iothread_run() aio-posix: enable IORING_SETUP_SINGLE_ISSUER aio-posix: enable IORING_SETUP_COOP_TASKRUN | IORING_SETUP_TASKRUN_FLAG aio-posix: enable IORING_SETUP_NO_SQARRAY include/system/iothread.h | 1 - iothread.c | 140 +++++++++++++++++++++----------------- util/fdmon-io_uring.c | 38 ++++++++++- 3 files changed, 113 insertions(+), 66 deletions(-) -- 2.53.0
