Recent performance investigation work done by Karl Rister shows that the guest->host notification takes around 20 us. This is more than the "overhead" of QEMU itself (e.g. block layer).
One way to avoid the costly exit is to use polling instead of notification. The main drawback of polling is that it consumes CPU resources. In order to benefit performance the host must have extra CPU cycles available on physical CPUs that aren't used by the guest. This is an experimental AioContext polling implementation. It adds a polling callback into the event loop. Polling functions are implemented for virtio-blk virtqueue guest->host kick and Linux AIO completion. The QEMU_AIO_POLL_MAX_NS environment variable sets the number of nanoseconds to poll before entering the usual blocking poll(2) syscall. Try setting this variable to the time from old request completion to new virtqueue kick. By default no polling is done. The QEMU_AIO_POLL_MAX_NS must be set to get any polling! Karl: I hope you can try this patch series with several QEMU_AIO_POLL_MAX_NS values. If you don't find a good value we should double-check the tracing data to see if this experimental code can be improved. Stefan Hajnoczi (3): aio-posix: add aio_set_poll_handler() virtio: poll virtqueues for new buffers linux-aio: poll ring for completions aio-posix.c | 133 ++++++++++++++++++++++++++++++++++++++++++++++++++++ block/linux-aio.c | 17 +++++++ hw/virtio/virtio.c | 19 ++++++++ include/block/aio.h | 16 +++++++ 4 files changed, 185 insertions(+) -- 2.7.4