On Sat, Jun 28, 2014 at 5:51 AM, Paolo Bonzini <pbonz...@redhat.com> wrote: > Il 27/06/2014 20:01, Ming Lei ha scritto: > >> I just implemented plug&unplug based batching, and it is working now. >> But throughout still has no obvious improvement. >> >> Looks loading in IOthread is a bit low, so I am wondering if there is >> block point caused by Qemu QEMU block layer. > > > What does perf say? Also, you can try using the QEMU trace subsystem and > see where the latency goes.
Follows some test result against 8589744aaf07b62 of upstream qemu, and the test is done on my 2core(4thread) laptop: 1, with my draft batch patches[1](only linux-aio supported now) - throughput: +16% compared qemu upstream - average time spent by handle_notify(): 310us - average time between two handle_notify(): 1591us (this time reflects latency of handling host_notifier) 2, same tests on 2.0.0 release(use custom Linux AIO) - average time spent by handle_notify(): 68us - average time between calling two handle_notify(): 269us (this time reflects latency of handling host_notifier) >From above tests, looks root cause is late handling notify, and qemu block layer becomes 4times slower than previous custom linux aio taken by dataplane. [1], git://kernel.ubuntu.com/ming/qemu.git #master-0626-batch Thanks, -- Ming Lei