On Thu, Jun 26, 2014 at 11:14:16PM +0800, Ming Lei wrote: > Hi Stefan, > > I found VM block I/O thoughput is decreased by more than 40% > on my laptop, and looks much worsen in my server environment, > and it is caused by your commit 580b6b2aa2: > > dataplane: use the QEMU block layer for I/O > > I run fio with below config to test random read: > > [global] > direct=1 > size=4G > bsrange=4k-4k > timeout=20 > numjobs=4 > ioengine=libaio > iodepth=64 > filename=/dev/vdc > group_reporting=1 > > [f] > rw=randread > > Together with throughput drop, the latency is improved a little. > > With this commit, I/O block submitted to fs becomes much smaller > than before, and more io_submit() need to be called to kernel, that > means iodepth may become much less. > > I am not surprised with the result since I did compare VM I/O > performance between qemu and lkvm before, which has no big qemu > lock problem and handle I/O in a dedicated thread, but lkvm's block > IO is still much worse than qemu from view of throughput, because > lkvm doesn't submit block I/O at batch like the way of previous > dataplane, IMO. > > But now you change the way of submitting I/O, could you share > the motivation about the change? Is the throughput drop you expect?
Thanks for reporting this. 40% is a serious regression. We were expecting a regression since the custom Linux AIO codepath has been replaced with the QEMU block layer (which offers features like image formats, snapshots, I/O throttling). Let me know if you get stuck working on a patch. Implementing batching sounds like a good idea. I never measured the impact when I wrote the ioq code, it just seemed like a natural way to structure the code. Hopefully this 40% number is purely due to batching and we can get most of the performance back. Stefan
pgprU8bmSUn_h.pgp
Description: PGP signature