Stefan Hajnoczi <stefa...@gmail.com> wrote on 21/02/2013 10:11:12 AM:
> From: Stefan Hajnoczi <stefa...@gmail.com> > To: Loic Dachary <l...@dachary.org>, > Cc: qemu-devel <qemu-devel@nongnu.org> > Date: 21/02/2013 10:11 AM > Subject: Re: [Qemu-devel] Block I/O optimizations > Sent by: qemu-devel-bounces+abelg=il.ibm....@nongnu.org > > On Mon, Feb 18, 2013 at 7:19 PM, Loic Dachary <l...@dachary.org> wrote: > > I recently tried to figure out the best and easiest ways to > increase block I/O performances with qemu. Not being a qemu expert, > I expected to find a few optimization tricks. Much to my surprise, > it appears that there are many significant improvements being worked > on. This is excellent news :-) > > > > However, I'm not sure I understand how they all fit together. It's > probably quite obvious from the developer point of view but I would > very much appreciate an overview of how dataplane, vhost-blk, ELVIS > etc. should be used or developed to maximize I/O performances. Are > there documents I should read ? If not, would someone be willing to > share bits of wisdom ? > > Hi Loic, > There will be more information on dataplane shortly. I'll write up a > blog post and share the link with you. Hi Stefan, I assume dataplane could provide a significant performance boost and approximate vhost-blk performance. If I understand properly, that's because dataplane finally removes the dependency on the global mutex and uses eventfd to process notifications. However, I am concerned dataplane may not solve the scalability problem because QEMU will be still running 1 thread per VCPU and 1 per virtual device to handle I/O for each VM. Assuming we run N VMs with 1 VCPU and 1 virtual I/O device, we will have 2N threads competing for CPU cycles. In a cloud-like environment running I/O intensive VMs that could be a problem because the I/O threads and VCPU threads may starve each other. Further more, the linux kernel can't make good scheduling decisions (from I/O perspective) because it has no information about the content of the I/O queues. We did some experiments with a modified vhost-blk back-end that uses a single (or few) thread/s to process I/O for many VMs as opposed to 1 thread per VM (I/O device). These thread/s decide for how-long and when to process the request of each VM based on the I/O activity of each queue. We noticed that this model (part of what we call ELVIS) significantly improves the scalability of the system when you run many I/O intensive guests. I was wondering if you have considered this type of threading model for dataplane as well. With vhost-blk (or-net) it's relatively easy to use a kernel thread to process I/O for many VMs (user-space processes). However, with a QEMU back-end (like dataplane/virtio-blk) the shared thread model may be challenging because it requires a shared user-space process (for the I/O thread/s) to handle I/O for many QEMU processes. Any thoughts/opinions on the share-thread direction ?