Re: [Qemu-devel] Block I/O optimizations

Abel Gordon Mon, 25 Feb 2013 00:49:37 -0800


Stefan Hajnoczi <stefa...@gmail.com> wrote on 21/02/2013 10:11:12 AM:

> From: Stefan Hajnoczi <stefa...@gmail.com>
> To: Loic Dachary <l...@dachary.org>,
> Cc: qemu-devel <qemu-devel@nongnu.org>
> Date: 21/02/2013 10:11 AM
> Subject: Re: [Qemu-devel] Block I/O optimizations
> Sent by: qemu-devel-bounces+abelg=il.ibm....@nongnu.org
>
> On Mon, Feb 18, 2013 at 7:19 PM, Loic Dachary <l...@dachary.org> wrote:
> > I recently tried to figure out the best and easiest ways to
> increase block I/O performances with qemu. Not being a qemu expert,
> I expected to find a few optimization tricks. Much to my surprise,
> it appears that there are many significant improvements being worked
> on. This is excellent news :-)
> >
> > However, I'm not sure I understand how they all fit together. It's
> probably quite obvious from the developer point of view but I would
> very much appreciate an overview of how dataplane, vhost-blk, ELVIS
> etc. should be used or developed to maximize I/O performances. Are
> there documents I should read ? If not, would someone be willing to
> share bits of wisdom ?
>
> Hi Loic,
> There will be more information on dataplane shortly.  I'll write up a
> blog post and share the link with you.

Hi Stefan,

I assume dataplane could provide a significant performance boost
and approximate vhost-blk performance. If I understand properly,
that's because dataplane finally removes the dependency on
the global mutex and uses eventfd to process notifications.

However, I am concerned dataplane may not solve the scalability
problem because QEMU will be still running 1 thread
per VCPU and 1 per virtual device to handle I/O for each VM.
Assuming we run N VMs with 1 VCPU and 1 virtual I/O device,
we will have 2N threads competing for CPU cycles.  In a
cloud-like environment running I/O intensive VMs that could
be a problem because the I/O threads and VCPU threads may
starve each other. Further more, the linux kernel can't make
good scheduling decisions (from I/O perspective) because it
has no information about the content of the I/O queues.

We did some experiments with a modified vhost-blk back-end
that uses a single (or few) thread/s to process I/O for many
VMs as opposed to 1 thread per VM (I/O device).  These thread/s
decide for how-long and when to process the request of each
VM based on the I/O activity of each queue. We noticed that
this model (part of what we call ELVIS) significantly improves
the scalability of the system when you run many I/O intensive
guests.

I was wondering if you have considered this type of threading
model for dataplane as well. With vhost-blk (or-net) it's relatively
easy to use a kernel thread to process I/O for many VMs (user-space
processes). However, with a QEMU back-end (like dataplane/virtio-blk)
the shared thread model may be challenging because it requires
a shared user-space process (for the I/O thread/s) to handle
I/O for many QEMU processes.

Any thoughts/opinions on the share-thread direction ?

Re: [Qemu-devel] Block I/O optimizations

Reply via email to