On Tue, Mar 05, 2013 at 09:46:30PM +0100, Benoît Canet wrote:
> > You need to set a more specific goal.  Some questions to get started:
> >  * Which workloads do you care about and what are their
> > characteristics (sequential or random I/O, queue depth)?
> >  * Do you care about 1 vcpu guests or 4+ vcpu guests?  (SMP scalability)
> >  * Are you using an image format?
> 
> The usage would be a typical HPC workload: 4 vcpu in SMP and random IO on raw
> devices.

Okay.  If you want to do performance analysis on the existing stack,
then the next step is to choose a benchmark that represents this
workload.  Then you can collect profiles and see where there is room for
improvement.

In virtio-blk data plane world I'm currently converting core QEMU code
to support multiple AioContexts.  This is needed in order to use
BlockDriverStates from threads (without holding the global mutex).  This
is a pretty linear task.

The second half of this work is enabling device emulation code to run in
threads without the global mutex.  The trickiest thing here is probably
guest memory access - making guest memory access and DMA work safely
from a thread.  I haven't really started on this, Ping Fan Liu has
worked on it in the past, you could chat with him to find out the
current status.

Stefan

Reply via email to