Re: [RFC] virtio: Support releasing lock during kick

2011-08-10 Thread Stefan Hajnoczi
On Wed, Aug 10, 2011 at 09:18:01AM -0400, Christoph Hellwig wrote:
> Any progress on these patches?

Khoa ran ffsb benchmarks on his rig and we didn't see any benefit.  I
have not started investigating yet, been working on other things.

It will be necessary to compare against the old patches which did show
an improvment last year when we ran them.  Then we'll know if I broke
something in the new patches.

If you like I can send the latest patches for you to take a look.  I
will not get a chance to play with this until after LinuxCon.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-08-10 Thread Christoph Hellwig
Any progress on these patches?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-06-24 Thread Stefan Hajnoczi
On Mon, Jun 20, 2011 at 4:27 PM, Stefan Hajnoczi  wrote:
> On Sun, Jun 19, 2011 at 8:14 AM, Michael S. Tsirkin  wrote:
>> On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
>>> The virtio block device holds a lock during I/O request processing.
>>> Kicking the virtqueue while the lock is held results in long lock hold
>>> times and increases contention for the lock.
>>>
>>> This patch modifies virtqueue_kick() to optionally release a lock while
>>> notifying the host.  Virtio block is modified to pass in its lock.  This
>>> allows other vcpus to queue I/O requests during the time spent servicing
>>> the virtqueue notify in the host.
>>>
>>> The virtqueue_kick() function is modified to know about locking because
>>> it changes the state of the virtqueue and should execute with the lock
>>> held (it would not be correct for virtio block to release the lock
>>> before calling virtqueue_kick()).
>>>
>>> Signed-off-by: Stefan Hajnoczi 
>>
>> While the optimization makes sense, the API's pretty hairy IMHO.
>> Why don't we split the kick functionality instead?
>> E.g.
>>        /* Report whether host notification is necessary. */
>>        bool virtqueue_kick_prepare(struct virtqueue *vq)
>>        /* Can be done in parallel with add_buf/get_buf */
>>        void virtqueue_kick_notify(struct virtqueue *vq)
>
> This is a nice idea, it makes the code cleaner.  I am testing patches
> that implement this and after Khoa has measured the performance I will
> send them out.

Just an update that benchmarks are being run.  Will send out patches
and results as soon as they are in.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-06-20 Thread Stefan Hajnoczi
On Sun, Jun 19, 2011 at 8:14 AM, Michael S. Tsirkin  wrote:
> On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
>> The virtio block device holds a lock during I/O request processing.
>> Kicking the virtqueue while the lock is held results in long lock hold
>> times and increases contention for the lock.
>>
>> This patch modifies virtqueue_kick() to optionally release a lock while
>> notifying the host.  Virtio block is modified to pass in its lock.  This
>> allows other vcpus to queue I/O requests during the time spent servicing
>> the virtqueue notify in the host.
>>
>> The virtqueue_kick() function is modified to know about locking because
>> it changes the state of the virtqueue and should execute with the lock
>> held (it would not be correct for virtio block to release the lock
>> before calling virtqueue_kick()).
>>
>> Signed-off-by: Stefan Hajnoczi 
>
> While the optimization makes sense, the API's pretty hairy IMHO.
> Why don't we split the kick functionality instead?
> E.g.
>        /* Report whether host notification is necessary. */
>        bool virtqueue_kick_prepare(struct virtqueue *vq)
>        /* Can be done in parallel with add_buf/get_buf */
>        void virtqueue_kick_notify(struct virtqueue *vq)

This is a nice idea, it makes the code cleaner.  I am testing patches
that implement this and after Khoa has measured the performance I will
send them out.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-06-19 Thread Christoph Hellwig
On Sun, Jun 19, 2011 at 10:48:41AM +0300, Michael S. Tsirkin wrote:
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 4ce953f..a8672ec 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -433,6 +433,8 @@ void blk_run_queue(struct request_queue *q)
>   spin_lock_irqsave(q->queue_lock, flags);
>   __blk_run_queue(q);
>   spin_unlock_irqrestore(q->queue_lock, flags);
> + if (q->request_done)
> + q->request_done(q);

We have quite a few cases where __blk_run_queue is called directly, and
one more (although not applicable to virtio-blk) that calls ->request_fn
directly.

I think Stefan's way is the way to go for now, releasing and reacquiring
the queue lock once in ->request_fn is much less than the common IDE and
SCSI setups do today.

Eventually ->queue_lock should be split from the driver-internal lock,
and we could do a more efficient calling convention than the one per
request blk_peek_request.  I've started looking into that, but it's
going to take a while.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-06-19 Thread Michael S. Tsirkin
On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
> The virtio block device holds a lock during I/O request processing.
> Kicking the virtqueue while the lock is held results in long lock hold
> times and increases contention for the lock.

As you point out the problem with dropping
and getting this lock in the request function is
that we double the number of atomics here.

How about we teach the block core to call a separate
function with spinlock not held? This way we don't need
to do extra lock/unlock operations, and kick can be
done with lock not held and interrupts enabled.


diff --git a/block/blk-core.c b/block/blk-core.c
index 4ce953f..a8672ec 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -433,6 +433,8 @@ void blk_run_queue(struct request_queue *q)
spin_lock_irqsave(q->queue_lock, flags);
__blk_run_queue(q);
spin_unlock_irqrestore(q->queue_lock, flags);
+   if (q->request_done)
+   q->request_done(q);
 }
 EXPORT_SYMBOL(blk_run_queue);
 

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2011-06-19 Thread Michael S. Tsirkin
On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
> The virtio block device holds a lock during I/O request processing.
> Kicking the virtqueue while the lock is held results in long lock hold
> times and increases contention for the lock.
> 
> This patch modifies virtqueue_kick() to optionally release a lock while
> notifying the host.  Virtio block is modified to pass in its lock.  This
> allows other vcpus to queue I/O requests during the time spent servicing
> the virtqueue notify in the host.
> 
> The virtqueue_kick() function is modified to know about locking because
> it changes the state of the virtqueue and should execute with the lock
> held (it would not be correct for virtio block to release the lock
> before calling virtqueue_kick()).
> 
> Signed-off-by: Stefan Hajnoczi 

While the optimization makes sense, the API's pretty hairy IMHO.
Why don't we split the kick functionality instead?
E.g.
/* Report whether host notification is necessary. */
bool virtqueue_kick_prepare(struct virtqueue *vq)
/* Can be done in parallel with add_buf/get_buf */
void virtqueue_kick_notify(struct virtqueue *vq)

And users can

r = virtqueue_kick_prepare(vq);
spin_unlock_irqrestore(...);
if (r)
virtqueue_kick_notify(struct virtqueue *vq)

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-29 Thread Avi Kivity

On 06/29/2010 10:08 AM, Stefan Hajnoczi wrote:


Is it incorrect to have the following pattern?
spin_lock_irqsave(q->queue_lock);
spin_unlock(q->queue_lock);
spin_lock(q->queue_lock);
spin_unlock_irqrestore(q->queue_lock);
   


Perfectly legitimate.  spin_lock_irqsave() is equivalent to 
local_irq_save() followed by spin_lock() (with the potential 
optimization that we can service interrupts while spinning).


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-29 Thread Stefan Hajnoczi
On Mon, Jun 28, 2010 at 4:55 PM, Marcelo Tosatti  wrote:
> On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
>> The virtio block device holds a lock during I/O request processing.
>> Kicking the virtqueue while the lock is held results in long lock hold
>> times and increases contention for the lock.
>>
>> This patch modifies virtqueue_kick() to optionally release a lock while
>> notifying the host.  Virtio block is modified to pass in its lock.  This
>> allows other vcpus to queue I/O requests during the time spent servicing
>> the virtqueue notify in the host.
>>
>> The virtqueue_kick() function is modified to know about locking because
>> it changes the state of the virtqueue and should execute with the lock
>> held (it would not be correct for virtio block to release the lock
>> before calling virtqueue_kick()).
>>
>> Signed-off-by: Stefan Hajnoczi 
>> ---
>> I am not yet 100% happy with this patch which aims to reduce guest CPU
>> consumption related to vblk->lock contention.  Although this patch reduces
>> wait/hold times it does not affect I/O throughput or guest CPU utilization.
>> More investigation is required to get to the bottom of why guest CPU
>> utilization does not decrease when a lock bottleneck has been removed.
>>
>> Performance figures:
>>
>> Host: 2.6.34 upstream kernel, qemu-kvm-0.12.4 if=virtio,cache=none
>> Guest: 2.6.35-rc3-kvm.git upstream kernel
>> Storage: 12 disks as striped LVM volume
>> Benchmark: 4 concurrent dd bs=4k iflag=direct
>>
>> Lockstat data for &vblk->lock:
>>
>> test       con-bounces contentions  waittime-min waittime-max waittime-total
>> unmodified 7097        7108         0.31         956.09       161165.4
>> patched    11484       11550        0.30         411.80       50245.83
>>
>> The maximum wait time went down by 544.29 us (-57%) and the total wait time
>> decreased by 69%.  This shows that the virtqueue kick is indeed hogging the
>> lock.
>>
>> The patched version actually has higher contention than the unmodified 
>> version.
>> I think the reason for this is that each virtqueue kick now includes a short
>> release and reacquire.  This short release gives other vcpus a chance to
>> acquire the lock and progress, hence more contention but overall better wait
>> time numbers.
>>
>> name       acq-bounces acquisitions holdtime-min holdtime-max holdtime-total
>> unmodified 10771       5038346      0.00         3271.81      59016905.47
>> patched    31594       5857813      0.00         219.76       24104915.55
>>
>> Here we see the full impact of this patch: hold time reduced to 219.76 us
>> (-93%).
>>
>> Again the acquisitions have increased since we're now doing an extra
>> unlock+lock per virtqueue kick.
>>
>> Testing, ideas, and comments appreciated.
>>
>>  drivers/block/virtio_blk.c          |    2 +-
>>  drivers/char/hw_random/virtio-rng.c |    2 +-
>>  drivers/char/virtio_console.c       |    6 +++---
>>  drivers/net/virtio_net.c            |    6 +++---
>>  drivers/virtio/virtio_balloon.c     |    6 +++---
>>  drivers/virtio/virtio_ring.c        |   13 +++--
>>  include/linux/virtio.h              |    3 ++-
>>  net/9p/trans_virtio.c               |    2 +-
>>  8 files changed, 25 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
>> index 258bc2a..de033bf 100644
>> --- a/drivers/block/virtio_blk.c
>> +++ b/drivers/block/virtio_blk.c
>> @@ -187,7 +187,7 @@ static void do_virtblk_request(struct request_queue *q)
>>       }
>>
>>       if (issued)
>> -             virtqueue_kick(vblk->vq);
>> +             virtqueue_kick(vblk->vq, &vblk->lock);
>>  }
>>
>>  static void virtblk_prepare_flush(struct request_queue *q, struct request 
>> *req)
>> diff --git a/drivers/char/hw_random/virtio-rng.c 
>> b/drivers/char/hw_random/virtio-rng.c
>> index 75f1cbd..852d563 100644
>> --- a/drivers/char/hw_random/virtio-rng.c
>> +++ b/drivers/char/hw_random/virtio-rng.c
>> @@ -49,7 +49,7 @@ static void register_buffer(u8 *buf, size_t size)
>>       if (virtqueue_add_buf(vq, &sg, 0, 1, buf) < 0)
>>               BUG();
>>
>> -     virtqueue_kick(vq);
>> +     virtqueue_kick(vq, NULL);
>>  }
>>
>>  static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
>> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
>> index 942a982..677714d 100644
>> --- a/drivers/char/virtio_console.c
>> +++ b/drivers/char/virtio_console.c
>> @@ -328,7 +328,7 @@ static int add_inbuf(struct virtqueue *vq, struct 
>> port_buffer *buf)
>>       sg_init_one(sg, buf->buf, buf->size);
>>
>>       ret = virtqueue_add_buf(vq, sg, 0, 1, buf);
>> -     virtqueue_kick(vq);
>> +     virtqueue_kick(vq, NULL);
>>       return ret;
>>  }
>>
>> @@ -400,7 +400,7 @@ static ssize_t __send_control_msg(struct ports_device 
>> *portdev, u32 port_id,
>>
>>       sg_init_one(sg, &cpkt, sizeof(cpkt));
>>       if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt) >= 0) {
>> -             virtqueue_kick(vq);
>

Re: [RFC] virtio: Support releasing lock during kick

2010-06-28 Thread Marcelo Tosatti
On Wed, Jun 23, 2010 at 10:24:02PM +0100, Stefan Hajnoczi wrote:
> The virtio block device holds a lock during I/O request processing.
> Kicking the virtqueue while the lock is held results in long lock hold
> times and increases contention for the lock.
> 
> This patch modifies virtqueue_kick() to optionally release a lock while
> notifying the host.  Virtio block is modified to pass in its lock.  This
> allows other vcpus to queue I/O requests during the time spent servicing
> the virtqueue notify in the host.
> 
> The virtqueue_kick() function is modified to know about locking because
> it changes the state of the virtqueue and should execute with the lock
> held (it would not be correct for virtio block to release the lock
> before calling virtqueue_kick()).
> 
> Signed-off-by: Stefan Hajnoczi 
> ---
> I am not yet 100% happy with this patch which aims to reduce guest CPU
> consumption related to vblk->lock contention.  Although this patch reduces
> wait/hold times it does not affect I/O throughput or guest CPU utilization.
> More investigation is required to get to the bottom of why guest CPU
> utilization does not decrease when a lock bottleneck has been removed.
> 
> Performance figures:
> 
> Host: 2.6.34 upstream kernel, qemu-kvm-0.12.4 if=virtio,cache=none
> Guest: 2.6.35-rc3-kvm.git upstream kernel
> Storage: 12 disks as striped LVM volume
> Benchmark: 4 concurrent dd bs=4k iflag=direct
> 
> Lockstat data for &vblk->lock:
> 
> test   con-bounces contentions  waittime-min waittime-max waittime-total
> unmodified 70977108 0.31 956.09   161165.4
> patched11484   115500.30 411.80   50245.83
> 
> The maximum wait time went down by 544.29 us (-57%) and the total wait time
> decreased by 69%.  This shows that the virtqueue kick is indeed hogging the
> lock.
> 
> The patched version actually has higher contention than the unmodified 
> version.
> I think the reason for this is that each virtqueue kick now includes a short
> release and reacquire.  This short release gives other vcpus a chance to
> acquire the lock and progress, hence more contention but overall better wait
> time numbers.
> 
> name   acq-bounces acquisitions holdtime-min holdtime-max holdtime-total
> unmodified 10771   5038346  0.00 3271.81  59016905.47
> patched31594   5857813  0.00 219.76   24104915.55
> 
> Here we see the full impact of this patch: hold time reduced to 219.76 us
> (-93%).
> 
> Again the acquisitions have increased since we're now doing an extra
> unlock+lock per virtqueue kick.
> 
> Testing, ideas, and comments appreciated.
> 
>  drivers/block/virtio_blk.c  |2 +-
>  drivers/char/hw_random/virtio-rng.c |2 +-
>  drivers/char/virtio_console.c   |6 +++---
>  drivers/net/virtio_net.c|6 +++---
>  drivers/virtio/virtio_balloon.c |6 +++---
>  drivers/virtio/virtio_ring.c|   13 +++--
>  include/linux/virtio.h  |3 ++-
>  net/9p/trans_virtio.c   |2 +-
>  8 files changed, 25 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 258bc2a..de033bf 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -187,7 +187,7 @@ static void do_virtblk_request(struct request_queue *q)
>   }
>  
>   if (issued)
> - virtqueue_kick(vblk->vq);
> + virtqueue_kick(vblk->vq, &vblk->lock);
>  }
>  
>  static void virtblk_prepare_flush(struct request_queue *q, struct request 
> *req)
> diff --git a/drivers/char/hw_random/virtio-rng.c 
> b/drivers/char/hw_random/virtio-rng.c
> index 75f1cbd..852d563 100644
> --- a/drivers/char/hw_random/virtio-rng.c
> +++ b/drivers/char/hw_random/virtio-rng.c
> @@ -49,7 +49,7 @@ static void register_buffer(u8 *buf, size_t size)
>   if (virtqueue_add_buf(vq, &sg, 0, 1, buf) < 0)
>   BUG();
>  
> - virtqueue_kick(vq);
> + virtqueue_kick(vq, NULL);
>  }
>  
>  static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
> diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
> index 942a982..677714d 100644
> --- a/drivers/char/virtio_console.c
> +++ b/drivers/char/virtio_console.c
> @@ -328,7 +328,7 @@ static int add_inbuf(struct virtqueue *vq, struct 
> port_buffer *buf)
>   sg_init_one(sg, buf->buf, buf->size);
>  
>   ret = virtqueue_add_buf(vq, sg, 0, 1, buf);
> - virtqueue_kick(vq);
> + virtqueue_kick(vq, NULL);
>   return ret;
>  }
>  
> @@ -400,7 +400,7 @@ static ssize_t __send_control_msg(struct ports_device 
> *portdev, u32 port_id,
>  
>   sg_init_one(sg, &cpkt, sizeof(cpkt));
>   if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt) >= 0) {
> - virtqueue_kick(vq);
> + virtqueue_kick(vq, NULL);
>   while (!virtqueue_get_buf(vq, &len))
>   cpu_relax();
>   }
> @@ 

Re: [RFC] virtio: Support releasing lock during kick

2010-06-25 Thread Stefan Hajnoczi
On Fri, Jun 25, 2010 at 06:32:20PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 25, 2010 at 04:31:44PM +0100, Stefan Hajnoczi wrote:
> > On Fri, Jun 25, 2010 at 01:43:17PM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Jun 25, 2010 at 12:39:21PM +0930, Rusty Russell wrote:
> > > > On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
> > > > > On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori 
> > > > >  wrote:
> > > > > > Shouldn't it be possible to just drop the lock before invoking
> > > > > > virtqueue_kick() and reacquire it afterwards?  There's nothing in 
> > > > > > that
> > > > > > virtqueue_kick() path that the lock is protecting AFAICT.
> > > > > 
> > > > > No, that would lead to a race condition because vq->num_added is
> > > > > modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
> > > > > Without a lock held during virtqueue_kick() another vcpu could add
> > > > > bufs while vq->num_added is used and cleared by virtqueue_kick():
> > > > 
> > > > Right, this dovetails with another proposed change (was it Michael?)
> > > > where we would update the avail idx inside add_buf, rather than waiting
> > > > until kick.  This means a barrier inside add_buf, but that's probably
> > > > fine.
> > > > 
> > > > If we do that, then we don't need a lock on virtqueue_kick.
> > > > 
> > > > Michael, thoughts?
> > > 
> > > Maybe not even that: I think we could just do virtio_wmb()
> > > in add, and keep the mb() in kick.
> > > 
> > > What I'm a bit worried about is contention on the cacheline
> > > including index and flags: the more we write to that line,
> > > the worse it gets.
> > > 
> > > So need to test performance impact of this change:
> > > I didn't find time to do this yet, as I am trying
> > > to finalize the used index publishing patches.
> > > Any takers?
> > > 
> > > Do we see performance improvement after making kick lockless?
> > 
> > There was no guest CPU reduction or I/O throughput increase with my
> > patch when running 4 dd iflag=direct bs=4k if=/dev/vdb of=/dev/null
> > processes.  However, the lock_stat numbers above show clear improvement
> > of the lock hold/wait times.
> > 
> > I was hoping to see guest CPU utilization go down and I/O throughput go
> > up, so there is still investigation to do with my patch in isolation.
> > Although I'd like to try it later, putting my patch on top of your avail
> > idx work is too early because it will be harder to reason about the
> > performance with both patches present at the same time.
> > 
> > Stefan
> 
> What about host CPU utilization?

There is data available for host CPU utilization, I need to dig it up.

> Also, are you using PARAVIRT_SPINLOCKS?

No.  I haven't found much documentation on paravirt spinlocks other than
the commit that introduced them:

  commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8
  Author: Jeremy Fitzhardinge 
  Date:   Mon Jul 7 12:07:51 2008 -0700

  paravirt: introduce a "lock-byte" spinlock implementation

PARAVIRT_SPINLOCKS is not set in the config I use, probably because of
the associated performance issue that causes distros to build without
them:

  commit b4ecc126991b30fe5f9a59dfacda046aeac124b2
  Author: Jeremy Fitzhardinge 
  Date:   Wed May 13 17:16:55 2009 -0700

  x86: Fix performance regression caused by paravirt_ops on native
  kernels

I would expect performance results to be smoother with
PARAVIRT_SPINLOCKS for the guest kernel.  I will add it for future runs,
thanks for pointing it out.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-25 Thread Michael S. Tsirkin
On Fri, Jun 25, 2010 at 04:31:44PM +0100, Stefan Hajnoczi wrote:
> On Fri, Jun 25, 2010 at 01:43:17PM +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 25, 2010 at 12:39:21PM +0930, Rusty Russell wrote:
> > > On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
> > > > On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori 
> > > >  wrote:
> > > > > Shouldn't it be possible to just drop the lock before invoking
> > > > > virtqueue_kick() and reacquire it afterwards?  There's nothing in that
> > > > > virtqueue_kick() path that the lock is protecting AFAICT.
> > > > 
> > > > No, that would lead to a race condition because vq->num_added is
> > > > modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
> > > > Without a lock held during virtqueue_kick() another vcpu could add
> > > > bufs while vq->num_added is used and cleared by virtqueue_kick():
> > > 
> > > Right, this dovetails with another proposed change (was it Michael?)
> > > where we would update the avail idx inside add_buf, rather than waiting
> > > until kick.  This means a barrier inside add_buf, but that's probably
> > > fine.
> > > 
> > > If we do that, then we don't need a lock on virtqueue_kick.
> > > 
> > > Michael, thoughts?
> > 
> > Maybe not even that: I think we could just do virtio_wmb()
> > in add, and keep the mb() in kick.
> > 
> > What I'm a bit worried about is contention on the cacheline
> > including index and flags: the more we write to that line,
> > the worse it gets.
> > 
> > So need to test performance impact of this change:
> > I didn't find time to do this yet, as I am trying
> > to finalize the used index publishing patches.
> > Any takers?
> > 
> > Do we see performance improvement after making kick lockless?
> 
> There was no guest CPU reduction or I/O throughput increase with my
> patch when running 4 dd iflag=direct bs=4k if=/dev/vdb of=/dev/null
> processes.  However, the lock_stat numbers above show clear improvement
> of the lock hold/wait times.
> 
> I was hoping to see guest CPU utilization go down and I/O throughput go
> up, so there is still investigation to do with my patch in isolation.
> Although I'd like to try it later, putting my patch on top of your avail
> idx work is too early because it will be harder to reason about the
> performance with both patches present at the same time.
> 
> Stefan

What about host CPU utilization?
Also, are you using PARAVIRT_SPINLOCKS?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-25 Thread Stefan Hajnoczi
On Fri, Jun 25, 2010 at 01:43:17PM +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 25, 2010 at 12:39:21PM +0930, Rusty Russell wrote:
> > On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
> > > On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori  
> > > wrote:
> > > > Shouldn't it be possible to just drop the lock before invoking
> > > > virtqueue_kick() and reacquire it afterwards?  There's nothing in that
> > > > virtqueue_kick() path that the lock is protecting AFAICT.
> > > 
> > > No, that would lead to a race condition because vq->num_added is
> > > modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
> > > Without a lock held during virtqueue_kick() another vcpu could add
> > > bufs while vq->num_added is used and cleared by virtqueue_kick():
> > 
> > Right, this dovetails with another proposed change (was it Michael?)
> > where we would update the avail idx inside add_buf, rather than waiting
> > until kick.  This means a barrier inside add_buf, but that's probably
> > fine.
> > 
> > If we do that, then we don't need a lock on virtqueue_kick.
> > 
> > Michael, thoughts?
> 
> Maybe not even that: I think we could just do virtio_wmb()
> in add, and keep the mb() in kick.
> 
> What I'm a bit worried about is contention on the cacheline
> including index and flags: the more we write to that line,
> the worse it gets.
> 
> So need to test performance impact of this change:
> I didn't find time to do this yet, as I am trying
> to finalize the used index publishing patches.
> Any takers?
> 
> Do we see performance improvement after making kick lockless?

There was no guest CPU reduction or I/O throughput increase with my
patch when running 4 dd iflag=direct bs=4k if=/dev/vdb of=/dev/null
processes.  However, the lock_stat numbers above show clear improvement
of the lock hold/wait times.

I was hoping to see guest CPU utilization go down and I/O throughput go
up, so there is still investigation to do with my patch in isolation.
Although I'd like to try it later, putting my patch on top of your avail
idx work is too early because it will be harder to reason about the
performance with both patches present at the same time.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-25 Thread Michael S. Tsirkin
On Fri, Jun 25, 2010 at 12:39:21PM +0930, Rusty Russell wrote:
> On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
> > On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori  
> > wrote:
> > > Shouldn't it be possible to just drop the lock before invoking
> > > virtqueue_kick() and reacquire it afterwards?  There's nothing in that
> > > virtqueue_kick() path that the lock is protecting AFAICT.
> > 
> > No, that would lead to a race condition because vq->num_added is
> > modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
> > Without a lock held during virtqueue_kick() another vcpu could add
> > bufs while vq->num_added is used and cleared by virtqueue_kick():
> 
> Right, this dovetails with another proposed change (was it Michael?)
> where we would update the avail idx inside add_buf, rather than waiting
> until kick.  This means a barrier inside add_buf, but that's probably
> fine.
> 
> If we do that, then we don't need a lock on virtqueue_kick.
> 
> Michael, thoughts?

Maybe not even that: I think we could just do virtio_wmb()
in add, and keep the mb() in kick.

What I'm a bit worried about is contention on the cacheline
including index and flags: the more we write to that line,
the worse it gets.

So need to test performance impact of this change:
I didn't find time to do this yet, as I am trying
to finalize the used index publishing patches.
Any takers?

Do we see performance improvement after making kick lockless?

> Thanks,
> Rusty.
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-24 Thread Stefan Hajnoczi
On Fri, Jun 25, 2010 at 4:09 AM, Rusty Russell  wrote:
> On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
>> On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori  
>> wrote:
>> > Shouldn't it be possible to just drop the lock before invoking
>> > virtqueue_kick() and reacquire it afterwards?  There's nothing in that
>> > virtqueue_kick() path that the lock is protecting AFAICT.
>>
>> No, that would lead to a race condition because vq->num_added is
>> modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
>> Without a lock held during virtqueue_kick() another vcpu could add
>> bufs while vq->num_added is used and cleared by virtqueue_kick():
>
> Right, this dovetails with another proposed change (was it Michael?)
> where we would update the avail idx inside add_buf, rather than waiting
> until kick.  This means a barrier inside add_buf, but that's probably
> fine.
>
> If we do that, then we don't need a lock on virtqueue_kick.

That would be nice, we could push the change up into just virtio-blk.

I did wonder if virtio-net can take advantage of unlocked kick, too,
but haven't investigated yet.  The virtio-net kick in start_xmit()
happens with the netdev _xmit_lock held.  Any ideas?

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-24 Thread Rusty Russell
On Thu, 24 Jun 2010 03:00:30 pm Stefan Hajnoczi wrote:
> On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori  
> wrote:
> > Shouldn't it be possible to just drop the lock before invoking
> > virtqueue_kick() and reacquire it afterwards?  There's nothing in that
> > virtqueue_kick() path that the lock is protecting AFAICT.
> 
> No, that would lead to a race condition because vq->num_added is
> modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
> Without a lock held during virtqueue_kick() another vcpu could add
> bufs while vq->num_added is used and cleared by virtqueue_kick():

Right, this dovetails with another proposed change (was it Michael?)
where we would update the avail idx inside add_buf, rather than waiting
until kick.  This means a barrier inside add_buf, but that's probably
fine.

If we do that, then we don't need a lock on virtqueue_kick.

Michael, thoughts?

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-23 Thread Stefan Hajnoczi
On Wed, Jun 23, 2010 at 11:12 PM, Anthony Liguori  wrote:
> Shouldn't it be possible to just drop the lock before invoking
> virtqueue_kick() and reacquire it afterwards?  There's nothing in that
> virtqueue_kick() path that the lock is protecting AFAICT.

No, that would lead to a race condition because vq->num_added is
modified by both virtqueue_add_buf_gfp() and virtqueue_kick().
Without a lock held during virtqueue_kick() another vcpu could add
bufs while vq->num_added is used and cleared by virtqueue_kick():

void virtqueue_kick(struct virtqueue *_vq, spinlock_t *lock)
{
struct vring_virtqueue *vq = to_vvq(_vq);
START_USE(vq);
/* Descriptors and available array need to be set before we expose the
 * new available array entries. */
virtio_wmb();

vq->vring.avail->idx += vq->num_added;
vq->num_added = 0;

/* Need to update avail index before checking if we should notify */
virtio_mb();

if (!(vq->vring.used->flags & VRING_USED_F_NO_NOTIFY)) {
/* Release lock while doing the kick because the guest should
 * not exit with the lock held. */
if (lock)
spin_unlock(lock);

/* Prod other side to tell it about changes. */
vq->notify(&vq->vq);

if (lock)
spin_lock(lock);
}

END_USE(vq);
}

Stefan
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] virtio: Support releasing lock during kick

2010-06-23 Thread Anthony Liguori

On 06/23/2010 04:24 PM, Stefan Hajnoczi wrote:

The virtio block device holds a lock during I/O request processing.
Kicking the virtqueue while the lock is held results in long lock hold
times and increases contention for the lock.

This patch modifies virtqueue_kick() to optionally release a lock while
notifying the host.  Virtio block is modified to pass in its lock.  This
allows other vcpus to queue I/O requests during the time spent servicing
the virtqueue notify in the host.

The virtqueue_kick() function is modified to know about locking because
it changes the state of the virtqueue and should execute with the lock
held (it would not be correct for virtio block to release the lock
before calling virtqueue_kick()).

Signed-off-by: Stefan Hajnoczi
---
I am not yet 100% happy with this patch which aims to reduce guest CPU
consumption related to vblk->lock contention.  Although this patch reduces
wait/hold times it does not affect I/O throughput or guest CPU utilization.
More investigation is required to get to the bottom of why guest CPU
utilization does not decrease when a lock bottleneck has been removed.

Performance figures:

Host: 2.6.34 upstream kernel, qemu-kvm-0.12.4 if=virtio,cache=none
Guest: 2.6.35-rc3-kvm.git upstream kernel
Storage: 12 disks as striped LVM volume
Benchmark: 4 concurrent dd bs=4k iflag=direct

Lockstat data for&vblk->lock:

test   con-bounces contentions  waittime-min waittime-max waittime-total
unmodified 70977108 0.31 956.09   161165.4
patched11484   115500.30 411.80   50245.83

The maximum wait time went down by 544.29 us (-57%) and the total wait time
decreased by 69%.  This shows that the virtqueue kick is indeed hogging the
lock.

The patched version actually has higher contention than the unmodified version.
I think the reason for this is that each virtqueue kick now includes a short
release and reacquire.  This short release gives other vcpus a chance to
acquire the lock and progress, hence more contention but overall better wait
time numbers.

name   acq-bounces acquisitions holdtime-min holdtime-max holdtime-total
unmodified 10771   5038346  0.00 3271.81  59016905.47
patched31594   5857813  0.00 219.76   24104915.55

Here we see the full impact of this patch: hold time reduced to 219.76 us
(-93%).

Again the acquisitions have increased since we're now doing an extra
unlock+lock per virtqueue kick.

Testing, ideas, and comments appreciated.

  drivers/block/virtio_blk.c  |2 +-
  drivers/char/hw_random/virtio-rng.c |2 +-
  drivers/char/virtio_console.c   |6 +++---
  drivers/net/virtio_net.c|6 +++---
  drivers/virtio/virtio_balloon.c |6 +++---
  drivers/virtio/virtio_ring.c|   13 +++--
  include/linux/virtio.h  |3 ++-
  net/9p/trans_virtio.c   |2 +-
  8 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 258bc2a..de033bf 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -187,7 +187,7 @@ static void do_virtblk_request(struct request_queue *q)
}

if (issued)
-   virtqueue_kick(vblk->vq);
+   virtqueue_kick(vblk->vq,&vblk->lock);
  }
   


Shouldn't it be possible to just drop the lock before invoking 
virtqueue_kick() and reacquire it afterwards?  There's nothing in that 
virtqueue_kick() path that the lock is protecting AFAICT.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC] virtio: Support releasing lock during kick

2010-06-23 Thread Stefan Hajnoczi
The virtio block device holds a lock during I/O request processing.
Kicking the virtqueue while the lock is held results in long lock hold
times and increases contention for the lock.

This patch modifies virtqueue_kick() to optionally release a lock while
notifying the host.  Virtio block is modified to pass in its lock.  This
allows other vcpus to queue I/O requests during the time spent servicing
the virtqueue notify in the host.

The virtqueue_kick() function is modified to know about locking because
it changes the state of the virtqueue and should execute with the lock
held (it would not be correct for virtio block to release the lock
before calling virtqueue_kick()).

Signed-off-by: Stefan Hajnoczi 
---
I am not yet 100% happy with this patch which aims to reduce guest CPU
consumption related to vblk->lock contention.  Although this patch reduces
wait/hold times it does not affect I/O throughput or guest CPU utilization.
More investigation is required to get to the bottom of why guest CPU
utilization does not decrease when a lock bottleneck has been removed.

Performance figures:

Host: 2.6.34 upstream kernel, qemu-kvm-0.12.4 if=virtio,cache=none
Guest: 2.6.35-rc3-kvm.git upstream kernel
Storage: 12 disks as striped LVM volume
Benchmark: 4 concurrent dd bs=4k iflag=direct

Lockstat data for &vblk->lock:

test   con-bounces contentions  waittime-min waittime-max waittime-total
unmodified 70977108 0.31 956.09   161165.4
patched11484   115500.30 411.80   50245.83

The maximum wait time went down by 544.29 us (-57%) and the total wait time
decreased by 69%.  This shows that the virtqueue kick is indeed hogging the
lock.

The patched version actually has higher contention than the unmodified version.
I think the reason for this is that each virtqueue kick now includes a short
release and reacquire.  This short release gives other vcpus a chance to
acquire the lock and progress, hence more contention but overall better wait
time numbers.

name   acq-bounces acquisitions holdtime-min holdtime-max holdtime-total
unmodified 10771   5038346  0.00 3271.81  59016905.47
patched31594   5857813  0.00 219.76   24104915.55

Here we see the full impact of this patch: hold time reduced to 219.76 us
(-93%).

Again the acquisitions have increased since we're now doing an extra
unlock+lock per virtqueue kick.

Testing, ideas, and comments appreciated.

 drivers/block/virtio_blk.c  |2 +-
 drivers/char/hw_random/virtio-rng.c |2 +-
 drivers/char/virtio_console.c   |6 +++---
 drivers/net/virtio_net.c|6 +++---
 drivers/virtio/virtio_balloon.c |6 +++---
 drivers/virtio/virtio_ring.c|   13 +++--
 include/linux/virtio.h  |3 ++-
 net/9p/trans_virtio.c   |2 +-
 8 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 258bc2a..de033bf 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -187,7 +187,7 @@ static void do_virtblk_request(struct request_queue *q)
}
 
if (issued)
-   virtqueue_kick(vblk->vq);
+   virtqueue_kick(vblk->vq, &vblk->lock);
 }
 
 static void virtblk_prepare_flush(struct request_queue *q, struct request *req)
diff --git a/drivers/char/hw_random/virtio-rng.c 
b/drivers/char/hw_random/virtio-rng.c
index 75f1cbd..852d563 100644
--- a/drivers/char/hw_random/virtio-rng.c
+++ b/drivers/char/hw_random/virtio-rng.c
@@ -49,7 +49,7 @@ static void register_buffer(u8 *buf, size_t size)
if (virtqueue_add_buf(vq, &sg, 0, 1, buf) < 0)
BUG();
 
-   virtqueue_kick(vq);
+   virtqueue_kick(vq, NULL);
 }
 
 static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait)
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 942a982..677714d 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -328,7 +328,7 @@ static int add_inbuf(struct virtqueue *vq, struct 
port_buffer *buf)
sg_init_one(sg, buf->buf, buf->size);
 
ret = virtqueue_add_buf(vq, sg, 0, 1, buf);
-   virtqueue_kick(vq);
+   virtqueue_kick(vq, NULL);
return ret;
 }
 
@@ -400,7 +400,7 @@ static ssize_t __send_control_msg(struct ports_device 
*portdev, u32 port_id,
 
sg_init_one(sg, &cpkt, sizeof(cpkt));
if (virtqueue_add_buf(vq, sg, 1, 0, &cpkt) >= 0) {
-   virtqueue_kick(vq);
+   virtqueue_kick(vq, NULL);
while (!virtqueue_get_buf(vq, &len))
cpu_relax();
}
@@ -444,7 +444,7 @@ static ssize_t send_buf(struct port *port, void *in_buf, 
size_t in_count,
ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf);
 
/* Tell Host to go! */
-   virtqueue_kick(out_vq);
+   virtqueue_kick(out_vq, NULL);
 
if (r