[PATCH] get 2% or more performance improved by reducing spin_lock race

2013-06-07 Thread Qinchuanyu
the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
 but it could be done outside the spin_lock. 
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
-- 
1.7.3.1.msysgit.0
N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

[PATCH] vhost: wake up worker outside spin_lock

2013-06-07 Thread Qinchuanyu
the wake_up_process func is included by spin_lock/unlock in vhost_work_queue,
but it could be done outside the spin_lock. 
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 
94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
--
1.7.3.1.msysgit.0
N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

Re: [PATCH] vhost: get 2% performance improved by reducing spin_lock race in vhost_work_queue

2013-05-20 Thread Qinchuanyu
From: Chuanyu Qin qinchua...@huawei.com
Subject: [PATCH] get 2% or more performance improved by reducing spin_lock race 
in vhost_work_queue

the wake_up_process func is included by spin_lock/unlock in vhost_work_queue, 
but it could be done outside the spin_lock. 
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as 
below.
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
---
 drivers/vhost/vhost.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 94dbd25..8bee109 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -146,9 +146,10 @@ static inline void vhost_work_queue(struct vhost_dev *dev,
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
-- 
1.7.3.1.msysgit.0


 On 05/20/2013 12:22 PM, Qinchuanyu wrote:
  The patch below is base on
  https://git.kernel.org/cgit/linux/kernel/git/next/linux-
 next.git/tree/drivers/vhost/vhost.c?id=refs/tags/next-20130517
 
  Signed-off-by: Chuanyu Qin qinchua...@huawei.com
  --- a/drivers/vhost/vhost.c 2013-05-20 11:47:05.0 +0800
  +++ b/drivers/vhost/vhost.c 2013-05-20 11:48:24.0 +0800
  @@ -154,9 +154,10 @@
  if (list_empty(work-node)) {
  list_add_tail(work-node, dev-work_list);
  work-queue_seq++;
  +   spin_unlock_irqrestore(dev-work_lock, flags);
  wake_up_process(dev-worker);
  -   }
  -   spin_unlock_irqrestore(dev-work_lock, flags);
  +   } else
  +   spin_unlock_irqrestore(dev-work_lock, flags);
   }
 
   void vhost_poll_queue(struct vhost_poll *poll)
 
  I did the test by using iperf in 10G environment, the test num as
 below:
   orignal   modified
  thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
  1   9.59 28.82   |  9.5927.49
  89.6132.92   |  9.6226.77
  649.5846.48  | 9.5538.99
  2569.663.7   |  9.6 52.59
 
  The cost of vhost reduced while the throughput is almost unchanged.
 
 Thanks, and please generate a formal patch based on
 Documentation/SubmittingPatches (put the description and perf numbers
 in the commit log). Then resubmit it to let the maintainer apply it.

N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf

provide vhost thread per virtqueue for forwarding scenario

2013-05-19 Thread Qinchuanyu
Vhost thread provide both tx and rx ability for virtio-net. 
In the forwarding scenarios, tx and rx share the vhost thread, and throughput 
is limited by single thread.

So I did a patch for provide vhost thread per virtqueue, not per vhost_net.

Of course, multi-queue virtio-net is final solution, but it require new version 
of virtio-net working in guest.
If you have to work with suse10,11, redhat 5.x as guest, and want to improve 
the forward throughput,
using vhost thread per queue seems to be the only solution.

I did the test with kernel 3.0.27 and qemu-1.4.0, guest is suse11-sp2, and then 
two vhost thread provide 
double tx/rx forwarding performance than signal vhost thread. 
The virtqueue of vhost_blk is 1, so it still use one vhost thread without 
change.

Is there something wrong in this solution? If not, I would list patch later.

Best regards
King
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: get 2% performance improved by reducing spin_lock race in vhost_work_queue

2013-05-19 Thread Qinchuanyu
Right now the wake_up_process func is included in spin_lock/unlock, but it 
could be done outside the spin_lock.
I have test it with kernel 3.0.27 and guest suse11-sp2, it provide 2%-3% net 
performance improved.

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
--- a/drivers/vhost/vhost.c 2013-05-20 10:36:30.0 +0800
+++ b/drivers/vhost/vhost.c 2013-05-20 10:36:54.0 +0800
@@ -144,9 +144,10 @@
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: get 2% performance improved by reducing spin_lock race in vhost_work_queue

2013-05-19 Thread Qinchuanyu
The patch below is base on 
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/drivers/vhost/vhost.c?id=refs/tags/next-20130517

Signed-off-by: Chuanyu Qin qinchua...@huawei.com
--- a/drivers/vhost/vhost.c 2013-05-20 11:47:05.0 +0800
+++ b/drivers/vhost/vhost.c 2013-05-20 11:48:24.0 +0800
@@ -154,9 +154,10 @@
if (list_empty(work-node)) {
list_add_tail(work-node, dev-work_list);
work-queue_seq++;
+   spin_unlock_irqrestore(dev-work_lock, flags);
wake_up_process(dev-worker);
-   }
-   spin_unlock_irqrestore(dev-work_lock, flags);
+   } else
+   spin_unlock_irqrestore(dev-work_lock, flags);
 }
 
 void vhost_poll_queue(struct vhost_poll *poll)

I did the test by using iperf in 10G environment, the test num as below:
 orignal   modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps) vhost(%)
1   9.59 28.82   |  9.5927.49
89.6132.92   |  9.6226.77
649.5846.48  | 9.5538.99
2569.663.7   |  9.6 52.59

The cost of vhost reduced while the throughput is almost unchanged.

On 05/20/2013 11:06 AM, Qinchuanyu wrote:
 Right now the wake_up_process func is included in spin_lock/unlock, but it 
 could be done outside the spin_lock.
 I have test it with kernel 3.0.27 and guest suse11-sp2, it provide 2%-3% net 
 performance improved.

 Signed-off-by: Chuanyu Qin qinchua...@huawei.com

Make sense to me but need generate a patch against net-next.git or
vhost.git in git.kernel.org.

Btw. How did you test this? Care to share the perf numbers?

Thanks
 mu
 --- a/drivers/vhost/vhost.c 2013-05-20 10:36:30.0 +0800
 +++ b/drivers/vhost/vhost.c 2013-05-20 10:36:54.0 +0800
 @@ -144,9 +144,10 @@
 if (list_empty(work-node)) {
 list_add_tail(work-node, dev-work_list);
 work-queue_seq++;
 +   spin_unlock_irqrestore(dev-work_lock, flags);
 wake_up_process(dev-worker);
 -   }
 -   spin_unlock_irqrestore(dev-work_lock, flags);
 +   } else
 +   spin_unlock_irqrestore(dev-work_lock, flags);
  }
  
  void vhost_poll_queue(struct vhost_poll *poll)
 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html