[PATCH] vhost: Add polling mode

2014-08-10 Thread Razya Ladelsky
From: Razya Ladelsky ra...@il.ibm.com
Date: Thu, 31 Jul 2014 09:47:20 +0300
Subject: [PATCH] vhost: Add polling mode

When vhost is waiting for buffers from the guest driver (e.g., more packets to
send in vhost-net's transmit queue), it normally goes to sleep and waits for the
guest to kick it. This kick involves a PIO in the guest, and therefore an exit
(and possibly userspace involvement in translating this PIO exit into a file
descriptor event), all of which hurts performance.

If the system is under-utilized (has cpu time to spare), vhost can continuously
poll the virtqueues for new buffers, and avoid asking the guest to kick us.
This patch adds an optional polling mode to vhost, that can be enabled via a
kernel module parameter, poll_start_rate.

When polling is active for a virtqueue, the guest is asked to disable
notification (kicks), and the worker thread continuously checks for new buffers.
When it does discover new buffers, it simulates a kick by invoking the
underlying backend driver (such as vhost-net), which thinks it got a real kick
from the guest, and acts accordingly. If the underlying driver asks not to be
kicked, we disable polling on this virtqueue.

We start polling on a virtqueue when we notice it has work to do. Polling on
this virtqueue is later disabled after 3 seconds of polling turning up no new
work, as in this case we are better off returning to the exit-based notification
mechanism. The default timeout of 3 seconds can be changed with the
poll_stop_idle kernel module parameter.

This polling approach makes lot of sense for new HW with posted-interrupts for
which we have exitless host-to-guest notifications. But even with support for
posted interrupts, guest-to-host communication still causes exits. Polling adds
the missing part.

When systems are overloaded, there won't be enough cpu time for the various
vhost threads to poll their guests' devices. For these scenarios, we plan to add
support for vhost threads that can be shared by multiple devices, even of
multiple vms.
Our ultimate goal is to implement the I/O acceleration features described in:
KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
https://www.youtube.com/watch?v=9EyweibHfEs
and
https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html

I ran some experiments with TCP stream netperf and filebench (having 2 threads
performing random reads) benchmarks on an IBM System x3650 M4.
I have two machines, A and B. A hosts the vms, B runs the netserver.
The vms (on A) run netperf, its destination server is running on B.
All runs loaded the guests in a way that they were (cpu) saturated. For example,
I ran netperf with 64B messages, which is heavily loading the vm (which is why
its throughput is low).
The idea was to get it 100% loaded, so we can see that the polling is getting it
to produce higher throughput.

The system had two cores per guest, as to allow for both the vcpu and the vhost
thread to run concurrently for maximum throughput (but I didn't pin the threads
to specific cores).
My experiments were fair in a sense that for both cases, with or without
polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity that
way). The only difference was whether polling was enabled/disabled.

Results:

Netperf, 1 vm:
The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
Number of exits/sec decreased 6x.
The same improvement was shown when I tested with 3 vms running netperf
(4086 MB/sec - 5545 MB/sec).

filebench, 1 vm:
ops/sec improved by 13% with the polling patch. Number of exits was reduced by
31%.
The same experiment with 3 vms running filebench showed similar numbers.

Signed-off-by: Razya Ladelsky ra...@il.ibm.com
---
 drivers/vhost/net.c   |6 +-
 drivers/vhost/scsi.c  |6 +-
 drivers/vhost/vhost.c |  245 +++--
 drivers/vhost/vhost.h |   38 +++-
 4 files changed, 277 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 971a760..558aecb 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
}
vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
-   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
-   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
+   vqs[VHOST_NET_VQ_TX]);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
+   vqs[VHOST_NET_VQ_RX]);
 
f-private_data = n;
 
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 4f4ffa4..665eeeb 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1528,9 +1528,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
if (!vqs)
goto err_vqs;
 
-   

RFC: [PATCH v1] KVM: Use trace_printk() for vcpu_unimpl() for performance reasons

2014-08-10 Thread Ilari Stenroth

vcpu_unimpl() is called to notify for example about unhandled wrmsr
requests made by KVM guests. It used to call printk() but in
certain setups printk() may cause severe performance impact thus
replacing printk() with guaranteed to be buffered trace_printk()
avoids this caveat.

Signed-off-by: Ilari Stenroth ilari.stenr...@gmail.com
---
 include/linux/kvm_host.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a4c33b3..b79ce59 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -24,6 +24,8 @@
 #include linux/err.h
 #include linux/irqflags.h
 #include linux/context_tracking.h
+#include linux/kernel.h
+#include linux/kern_levels.h
 #include asm/signal.h

 #include linux/kvm.h
@@ -408,9 +410,15 @@ struct kvm {
pr_info(kvm [%i]:  fmt, task_pid_nr(current), ## __VA_ARGS__)
 #define kvm_debug(fmt, ...) \
pr_debug(kvm [%i]:  fmt, task_pid_nr(current), ## __VA_ARGS__)
+#ifdef CONFIG_TRACING
+#define kvm_pr_unimpl(fmt, ...) \
+   trace_printk(pr_fmt(KERN_ERR kvm [%i]:  fmt), \
+  task_tgid_nr(current), ## __VA_ARGS__)
+#else
 #define kvm_pr_unimpl(fmt, ...) \
pr_err_ratelimited(kvm [%i]:  fmt, \
   task_tgid_nr(current), ## __VA_ARGS__)
+#endif

 /* The guest did something we don't support. */
 #define vcpu_unimpl(vcpu, fmt, ...)\
--
2.0.4


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-10 Thread Michael S. Tsirkin
On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
 From: Razya Ladelsky ra...@il.ibm.com
 Date: Thu, 31 Jul 2014 09:47:20 +0300
 Subject: [PATCH] vhost: Add polling mode
 
 When vhost is waiting for buffers from the guest driver (e.g., more packets to
 send in vhost-net's transmit queue), it normally goes to sleep and waits for 
 the
 guest to kick it. This kick involves a PIO in the guest, and therefore an 
 exit
 (and possibly userspace involvement in translating this PIO exit into a file
 descriptor event), all of which hurts performance.
 
 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously
 poll the virtqueues for new buffers, and avoid asking the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled via a
 kernel module parameter, poll_start_rate.
 
 When polling is active for a virtqueue, the guest is asked to disable
 notification (kicks), and the worker thread continuously checks for new 
 buffers.
 When it does discover new buffers, it simulates a kick by invoking the
 underlying backend driver (such as vhost-net), which thinks it got a real kick
 from the guest, and acts accordingly. If the underlying driver asks not to be
 kicked, we disable polling on this virtqueue.
 
 We start polling on a virtqueue when we notice it has work to do. Polling on
 this virtqueue is later disabled after 3 seconds of polling turning up no new
 work, as in this case we are better off returning to the exit-based 
 notification
 mechanism. The default timeout of 3 seconds can be changed with the
 poll_stop_idle kernel module parameter.
 
 This polling approach makes lot of sense for new HW with posted-interrupts for
 which we have exitless host-to-guest notifications. But even with support for
 posted interrupts, guest-to-host communication still causes exits. Polling 
 adds
 the missing part.
 
 When systems are overloaded, there won't be enough cpu time for the various
 vhost threads to poll their guests' devices. For these scenarios, we plan to 
 add
 support for vhost threads that can be shared by multiple devices, even of
 multiple vms.
 Our ultimate goal is to implement the I/O acceleration features described in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
 I ran some experiments with TCP stream netperf and filebench (having 2 threads
 performing random reads) benchmarks on an IBM System x3650 M4.
 I have two machines, A and B. A hosts the vms, B runs the netserver.
 The vms (on A) run netperf, its destination server is running on B.
 All runs loaded the guests in a way that they were (cpu) saturated. For 
 example,
 I ran netperf with 64B messages, which is heavily loading the vm (which is why
 its throughput is low).
 The idea was to get it 100% loaded, so we can see that the polling is getting 
 it
 to produce higher throughput.

And, did your tests actually produce 100% load on both host CPUs?

 The system had two cores per guest, as to allow for both the vcpu and the 
 vhost
 thread to run concurrently for maximum throughput (but I didn't pin the 
 threads
 to specific cores).
 My experiments were fair in a sense that for both cases, with or without
 polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity 
 that
 way). The only difference was whether polling was enabled/disabled.
 
 Results:
 
 Netperf, 1 vm:
 The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
 Number of exits/sec decreased 6x.
 The same improvement was shown when I tested with 3 vms running netperf
 (4086 MB/sec - 5545 MB/sec).
 
 filebench, 1 vm:
 ops/sec improved by 13% with the polling patch. Number of exits was reduced by
 31%.
 The same experiment with 3 vms running filebench showed similar numbers.
 
 Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 ---
  drivers/vhost/net.c   |6 +-
  drivers/vhost/scsi.c  |6 +-
  drivers/vhost/vhost.c |  245 
 +++--
  drivers/vhost/vhost.h |   38 +++-
  4 files changed, 277 insertions(+), 18 deletions(-)
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 971a760..558aecb 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
 file *f)
   }
   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
 - vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
 - vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
 + vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
 + vqs[VHOST_NET_VQ_TX]);
 + vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
 + vqs[VHOST_NET_VQ_RX]);
  
   f-private_data = n;
  
 diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
 index 

Re: [PATCH v3] powerpc/kvm: support to handle sw breakpoint

2014-08-10 Thread Madhavan Srinivasan
On Sunday 03 August 2014 09:21 PM, Segher Boessenkool wrote:
 +/*
 + * KVMPPC_INST_BOOK3S_DEBUG is debug Instruction for supporting Software 
 Breakpoint.
 + * Based on PowerISA v2.07, Instruction with opcode 0s will be treated as 
 illegal
 + * instruction.
 + */
 
 primary opcode 0 instead?
 
ok sure.

 +#define OP_ZERO 0x0
 
 Using 0x0 where you mean 0, making a #define for 0 in the first place...
 This all looks rather silly doesn't it.
 

I wanted to avoid zero mentioned in the case statement, but can add a
comment explaining it.

 +case OP_ZERO:
 +if((inst  0x0000) == KVMPPC_INST_BOOK3S_DEBUG) {
 
 You either shouldn't mask at all here, or the mask is wrong (the primary
 op is the top six bits, not the top eight).

Yes. I guess I dont need to check here. Will resend the patch.

Thanks for review
Regards
Maddy

 
 Segher
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html