[kvm-devel] [Virtio-for-kvm] [PATCH 7/7] userspace virtio

2007-12-21 Thread Dor Laor
 From f244bcad756c4f761627557bb7f315b1d8f22fb2 Mon Sep 17 00:00:00 2001
From: Dor Laor <[EMAIL PROTECTED]>
Date: Thu, 20 Dec 2007 13:26:30 +0200
Subject: [PATCH] [VIRTIO-NET] Rx performance improvement
 The current performance are not good enough, the problem lies
 in qemu tap handling code that caused to pass packets one at
 a time and also to copy them to a temporal buffer.

This patch prevents qemu handlers from reading the tap and instead
it selects the tap descriptors for virtio devices.
This eliminates copies and also batch guest notifications (interrupts).

Using this patch the rx performance reaches 800Mbps.
The patch does not follow qemu's api since the intention is first to have
a better io in kvm and then to polish it correctly.

Signed-off-by: Dor Laor <[EMAIL PROTECTED]>
---
 qemu/hw/pc.h |2 +-
 qemu/hw/virtio-net.c |  114 
+++--
 qemu/sysemu.h|3 +
 qemu/vl.c|   23 ++-
 4 files changed, 107 insertions(+), 35 deletions(-)

diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h
index 95471f3..5d4c747 100644
--- a/qemu/hw/pc.h
+++ b/qemu/hw/pc.h
@@ -145,7 +145,7 @@ void isa_ne2000_init(int base, qemu_irq irq, NICInfo 
*nd);
 /* virtio-net.c */
 
 void *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn);
-
+void virtio_net_poll(void);
 
 /* virtio-blk.h */
 void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device,
diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index f6f1f28..b955a5e 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -60,8 +60,13 @@ typedef struct VirtIONet
 VirtQueue *tx_vq;
 VLANClientState *vc;
 int can_receive;
+int tap_fd;
+struct VirtIONet *next;
+int do_notify;
 } VirtIONet;
 
+static VirtIONet *VirtIONetHead = NULL;
+
 static VirtIONet *to_virtio_net(VirtIODevice *vdev)
 {
 return (VirtIONet *)vdev;
@@ -96,42 +101,81 @@ static int virtio_net_can_receive(void *opaque)
 return (n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK) && n->can_receive;
 }
 
-static void virtio_net_receive(void *opaque, const uint8_t *buf, int size)
+void virtio_net_poll(void)
 {
-VirtIONet *n = opaque;
+VirtIONet *vnet;
+int len;
+fd_set rfds;
+struct timeval tv;
+int max_fd = -1;
 VirtQueueElement elem;
 struct virtio_net_hdr *hdr;
-int offset, i;
-
-/* FIXME: the drivers really need to set their status better */
-if (n->rx_vq->vring.avail == NULL) {
-n->can_receive = 0;
-return;
-}
-
-if (virtqueue_pop(n->rx_vq, &elem) == 0) {
-/* wait until the guest adds some rx bufs */
-n->can_receive = 0;
-return;
-}
-
-hdr = (void *)elem.in_sg[0].iov_base;
-hdr->flags = 0;
-hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
-
-/* copy in packet.  ugh */
-offset = 0;
-i = 1;
-while (offset < size && i < elem.in_num) {
-int len = MIN(elem.in_sg[i].iov_len, size - offset);
-memcpy(elem.in_sg[i].iov_base, buf + offset, len);
-offset += len;
-i++;
-}
+int did_notify;
+
+FD_ZERO(&rfds);
+tv.tv_sec = 0;
+tv.tv_usec = 0;
+
+while (1) {
+
+ // Prepare the set of device to select from
+for (vnet = VirtIONetHead; vnet; vnet = vnet->next) {
+
+ vnet->do_notify = 0;
+ //first check if the driver is ok
+ if (!virtio_net_can_receive(vnet))
+ continue;
+
+ /* FIXME: the drivers really need to set their status 
better */
+ if (vnet->rx_vq->vring.avail == NULL) {
+vnet->can_receive = 0;
+continue;
+ }
+
+ FD_SET(vnet->tap_fd, &rfds);
+ if (max_fd < vnet->tap_fd) max_fd = vnet->tap_fd;
+}
+   
+if (select(max_fd + 1, &rfds, NULL, NULL, &tv) <= 0)
+break;
+
+   // Now check who has data pending in the tap
+for (vnet = VirtIONetHead; vnet; vnet = vnet->next) {
+
+ if (!FD_ISSET(vnet->tap_fd, &rfds))
+ continue;
+   
+ if (virtqueue_pop(vnet->rx_vq, &elem) == 0) {
+vnet->can_receive = 0;
+ continue;
+ }
+
+ hdr = (void *)elem.in_sg[0].iov_base;
+ hdr->flags = 0;
+ hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
+again:
+ len = readv(vnet->tap_fd, &elem.in_sg[1], elem.in_num - 1);
+ if (len == -1)
+ if (errno == EINTR || errno == EAGAIN)
+ goto again;
+ else
+ fprintf(stderr, "reading network error %d", len);
+
+virtqueue_push(vnet->rx_vq, &elem, sizeof(*hdr) + len);
+vnet->do_notify = 1;
+}
+
+/* signal other side */
+did_notify = 0;
+for (vnet = VirtIONetHead; vnet; vnet = vnet->next)
+if (vnet->do_notify) {
+virtio_notify(&vnet->vdev, vnet->rx_vq);
+did_notify+

Re: [kvm-devel] [Virtio-for-kvm] [PATCH 7/7] userspace virtio

2007-12-23 Thread Avi Kivity
Dor Laor wrote:
>  From f244bcad756c4f761627557bb7f315b1d8f22fb2 Mon Sep 17 00:00:00 2001
> From: Dor Laor <[EMAIL PROTECTED]>
> Date: Thu, 20 Dec 2007 13:26:30 +0200
> Subject: [PATCH] [VIRTIO-NET] Rx performance improvement
>  The current performance are not good enough, the problem lies
>  in qemu tap handling code that caused to pass packets one at
>  a time and also to copy them to a temporal buffer.
>
> This patch prevents qemu handlers from reading the tap and instead
> it selects the tap descriptors for virtio devices.
> This eliminates copies and also batch guest notifications (interrupts).
>
> Using this patch the rx performance reaches 800Mbps.
> The patch does not follow qemu's api since the intention is first to have
> a better io in kvm and then to polish it correctly.
>
>   

This breaks -net user, which is one of the motivations for having a qemu 
net device.  We need to maintain the old path for that, and only use the 
new fast path if using -net tap.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Virtio-for-kvm] [PATCH 7/7] userspace virtio

2008-01-02 Thread Anthony Liguori
I think we should hold off on this sort of patch at first.  I know it 
improves performance, but it's very hack-ish.  I have a similar patch[1] 
that improves performance more but is even more hack-ish.

I think we have to approach this by not special cases virtio-net to know 
about the tap fd, but to figure out the interface that virtio-net would 
need to be efficient, and then refactor the net interface to look like 
that.  Then we can still support user, pcap, and the other network 
transports.

[1] http://hg.codemonkey.ws/qemu-virtio/file/75cefe566cea/aio-net.diff

Regards,

Anthony Liguori

Dor Laor wrote:
> From f244bcad756c4f761627557bb7f315b1d8f22fb2 Mon Sep 17 00:00:00 2001
> From: Dor Laor <[EMAIL PROTECTED]>
> Date: Thu, 20 Dec 2007 13:26:30 +0200
> Subject: [PATCH] [VIRTIO-NET] Rx performance improvement
> The current performance are not good enough, the problem lies
> in qemu tap handling code that caused to pass packets one at
> a time and also to copy them to a temporal buffer.
>
> This patch prevents qemu handlers from reading the tap and instead
> it selects the tap descriptors for virtio devices.
> This eliminates copies and also batch guest notifications (interrupts).
>
> Using this patch the rx performance reaches 800Mbps.
> The patch does not follow qemu's api since the intention is first to have
> a better io in kvm and then to polish it correctly.
>
> Signed-off-by: Dor Laor <[EMAIL PROTECTED]>
> ---
> qemu/hw/pc.h |2 +-
> qemu/hw/virtio-net.c |  114 
> +++--
> qemu/sysemu.h|3 +
> qemu/vl.c|   23 ++-
> 4 files changed, 107 insertions(+), 35 deletions(-)
>
> diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h
> index 95471f3..5d4c747 100644
> --- a/qemu/hw/pc.h
> +++ b/qemu/hw/pc.h
> @@ -145,7 +145,7 @@ void isa_ne2000_init(int base, qemu_irq irq, 
> NICInfo *nd);
> /* virtio-net.c */
>
> void *virtio_net_init(PCIBus *bus, NICInfo *nd, int devfn);
> -
> +void virtio_net_poll(void);
>
> /* virtio-blk.h */
> void *virtio_blk_init(PCIBus *bus, uint16_t vendor, uint16_t device,
> diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
> index f6f1f28..b955a5e 100644
> --- a/qemu/hw/virtio-net.c
> +++ b/qemu/hw/virtio-net.c
> @@ -60,8 +60,13 @@ typedef struct VirtIONet
> VirtQueue *tx_vq;
> VLANClientState *vc;
> int can_receive;
> +int tap_fd;
> +struct VirtIONet *next;
> +int do_notify;
> } VirtIONet;
>
> +static VirtIONet *VirtIONetHead = NULL;
> +
> static VirtIONet *to_virtio_net(VirtIODevice *vdev)
> {
> return (VirtIONet *)vdev;
> @@ -96,42 +101,81 @@ static int virtio_net_can_receive(void *opaque)
> return (n->vdev.status & VIRTIO_CONFIG_S_DRIVER_OK) && 
> n->can_receive;
> }
>
> -static void virtio_net_receive(void *opaque, const uint8_t *buf, int 
> size)
> +void virtio_net_poll(void)
> {
> -VirtIONet *n = opaque;
> +VirtIONet *vnet;
> +int len;
> +fd_set rfds;
> +struct timeval tv;
> +int max_fd = -1;
> VirtQueueElement elem;
> struct virtio_net_hdr *hdr;
> -int offset, i;
> -
> -/* FIXME: the drivers really need to set their status better */
> -if (n->rx_vq->vring.avail == NULL) {
> -n->can_receive = 0;
> -return;
> -}
> -
> -if (virtqueue_pop(n->rx_vq, &elem) == 0) {
> -/* wait until the guest adds some rx bufs */
> -n->can_receive = 0;
> -return;
> -}
> -
> -hdr = (void *)elem.in_sg[0].iov_base;
> -hdr->flags = 0;
> -hdr->gso_type = VIRTIO_NET_HDR_GSO_NONE;
> -
> -/* copy in packet.  ugh */
> -offset = 0;
> -i = 1;
> -while (offset < size && i < elem.in_num) {
> -int len = MIN(elem.in_sg[i].iov_len, size - offset);
> -memcpy(elem.in_sg[i].iov_base, buf + offset, len);
> -offset += len;
> -i++;
> -}
> +int did_notify;
> +
> +FD_ZERO(&rfds);
> +tv.tv_sec = 0;
> +tv.tv_usec = 0;
> +
> +while (1) {
> +
> + // Prepare the set of device to select from
> +for (vnet = VirtIONetHead; vnet; vnet = vnet->next) {
> +
> + vnet->do_notify = 0;
> + //first check if the driver is ok
> + if (!virtio_net_can_receive(vnet))
> + continue;
> +
> + /* FIXME: the drivers really need to set their status 
> better */
> + if (vnet->rx_vq->vring.avail == NULL) {
> +vnet->can_receive = 0;
> +continue;
> + }
> +
> + FD_SET(vnet->tap_fd, &rfds);
> + if (max_fd < vnet->tap_fd) max_fd = vnet->tap_fd;
> +}
> +   +if (select(max_fd + 1, &rfds, NULL, NULL, &tv) <= 0)
> +break;
> +
> +   // Now check who has data pending in the tap
> +for (vnet = VirtIONetHead; vnet; vnet = vnet->next) {
> +
> + if (!FD_ISSET(vnet->tap_fd, &rfds))
> + continue;
> +   + if (virtqueue_pop(vnet->rx_vq, &elem) == 0) {
> +vnet->

Re: [kvm-devel] [Virtio-for-kvm] [PATCH 7/7] userspace virtio

2008-01-02 Thread Avi Kivity
Anthony Liguori wrote:
> I think we should hold off on this sort of patch at first.  I know it 
> improves performance, but it's very hack-ish.  I have a similar patch[1] 
> that improves performance more but is even more hack-ish.
>
> I think we have to approach this by not special cases virtio-net to know 
> about the tap fd, but to figure out the interface that virtio-net would 
> need to be efficient, and then refactor the net interface to look like 
> that.  Then we can still support user, pcap, and the other network 
> transports.
>
> [1] http://hg.codemonkey.ws/qemu-virtio/file/75cefe566cea/aio-net.diff
>   

While you are right in principle, high performance networking is long 
overdue in kvm so I applied that patch.  Once a mega async dma framework 
is added to qemu, we'll just revert that patch prior to adding the glue 
to said framework.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Virtio-for-kvm] [PATCH 7/7] userspace virtio

2008-01-02 Thread Dor Laor
Avi Kivity wrote:
> Anthony Liguori wrote:
>> I think we should hold off on this sort of patch at first.  I know it 
>> improves performance, but it's very hack-ish.  I have a similar 
>> patch[1] that improves performance more but is even more hack-ish.
>>
>> I think we have to approach this by not special cases virtio-net to 
>> know about the tap fd, but to figure out the interface that 
>> virtio-net would need to be efficient, and then refactor the net 
>> interface to look like that.  Then we can still support user, pcap, 
>> and the other network transports.
>>
>> [1] http://hg.codemonkey.ws/qemu-virtio/file/75cefe566cea/aio-net.diff
>>   
>
> While you are right in principle, high performance networking is long 
> overdue in kvm so I applied that patch.  Once a mega async dma 
> framework is added to qemu, we'll just revert that patch prior to 
> adding the glue to said framework.
>
I second Avi, the reason we wanted quick (and ugly) performance patch 
merged was because people are trying kvm and sometime
get disappointed of IO performance. Since we did have unmerged pv net 
for quite a while, we're pushing towards merging it.
We can't wait for qemu guys to merge it and we also plan to develop host 
kernel pv network side.

btw: My hack leaves the -user support but with the previous performance.
I'll add the tx improvement too.

Thanks for the backward compat patch :)
Dor

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel