Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Mon, Sep 13, 2010 at 07:00:51PM +0200, Avi Kivity wrote: On 09/13/2010 06:30 PM, Michael S. Tsirkin wrote: Trouble is, each vhost-net device is associated with 1 tun/tap device which means that each vhost-net device is associated with a transmit and receive queue. I don't know if you'll always have an equal number of transmit and receive queues but there's certainly challenge in terms of flexibility with this model. Regards, Anthony Liguori Not really, TX and RX can be mapped to different devices, or you can only map one of these. What is the trouble? Suppose you have one multiqueue-capable ethernet card. How can you connect it to multiple rx/tx queues? tx is in principle doable, but what about rx? What does only map one of these mean? Connect the device with one queue (presumably rx), and terminate the others? Will packet classification work (does the current multiqueue proposal support it)? This is a non trivial problem, but this needs to be handled in tap, not in vhost net. If tap gives you multiple queues, vhost-net will happily let you connect vqs to these. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Mon, Sep 13, 2010 at 12:40:11PM -0500, Anthony Liguori wrote: On 09/13/2010 11:30 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 10:59:34AM -0500, Anthony Liguori wrote: On 09/13/2010 04:04 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 09:50:42AM +0530, Krishna Kumar2 wrote: Michael S. Tsirkinm...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkinm...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK Here's what I mean: each vhost device includes 1 TX and 1 RX VQ. Instead of teaching vhost about multiqueue, we could simply open /dev/vhost-net multiple times. How many times would be up to qemu. Trouble is, each vhost-net device is associated with 1 tun/tap device which means that each vhost-net device is associated with a transmit and receive queue. I don't know if you'll always have an equal number of transmit and receive queues but there's certainly challenge in terms of flexibility with this model. Regards, Anthony Liguori Not really, TX and RX can be mapped to different devices, It's just a little odd. Would you bond multiple tun tap devices to achieve multi-queue TX? For RX, do you somehow limit RX to only one of those devices? Exatly in the way the patches we discuss here do this: we already have a per-queue fd. If we were doing this in QEMU (and btw, there needs to be userspace patches before we implement this in the kernel side), I agree that Feature parity is nice to have, but I don't see a huge problem with (hopefully temporarily) only supporting feature X with kernel acceleration, BTW. This is already the case with checksum offloading features. I think it would make more sense to just rely on doing a multithreaded write to a single tun/tap device and then to hope that in can be made smarter at the macvtap layer. No, an fd serializes access, so you need seperate fds for multithreaded writes to work. Think about how e.g. select will work. Regards, Anthony Liguori Regards, Anthony Liguori or you can only map one of these. What is the trouble? What other features would you desire in terms of flexibility? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Mon, Sep 13, 2010 at 09:50:42AM +0530, Krishna Kumar2 wrote: Michael S. Tsirkin m...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkin m...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK Here's what I mean: each vhost device includes 1 TX and 1 RX VQ. Instead of teaching vhost about multiqueue, we could simply open /dev/vhost-net multiple times. How many times would be up to qemu. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On 09/13/2010 04:04 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 09:50:42AM +0530, Krishna Kumar2 wrote: Michael S. Tsirkinm...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkinm...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK Here's what I mean: each vhost device includes 1 TX and 1 RX VQ. Instead of teaching vhost about multiqueue, we could simply open /dev/vhost-net multiple times. How many times would be up to qemu. Trouble is, each vhost-net device is associated with 1 tun/tap device which means that each vhost-net device is associated with a transmit and receive queue. I don't know if you'll always have an equal number of transmit and receive queues but there's certainly challenge in terms of flexibility with this model. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Mon, Sep 13, 2010 at 10:59:34AM -0500, Anthony Liguori wrote: On 09/13/2010 04:04 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 09:50:42AM +0530, Krishna Kumar2 wrote: Michael S. Tsirkinm...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkinm...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK Here's what I mean: each vhost device includes 1 TX and 1 RX VQ. Instead of teaching vhost about multiqueue, we could simply open /dev/vhost-net multiple times. How many times would be up to qemu. Trouble is, each vhost-net device is associated with 1 tun/tap device which means that each vhost-net device is associated with a transmit and receive queue. I don't know if you'll always have an equal number of transmit and receive queues but there's certainly challenge in terms of flexibility with this model. Regards, Anthony Liguori Not really, TX and RX can be mapped to different devices, or you can only map one of these. What is the trouble? What other features would you desire in terms of flexibility? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On 09/13/2010 11:30 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 10:59:34AM -0500, Anthony Liguori wrote: On 09/13/2010 04:04 AM, Michael S. Tsirkin wrote: On Mon, Sep 13, 2010 at 09:50:42AM +0530, Krishna Kumar2 wrote: Michael S. Tsirkinm...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkinm...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK Here's what I mean: each vhost device includes 1 TX and 1 RX VQ. Instead of teaching vhost about multiqueue, we could simply open /dev/vhost-net multiple times. How many times would be up to qemu. Trouble is, each vhost-net device is associated with 1 tun/tap device which means that each vhost-net device is associated with a transmit and receive queue. I don't know if you'll always have an equal number of transmit and receive queues but there's certainly challenge in terms of flexibility with this model. Regards, Anthony Liguori Not really, TX and RX can be mapped to different devices, It's just a little odd. Would you bond multiple tun tap devices to achieve multi-queue TX? For RX, do you somehow limit RX to only one of those devices? If we were doing this in QEMU (and btw, there needs to be userspace patches before we implement this in the kernel side), I think it would make more sense to just rely on doing a multithreaded write to a single tun/tap device and then to hope that in can be made smarter at the macvtap layer. Regards, Anthony Liguori Regards, Anthony Liguori or you can only map one of these. What is the trouble? What other features would you desire in terms of flexibility? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
Michael S. Tsirkin m...@redhat.com wrote on 09/12/2010 05:16:37 PM: Michael S. Tsirkin m...@redhat.com 09/12/2010 05:16 PM On Thu, Sep 09, 2010 at 07:19:33PM +0530, Krishna Kumar2 wrote: Unfortunately I need a constant in vhost for now. Maybe not even that: you create multiple vhost-net devices so vhost-net in kernel does not care about these either, right? So this can be just part of vhost_net.h in qemu. Sorry, I didn't understand what you meant. I can remove all socks[] arrays/constants by pre-allocating sockets in vhost_setup_vqs. Then I can remove all socks parameters in vhost_net_stop, vhost_net_release and vhost_net_reset_owner. Does this make sense? Thanks, - KK -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
Rusty Russell ru...@rustcorp.com.au wrote on 09/09/2010 05:44:25 PM: This seems a bit weird. I mean, the driver used vdev-config- find_vqs to find the queues, which returns them (in order). So, can't you put this into your struct send_queue? I am saving the vqs in the send_queue, but the cb needs to locate the device txq from the svq. The only other way I could think of is to iterate through the send_queue's and compare svq against sq[i]-svq, but cb's happen quite a bit. Is there a better way? Ah, good point. Move the queue index into the struct virtqueue? Is it OK to move the queue_index from virtio_pci_vq_info to virtqueue? I didn't want to change any data structures in virtio for this patch, but I can do it either way. Also, why define VIRTIO_MAX_TXQS? If the driver can't handle all of them, it should simply not use them... The main reason was vhost :) Since vhost_net_release should not fail (__fput can't handle f_op-release() failure), I needed a maximum number of socks to clean up: Ah, then it belongs in the vhost headers. The guest shouldn't see such a restriction if it doesn't apply; it's a host thing. Oh, and I think you could profitably use virtio_config_val(), too. OK, I will make those changes. Thanks for the reference to virtio_config_val(), I will use it in guest probe instead of the cumbersome way I am doing now. Unfortunately I need a constant in vhost for now. Thanks, - KK -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Thu, 9 Sep 2010 11:19:33 pm Krishna Kumar2 wrote: Rusty Russell ru...@rustcorp.com.au wrote on 09/09/2010 05:44:25 PM: Ah, good point. Move the queue index into the struct virtqueue? Is it OK to move the queue_index from virtio_pci_vq_info to virtqueue? I didn't want to change any data structures in virtio for this patch, but I can do it either way. Yep, it's logical to me. Thanks! Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
On Wed, 8 Sep 2010 04:59:05 pm Krishna Kumar wrote: Add virtio_get_queue_index() to get the queue index of a vq. This is needed by the cb handler to locate the queue that should be processed. This seems a bit weird. I mean, the driver used vdev-config-find_vqs to find the queues, which returns them (in order). So, can't you put this into your struct send_queue? Also, why define VIRTIO_MAX_TXQS? If the driver can't handle all of them, it should simply not use them... Thanks! Rusty. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/4] Add a new API to virtio-pci
Rusty Russell ru...@rustcorp.com.au wrote on 09/09/2010 09:19:39 AM: On Wed, 8 Sep 2010 04:59:05 pm Krishna Kumar wrote: Add virtio_get_queue_index() to get the queue index of a vq. This is needed by the cb handler to locate the queue that should be processed. This seems a bit weird. I mean, the driver used vdev-config-find_vqs to find the queues, which returns them (in order). So, can't you put this into your struct send_queue? I am saving the vqs in the send_queue, but the cb needs to locate the device txq from the svq. The only other way I could think of is to iterate through the send_queue's and compare svq against sq[i]-svq, but cb's happen quite a bit. Is there a better way? static void skb_xmit_done(struct virtqueue *svq) { struct virtnet_info *vi = svq-vdev-priv; int qnum = virtio_get_queue_index(svq) - 1; /* 0 is RX vq */ /* Suppress further interrupts. */ virtqueue_disable_cb(svq); /* We were probably waiting for more output buffers. */ netif_wake_subqueue(vi-dev, qnum); } Also, why define VIRTIO_MAX_TXQS? If the driver can't handle all of them, it should simply not use them... The main reason was vhost :) Since vhost_net_release should not fail (__fput can't handle f_op-release() failure), I needed a maximum number of socks to clean up: #define MAX_VQS (1 + VIRTIO_MAX_TXQS) static int vhost_net_release(struct inode *inode, struct file *f) { struct vhost_net *n = f-private_data; struct vhost_dev *dev = n-dev; struct socket *socks[MAX_VQS]; int i; vhost_net_stop(n, socks); vhost_net_flush(n); vhost_dev_cleanup(dev); for (i = n-dev.nvqs - 1; i = 0; i--) if (socks[i]) fput(socks[i]-file); ... } Thanks, - KK -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html