Re: copyless virtio net thoughts?
On Thu, Feb 19, 2009 at 10:06:17PM +1030, Rusty Russell wrote: > On Thursday 19 February 2009 10:01:42 Simon Horman wrote: > > On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > > > > > 2) Direct NIC attachment This is particularly interesting with SR-IOV or > > > other multiqueue nics, but for boutique cases or benchmarks, could be for > > > normal NICs. So far I have some very sketched-out patches: for the > > > attached nic dev_alloc_skb() gets an skb from the guest (which supplies > > > them via some kind of AIO interface), and a branch in netif_receive_skb() > > > which returned it to the guest. This bypasses all firewalling in the > > > host though; we're basically having the guest process drive the NIC > > > directly. > > > > Hi Rusty, > > > > Can I clarify that the idea with utilising SR-IOV would be to assign > > virtual functions to guests? That is, something conceptually similar to > > PCI pass-through in Xen (although I'm not sure that anyone has virtual > > function pass-through working yet). > > Not quite: I think PCI passthrough IMHO is the *wrong* way to do it: it > makes migrate complicated (if not impossible), and requires emulation or > the same NIC on the destination host. > > This would be the *host* seeing the virtual functions as multiple NICs, > then the ability to attach a given NIC directly to a process. > > This isn't guest-visible: the kvm process is configured to connect > directly to a NIC, rather than (say) bridging through the host. Hi Rusty, Hi Chris, Thanks for the clarification. I think that the approach that Xen recommends for migration is to use a bonding device that accesses the pass-through device if present and a virtual nic. The idea that you outline above does sound somewhat cleaner :-) > > If so, wouldn't this also be useful on machines that have multiple > > NICs? > > Yes, but mainly as a benchmark hack AFAICT :) Ok, I was under the impression that at least in the Xen world it was something people actually used. But I could easily be mistaken. > Hope that clarifies, Rusty. On Thu, Feb 19, 2009 at 03:37:52AM -0800, Chris Wright wrote: > * Simon Horman (ho...@verge.net.au) wrote: > > On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > > 2) Direct NIC attachment This is particularly interesting with SR-IOV or > > > other multiqueue nics, but for boutique cases or benchmarks, could be for > > > normal NICs. So far I have some very sketched-out patches: for the > > > attached nic dev_alloc_skb() gets an skb from the guest (which supplies > > > them via some kind of AIO interface), and a branch in netif_receive_skb() > > > which returned it to the guest. This bypasses all firewalling in the > > > host though; we're basically having the guest process drive the NIC > > > directly. > > > > Can I clarify that the idea with utilising SR-IOV would be to assign > > virtual functions to guests? That is, something conceptually similar to > > PCI pass-through in Xen (although I'm not sure that anyone has virtual > > function pass-through working yet). If so, wouldn't this also be useful > > on machines that have multiple NICs? > > This would be the typical usecase for sr-iov. But I think Rusty is > referring to giving a nic "directly" to a guest but the guest is still > seeing a virtio nic (not pass-through/device-assignment). So there's > no bridge, and zero copy so the dma buffers are supplied by guest, > but host has the driver for the physical nic or the VF. -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Thursday 19 February 2009, Rusty Russell wrote: > Not quite: I think PCI passthrough IMHO is the *wrong* way to do it: > it makes migrate complicated (if not impossible), and requires > emulation or the same NIC on the destination host. > > This would be the *host* seeing the virtual functions as multiple > NICs, then the ability to attach a given NIC directly to a process. I guess what you mean then is what Intel calls VMDq, not SR-IOV. Eddie has some slides about this at http://docs.huihoo.com/kvm/kvmforum2008/kdf2008_7.pdf . The latest network cards support both operation modes, and it appears to me that there is a place for both. VMDq gives you the best performance without limiting flexibility, while SR-IOV performance in theory can be even better, but sacrificing a lot of flexibility and potentially local (guest-to-gest) performance. AFAICT, any card that supports SR-IOV should also allow a VMDq like model, as you describe. Arnd <>< -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
* Simon Horman (ho...@verge.net.au) wrote: > On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > 2) Direct NIC attachment This is particularly interesting with SR-IOV or > > other multiqueue nics, but for boutique cases or benchmarks, could be for > > normal NICs. So far I have some very sketched-out patches: for the > > attached nic dev_alloc_skb() gets an skb from the guest (which supplies > > them via some kind of AIO interface), and a branch in netif_receive_skb() > > which returned it to the guest. This bypasses all firewalling in the > > host though; we're basically having the guest process drive the NIC > > directly. > > Can I clarify that the idea with utilising SR-IOV would be to assign > virtual functions to guests? That is, something conceptually similar to > PCI pass-through in Xen (although I'm not sure that anyone has virtual > function pass-through working yet). If so, wouldn't this also be useful > on machines that have multiple NICs? This would be the typical usecase for sr-iov. But I think Rusty is referring to giving a nic "directly" to a guest but the guest is still seeing a virtio nic (not pass-through/device-assignment). So there's no bridge, and zero copy so the dma buffers are supplied by guest, but host has the driver for the physical nic or the VF. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Thursday 19 February 2009 10:01:42 Simon Horman wrote: > On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > > > 2) Direct NIC attachment This is particularly interesting with SR-IOV or > > other multiqueue nics, but for boutique cases or benchmarks, could be for > > normal NICs. So far I have some very sketched-out patches: for the > > attached nic dev_alloc_skb() gets an skb from the guest (which supplies > > them via some kind of AIO interface), and a branch in netif_receive_skb() > > which returned it to the guest. This bypasses all firewalling in the > > host though; we're basically having the guest process drive the NIC > > directly. > > Hi Rusty, > > Can I clarify that the idea with utilising SR-IOV would be to assign > virtual functions to guests? That is, something conceptually similar to > PCI pass-through in Xen (although I'm not sure that anyone has virtual > function pass-through working yet). Not quite: I think PCI passthrough IMHO is the *wrong* way to do it: it makes migrate complicated (if not impossible), and requires emulation or the same NIC on the destination host. This would be the *host* seeing the virtual functions as multiple NICs, then the ability to attach a given NIC directly to a process. This isn't guest-visible: the kvm process is configured to connect directly to a NIC, rather than (say) bridging through the host. > If so, wouldn't this also be useful > on machines that have multiple NICs? Yes, but mainly as a benchmark hack AFAICT :) Hope that clarifies, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Thursday 19 February 2009 02:54:06 Arnd Bergmann wrote: > On Wednesday 18 February 2009, Rusty Russell wrote: > > > 2) Direct NIC attachment > > This is particularly interesting with SR-IOV or other multiqueue nics, > > but for boutique cases or benchmarks, could be for normal NICs. So > > far I have some very sketched-out patches: for the attached nic > > dev_alloc_skb() gets an skb from the guest (which supplies them via > > some kind of AIO interface), and a branch in netif_receive_skb() > > which returned it to the guest. This bypasses all firewalling in > > the host though; we're basically having the guest process drive > > the NIC directly. > > If this is not passing the PCI device directly to the guest, but > uses your concept, wouldn't it still be possible to use the firewalling > in the host? You can always inspect the headers, drop the frame, etc > without copying the whole frame at any point. It's possible, but you don't want routing or parsing, etc: the NIC is just "directly" attached to the guest. You could do it in qemu or whatever, but it would not be the kernel scheme (netfilter/iptables). > > 3) Direct interguest networking > > Anthony has been thinking here: vmsplice has already been mentioned. > > The idea of passing directly from one guest to another is an > > interesting one: using dma engines might be possible too. Again, > > host can't firewall this traffic. Simplest as a dedicated "internal > > lan" NIC, but we could theoretically do a fast-path for certain MAC > > addresses on a general guest NIC. > > Another option would be to use an SR-IOV adapter from multiple guests, > with a virtual ethernet bridge in the adapter. This moves the overhead > from the CPU to the bus and/or adapter, so it may or may not be a real > benefit depending on the workload. Yes, I guess this should work. Even different SR-IOV adapters will simply send to one another. I'm not sure this obviates the desire to have direct inter-guest which is more generic though. Thanks! Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: copyless virtio net thoughts?
Simon Horman wrote: > On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell > wrote: >> >> 2) Direct NIC attachment This is particularly >> interesting with SR-IOV or other multiqueue nics, but >> for boutique cases or benchmarks, could be for normal >> NICs. So far I have some very sketched-out patches: for >> the attached nic dev_alloc_skb() gets an skb from the >> guest (which supplies them via some kind of AIO >> interface), and a branch in netif_receive_skb() which >> returned it to the guest. This bypasses all firewalling >> in the host though; we're basically having the guest >> process drive the NIC directly. > > Hi Rusty, > > Can I clarify that the idea with utilising SR-IOV would > be to assign virtual functions to guests? That is, > something conceptually similar to PCI pass-through in Xen > (although I'm not sure that anyone has virtual function > pass-through working yet). If so, wouldn't this also be > useful on machines that have multiple NICs? > Yes, and we have successfully get it run with assigning VF to guest in both Xen & KVM, but we are still working on pushing those patches out since it needs Linux PCI subsystem support & driver support. Thx, eddie-- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > 2) Direct NIC attachment This is particularly interesting with SR-IOV or > other multiqueue nics, but for boutique cases or benchmarks, could be for > normal NICs. So far I have some very sketched-out patches: for the > attached nic dev_alloc_skb() gets an skb from the guest (which supplies > them via some kind of AIO interface), and a branch in netif_receive_skb() > which returned it to the guest. This bypasses all firewalling in the > host though; we're basically having the guest process drive the NIC > directly. Hi Rusty, Can I clarify that the idea with utilising SR-IOV would be to assign virtual functions to guests? That is, something conceptually similar to PCI pass-through in Xen (although I'm not sure that anyone has virtual function pass-through working yet). If so, wouldn't this also be useful on machines that have multiple NICs? -- Simon Horman VA Linux Systems Japan K.K., Sydney, Australia Satellite Office H: www.vergenet.net/~horms/ W: www.valinux.co.jp/en -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wednesday 18 February 2009, Rusty Russell wrote: > 2) Direct NIC attachment > This is particularly interesting with SR-IOV or other multiqueue nics, > but for boutique cases or benchmarks, could be for normal NICs. So > far I have some very sketched-out patches: for the attached nic > dev_alloc_skb() gets an skb from the guest (which supplies them via > some kind of AIO interface), and a branch in netif_receive_skb() > which returned it to the guest. This bypasses all firewalling in > the host though; we're basically having the guest process drive > the NIC directly. If this is not passing the PCI device directly to the guest, but uses your concept, wouldn't it still be possible to use the firewalling in the host? You can always inspect the headers, drop the frame, etc without copying the whole frame at any point. When it gets to the point of actually giving the (real pf or sr-iov vf) to one guest, you really get to the point where you can't do local firewalling any more. > 3) Direct interguest networking > Anthony has been thinking here: vmsplice has already been mentioned. > The idea of passing directly from one guest to another is an > interesting one: using dma engines might be possible too. Again, > host can't firewall this traffic. Simplest as a dedicated "internal > lan" NIC, but we could theoretically do a fast-path for certain MAC > addresses on a general guest NIC. Another option would be to use an SR-IOV adapter from multiple guests, with a virtual ethernet bridge in the adapter. This moves the overhead from the CPU to the bus and/or adapter, so it may or may not be a real benefit depending on the workload. Arnd <>< -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Wed, Feb 18, 2009 at 10:08:00PM +1030, Rusty Russell wrote: > > 4) Multiple queues > This is Herbert's. Should be fairly simple to add; it was in the back of my > mind when we started. Not sure whether the queues should be static or > dynamic (imagine direct interguest networking, one queue pair for each other > guest), and how xmit queues would be selected by the guest (anything > anywhere, or dst mac?). The primary purpose of multiple queues is to maximise CPU utilisation, so the number of queues is simply dependent on the number of CPUs allotted to the guest. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Thursday 05 February 2009 12:37:32 Chris Wright wrote: > There's been a number of different discussions re: getting copyless virtio > net (esp. for KVM). This is just a poke in that general direction to > stir the discussion. I'm interested to hear current thoughts? This thread seems to have died out, time for me to weigh in! There are four promising areas that I see when looking at virtio_net performance. I list them all here because they may interact: 1) Async tap access. 2) Direct NIC attachment. 3) Direct interguest networking. 4) Multiqueue virtio_net. 1) Async tap access Either via aio, or something like the prototype virtio_ring patches I produced last year. This is potentially copyless networking for xmit (bar header), with one copy on recv. 2) Direct NIC attachment This is particularly interesting with SR-IOV or other multiqueue nics, but for boutique cases or benchmarks, could be for normal NICs. So far I have some very sketched-out patches: for the attached nic dev_alloc_skb() gets an skb from the guest (which supplies them via some kind of AIO interface), and a branch in netif_receive_skb() which returned it to the guest. This bypasses all firewalling in the host though; we're basically having the guest process drive the NIC directly. 3) Direct interguest networking Anthony has been thinking here: vmsplice has already been mentioned. The idea of passing directly from one guest to another is an interesting one: using dma engines might be possible too. Again, host can't firewall this traffic. Simplest as a dedicated "internal lan" NIC, but we could theoretically do a fast-path for certain MAC addresses on a general guest NIC. 4) Multiple queues This is Herbert's. Should be fairly simple to add; it was in the back of my mind when we started. Not sure whether the queues should be static or dynamic (imagine direct interguest networking, one queue pair for each other guest), and how xmit queues would be selected by the guest (anything anywhere, or dst mac?). Anyone else want to make comments? Thanks, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
From: Arnd Bergmann Date: Sat, 7 Feb 2009 12:56:06 +0100 > Having the load spread evenly over all guests sounds like a much rarer > use case. Totally agreed. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Friday 06 February 2009, Avi Kivity wrote: > > Well, these guests will suck both on baremetal and in virtualisation, > > big deal :) Multiqueue at 10GbE speeds and above is simply not an > > optional feature. > > > > Each guest may only use a part of the 10Gb/s bandwidth, if you have 10 > guests each using 1Gb/s, then we should be able to support this without > multiqueue in the guests. I would expect that there are people that even people with 10 simultaneous guests would like to be able to saturate the link when only one or two of them are doing much traffic on the interface. Having the load spread evenly over all guests sounds like a much rarer use case. Arnd <>< -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
Herbert Xu wrote: On Fri, Feb 06, 2009 at 10:46:37AM +0200, Avi Kivity wrote: The guest's block layer is copyless. The host block layer is -><- this far from being copyless -- all we need is preadv()/pwritev() or to replace our thread pool implementation in qemu with linux-aio. Everything else is copyless. Since we are actively working on this, expect this limitation to disappear soon. Great, when that happens I'll promise to revisit zero-copy transmit :) I was hoping to get some concurrency here, but okay. I support this, but it should be in addition to copylessness, not on its own. I was talking about it in the context of zero-copy receive, where you mentioned that the virtio/kvm copy may not occur on the CPU of the guest's copy. My point is that using multiqueue you can avoid this change of CPU. But yeah I think zero-copy receive is much more useful than zero- copy transmit at the moment. Although I'd prefer to wait for you guys to finish the block layer work before contemplating pushing the copy on receive into the guest :) We'll get the block layer done soon, so it won't be a barrier. - many guests will not support multiqueue Well, these guests will suck both on baremetal and in virtualisation, big deal :) Multiqueue at 10GbE speeds and above is simply not an optional feature. Each guest may only use a part of the 10Gb/s bandwidth, if you have 10 guests each using 1Gb/s, then we should be able to support this without multiqueue in the guests. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Fri, Feb 06, 2009 at 10:46:37AM +0200, Avi Kivity wrote: > > The guest's block layer is copyless. The host block layer is -><- this > far from being copyless -- all we need is preadv()/pwritev() or to > replace our thread pool implementation in qemu with linux-aio. > Everything else is copyless. > > Since we are actively working on this, expect this limitation to > disappear soon. Great, when that happens I'll promise to revisit zero-copy transmit :) > I support this, but it should be in addition to copylessness, not on its > own. I was talking about it in the context of zero-copy receive, where you mentioned that the virtio/kvm copy may not occur on the CPU of the guest's copy. My point is that using multiqueue you can avoid this change of CPU. But yeah I think zero-copy receive is much more useful than zero- copy transmit at the moment. Although I'd prefer to wait for you guys to finish the block layer work before contemplating pushing the copy on receive into the guest :) > - many guests will not support multiqueue Well, these guests will suck both on baremetal and in virtualisation, big deal :) Multiqueue at 10GbE speeds and above is simply not an optional feature. > - for some threaded workloads, you cannot predict where the final read() > will come from; this renders multiqueue ineffective for keeping cache > locality > > - usually you want virtio to transfer large amounts of data; but if you > want your copies to be cache-hot, you need to limit transfers to half > the cache size (a quarter if hyperthreading); this limits virtio > effectiveness Agreed on both counts. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
Herbert Xu wrote: On Thu, Feb 05, 2009 at 02:37:07PM +0200, Avi Kivity wrote: I believe that copyless networking is absolutely essential. I used to think it was important, but I'm now of the opinion that it's quite useless for virtualisation as it stands. For transmit, copyless is needed to properly support sendfile() type workloads - http/ftp/nfs serving. These are usually high-bandwidth, cache-cold workloads where a copy is most expensive. This is totally true for baremetal, but useless for virtualisation right now because the block layer is not zero-copy. That is, the data is going to be cache hot anyway so zero-copy networking doesn't buy you much at all. The guest's block layer is copyless. The host block layer is -><- this far from being copyless -- all we need is preadv()/pwritev() or to replace our thread pool implementation in qemu with linux-aio. Everything else is copyless. Since we are actively working on this, expect this limitation to disappear soon. (even if it doesn't, the effect of block layer copies is multiplied by the cache miss percentage which can be quite low for many workloads; but again, we're not bulding on that) Please also recall that for the time being, block speeds are way slower than network speeds. So the really interesting case is actually network-to-network transfers. Again due to the RX copy this is going to be cache hot. Block speeds are not way slower. We're at 4Gb/sec for Fibre and 10Gb/s for networking. With dual channels or a decent cache hit rate they're evenly matched. For receive, the guest will almost always do an additional copy, but it will most likely do the copy from another cpu. Xen netchannel2 That's what we should strive to avoid. The best scenario with modern 10GbE NICs is to stay on one CPU if at all possible. The NIC will pick a CPU when it delivers the packet into one of the RX queues and we should stick with it for as long as possible. So what I'd like to see next in virtualised networking is virtual multiqueue support in guest drivers. No I'm not talking about making one or more of the physical RX/TX queues available to the guest (aka passthrough), but actually turning something like the virtio-net interface into a multiqueue interface. I support this, but it should be in addition to copylessness, not on its own. - many guests will not support multiqueue - for some threaded workloads, you cannot predict where the final read() will come from; this renders multiqueue ineffective for keeping cache locality - usually you want virtio to transfer large amounts of data; but if you want your copies to be cache-hot, you need to limit transfers to half the cache size (a quarter if hyperthreading); this limits virtio effectiveness -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
On Thu, Feb 05, 2009 at 02:37:07PM +0200, Avi Kivity wrote: > > I believe that copyless networking is absolutely essential. I used to think it was important, but I'm now of the opinion that it's quite useless for virtualisation as it stands. > For transmit, copyless is needed to properly support sendfile() type > workloads - http/ftp/nfs serving. These are usually high-bandwidth, > cache-cold workloads where a copy is most expensive. This is totally true for baremetal, but useless for virtualisation right now because the block layer is not zero-copy. That is, the data is going to be cache hot anyway so zero-copy networking doesn't buy you much at all. Please also recall that for the time being, block speeds are way slower than network speeds. So the really interesting case is actually network-to-network transfers. Again due to the RX copy this is going to be cache hot. > For receive, the guest will almost always do an additional copy, but it > will most likely do the copy from another cpu. Xen netchannel2 That's what we should strive to avoid. The best scenario with modern 10GbE NICs is to stay on one CPU if at all possible. The NIC will pick a CPU when it delivers the packet into one of the RX queues and we should stick with it for as long as possible. So what I'd like to see next in virtualised networking is virtual multiqueue support in guest drivers. No I'm not talking about making one or more of the physical RX/TX queues available to the guest (aka passthrough), but actually turning something like the virtio-net interface into a multiqueue interface. This is the best way to get cache locality and minimise CPU waste. So I'm certainly not rushing out to do any zero-copy virtual networking. However, I would like to start working on a virtual multiqueue NIC interface. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
Avi Kivity wrote: Chris Wright wrote: There's been a number of different discussions re: getting copyless virtio net (esp. for KVM). This is just a poke in that general direction to stir the discussion. I'm interested to hear current thoughts I believe that copyless networking is absolutely essential. For transmit, copyless is needed to properly support sendfile() type workloads - http/ftp/nfs serving. These are usually high-bandwidth, cache-cold workloads where a copy is most expensive. For receive, the guest will almost always do an additional copy, but it will most likely do the copy from another cpu. Xen netchannel2 mitigates this somewhat by having the guest request the hypervisor to perform the copy when the rx interrupt is processed, but this may still be too early (the packet may be destined to a process that is on another vcpu), and the extra hypercall is expensive. In my opinion, it would be ideal to linux-aio enable taps and packet sockets. io_submit() allows submitting multiple buffers in one syscall and supports scatter/gather. io_getevents() supports dequeuing multiple packet completions in one syscall. splice() has some nice properties too. It disconnects the notion of moving around packets from the actually copy them. It also fits well into a more performant model of interguest IO. You can't publish multiple buffers with splice but I don't think we can do that today practically speaking because of mergable RX buffers. You would have to extend the linux-aio interface to hand it a bunch of buffers and for it to tell you where the packet boundaries were. Regards, Anthony Liguroi -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: copyless virtio net thoughts?
Chris Wright wrote: There's been a number of different discussions re: getting copyless virtio net (esp. for KVM). This is just a poke in that general direction to stir the discussion. I'm interested to hear current thoughts I believe that copyless networking is absolutely essential. For transmit, copyless is needed to properly support sendfile() type workloads - http/ftp/nfs serving. These are usually high-bandwidth, cache-cold workloads where a copy is most expensive. For receive, the guest will almost always do an additional copy, but it will most likely do the copy from another cpu. Xen netchannel2 mitigates this somewhat by having the guest request the hypervisor to perform the copy when the rx interrupt is processed, but this may still be too early (the packet may be destined to a process that is on another vcpu), and the extra hypercall is expensive. In my opinion, it would be ideal to linux-aio enable taps and packet sockets. io_submit() allows submitting multiple buffers in one syscall and supports scatter/gather. io_getevents() supports dequeuing multiple packet completions in one syscall. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html