On Mon, May 27, 2013 at 12:13 PM, Michael S. Tsirkin <m...@redhat.com> wrote: > > On Mon, May 27, 2013 at 12:01:07PM -0500, Anthony Liguori wrote: > > Paolo Bonzini <pbonz...@redhat.com> writes: > > > > Finally, the destination QEMU process can vmsplice() from the pipe which > > will copy the data (this is the only copy). > > AFAIK splice is mostly useless for networking as there's no way to > get notified when packet has been sent.
I suspect you could use a thread pool to work around this. It's certainly not useless if your goal is to do userspace switching... > > If vswitch needs to route externally, then it would need to splice() to > > a macvtap. > > > > macvtap should be able to send the packet without copying the data. Not > > sure that this last work will work as expected but if it doesn't, that's > > a bug that can/should be fixed. > > > > The kernel cannot do better than the above modulo any overhead from > > userspace context switching[*]. > > Also modulo scheduler latency - kernel processes packets > in interrupt context. There's a reason e.g. OVS runs data-path in > kernel. Ack. Like I say below, I think network routing belongs in the kernel. Regards, Anthony Liguori > > Guest-to-guest requires a copy. > > Normally macvtap is undesirable because it's tightly connected to a > > network adapter but that is a desirable trait in this case. > > > > N.B., I'm not advocating making all switching decisions in > > userspace. Just pointing out how it can be done efficiently. > > > > [*] in theory the kernel could do zero copy receive but i'm not sure > > it's feasible in practice. > > > > Regards, > > > > Anthony Liguori > > > > > > > > Paolo > > > > > >>> It would be slower than vhost-net, for example no zero-copy > > >>> transmission. > > >> > > >> With splice, I think you could at least get single copy guest-to-guest > > >> networking which is about as good as can be done. > > >> > > >> Regards, > > >> > > >> Anthony Liguori > > >> > > >>>> 3. Use the kernel as a middle-man. Create a double-ended "veth" > > >>>> interface and have Snabb Switch and QEMU each open a PF_PACKET > > >>>> socket and accelerate it with VHOST_NET. > > >>> > > >>> As Michael, mentioned, this could be macvtap on the interface that you > > >>> have already created in the switch and passed to vhost-net. Then you do > > >>> not have to do anything in QEMU. > > >>> > > >>> Paolo > > >>> > > >>>> If you are using the Linux network stack then it might be better to > > >>>> integrate with vhost maybe as a tun-like device driver. > > >>>> > > >>>> Stefan > > >>>> > > >>>>