Re: [PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-02-21 Thread Stefan Hajnoczi
On Thu, Feb 13, 2020 at 03:48:59PM +0200, Nikos Dragazis wrote:
> On Tue, 14 Jan 2020 at 10:20 Stefan Hajnoczi
>  wrote:
> > On Fri, Jan 10, 2020 at 10:34 AM Marc-André Lureau
> >  wrote:
> > > On Wed, Jan 8, 2020 at 5:57 AM V.  wrote:
> >
> > Hi V.,
> > I think I remember you from Etherboot/gPXE days :).
> >
> > > > 3.
> > > > Now if Cross Cable is actually a new and (after a code-rewrite of 10) a
> > > > viable way to connect 2 QEMU's together, could I actually
> > > > suggest a better way?
> > > > I am thinking of a '-netdev vhost-user-slave' option to connect (as 
> > > > client
> > > > or server) to a master QEMU running '-netdev vhost-user'.
> > > > This way there is no need for any external program at all, the master 
> > > > can
> > > > have it's memory unshared and everything would just work
> > > > and be fast.
> > > > Also the whole thing can fall back to normal virtio if memory is not 
> > > > shared
> > > > and it would even work in pure usermode without any
> > > > context switch.
> > > >
> > > > Building a patch for this idea I could maybe get around to, don't 
> > > > clearly
> > > > have an idea how much work this would be but I've done
> > > > crazier things.
> > > > But is this is something that someone might be able to whip up in an 
> > > > hour
> > > > or two? Someone who actually does have a clue about vhost
> > > > and virtio maybe? ;-)
> > >
> > > I believe https://wiki.qemu.org/Features/VirtioVhostUser is what you
> > > are after. It's still being discussed and non-trivial, but not very
> > > active lately afaik.
> >
> > virtio-vhost-user is being experimented with in the SPDK community but
> > there has been no activity on VIRTIO standardization or the QEMU
> > patches for some time.  More info here:
> > https://ndragazis.github.io/spdk.html
> >
> > I think the new ivshmem v2 feature may provide the functionality you
> > are looking for, but I haven't looked at them yet.  Here is the link:
> > https://www.mail-archive.com/address@hidden/msg668749.html
> >
> > And here is Jan's KVM Forum presentation on ivshmem v2:
> > https://www.youtube.com/watch?v=TiZrngLUFMA
> >
> > Stefan
> 
> 
> Hi Stefan,
> 
> First of all, sorry for the delayed response. The mail got lost
> somewhere in my inbox. Please keep Cc-ing me on all threads related to
> virtio-vhost-user.
> 
> As you correctly pointed out, I have been experimenting with
> virtio-vhost-user on SPDK and [1] is a working demo on this matter. I
> have been working on getting it merged with SPDK and, to this end, I
> have been interacting with the SPDK team [2][3] and mostly with Darek
> Stojaczyk (Cc-ing him).
> 
> The reason that you haven’t seen any activity for the last months is
> that I got a job and hence my schedule has become tighter. But I will
> definitely find some space and fit it into my schedule. Let me give you
> a heads up, so that we get synced:
> 
> Originally, I created and sent a patch (influenced from your DPDK patch
> [4]) to SPDK that was enhancing SPDK’s internal rte_vhost library with
> the virtio-vhost-user transport. However, a few weeks later, the SPDK
> team decided to switch from their internal rte_vhost library to using
> DPDK’s rte_vhost library directly [3]. Given that, I refactored and sent
> the patch for the virtio-vhost-user transport to the DPDK mailing list
> [5]. Regarding the virtio-vhost-user device, I have made some
> enhancements [6] on your original RFC device implementation and, as you
> may remember, I have sent some revised versions of the spec to the
> virtio mailing list [7].
> 
> At the moment, the blocker is the virtio spec. The last update on this
> was my discussion with Michael Tsirkin (Cc-ing him as well) about the
> need for the VIRTIO_PCI_CAP_DOORBELL_CFG and
> VIRTIO_PCI_CAP_NOTIFICATION_CFG configuration structures [8].
> 
> So, I think the next steps should be the following:
> 
> 1. merging the spec
> 2. adding the device on QEMU
> 3. adding the vvu transport on DPDK
> 4. extending SPDK to make use of the new vhost-user transport
> 
> To conclude, I still believe that this device is useful and should be
> part of virtio/qemu/dpdk/spdk and I will continue working on this
> direction.

Thanks for letting me know.  Feel free to resend the latest VIRTIO spec
changes to restart the discussion.  I am certainly interested.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-02-13 Thread Michael S. Tsirkin
On Wed, Jan 08, 2020 at 02:54:30AM +0100, V. wrote:
> Hi List,
> 
> For my VM setup I tend to use a lot of VM to VM single network links to do 
> routing, switching and bridging in VM's instead of the host.
> Also stemming from a silly fetish to sometimes use some OpenBSD VMs as 
> firewall, but that is besides the point here.
> I am using the standard, tested and true method of using a whole bunch  of 
> bridges, having 2 vhost taps each.
> This works and it's fast, but it is a nightmare to manage with all the 
> interfaces on the host.
> 
> So, I looked a bit into how I can improve this, basically coming down to "How 
> to connect 2 VM's together in a really fast and easy way".
> This however, is not as straightforward as I thought, without going the whole 
> route of OVS/Snabb/any other big feature bloated
> software switch.
> Cause really, all I want is to connect 2 VM's in a fast and easy way. 
> Shouldn't be that hard right?
> 
> Anyways, I end up finding tests/vhost-user-bridge.c, which is very nicely 
> doing half of what I wanted.

BTW you can easily run two vhost user bridges and connect them back to
back, right?

> After some doubling of the vhosts and eliminating udp, I came up with a Vhost 
> User Cross Cable. (patch in next post).

Hmm you forgot --thread=shallow so your posts aren't linked.



> It just opens 2 vhost sockets instead of 1 and does the forwarding between 
> them.
> A terrible hack and slash of vhost-user-bridge.c, probably now with bugs 
> causing the dead of many puppies and the end of humanity,
> but it works!

I think generally this approach has value, maybe a separate utility,
maybe as a flag for vhost-user-bridge.

-- 
MST




Re: [PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-02-13 Thread Nikos Dragazis

On Tue, 14 Jan 2020 at 10:20 Stefan Hajnoczi
 wrote:
> On Fri, Jan 10, 2020 at 10:34 AM Marc-André Lureau
>  wrote:
> > On Wed, Jan 8, 2020 at 5:57 AM V.  wrote:
>
> Hi V.,
> I think I remember you from Etherboot/gPXE days :).
>
> > > 3.
> > > Now if Cross Cable is actually a new and (after a code-rewrite of 10) a
> > > viable way to connect 2 QEMU's together, could I actually
> > > suggest a better way?
> > > I am thinking of a '-netdev vhost-user-slave' option to connect (as client
> > > or server) to a master QEMU running '-netdev vhost-user'.
> > > This way there is no need for any external program at all, the master can
> > > have it's memory unshared and everything would just work
> > > and be fast.
> > > Also the whole thing can fall back to normal virtio if memory is not 
shared
> > > and it would even work in pure usermode without any
> > > context switch.
> > >
> > > Building a patch for this idea I could maybe get around to, don't clearly
> > > have an idea how much work this would be but I've done
> > > crazier things.
> > > But is this is something that someone might be able to whip up in an hour
> > > or two? Someone who actually does have a clue about vhost
> > > and virtio maybe? ;-)
> >
> > I believe https://wiki.qemu.org/Features/VirtioVhostUser is what you
> > are after. It's still being discussed and non-trivial, but not very
> > active lately afaik.
>
> virtio-vhost-user is being experimented with in the SPDK community but
> there has been no activity on VIRTIO standardization or the QEMU
> patches for some time.  More info here:
> https://ndragazis.github.io/spdk.html
>
> I think the new ivshmem v2 feature may provide the functionality you
> are looking for, but I haven't looked at them yet.  Here is the link:
> https://www.mail-archive.com/address@hidden/msg668749.html
>
> And here is Jan's KVM Forum presentation on ivshmem v2:
> https://www.youtube.com/watch?v=TiZrngLUFMA
>
> Stefan


Hi Stefan,

First of all, sorry for the delayed response. The mail got lost
somewhere in my inbox. Please keep Cc-ing me on all threads related to
virtio-vhost-user.

As you correctly pointed out, I have been experimenting with
virtio-vhost-user on SPDK and [1] is a working demo on this matter. I
have been working on getting it merged with SPDK and, to this end, I
have been interacting with the SPDK team [2][3] and mostly with Darek
Stojaczyk (Cc-ing him).

The reason that you haven’t seen any activity for the last months is
that I got a job and hence my schedule has become tighter. But I will
definitely find some space and fit it into my schedule. Let me give you
a heads up, so that we get synced:

Originally, I created and sent a patch (influenced from your DPDK patch
[4]) to SPDK that was enhancing SPDK’s internal rte_vhost library with
the virtio-vhost-user transport. However, a few weeks later, the SPDK
team decided to switch from their internal rte_vhost library to using
DPDK’s rte_vhost library directly [3]. Given that, I refactored and sent
the patch for the virtio-vhost-user transport to the DPDK mailing list
[5]. Regarding the virtio-vhost-user device, I have made some
enhancements [6] on your original RFC device implementation and, as you
may remember, I have sent some revised versions of the spec to the
virtio mailing list [7].

At the moment, the blocker is the virtio spec. The last update on this
was my discussion with Michael Tsirkin (Cc-ing him as well) about the
need for the VIRTIO_PCI_CAP_DOORBELL_CFG and
VIRTIO_PCI_CAP_NOTIFICATION_CFG configuration structures [8].

So, I think the next steps should be the following:

1. merging the spec
2. adding the device on QEMU
3. adding the vvu transport on DPDK
4. extending SPDK to make use of the new vhost-user transport

To conclude, I still believe that this device is useful and should be
part of virtio/qemu/dpdk/spdk and I will continue working on this
direction.

Best regards,
Nikos


[1] https://ndragazis.github.io/spdk.html
[2] 
https://lists.01.org/hyperkitty/list/s...@lists.01.org/thread/UR4FM45LEQIBJNQ4MTDZFH6SLTXHTGDR/#ZGPRKS47QWHXHFBEKSCA7Z66E2AGSLHN
[3] 
https://lists.01.org/hyperkitty/list/s...@lists.01.org/thread/WLUREJGPK5UJVTHIQ5GRL3CDWR5NN5BI/#G7P3D4KF6OQDI2RYASXQOZCMITKT5DEP
[4] http://mails.dpdk.org/archives/dev/2018-January/088155.html
[5] 
https://lore.kernel.org/dpdk-dev/e03dcc29-d472-340a-9825-48d13e472...@redhat.com/T/
[6] https://lists.gnu.org/archive/html/qemu-devel/2019-04/msg02910.html
[7] https://lists.oasis-open.org/archives/virtio-dev/201906/msg00036.html
[8] https://lists.oasis-open.org/archives/virtio-dev/201908/msg00014.html



Re: [PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-01-14 Thread Stefan Hajnoczi
On Fri, Jan 10, 2020 at 10:34 AM Marc-André Lureau
 wrote:
> On Wed, Jan 8, 2020 at 5:57 AM V.  wrote:

Hi V.,
I think I remember you from Etherboot/gPXE days :).

> > 3.
> > Now if Cross Cable is actually a new and (after a code-rewrite of 10) a 
> > viable way to connect 2 QEMU's together, could I actually
> > suggest a better way?
> > I am thinking of a '-netdev vhost-user-slave' option to connect (as client 
> > or server) to a master QEMU running '-netdev vhost-user'.
> > This way there is no need for any external program at all, the master can 
> > have it's memory unshared and everything would just work
> > and be fast.
> > Also the whole thing can fall back to normal virtio if memory is not shared 
> > and it would even work in pure usermode without any
> > context switch.
> >
> > Building a patch for this idea I could maybe get around to, don't clearly 
> > have an idea how much work this would be but I've done
> > crazier things.
> > But is this is something that someone might be able to whip up in an hour 
> > or two? Someone who actually does have a clue about vhost
> > and virtio maybe? ;-)
>
> I believe https://wiki.qemu.org/Features/VirtioVhostUser is what you
> are after. It's still being discussed and non-trivial, but not very
> active lately afaik.

virtio-vhost-user is being experimented with in the SPDK community but
there has been no activity on VIRTIO standardization or the QEMU
patches for some time.  More info here:
https://ndragazis.github.io/spdk.html

I think the new ivshmem v2 feature may provide the functionality you
are looking for, but I haven't looked at them yet.  Here is the link:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg668749.html

And here is Jan's KVM Forum presentation on ivshmem v2:
https://www.youtube.com/watch?v=TiZrngLUFMA

Stefan



Re: [PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-01-10 Thread Marc-André Lureau
Hi

On Wed, Jan 8, 2020 at 5:57 AM V.  wrote:
>
> Hi List,
>
> For my VM setup I tend to use a lot of VM to VM single network links to do 
> routing, switching and bridging in VM's instead of the host.
> Also stemming from a silly fetish to sometimes use some OpenBSD VMs as 
> firewall, but that is besides the point here.
> I am using the standard, tested and true method of using a whole bunch  of 
> bridges, having 2 vhost taps each.
> This works and it's fast, but it is a nightmare to manage with all the 
> interfaces on the host.
>
> So, I looked a bit into how I can improve this, basically coming down to "How 
> to connect 2 VM's together in a really fast and easy way".
> This however, is not as straightforward as I thought, without going the whole 
> route of OVS/Snabb/any other big feature bloated
> software switch.
> Cause really, all I want is to connect 2 VM's in a fast and easy way. 
> Shouldn't be that hard right?
>
> Anyways, I end up finding tests/vhost-user-bridge.c, which is very nicely 
> doing half of what I wanted.
> After some doubling of the vhosts and eliminating udp, I came up with a Vhost 
> User Cross Cable. (patch in next post).
> It just opens 2 vhost sockets instead of 1 and does the forwarding between 
> them.
> A terrible hack and slash of vhost-user-bridge.c, probably now with bugs 
> causing the dead of many puppies and the end of humanity,
> but it works!
>
> However... I now am left with some questions, which I hope some of you can 
> answer.
>
> 1.
> I looked, googled, read and tried things, but it is likely that I am an 
> complete and utter moron and my google-fu has just been awful...
> Very likely... But is there really no other way then I have found to just 
> link up 2 QEMU's in a fast non-bridge way? (No, not sockets.)
> Not that OVS and the likes are not fine software, but do we really need the 
> whole DPDK to do this?

By "not sockets", you mean the data path should use shared memory?
Then, I don't think there are other way.

>
> 2.
> In the unlikely chance that I'm not an idiot, then I guess now we have a nice 
> simple cross cable.
> However, I am still a complete vhost/virtio idiot who has no clue how it 
> works and just randomly brute-forced code into submission.
> Maybe not entirely true, but I would still appreciate it very much if someone 
> with more knowledge into vhost to have a quick look at
> how things are done in cc.
>
> Specifically this monstrosity in TX (speed_killer is a 1MB buffer and kills 
> any speed):
>   ret = iov_from_buf(sg, num, 0, speed_killer,
>  iov_to_buf(out_sg, out_num, 0, speed_killer,
> MIN(iov_size(out_sg, out_num), sizeof 
> speed_killer)
>)
> );
>
>   vs. the commented:
>   //ret = iov_copy(sg, num, out_sg, out_num, 0,
>   //   MIN(iov_size(sg, num), iov_size(out_sg, out_num)));
>
> The first is obviously a quick fix to get things working, however, in my 
> meager understanding, should the 2nd one not work?
> Maybe I'm messing up my vectors here, or I am messing up my understanding of 
> iov_copy, but shouldn't the 2nd form be the way to zero
> copy?


As you noted, the data must be copied from source to dest memory.
iov_copy() doesn't actually do that, I don't think we have a iov
function for that.

>
> 3.
> Now if Cross Cable is actually a new and (after a code-rewrite of 10) a 
> viable way to connect 2 QEMU's together, could I actually
> suggest a better way?
> I am thinking of a '-netdev vhost-user-slave' option to connect (as client or 
> server) to a master QEMU running '-netdev vhost-user'.
> This way there is no need for any external program at all, the master can 
> have it's memory unshared and everything would just work
> and be fast.
> Also the whole thing can fall back to normal virtio if memory is not shared 
> and it would even work in pure usermode without any
> context switch.
>
> Building a patch for this idea I could maybe get around to, don't clearly 
> have an idea how much work this would be but I've done
> crazier things.
> But is this is something that someone might be able to whip up in an hour or 
> two? Someone who actually does have a clue about vhost
> and virtio maybe? ;-)

I believe https://wiki.qemu.org/Features/VirtioVhostUser is what you
are after. It's still being discussed and non-trivial, but not very
active lately afaik.

>
> 4.
> Hacking together cc from bridge I noticed the use of container_of() to get 
> from vudev to state in the vu callbacks.
> Would it be an idea to add a context pointer to the callbacks (possibly 
> gotten from VuDevIface)?
> And I know. First post and I have the forwardness to even suggest an API 
> change! I know!
> But it makes things a bit simpler to avoid globals and it makes sense to have 
> some context in a callback to know what's going on,
> right? ;-)

Well, the callbacks are called with the VuDev, so container_of() is
quite fine since 

[PATCH/RFC 0/1] Vhost User Cross Cable: Intro

2020-01-07 Thread V.
Hi List,

For my VM setup I tend to use a lot of VM to VM single network links to do 
routing, switching and bridging in VM's instead of the host.
Also stemming from a silly fetish to sometimes use some OpenBSD VMs as 
firewall, but that is besides the point here.
I am using the standard, tested and true method of using a whole bunch  of 
bridges, having 2 vhost taps each.
This works and it's fast, but it is a nightmare to manage with all the 
interfaces on the host.

So, I looked a bit into how I can improve this, basically coming down to "How 
to connect 2 VM's together in a really fast and easy way".
This however, is not as straightforward as I thought, without going the whole 
route of OVS/Snabb/any other big feature bloated
software switch.
Cause really, all I want is to connect 2 VM's in a fast and easy way. Shouldn't 
be that hard right?

Anyways, I end up finding tests/vhost-user-bridge.c, which is very nicely doing 
half of what I wanted.
After some doubling of the vhosts and eliminating udp, I came up with a Vhost 
User Cross Cable. (patch in next post).
It just opens 2 vhost sockets instead of 1 and does the forwarding between them.
A terrible hack and slash of vhost-user-bridge.c, probably now with bugs 
causing the dead of many puppies and the end of humanity,
but it works!

However... I now am left with some questions, which I hope some of you can 
answer.

1.
I looked, googled, read and tried things, but it is likely that I am an 
complete and utter moron and my google-fu has just been awful...
Very likely... But is there really no other way then I have found to just link 
up 2 QEMU's in a fast non-bridge way? (No, not sockets.)
Not that OVS and the likes are not fine software, but do we really need the 
whole DPDK to do this?

2.
In the unlikely chance that I'm not an idiot, then I guess now we have a nice 
simple cross cable.
However, I am still a complete vhost/virtio idiot who has no clue how it works 
and just randomly brute-forced code into submission.
Maybe not entirely true, but I would still appreciate it very much if someone 
with more knowledge into vhost to have a quick look at
how things are done in cc.

Specifically this monstrosity in TX (speed_killer is a 1MB buffer and kills any 
speed):
  ret = iov_from_buf(sg, num, 0, speed_killer,
 iov_to_buf(out_sg, out_num, 0, speed_killer,
    MIN(iov_size(out_sg, out_num), sizeof 
speed_killer)
   )
    );

  vs. the commented:
  //ret = iov_copy(sg, num, out_sg, out_num, 0,
  //   MIN(iov_size(sg, num), iov_size(out_sg, out_num)));

The first is obviously a quick fix to get things working, however, in my meager 
understanding, should the 2nd one not work?
Maybe I'm messing up my vectors here, or I am messing up my understanding of 
iov_copy, but shouldn't the 2nd form be the way to zero
copy?

3.
Now if Cross Cable is actually a new and (after a code-rewrite of 10) a viable 
way to connect 2 QEMU's together, could I actually
suggest a better way?
I am thinking of a '-netdev vhost-user-slave' option to connect (as client or 
server) to a master QEMU running '-netdev vhost-user'.
This way there is no need for any external program at all, the master can have 
it's memory unshared and everything would just work
and be fast.
Also the whole thing can fall back to normal virtio if memory is not shared and 
it would even work in pure usermode without any
context switch.

Building a patch for this idea I could maybe get around to, don't clearly have 
an idea how much work this would be but I've done
crazier things.
But is this is something that someone might be able to whip up in an hour or 
two? Someone who actually does have a clue about vhost
and virtio maybe? ;-)

4.
Hacking together cc from bridge I noticed the use of container_of() to get from 
vudev to state in the vu callbacks.
Would it be an idea to add a context pointer to the callbacks (possibly gotten 
from VuDevIface)?
And I know. First post and I have the forwardness to even suggest an API 
change! I know!
But it makes things a bit simpler to avoid globals and it makes sense to have 
some context in a callback to know what's going on,
right? ;-)

5.
Last one, promise.
I'm very much in the church of "less software == less bugs == less security 
problems".
Running cc or a vhost-user-slave means QEMU has fast networking in usermode 
without the need for anything else then AF_UNIX + shared
mem.
So might it be possible to weed out any modern fancy stuff like the Internet 
Protocol, TCP, taps, bridges, ethernet and tokenring
from a kernel and run QEMU on that?
The idea here is a kernel with storage, a serial console, AF_UNIX and vfio-pci, 
only running QEMU.
Would this be feasible? Or does QEMU need a kernel which at least has a grasp 
of understanding of what AF_INET and ethernet is?
(Does a modern kernel even still support tokenring? No idea, Probably does.)


Finally, an example