Re: Performance of Wireguard on Infiniband 40G

2017-05-14 Thread Jason A. Donenfeld
Hey Greg,

On Sun, May 14, 2017 at 12:48 PM, Greg KH  wrote:
> 4.9 is 6 months old, I'd be curious if 4.11 is any faster given the rate
> of change in the network stack :)

I imagine it might be. I think the biggest bottle neck, in any case,
is still the poor algorithm in padata. Hopefully we'll get this sorted
with the help of Samuel's research this summer!

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Performance of Wireguard on Infiniband 40G

2017-05-14 Thread Greg KH
On Sun, May 14, 2017 at 11:55:52AM +0200, Baptiste Jonglez wrote:
> On Sun, May 14, 2017 at 12:52:11AM +0200, Jason A. Donenfeld wrote:
> > One small and unfortunate thought just occurred to me: the backporting
> > to really old kernels I'm pretty sure is way less efficient than newer
> > kernels on the RX, due to some missing core fast-path APIs in the old
> > kernels. In particular, I had to wrap the UDP layer with some nasty
> > hacks to get packets out, whereas newer kernels have an elegant API
> > for that which integrates in the right place. Just a thought... I
> > haven't actually done concrete measurements though.
> 
> Good idea, I have redone the same setup with kernel 4.9.18 from
> jessie-backports.
> 
> TL;DR: when switching from kernel 3.16 to 4.9, wireguard has a 50%
> performance gain in the most favourable case (large MTU).  Also, iperf
> seems generally faster than iperf3, most likely because iperf3 has no
> multi-threading.

4.9 is 6 months old, I'd be curious if 4.11 is any faster given the rate
of change in the network stack :)

thanks,

greg k-h
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Performance of Wireguard on Infiniband 40G

2017-05-13 Thread Jason A. Donenfeld
Hey again,

On Sat, May 13, 2017 at 9:37 AM, Baptiste Jonglez
 wrote:
> - Debian jessie
> - Linux 3.16.43-2

One small and unfortunate thought just occurred to me: the backporting
to really old kernels I'm pretty sure is way less efficient than newer
kernels on the RX, due to some missing core fast-path APIs in the old
kernels. In particular, I had to wrap the UDP layer with some nasty
hacks to get packets out, whereas newer kernels have an elegant API
for that which integrates in the right place. Just a thought... I
haven't actually done concrete measurements though.

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Re: Performance of Wireguard on Infiniband 40G

2017-05-13 Thread Jason A. Donenfeld
Hey Baptiste,

Awesome test! Thanks for reporting the results.

On Sat, May 13, 2017 at 9:37 AM, Baptiste Jonglez
 wrote:
> Using iperf (TCP mode) over the wireguard interface, performance was
> around 1.6 Gbit/s.  In bidirectional mode (iperf -d), performance was
> 700 Mbit/s + 800 Mbit/s.

Indeed the current multicore algorithm has a lot of issues. Samuel,
CCd, is going to be doing some work on optimizing this algorithm this
summer.

> After raising the MTU of the wireguard interface to 65450, performance
> went up to 7.6 Gbit/s (unidirectional iperf).

It makes sense that it'd be higher, since CPUs work best when running
uninterrupted, but still this indicates that padata is a very
suboptimal algorithm. Expect some improvements on this in the coming
months. Hopefully you'll be able to test on similar hardware at some
point when things are finished.

> Note that infiniband has a MTU of 65520 bytes, but Wireguard still selects
> a MTU of 1420 bytes for its interface.

Yea the 1420 is just a hard coded "default". I probably add something
clever to autoselect an MTU when configuring the first peer's first
endpoint (by computing the route and taking its interface's mtu and
doing subtraction, etc), but the long term solution, I think, will be
to do some more clever PMTU situation from within WireGuard. I'm still
working out exactly how to do this, but it should be possible.


> - Xeon E5520 @2.27GHz (2 CPUs, 4 cores each)
> - Mellanox ConnectX IB 4X QDR MT26428

*drools* That's some awesome hardware!

> - iperf 2.0.5

iperf2 has the -b bidirectional mode which is nice, but it seems like
most people are using iperf3 now. Out of curiosity, is there a reason
for preferring iperf2, beyond the -b switch?

Jason
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard


Performance of Wireguard on Infiniband 40G

2017-05-13 Thread Baptiste Jonglez
Hi,

Just for information, I did a quick test of Wireguard over a 40G
Infiniband network, between two machines with a Xeon E5520.

Using iperf (TCP mode) over the wireguard interface, performance was
around 1.6 Gbit/s.  In bidirectional mode (iperf -d), performance was
700 Mbit/s + 800 Mbit/s.

Note that infiniband has a MTU of 65520 bytes, but Wireguard still selects
a MTU of 1420 bytes for its interface.
After raising the MTU of the wireguard interface to 65450, performance
went up to 7.6 Gbit/s (unidirectional iperf).

Using the Infiniband network directly, iperf's performance is 21.7 Gbit/s
(iperf maxes out the CPU at the receiver, even when using 8 threads).

Hardware used:

- Xeon E5520 @2.27GHz (2 CPUs, 4 cores each)
- Mellanox ConnectX IB 4X QDR MT26428

Versions used:

- Debian jessie
- Linux 3.16.43-2
- Wireguard 0.0.20170421-2
- iperf 2.0.5
- Mellanox ConnectX InfiniBand driver v2.2-1

Baptiste


signature.asc
Description: PGP signature
___
WireGuard mailing list
WireGuard@lists.zx2c4.com
https://lists.zx2c4.com/mailman/listinfo/wireguard