Re: L2 network namespaces + macvlan performances

2007-07-09 Thread Rick Jones
Between the "normal" case and the "net namespace + macvlan" case, 
results are  about the same for both the throughput and the local CPU 
load for the following test types: TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


macvlan looks like a very good candidate for network namespace in these 
cases.


But, with the TCP_STREAM test, I observed the CPU load is about the
same (that's what we wanted) but the throughput decreases by about 5%:
from 850MB/s down to 810MB/s.
I haven't investigated yet why the throughput decrease in the case.
Does it come from my setup, from macvlan additional treatments, other? I 
don't know yet


Given that your "normal" case doesn't hit link-rate on the TCP_STREAM, 
but it does with UDP_STREAM, it could be that there isn't quite enough 
TCP window available, particularly given it seems the default settings 
for sockets/windows are in use.  You might try your normal case with the 
test-specific -S and -s options to increase the socket buffer size:


netperf -H 192.168.76.1 -i 30,3 -l 20 -t TCP_STREAM -- -m 1400 -S 128K 
-S 128K


and see if that gets you link-rate.  One other possibility there is the 
use of the 1400 byte send - that probably doesn't interact terribly well 
with TSO.  Also, it isn't (?) likely the MSS for the connection, which 
you can have reported by adding a "-v 2" to the global options.  You 
could/should then use the MSS in a subsequent test, or perhaps better 
still use a rather larger send size for TCP_STREAM|TCP_MAERTS - I myself 
for no particular reason tend to use either 32KB or 64KB as the send 
size in the netperf TCP_STREAM tests I run.


A final WAG - that the 1400 byte send size interacted poorly with the 
Nagle algorithm since it was a sub-MSS send.  When Nagle is involved, 
things can be very timing-sensitive, change the timing ever so slightly 
and you can have a rather larger change in throughput. That could be 
dealt-with either with the larger send sizes mentioned above, or by 
adding a test-specific -D option to set TCP_NODELAY.


happy benchmarking,

rick jones
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: L2 network namespaces + macvlan performances

2007-07-09 Thread Herbert Poetzl
On Fri, Jul 06, 2007 at 06:48:15PM +0200, Benjamin Thery wrote:
> Following a discussion we had at OLS concerning L2 network namespace
> performances and how the new macvlan driver could potentially improve
> them, I've ported the macvlan patchset on top of Eric's net namespace
> patchset on 2.6.22-rc4-mm2.
> 
> A little bit of history:
> 
> Some months ago, when we ran some performance tests (using netperf)
> on net namespace, we observed the following things:
> 
> Using 'etun', the virtual ethernet tunnel driver, and IP routes
> from inside a network namespace,
> 
> - The throughput is the same as the "normal" case(*)
>   (* normal case: no namespace, using physical adapters).
>   No regression. Good.
> 
> - But the CPU load increases a lot. Bad.
>   The reasons are:
>   - All checksums are done in software. No hardware offloading.
>   - Every TCP packets going through the etun devices are
> duplicated in ip_forward() before we decrease the ttl.
>   (packets are routed between both ends of etun)
> 
> We also made some testing with bridges, and obtained the same results:
>   CPU load increase:
>   - No hardware offloading
>   - Packets are duplicated somewhere in the bridge+netfilter
>   code (can't remember where right now)
> 
> 
> This time, I've replaced the etun interface by the new macvlan,
> which should benefits from the hardware offloading capabilities of the
> physical adapter and suppress the forwarding stuff.
> 
> My test setup is:
> 
>   Host AHost B
>  _____
> |  _   |  |   |
> | | Netns 1 |  |  |   |
> | | |  |  |   |
> | | macvlan0|  |  |   |
> | |___|_|  |  |   |
> | ||  |   |
> |_||  |___|
>   | eth0 (192.168.0.2) | eth0 (192.168.0.1)
>   ||
> -
> macvlan0 (192.168.0.3)
> 
> - netperf runs on host A
> - netserver runs on host B
> - Adapters speed is 1GB/s
> 
> On this setup I ran the following netperf tests: TCP_STREAM, 
> TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.
> 
> Between the "normal" case and the "net namespace + macvlan" case, 
> results are  about the same for both the throughput and the local CPU 
> load for the following test types: TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.
> 
> macvlan looks like a very good candidate for network namespace in 
> these cases.
> 
> But, with the TCP_STREAM test, I observed the CPU load is about the
> same (that's what we wanted) but the throughput decreases by about 5%:
> from 850MB/s down to 810MB/s.
> I haven't investigated yet why the throughput decrease in the case.
> Does it come from my setup, from macvlan additional treatments, other? 
> I don't know yet
> 
> Attached to this email you'll find the raw netperf outputs for the 
> three cases:
> 
> - netperf through a physical adapter, no namespace:
>   netperf-results-2.6.22-rc4-mm2-netns1-vanilla.txt   
> - netperf through etun, inside a namespace:
>   netperf-results-2.6.22-rc4-mm2-netns1-using-etun.txt
> - netperf through macvlan, inside a namespace:
>   netperf-results-2.6.22-rc4-mm2-netns1-using-macvlan.txt
> 
> macvlan looks promising.

nice, any performance tests for multiple network
spaces sharing the same eth0 (with different macvlans)?
how does that compare to IP isolation performance wise?

TIA,
Herbert

> Regards,
> Benjamin
> 
> -- 
> B e n j a m i n   T h e r y  - BULL/DT/Open Software R&D
> 
>http://www.bull.com

> NETPERF RESULTS: the "normal" case : 
> 
> No network namespace, traffic goes through real 1GB/s physical adapters.
> 
> 
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
> (192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
>  87380  16384   140020.03   857.39   6.39 9.75 2.444   3.727  
> 
>  
> 
> TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
> (192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
> Recv   SendSend  Utilization   Service Demand
> Socket Socket  Message  Elapsed  Send Recv SendRecv
> Size   SizeSize Time Throughput  localremote   local   remote
> bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB
> 
>  87380  16384  8738020.03   763.15   4.75 10.

Re: L2 network namespaces + macvlan performances

2007-07-07 Thread Daniel Lezcano

Benjamin Thery wrote:

Following a discussion we had at OLS concerning L2 network namespace
performances and how the new macvlan driver could potentially improve
them, I've ported the macvlan patchset on top of Eric's net namespace
patchset on 2.6.22-rc4-mm2.

A little bit of history:

Some months ago, when we ran some performance tests (using netperf)
on net namespace, we observed the following things:

Using 'etun', the virtual ethernet tunnel driver, and IP routes
from inside a network namespace,

- The throughput is the same as the "normal" case(*)
  (* normal case: no namespace, using physical adapters).
  No regression. Good.

- But the CPU load increases a lot. Bad.
  The reasons are:
- All checksums are done in software. No hardware offloading.
- Every TCP packets going through the etun devices are
  duplicated in ip_forward() before we decrease the ttl.
  (packets are routed between both ends of etun)

We also made some testing with bridges, and obtained the same results:
CPU load increase:
- No hardware offloading
- Packets are duplicated somewhere in the bridge+netfilter
  code (can't remember where right now)


This time, I've replaced the etun interface by the new macvlan,
which should benefits from the hardware offloading capabilities of the
physical adapter and suppress the forwarding stuff.

My test setup is:

  Host AHost B
 _____
|  _   |  |   |
| | Netns 1 |  |  |   |
| | |  |  |   |
| | macvlan0|  |  |   |
| |___|_|  |  |   |
| ||  |   |
|_||  |___|
  | eth0 (192.168.0.2) | eth0 (192.168.0.1)
  ||
-
macvlan0 (192.168.0.3)

- netperf runs on host A
- netserver runs on host B
- Adapters speed is 1GB/s

On this setup I ran the following netperf tests: TCP_STREAM, TCP_MAERTS, 
TCP_RR, UDP_STREAM, UDP_RR.


Between the "normal" case and the "net namespace + macvlan" case, 
results are  about the same for both the throughput and the local CPU 
load for the following test types: TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


macvlan looks like a very good candidate for network namespace in these 
cases.


But, with the TCP_STREAM test, I observed the CPU load is about the
same (that's what we wanted) but the throughput decreases by about 5%:
from 850MB/s down to 810MB/s.
I haven't investigated yet why the throughput decrease in the case.
Does it come from my setup, from macvlan additional treatments, other? I 
don't know yet


Attached to this email you'll find the raw netperf outputs for the three 
cases:


- netperf through a physical adapter, no namespace:
netperf-results-2.6.22-rc4-mm2-netns1-vanilla.txt   
- netperf through etun, inside a namespace:
netperf-results-2.6.22-rc4-mm2-netns1-using-etun.txt   
- netperf through macvlan, inside a namespace:

netperf-results-2.6.22-rc4-mm2-netns1-using-macvlan.txt


macvlan looks promising.

Regards,
Benjamin


Very interesting.
Thank you very much Benjamin for investigating this.
I will update the http://lxc.sf.net web site with your description and 
results.






NETPERF RESULTS: the "normal" case : 


No network namespace, traffic goes through real 1GB/s physical adapters.


TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384   140020.03   857.39   6.39 9.75 2.444   3.727  




TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384  8738020.03   763.15   4.75 10.332.038   4.434  




TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.  

L2 network namespaces + macvlan performances

2007-07-06 Thread Benjamin Thery

Following a discussion we had at OLS concerning L2 network namespace
performances and how the new macvlan driver could potentially improve
them, I've ported the macvlan patchset on top of Eric's net namespace
patchset on 2.6.22-rc4-mm2.

A little bit of history:

Some months ago, when we ran some performance tests (using netperf)
on net namespace, we observed the following things:

Using 'etun', the virtual ethernet tunnel driver, and IP routes
from inside a network namespace,

- The throughput is the same as the "normal" case(*)
  (* normal case: no namespace, using physical adapters).
  No regression. Good.

- But the CPU load increases a lot. Bad.
  The reasons are:
- All checksums are done in software. No hardware offloading.
- Every TCP packets going through the etun devices are
  duplicated in ip_forward() before we decrease the ttl.
  (packets are routed between both ends of etun)

We also made some testing with bridges, and obtained the same results:
CPU load increase:
- No hardware offloading
- Packets are duplicated somewhere in the bridge+netfilter
  code (can't remember where right now)


This time, I've replaced the etun interface by the new macvlan,
which should benefits from the hardware offloading capabilities of the
physical adapter and suppress the forwarding stuff.

My test setup is:

  Host AHost B
 _____
|  _   |  |   |
| | Netns 1 |  |  |   |
| | |  |  |   |
| | macvlan0|  |  |   |
| |___|_|  |  |   |
| ||  |   |
|_||  |___|
  | eth0 (192.168.0.2) | eth0 (192.168.0.1)
  ||
-
macvlan0 (192.168.0.3)

- netperf runs on host A
- netserver runs on host B
- Adapters speed is 1GB/s

On this setup I ran the following netperf tests: TCP_STREAM, 
TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


Between the "normal" case and the "net namespace + macvlan" case, 
results are  about the same for both the throughput and the local CPU 
load for the following test types: TCP_MAERTS, TCP_RR, UDP_STREAM, UDP_RR.


macvlan looks like a very good candidate for network namespace in 
these cases.


But, with the TCP_STREAM test, I observed the CPU load is about the
same (that's what we wanted) but the throughput decreases by about 5%:
from 850MB/s down to 810MB/s.
I haven't investigated yet why the throughput decrease in the case.
Does it come from my setup, from macvlan additional treatments, other? 
I don't know yet


Attached to this email you'll find the raw netperf outputs for the 
three cases:


- netperf through a physical adapter, no namespace:
netperf-results-2.6.22-rc4-mm2-netns1-vanilla.txt   
- netperf through etun, inside a namespace:
netperf-results-2.6.22-rc4-mm2-netns1-using-etun.txt
- netperf through macvlan, inside a namespace:
netperf-results-2.6.22-rc4-mm2-netns1-using-macvlan.txt


macvlan looks promising.

Regards,
Benjamin

--
B e n j a m i n   T h e r y  - BULL/DT/Open Software R&D

   http://www.bull.com
NETPERF RESULTS: the "normal" case : 

No network namespace, traffic goes through real 1GB/s physical adapters.


TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384   140020.03   857.39   6.39 9.75 2.444   3.727  

 

TCP MAERTS TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

 87380  16384  8738020.03   763.15   4.75 10.332.038   4.434  

 

TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.76.1 
(192.168.76.1) port 0 AF_INET : +/-2.5% @ 95% conf.
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPUCPUS.dem   S.dem
Send   Recv   SizeSize   TimeRate local  remote local   remote
bytes  bytes  bytes   bytes