On 21 February 2012 07:20, richard Croucher < richard.crouc...@informatix-sol.com> wrote:
> ** > There is nothing special about this specific configuration. Both > measurements were made on the same physical server and same OS config, so > that can be ruled out as a factor for the difference. I've seen similar > results on many other systems as well. They'd been getting closer and > closer for some time, but now 10G Ethernet is lower latency than > InfiniBand. I'm sure you are aware as well, that there are similar > results comparing VERB level programs on RoCE and InfiniBand, which show > that Ethernet has the edge. I'm a big fan of InfiniBand but it has to > deliver the results. > > -- > > Richard Croucher > www.informatix-sol.com > +44-7802-213901 > > On Mon, 2012-02-20 at 17:03 +0000, Gilad Shainer wrote: > > Richard, > > > > Critical missing is the setup information. What is the server, CPU etc. > Can you please provide? > > > > Gilad > > > > > > *From:* ewg-boun...@lists.openfabrics.org [mailto: > ewg-boun...@lists.openfabrics.org] *On Behalf Of *richard Croucher > *Sent:* Monday, February 20, 2012 8:50 AM > *To:* ewg@lists.openfabrics.org > *Subject:* [ewg] dissappointing IPoIB performance > > > > > I've been undertaking some internal QA testing of the Mellanox CX3's. > > An observation that I've seen for some time and is most likely to do with > the IPoIB implementation rather than the HCA is that the latency of IPoIB > is getting increasingly poor in comparison with the standard kernel TCP/IP > stack over 10g Ethernet. > > If we look at the results for the CX3's running both 10G Ethernet and 40G > InfiniBand, on same serve hardware, I get the following median latency > with my test setup. Results are with my own test program so are only > meaningful as a comparison with the other configurations running the same > test. > > Running OFED 1.5.3 and RH 6.0 > > IPoIB (connected) TCP 33.67 uS (switchless) > IPoIB (datagram) TCP 31.63 uS (switchless) > IPoIB (connected) UDP 24.78 uS (switchless) > IPoIB (datagram) UDP 24.28 uS (switchless) > IPoIB (connected) UDP 25.37 uS (1 hop) between ports on same switch > IPoIB (connected) TCP 34.48 uS (1 hop) > 10G Ethernet UDP 24.04uS (2 hops) across a LAG connected pair of > Ethernet switches > 10G Ethernet TCP 34.59 uS (2 hops) > > The Mellanox Ethernet drivers are tuned for low latency rather than > throughput, but I would have hoped that given the 4x extra bandwidth > available it would have helped the InfiniBand drivers outperform. > > I've seen similar results for CX2 . 10G ethernet is increasingly > looking like the better option for low latency, particularly with the > current generation of low latency Ethernet switches. Switchless Ethernet > has been better for some time than switchless InfiniBand, but it now looks > to be the case in switched environments as well. I think this reflects > that there has been a lot of effort tweaking and tuning TCP/IP over > Ethernet and its low level drivers, with very little activity on the IPoIB > front. Unless we see improvements here it will get increasingly difficult > to justify InfiniBand deployments. > > > > > -- > > Richard Croucher > www.informatix-sol.com > +44-7802-213901 > > > > > > > > _______________________________________________ > ewg mailing list > ewg@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg > IPoIB is not really a native IB ULP and has never had that great performance. If performance is of concern then native IB Verbs is going to greatly outperform TCP/IP on any hardware platform, be it Ethernet or Infiniband. See for instance some cheap test hardware under my desk: spartan01 # ping spartan02 PING spartan02.prod.eq3.syd.au.ovs (10.1.33.2) 56(84) bytes of data. 64 bytes from spartan02.prod.eq3.syd.au.ovs (10.1.33.2): icmp_req=1 ttl=64 time=0.067 ms 64 bytes from spartan02.prod.eq3.syd.au.ovs (10.1.33.2): icmp_req=2 ttl=64 time=0.060 ms 64 bytes from spartan02.prod.eq3.syd.au.ovs (10.1.33.2): icmp_req=3 ttl=64 time=0.059 ms 64 bytes from spartan02.prod.eq3.syd.au.ovs (10.1.33.2): icmp_req=4 ttl=64 time=0.058 ms 64 bytes from spartan02.prod.eq3.syd.au.ovs (10.1.33.2): icmp_req=5 ttl=64 time=0.058 ms --- spartan02 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 3999ms rtt min/avg/max/mdev = 0.058/0.060/0.067/0.007 ms spartan01 # rdma_lat local address: LID 0x06 QPN 0x7e006d PSN 0x7b0dd5 RKey 0xa042400 VAddr 0x00000001012001 remote address: LID 0x01 QPN 0x3e0050 PSN 0x58a76e RKey 0xa002400 VAddr 0x00000001051001 Latency typical: 2.01 usec Latency best : 1.95 usec Latency worst : 2.8 usec Notice that IPoIB is in the realm of 60 usec vs 2 usec on native RDMA ping? This is also DDR quite old hardware mostly untuned, probably cost a total of $2200 including switch and 2 HBAs. One needs to compare apples with apples. Infiniband does TCP/IP fully in software, Ethernet cards have receive and transmit offload engines for TCP/IP because it's the Ethernet native protocol so to speak. So if you must you should compare TCP/IP on Ethernet vs RDMA on IB. TCP/IP is supported for compatibility reasons but it's by no means a good idea to use it for low latency or high-throughput because the design of the TCP/IP protocol facilitates neither. Joseph. -- * Founder | Director | VP Research Orion Virtualisation Solutions* | www.orionvm.com.au | Phone: 1300 56 99 52 | Mobile: 0428 754 84
_______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg