Re: [ewg] IPoIB to Ethernet routing performance

2011-01-03 Thread Christoph Lameter
On Sat, 25 Dec 2010, Ali Ayoub wrote:

 On Thu, Dec 9, 2010 at 3:46 PM, Christoph Lameter c...@linux.com wrote:
  On Mon, 6 Dec 2010, sebastien dugue wrote:
 
   The Mellanox BridgeX looks a better hardware solution with 12x 10Ge
   ports but when I tested this they could only provide vNIC
   functionality and would not commit to adding IPoIB gateway on their
   roadmap.
 
    Right, we did some evaluation on it and this was really a show stopper.
 
  Did the same thing here came to the same conclusions.

 May I ask why do you need IPoIB when you have EoIB (vNic driver)?
 Why it's a show stopper?

EoIB is immature for some use cases like financial. No multicast support
f.e. All multicast becomes broadcast. There is extensive support for
multicast on IPoIB and the various gotchas and hiccups that where there
initially have mostly been worked out.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-30 Thread Richard Croucher
IPoIB is far easier to use and does not carry out the additional management
burden of vNICS.

With vNICs you have to manage the MAC address mapping to Ethernet g/w port.
In some situations, such as when multiple G/w's are used for resiliency this
can amount to a lot of separate vNICs on each server to manage.  In a small
configuration I had, we ended up with 6 vNICS per server to manage.  On a
large configuration this additional management would be a big burden.

My experience with IPoIB has always been very positive. All my existing
socket programs have worked, even some esoteric ioctls I use for multicast
and buffer  management.
Performance could always be better, but in my experience it's not great for
the vNICS either.   Latency in particular was very disappointing when I
tested.  
If you want high performance you have to avoid TCP/IP.

-Original Message-
From: Jabe [mailto:jabe.chap...@shiftmail.org] 
Sent: 27 December 2010 11:51
To: richard.crouc...@informatix-sol.com
Cc: Richard Croucher; 'Ali Ayoub'; 'Christoph Lameter'; 'linux-rdma';
'sebastien dugue'; 'OF EWG'
Subject: Re: [ewg] IPoIB to Ethernet routing performance

On 12/26/2010 11:57 AM, Richard Croucher wrote:
 The vNIC driver only works when you have Ethernet/InfiniBand hardware
 gateways in your environment.   It is useful when you have external hosts
to
 communicate with which do not have direct InfiniBand connectivity.
 IPoIB is still heavily used in these environments to provide TCP/IP
 connectivity within the InfiniBand fabric.
 The primary Use Case for vNICs is probably for virtualization servers, so
 that individual Guests can be presented with a virtual Ethernet NIC and do
 not lead to load any InfiniBand drivers.  Only the hypervisor needs to
have
 the InfiniBand software stack loaded.
 I've also applied vNICs in the Financial Services arena, for connectivity
to
 external TCP/IP services but there the IPoIB gateway function is arguably
 more useful.

 The whole vNIC arena is complicated by different, incompatible
 implementations from each of Qlogic and Mellanox.

 Richard



Richard, with your explanation I understand why vNIC / EoIB is used in 
the case you cite, but I don't understand why it is NOT used in the 
other cases (like Ali says).

I can *guess* it's probably because with a virtual ethernet fabric you 
have to do all IP stack in software, probably without even having the 
stateless offloads (so it would be a performance reason). Is that the 
reason?

Thank you

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-29 Thread Tziporet Koren

 On 12/28/2010 5:30 PM, Reeted wrote:


You and Richard seem to have good experience of infiniband in 
virtualized environments. May I ask one thing?
We were thinking about buying some Mellanox Connectx-2 for use with 
SR-IOV (hardware virtualization for PCI bypass, supposedly supported 
by connectx-2) in KVM (also supposedly supports SR-IOV and PCI bypass).

Do you have info if this will work, in KVM or other hypervisors?
I asked in KVM mailing list but they have not tried this card (which 
is the only SR-IOV card among Infiniband ones, so they have not tried 
infiniband).



We are working on enabling SRIOV on ConnectX2 cards
Once we will have it working with KVM we will submit the patches to the 
linux-rdma list

Should be in few months - but don't ask how much is few :-)

Tziporet

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-28 Thread Reeted

On 12/28/2010 01:06 AM, Ali Ayoub wrote:

EoIB primary use is not virtualization, although it can support it in
better ways than other ULPs.
FYI, today running full/para virtualized driver in the Guest OS is
needed also for IPoIB.
Only when platform-virtualization solution is available, the GOS will
run IB stack (for any ULP).
   


You and Richard seem to have good experience of infiniband in 
virtualized environments. May I ask one thing?
We were thinking about buying some Mellanox Connectx-2 for use with 
SR-IOV (hardware virtualization for PCI bypass, supposedly supported by 
connectx-2) in KVM (also supposedly supports SR-IOV and PCI bypass).

Do you have info if this will work, in KVM or other hypervisors?
I asked in KVM mailing list but they have not tried this card (which is 
the only SR-IOV card among Infiniband ones, so they have not tried 
infiniband).

We can be interested in both true infiniband and IPoIB support.
Thank you.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-27 Thread Jabe

On 12/26/2010 11:57 AM, Richard Croucher wrote:

The vNIC driver only works when you have Ethernet/InfiniBand hardware
gateways in your environment.   It is useful when you have external hosts to
communicate with which do not have direct InfiniBand connectivity.
IPoIB is still heavily used in these environments to provide TCP/IP
connectivity within the InfiniBand fabric.
The primary Use Case for vNICs is probably for virtualization servers, so
that individual Guests can be presented with a virtual Ethernet NIC and do
not lead to load any InfiniBand drivers.  Only the hypervisor needs to have
the InfiniBand software stack loaded.
I've also applied vNICs in the Financial Services arena, for connectivity to
external TCP/IP services but there the IPoIB gateway function is arguably
more useful.

The whole vNIC arena is complicated by different, incompatible
implementations from each of Qlogic and Mellanox.

Richard
   



Richard, with your explanation I understand why vNIC / EoIB is used in 
the case you cite, but I don't understand why it is NOT used in the 
other cases (like Ali says).


I can *guess* it's probably because with a virtual ethernet fabric you 
have to do all IP stack in software, probably without even having the 
stateless offloads (so it would be a performance reason). Is that the 
reason?


Thank you
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-27 Thread Ali Ayoub
On Sun, Dec 26, 2010 at 2:57 AM, Richard Croucher
rich...@informatix-sol.com wrote:
 The vNIC driver only works when you have Ethernet/InfiniBand hardware
 gateways in your environment.   It is useful when you have external hosts to
 communicate with which do not have direct InfiniBand connectivity.
 IPoIB is still heavily used in these environments to provide TCP/IP
 connectivity within the InfiniBand fabric.

Once you have BridgeX HW, Mellanox vNic (EoIB) driver provides IB to
EN connectivity, as well as IB to IB connectivity.
Note that IB to IB connectivity doesn't involve the bridge HW
(peer-to-peer communication) so any packet sent internally within the
IB fabric doesn't reach the bridge HW.
Today, EoIB requires the BridgeX HW, in the future, Mellanox may
support bridge-less mode where it can work without the bridge HW.

 The primary Use Case for vNICs is probably for virtualization servers, so
 that individual Guests can be presented with a virtual Ethernet NIC and do
 not lead to load any InfiniBand drivers.  Only the hypervisor needs to have
 the InfiniBand software stack loaded.

EoIB primary use is not virtualization, although it can support it in
better ways than other ULPs.
FYI, today running full/para virtualized driver in the Guest OS is
needed also for IPoIB.
Only when platform-virtualization solution is available, the GOS will
run IB stack (for any ULP).

 -Original Message-
 From: ewg-boun...@lists.openfabrics.org
 [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Ali Ayoub
 Sent: 26 December 2010 07:43
 To: Christoph Lameter
 Cc: linux-rdma; sebastien dugue; Richard Croucher; OF EWG
 Subject: Re: [ewg] IPoIB to Ethernet routing performance

 On Thu, Dec 9, 2010 at 3:46 PM, Christoph Lameter c...@linux.com wrote:
 On Mon, 6 Dec 2010, sebastien dugue wrote:

  The Mellanox BridgeX looks a better hardware solution with 12x 10Ge
  ports but when I tested this they could only provide vNIC
  functionality and would not commit to adding IPoIB gateway on their
  roadmap.

   Right, we did some evaluation on it and this was really a show stopper.

 Did the same thing here came to the same conclusions.

 May I ask why do you need IPoIB when you have EoIB (vNic driver)?
 Why it's a show stopper?
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-26 Thread Richard Croucher
The vNIC driver only works when you have Ethernet/InfiniBand hardware
gateways in your environment.   It is useful when you have external hosts to
communicate with which do not have direct InfiniBand connectivity.
IPoIB is still heavily used in these environments to provide TCP/IP
connectivity within the InfiniBand fabric.
The primary Use Case for vNICs is probably for virtualization servers, so
that individual Guests can be presented with a virtual Ethernet NIC and do
not lead to load any InfiniBand drivers.  Only the hypervisor needs to have
the InfiniBand software stack loaded.
I've also applied vNICs in the Financial Services arena, for connectivity to
external TCP/IP services but there the IPoIB gateway function is arguably
more useful.

The whole vNIC arena is complicated by different, incompatible
implementations from each of Qlogic and Mellanox.

Richard

-Original Message-
From: ewg-boun...@lists.openfabrics.org
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Ali Ayoub
Sent: 26 December 2010 07:43
To: Christoph Lameter
Cc: linux-rdma; sebastien dugue; Richard Croucher; OF EWG
Subject: Re: [ewg] IPoIB to Ethernet routing performance

On Thu, Dec 9, 2010 at 3:46 PM, Christoph Lameter c...@linux.com wrote:
 On Mon, 6 Dec 2010, sebastien dugue wrote:

  The Mellanox BridgeX looks a better hardware solution with 12x 10Ge
  ports but when I tested this they could only provide vNIC
  functionality and would not commit to adding IPoIB gateway on their
  roadmap.

   Right, we did some evaluation on it and this was really a show stopper.

 Did the same thing here came to the same conclusions.

May I ask why do you need IPoIB when you have EoIB (vNic driver)?
Why it's a show stopper?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-25 Thread Ali Ayoub
On Thu, Dec 9, 2010 at 3:46 PM, Christoph Lameter c...@linux.com wrote:
 On Mon, 6 Dec 2010, sebastien dugue wrote:

  The Mellanox BridgeX looks a better hardware solution with 12x 10Ge
  ports but when I tested this they could only provide vNIC
  functionality and would not commit to adding IPoIB gateway on their
  roadmap.

   Right, we did some evaluation on it and this was really a show stopper.

 Did the same thing here came to the same conclusions.

May I ask why do you need IPoIB when you have EoIB (vNic driver)?
Why it's a show stopper?
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-17 Thread sebastien dugue

  Hi Matthieu,

On Thu, 16 Dec 2010 23:20:35 +0100
matthieu hautreux matthieu.hautr...@gmail.com wrote:

  The router is fitted with one ConnectX2 QDR HCA and one dual port
 Myricom 10G
   Ethernet adapter.
  
   ...
  
  Here are some numbers:
  
  - 1 IPoIB stream between client and router: 20 Gbits/sec
  
Looks OK.
  
  - 2 Ethernet streams between router and server: 19.5 Gbits/sec
  
Looks OK.
  
 
 
  Actually I am amazed you can get such a speed with IPoIB. Trying with
  NPtcp on my DDR infiniband I can only obtain about 4.6Gbit/sec at the
  best packet size (that is 1/4 of the infiniband bandwidth) with this
  chip embedded in the mainboard: InfiniBand: Mellanox Technologies
  MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon (not nehalem).
  That's with 2.6.37 kernel and vanilla ib_ipoib module. What's wrong with
  my setup?
  I always assumed that such a slow speed was due to the lack of
  offloading capabilities you get with ethernet cards, but maybe I was
  wrong...?
 
 Hi,
 
 I made the same kind of experimentations than Sebastien and got results
 similar to those of you Jabe, with about ~4.6Gbit/s.
 
 I am using QDR HCA and ipoib in connected mode on the infiniband part of the
 testbed and 2 * 10Ge ethernet cards in bonding on the ethernet side of the
 router.
 To get better results, I had to increase the MTU on the ethernet side from
 1500 to 9000. Indeed, due to the TCP Path MTU discovery, during routed
 exchanges the MTU used on the ipoib link for TCP messages was automatically
 set to the minimum MTU of 1500. This small but yet very standard MTU value
 does not seem to be well handled by the ipoib_cm layer.

  This may be due to the fact that the IB MTU is 2048. Every 1500 bytes packet
is padded to 2048 bytes before being sent through the wire, so you're loosing
roughly 25% bandwidth compared to an IPoIB MTU which is a multiple of 2048.

 
 Is this issue already known and/or reported ? It should be really
 interesting to understand why a small value of MTU is such a problem for
 ipoib_cm. After a quick look at the code, it seems that ipoib packet
 processing is single threaded and that each ip packet is
 transmitted/received and processed as a single unit. If that appears to be
 the bottleneck, do you think that packets aggregation and/or processing
 parallelization could be feasible in a future ipoib module ? A big part of
 the ethernet networks are configured with an MTU of 1500 and 10Ge cards
 currently employ parallelization strategy in their kernel module to cope
 with this problem. It is clear that a bigger MTU is better but it is not
 always possible to achieve due to existing equipments and machines. IMHO,
 that is a real problem for infiniband/ethernet interoperability.
 
 Sebastien, concerning your bad performance of 9.3Gbit/s when routing 2
 streams from you infiniband client to your ethernet server, what is the mode
 of your bonding on the ethernet side during the test ? are you using
 balance-rr or LACP ?

  I did not use any Ethernet teaming, I only declared 2 aliases on the
clients' ib0 and set the routing tables accordingly.

  Sébastien.

 I got this kind of results with LACP as only one link
 is really used during the transmissions and this link depends of the layer 2
 informations of the peers involved in the communication (as long as you use
 the default xmit_hash_policy).
 
 HTH
 Regards,
 Matthieu
 
  Also what application did you use for the benchmark?
  Thank you
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-17 Thread Roland Dreier
This may be due to the fact that the IB MTU is 2048. Every 1500 bytes 
  packet
  is padded to 2048 bytes before being sent through the wire, so you're loosing
  roughly 25% bandwidth compared to an IPoIB MTU which is a multiple of 2048.

This isn't true.  IB packets are only padded to a multiple of 4 bytes.

However there's no point in using IPoIB connected mode to pass packets
smaller than the IB MTU -- you might as well use datagram mode.

 - R.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg



Re: [ewg] IPoIB to Ethernet routing performance

2010-12-17 Thread matthieu hautreux
2010/12/17 Roland Dreier rdre...@cisco.com

 This may be due to the fact that the IB MTU is 2048. Every 1500 bytes
 packet
   is padded to 2048 bytes before being sent through the wire, so you're
 loosing
   roughly 25% bandwidth compared to an IPoIB MTU which is a multiple of
 2048.

 This isn't true.  IB packets are only padded to a multiple of 4 bytes.

 However there's no point in using IPoIB connected mode to pass packets
 smaller than the IB MTU -- you might as well use datagram mode.


We are using infiniband as an HPC cluster interconnect network and our
compute nodes use this technology to exchange data in IPoIB with a MTU of
65520, do RDMA MPI communications and access Lustre filesystems. On top of
that, some nodes are connected to both the IB interconnect and an external
ethernet network. These nodes act as IP routers and enable compute nodes to
access site centric resources (home directories using nfs, LDAP, ...).
Compute nodes are using IPoIB with a large MTU to contact the router nodes
so we get really good performances when we only communicate with the
routers. However, as soon as the compute nodes communicate with the external
ethernet world, the TCP path MTU discovery automatically reduces IPoIB MTU
to 1500, the ethernet MTU, and we touch this 4.6Gbit/s wall.

Using datagram mode in our scenario is not possible as it will reduce the
cluster internal performances in IPoIB. What we 'd rather have is an
ipoib_cm that would better handle small packet. Do you think that this
limitation is a HCA hardware limitation (number of packets per second) or a
software limitation (number of packet processed per second) ? I would think
that it is a software limitation as better results are achieved in datagram
mode with a same 1500 bytes MTU. IPoIB in connected mode seems to use a
single completion queue with a single MSI vector for all the queue pairs it
creates to communicate. Perhaps that multiplying the number of completion
queues and MSI vectors could help to spread/parallelize the load and get
better results. What is your feeling about that ?

Regards,
Matthieu


In fact, we really need to use IPoIB connected mode as



  - R.

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-14 Thread Richard Croucher
Qlogic claim their QLE7340 is lowest latency for MPI applications but it is
restricted to a single port.   I've not carried out IPoIB testing of this
card.  There are plenty of published results for the Mellanox ConnectX cards
which tend to account for the majority of HCA's deployed.

My opinion is that the Mellanox  ConnectX has more capabilities on board and
is probably the best all rounder, but the Qlogic TrueScale lets certain apps
get closer to the metal and therefore lower latency.

You really have to carry out your own testing, since it depends on what you
consider important. 

Richard

-Original Message-
From: ewg-boun...@lists.openfabrics.org
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Jabe
Sent: 13 December 2010 16:02
To: Jason Gunthorpe
Cc: linux-rdma; OF EWG
Subject: Re: [ewg] IPoIB to Ethernet routing performance

On 12/06/2010 10:27 PM, Jason Gunthorpe wrote:
 On Mon, Dec 06, 2010 at 09:47:42PM +0100, Jabe wrote:

 Technologies MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon
 (not nehalem).
  
 Newer Mellanox cards have most of the offloads you see for ethernet so
 they get better performance. Plus  Nehalem is just better at TCP in
 the first place..


Very interesting

Do you know if new Qlogic IB cards like QLE7340 also have such offloads?

In general which brand would you recommend for IB and for IPoIB?

Thank you
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-13 Thread Christoph Lameter
On Mon, 6 Dec 2010, sebastien dugue wrote:

  The Mellanox BridgeX looks a better hardware solution with 12x 10Ge
  ports but when I tested this they could only provide vNIC
  functionality and would not commit to adding IPoIB gateway on their
  roadmap.

   Right, we did some evaluation on it and this was really a show stopper.

Did the same thing here came to the same conclusions.

  Qlogic also offer the 12400 Gateway.  This has 6x 10ge ports.
  However, like the Mellanox, I understand they only provide host vNIC
  support.

Really? I was hoping that they would have something worth looking at.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread sebastien dugue

  Hi,

  I know this might be off topic, but somebody may have already run into the 
same
problem before.

  I'm trying to use a server as a router between an IB fabric and an Ethernet 
network.

  The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 10G
Ethernet adapter.

  I did some bandwidth measurements using iperf with the following setup:

  +-+   +-+   +-+
  | |   | |   10G Eth | |
  | |QDR IB | +---+ |
  | client  +---+  Router |   10G Eth |  Server |
  | |   | +---+ |
  | |   | |   | |
  +-+   +-+   +-+

  
  However, the routing performance is far from what I would have expected.

  Here are some numbers:

  - 1 IPoIB stream between client and router: 20 Gbits/sec

Looks OK.

  - 2 Ethernet streams between router and server: 19.5 Gbits/sec

Looks OK.

  - routing 1 IPoIB stream to 1 Ethernet stream from client to server: 9.8 
Gbits/sec

We manage to saturate the Ethernet link, looks good so far.

  - routing 2 IPoIB streams to 2 Ethernet streams from client to server: 9.3 
Gbits/sec

Argh, even less that when routing a single stream. I would have expected
a bit more than this.


  Has anybody ever tried to do some routing between an IB fabric and an Ethernet
network and achieved some sensible bandwidth figures?

  Are there some known limitations in what I try to achieve?


  Thanks,

  Sébastien.




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread Richard Croucher
You may be able to improve by doing some OS tuning.  All this data should stay 
in kernel mode but there are lots of bottlenecks in the TCP/IP stack that limit 
scalability.  The IPoIB code has not been optimized for this use case.

You don't mention what Server, kernel and OFED distro you are running.

The best performance is achieved using InfiniBand/Ethernet hardware gateways.  
Most of these provide virtual Ethernet NICs to InfiniBand hosts, but the 
Voltaire 4036E does provide a  IPoIB to Ethernet gateway capability.  This is 
FPGA based so does provide much higher performance than you will achieve using 
a standard server solution. 

-Original Message-
From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of sebastien dugue
Sent: 06 December 2010 10:25
To: OF EWG
Cc: linux-rdma
Subject: [ewg] IPoIB to Ethernet routing performance


  Hi,

  I know this might be off topic, but somebody may have already run into the 
same
problem before.

  I'm trying to use a server as a router between an IB fabric and an Ethernet 
network.

  The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 10G
Ethernet adapter.

  I did some bandwidth measurements using iperf with the following setup:

  +-+   +-+   +-+
  | |   | |   10G Eth | |
  | |QDR IB | +---+ |
  | client  +---+  Router |   10G Eth |  Server |
  | |   | +---+ |
  | |   | |   | |
  +-+   +-+   +-+

  
  However, the routing performance is far from what I would have expected.

  Here are some numbers:

  - 1 IPoIB stream between client and router: 20 Gbits/sec

Looks OK.

  - 2 Ethernet streams between router and server: 19.5 Gbits/sec

Looks OK.

  - routing 1 IPoIB stream to 1 Ethernet stream from client to server: 9.8 
Gbits/sec

We manage to saturate the Ethernet link, looks good so far.

  - routing 2 IPoIB streams to 2 Ethernet streams from client to server: 9.3 
Gbits/sec

Argh, even less that when routing a single stream. I would have expected
a bit more than this.


  Has anybody ever tried to do some routing between an IB fabric and an Ethernet
network and achieved some sensible bandwidth figures?

  Are there some known limitations in what I try to achieve?


  Thanks,

  Sébastien.




___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread sebastien dugue
On Mon, 6 Dec 2010 10:49:58 -
Richard Croucher rich...@informatix-sol.com wrote:

 You may be able to improve by doing some OS tuning.

  Right, I tried a few things concerning the TCP/IP stack tuning but nothing
really came out of it.

  All this data should stay in kernel mode but there are lots of bottlenecks in
 the TCP/IP stack that limit scalability.

  That may be my problem in fact.

  The IPoIB code has not been optimized for this use case.

  I don't think IPoIB to be the bottleneck. In this case as I managed to feed
2 IPoIB streams between the client and the router yielding about 40 Gbits/s 
bandwidth.

 
 You don't mention what Server, kernel and OFED distro you are running.

  Right, sorry. The router is one of our 4 sockets Nehalem-EX box with 2 IOHs 
which
is running an OFED 1.5.2.

 
 The best performance is achieved using InfiniBand/Ethernet hardware gateways.
 Most of these provide virtual Ethernet NICs to InfiniBand hosts, but the 
 Voltaire
 4036E does provide a  IPoIB to Ethernet gateway capability.  This is FPGA 
 based
 so does provide much higher performance than you will achieve using a 
 standard server solution.

  That may be a solution indeed. Are there any real world figures out there
concerning the 4036E performance?

  Thanks Richard,

  Sébastien.


 
 -Original Message-
 From: ewg-boun...@lists.openfabrics.org 
 [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of sebastien dugue
 Sent: 06 December 2010 10:25
 To: OF EWG
 Cc: linux-rdma
 Subject: [ewg] IPoIB to Ethernet routing performance
 
 
   Hi,
 
   I know this might be off topic, but somebody may have already run into the 
 same
 problem before.
 
   I'm trying to use a server as a router between an IB fabric and an Ethernet 
 network.
 
   The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 
 10G
 Ethernet adapter.
 
   I did some bandwidth measurements using iperf with the following setup:
 
   +-+   +-+   +-+
   | |   | |   10G Eth | |
   | |QDR IB | +---+ |
   | client  +---+  Router |   10G Eth |  Server |
   | |   | +---+ |
   | |   | |   | |
   +-+   +-+   +-+
 
   
   However, the routing performance is far from what I would have expected.
 
   Here are some numbers:
 
   - 1 IPoIB stream between client and router: 20 Gbits/sec
 
 Looks OK.
 
   - 2 Ethernet streams between router and server: 19.5 Gbits/sec
 
 Looks OK.
 
   - routing 1 IPoIB stream to 1 Ethernet stream from client to server: 9.8 
 Gbits/sec
 
 We manage to saturate the Ethernet link, looks good so far.
 
   - routing 2 IPoIB streams to 2 Ethernet streams from client to server: 9.3 
 Gbits/sec
 
 Argh, even less that when routing a single stream. I would have expected
 a bit more than this.
 
 
   Has anybody ever tried to do some routing between an IB fabric and an 
 Ethernet
 network and achieved some sensible bandwidth figures?
 
   Are there some known limitations in what I try to achieve?
 
 
   Thanks,
 
   Sébastien.
 
 
 
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread Richard Croucher
Unfortunately, the 4036E only has two 10G Ethernet ports which will ultimately 
limit the throughput.  

The Mellanox BridgeX looks a better hardware solution with 12x 10Ge ports but 
when I tested this they could only provide vNIC functionality and would not 
commit to adding IPoIB gateway on their roadmap.  

Qlogic also offer the 12400 Gateway.  This has 6x 10ge ports.   However, like 
the Mellanox, I understand they only provide host vNIC support.

I'll leave it to representatives from Voltaire, Mellanox and Qlogic to update 
us. Particularly on support for InfiniBand to Ethernet Gateway for RoCEE.  This 
is needed so that RDMA sessions can be run between InfiniBand and RoCEE 
connected hosts.  I don't believe this will work over any of the today's 
available products.

Richard

-Original Message-
From: sebastien dugue [mailto:sebastien.du...@bull.net] 
Sent: 06 December 2010 11:40
To: Richard Croucher
Cc: 'OF EWG'; 'linux-rdma'
Subject: Re: [ewg] IPoIB to Ethernet routing performance

On Mon, 6 Dec 2010 10:49:58 -
Richard Croucher rich...@informatix-sol.com wrote:

 You may be able to improve by doing some OS tuning.

  Right, I tried a few things concerning the TCP/IP stack tuning but nothing
really came out of it.

  All this data should stay in kernel mode but there are lots of bottlenecks in
 the TCP/IP stack that limit scalability.

  That may be my problem in fact.

  The IPoIB code has not been optimized for this use case.

  I don't think IPoIB to be the bottleneck. In this case as I managed to feed
2 IPoIB streams between the client and the router yielding about 40 Gbits/s 
bandwidth.

 
 You don't mention what Server, kernel and OFED distro you are running.

  Right, sorry. The router is one of our 4 sockets Nehalem-EX box with 2 IOHs 
which
is running an OFED 1.5.2.

 
 The best performance is achieved using InfiniBand/Ethernet hardware gateways.
 Most of these provide virtual Ethernet NICs to InfiniBand hosts, but the 
 Voltaire
 4036E does provide a  IPoIB to Ethernet gateway capability.  This is FPGA 
 based
 so does provide much higher performance than you will achieve using a 
 standard server solution.

  That may be a solution indeed. Are there any real world figures out there
concerning the 4036E performance?

  Thanks Richard,

  Sébastien.


 
 -Original Message-
 From: ewg-boun...@lists.openfabrics.org 
 [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of sebastien dugue
 Sent: 06 December 2010 10:25
 To: OF EWG
 Cc: linux-rdma
 Subject: [ewg] IPoIB to Ethernet routing performance
 
 
   Hi,
 
   I know this might be off topic, but somebody may have already run into the 
 same
 problem before.
 
   I'm trying to use a server as a router between an IB fabric and an Ethernet 
 network.
 
   The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 
 10G
 Ethernet adapter.
 
   I did some bandwidth measurements using iperf with the following setup:
 
   +-+   +-+   +-+
   | |   | |   10G Eth | |
   | |QDR IB | +---+ |
   | client  +---+  Router |   10G Eth |  Server |
   | |   | +---+ |
   | |   | |   | |
   +-+   +-+   +-+
 
   
   However, the routing performance is far from what I would have expected.
 
   Here are some numbers:
 
   - 1 IPoIB stream between client and router: 20 Gbits/sec
 
 Looks OK.
 
   - 2 Ethernet streams between router and server: 19.5 Gbits/sec
 
 Looks OK.
 
   - routing 1 IPoIB stream to 1 Ethernet stream from client to server: 9.8 
 Gbits/sec
 
 We manage to saturate the Ethernet link, looks good so far.
 
   - routing 2 IPoIB streams to 2 Ethernet streams from client to server: 9.3 
 Gbits/sec
 
 Argh, even less that when routing a single stream. I would have expected
 a bit more than this.
 
 
   Has anybody ever tried to do some routing between an IB fabric and an 
 Ethernet
 network and achieved some sensible bandwidth figures?
 
   Are there some known limitations in what I try to achieve?
 
 
   Thanks,
 
   Sébastien.
 
 
 
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
 
 ___
 ewg mailing list
 ewg@lists.openfabrics.org
 http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread sebastien dugue
On Mon, 6 Dec 2010 12:08:43 -
Richard Croucher rich...@informatix-sol.com wrote:

 Unfortunately, the 4036E only has two 10G Ethernet ports which will 
 ultimately limit the throughput.

  I'll need to look into this option.

 
 The Mellanox BridgeX looks a better hardware solution with 12x 10Ge ports but 
 when I tested this they could only provide vNIC functionality and would not 
 commit to adding IPoIB gateway on their roadmap.

  Right, we did some evaluation on it and this was really a show stopper.

  Thanks,

  Sébastien.

 
 Qlogic also offer the 12400 Gateway.  This has 6x 10ge ports.   However, like 
 the Mellanox, I understand they only provide host vNIC support.
 
 I'll leave it to representatives from Voltaire, Mellanox and Qlogic to update 
 us. Particularly on support for InfiniBand to Ethernet Gateway for RoCEE.  
 This is needed so that RDMA sessions can be run between InfiniBand and RoCEE 
 connected hosts.  I don't believe this will work over any of the today's 
 available products.
 
 Richard
 
 -Original Message-
 From: sebastien dugue [mailto:sebastien.du...@bull.net] 
 Sent: 06 December 2010 11:40
 To: Richard Croucher
 Cc: 'OF EWG'; 'linux-rdma'
 Subject: Re: [ewg] IPoIB to Ethernet routing performance
 
 On Mon, 6 Dec 2010 10:49:58 -
 Richard Croucher rich...@informatix-sol.com wrote:
 
  You may be able to improve by doing some OS tuning.
 
   Right, I tried a few things concerning the TCP/IP stack tuning but nothing
 really came out of it.
 
   All this data should stay in kernel mode but there are lots of bottlenecks 
  in
  the TCP/IP stack that limit scalability.
 
   That may be my problem in fact.
 
   The IPoIB code has not been optimized for this use case.
 
   I don't think IPoIB to be the bottleneck. In this case as I managed to feed
 2 IPoIB streams between the client and the router yielding about 40 Gbits/s 
 bandwidth.
 
  
  You don't mention what Server, kernel and OFED distro you are running.
 
   Right, sorry. The router is one of our 4 sockets Nehalem-EX box with 2 IOHs 
 which
 is running an OFED 1.5.2.
 
  
  The best performance is achieved using InfiniBand/Ethernet hardware 
  gateways.
  Most of these provide virtual Ethernet NICs to InfiniBand hosts, but the 
  Voltaire
  4036E does provide a  IPoIB to Ethernet gateway capability.  This is FPGA 
  based
  so does provide much higher performance than you will achieve using a 
  standard server solution.
 
   That may be a solution indeed. Are there any real world figures out there
 concerning the 4036E performance?
 
   Thanks Richard,
 
   Sébastien.
 
 
  
  -Original Message-
  From: ewg-boun...@lists.openfabrics.org 
  [mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of sebastien dugue
  Sent: 06 December 2010 10:25
  To: OF EWG
  Cc: linux-rdma
  Subject: [ewg] IPoIB to Ethernet routing performance
  
  
Hi,
  
I know this might be off topic, but somebody may have already run into 
  the same
  problem before.
  
I'm trying to use a server as a router between an IB fabric and an 
  Ethernet network.
  
The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 
  10G
  Ethernet adapter.
  
I did some bandwidth measurements using iperf with the following setup:
  
+-+   +-+   +-+
| |   | |   10G Eth | |
| |QDR IB | +---+ |
| client  +---+  Router |   10G Eth |  Server |
| |   | +---+ |
| |   | |   | |
+-+   +-+   +-+
  

However, the routing performance is far from what I would have expected.
  
Here are some numbers:
  
- 1 IPoIB stream between client and router: 20 Gbits/sec
  
  Looks OK.
  
- 2 Ethernet streams between router and server: 19.5 Gbits/sec
  
  Looks OK.
  
- routing 1 IPoIB stream to 1 Ethernet stream from client to server: 9.8 
  Gbits/sec
  
  We manage to saturate the Ethernet link, looks good so far.
  
- routing 2 IPoIB streams to 2 Ethernet streams from client to server: 
  9.3 Gbits/sec
  
  Argh, even less that when routing a single stream. I would have expected
  a bit more than this.
  
  
Has anybody ever tried to do some routing between an IB fabric and an 
  Ethernet
  network and achieved some sensible bandwidth figures?
  
Are there some known limitations in what I try to achieve?
  
  
Thanks,
  
Sébastien.
  
  
  
  
  ___
  ewg mailing list
  ewg@lists.openfabrics.org
  http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
  
  ___
  ewg mailing list
  ewg@lists.openfabrics.org
  http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread Jason Gunthorpe
On Mon, Dec 06, 2010 at 09:47:42PM +0100, Jabe wrote:

 Technologies MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon
 (not nehalem).

Newer Mellanox cards have most of the offloads you see for ethernet so
they get better performance. Plus  Nehalem is just better at TCP in
the first place..

Jason
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread Jabe



   The router is fitted with one ConnectX2 QDR HCA and one dual port Myricom 10G
Ethernet adapter.

...

   Here are some numbers:

   - 1 IPoIB stream between client and router: 20 Gbits/sec

 Looks OK.

   - 2 Ethernet streams between router and server: 19.5 Gbits/sec

 Looks OK.
   



Actually I am amazed you can get such a speed with IPoIB. Trying with 
NPtcp on my DDR infiniband I can only obtain about 4.6Gbit/sec at the 
best packet size (that is 1/4 of the infiniband bandwidth) with this 
chip embedded in the mainboard: InfiniBand: Mellanox Technologies 
MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon (not nehalem).
That's with 2.6.37 kernel and vanilla ib_ipoib module. What's wrong with 
my setup?
I always assumed that such a slow speed was due to the lack of 
offloading capabilities you get with ethernet cards, but maybe I was 
wrong...?

Also what application did you use for the benchmark?
Thank you
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread sebastien dugue

  Hi Jabe,

On Mon, 06 Dec 2010 21:47:42 +0100
Jabe jabe.chap...@shiftmail.org wrote:

 
 The router is fitted with one ConnectX2 QDR HCA and one dual port 
  Myricom 10G
  Ethernet adapter.
 
  ...
 
 Here are some numbers:
 
 - 1 IPoIB stream between client and router: 20 Gbits/sec
 
   Looks OK.
 
 - 2 Ethernet streams between router and server: 19.5 Gbits/sec
 
   Looks OK.
 
 
 
 Actually I am amazed you can get such a speed with IPoIB. Trying with 
 NPtcp on my DDR infiniband I can only obtain about 4.6Gbit/sec at the 
 best packet size (that is 1/4 of the infiniband bandwidth) with this 
 chip embedded in the mainboard: InfiniBand: Mellanox Technologies 
 MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon (not nehalem).
 That's with 2.6.37 kernel and vanilla ib_ipoib module. What's wrong with 
 my setup?
 I always assumed that such a slow speed was due to the lack of 
 offloading capabilities you get with ethernet cards, but maybe I was 
 wrong...?
 Also what application did you use for the benchmark?

 I'm using iperf.

  Sébastien.

 Thank you
 
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

Re: [ewg] IPoIB to Ethernet routing performance

2010-12-06 Thread sebastien dugue

 Hi Jason,

On Mon, 6 Dec 2010 14:27:59 -0700
Jason Gunthorpe jguntho...@obsidianresearch.com wrote:

 On Mon, Dec 06, 2010 at 09:47:42PM +0100, Jabe wrote:
 
  Technologies MT25204 [InfiniHost III Lx HCA]; and dual E5430 xeon
  (not nehalem).
 
 Newer Mellanox cards have most of the offloads you see for ethernet so
 they get better performance.

  What kind of offload capabilities are you referring to for IPoIB?

 Plus  Nehalem is just better at TCP in
 the first place..

  Well that depends on which Nehalem we're talking about. I've found that the EX
performs more poorly than the EP, though I didn't dig enough to find out why.

  Sébastien.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg