Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-08 Thread Tziporet Koren

On 2/7/2010 6:39 PM, Steve Wise wrote:


If ofed-1.5.1 is based on 2.6.33 then it will get this patch
automatically (assuming it goes upstream and makes 2.6.33).  Or we can
pull it in as a kernel_patches/fixes/ patch.
   
OFED 1.5.1 is not based on 2.6.33, but on 2.6.30, so we need the patch 
under fixes.

Steve - can you prepare such a patch?

Tziporet



--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-08 Thread Steve Wise

Tziporet Koren wrote:

On 2/7/2010 6:39 PM, Steve Wise wrote:


If ofed-1.5.1 is based on 2.6.33 then it will get this patch
automatically (assuming it goes upstream and makes 2.6.33).  Or we can
pull it in as a kernel_patches/fixes/ patch.
   
OFED 1.5.1 is not based on 2.6.33, but on 2.6.30, so we need the patch 
under fixes.

Steve - can you prepare such a patch?

Tziporet




The reason I thought it was based on 2.6.33, is because I see 2.6.33 git 
tags in the ofed kernel tree.  I misinterpreted what that meant.


I can develop a patch, but it will disable _all_ 127.0.0.1 binds.  
Otherwise openmpi is still broken on IB.


Steve.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Tziporet Koren

On 2/7/2010 3:22 AM, Steve Wise wrote:
   


Good catch, I'll update the patch and submit for 2.6.33 on Monday.


   

NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1.

   

If this patch will be accepted to the kernel 2.6.33 we can take it too

Tziporet
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise

Tziporet Koren wrote:

On 2/7/2010 3:22 AM, Steve Wise wrote:
  


Good catch, I'll update the patch and submit for 2.6.33 on Monday.


   

NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1.

   

If this patch will be accepted to the kernel 2.6.33 we can take it too

If ofed-1.5.1 is based on 2.6.33 then it will get this patch 
automatically (assuming it goes upstream and makes 2.6.33).  Or we can 
pull it in as a kernel_patches/fixes/ patch.


My point, though, is that even with this patch in ofed-1.5.1, we still 
have an openmpi/IB/rdmacm regression.  The only way to avoid this 
regression without changing openmpi is to disallow _all_ rdma binds to 
127.0.0.1.


Steve.





Tziporet


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Roland Dreier
  My point, though, is that even with this patch in ofed-1.5.1, we still
  have an openmpi/IB/rdmacm regression.  The only way to avoid this
  regression without changing openmpi is to disallow _all_ rdma binds to
  127.0.0.1.

Can you identify the source of the regression?  ie what was the change
that broke things?

I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.
-- 
Roland Dreier rola...@cisco.com
Cisco.com - http://www.cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise

Roland Dreier wrote:

  My point, though, is that even with this patch in ofed-1.5.1, we still
  have an openmpi/IB/rdmacm regression.  The only way to avoid this
  regression without changing openmpi is to disallow _all_ rdma binds to
  127.0.0.1.

Can you identify the source of the regression?  ie what was the change
that broke things?

  


It is the same commit you sited earlier.  It enables binding rdma cm_ids 
to 127.0.0.1.  Sean's proposed patch on top of that disables this only 
for iwarp devices.




I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.
  


--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Sean Hefty
Can you identify the source of the regression?  ie what was the change
that broke things?

My understanding is that support for loopback addresses exposes an existing bug
in openmpi.  It tries to bind to 127.0.0.1, which now succeeds.  Openmpi passes
that address to a remote node for use in connections.

I'm most concerned that there is another regression in 2.6.33, and if so
I would like to try and avoid letting that get into the final release.

Unless we never support loopback addresses, openmpi will see a regression.  The
only other problem that I'm aware of for 2.6.33 is that the bind to a loopback
address will succeed, even though the RDMA device may not support loopback.
This is true for the Chelsio and Ammasso drivers.  Connections should still
fail, but the bind is basically useless in this case.  I will try to get a patch
for that tomorrow.

- Sean

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html