Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-08 Thread Steve Wise
Tziporet Koren wrote: On 2/7/2010 6:39 PM, Steve Wise wrote: If ofed-1.5.1 is based on 2.6.33 then it will get this patch automatically (assuming it goes upstream and makes 2.6.33). Or we can pull it in as a kernel_patches/fixes/ patch. OFED 1.5.1 is not based on 2.6.33, but on 2.6.30, so

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-08 Thread Tziporet Koren
On 2/7/2010 6:39 PM, Steve Wise wrote: If ofed-1.5.1 is based on 2.6.33 then it will get this patch automatically (assuming it goes upstream and makes 2.6.33). Or we can pull it in as a kernel_patches/fixes/ patch. OFED 1.5.1 is not based on 2.6.33, but on 2.6.30, so we need the patch unde

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise
Tziporet Koren wrote: On 2/5/2010 6:52 PM, Sean Hefty wrote: BTW: Was this change an artifact of rebasing ofed-1.5.1 on a new kernel version? apparently Sorry to jump late on this thread OFED 1.5.1 was not rebased on a new kernel - its still based on 2.6.30. But many time we tak

RE: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Sean Hefty
>Can you identify the source of the regression? ie what was the change >that broke things? My understanding is that support for loopback addresses exposes an existing bug in openmpi. It tries to bind to 127.0.0.1, which now succeeds. Openmpi passes that address to a remote node for use in conne

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Tziporet Koren
On 2/5/2010 6:52 PM, Sean Hefty wrote: BTW: Was this change an artifact of rebasing ofed-1.5.1 on a new kernel version? apparently Sorry to jump late on this thread OFED 1.5.1 was not rebased on a new kernel - its still based on 2.6.30. But many time we take patches that were acce

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise
Roland Dreier wrote: > My point, though, is that even with this patch in ofed-1.5.1, we still > have an openmpi/IB/rdmacm regression. The only way to avoid this > regression without changing openmpi is to disallow _all_ rdma binds to > 127.0.0.1. Can you identify the source of the regressio

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Roland Dreier
> My point, though, is that even with this patch in ofed-1.5.1, we still > have an openmpi/IB/rdmacm regression. The only way to avoid this > regression without changing openmpi is to disallow _all_ rdma binds to > 127.0.0.1. Can you identify the source of the regression? ie what was the cha

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Steve Wise
Tziporet Koren wrote: On 2/7/2010 3:22 AM, Steve Wise wrote: Good catch, I'll update the patch and submit for 2.6.33 on Monday. NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1. If this patch will be accepted to the kernel 2.6.33 we can take it too If ofed-

Re: [ewg] bug 1918 - openmpi broken due to rdma-cm changes

2010-02-07 Thread Tziporet Koren
On 2/7/2010 3:22 AM, Steve Wise wrote: Good catch, I'll update the patch and submit for 2.6.33 on Monday. NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1. If this patch will be accepted to the kernel 2.6.33 we can take it too Tziporet -- To unsubscribe fro

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-06 Thread Steve Wise
Good catch, I'll update the patch and submit for 2.6.33 on Monday. NOTE: This doesn't solve our IB/openmpi regression for ofed-1.5.1. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More ma

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-06 Thread Sean Hefty
>> -list_for_each_entry(cma_dev, &dev_list, list) >> +list_for_each_entry(cma_dev, &dev_list, list) { >> +if (rdma_node_get_transport(cma_dev->device->node_type) != >> +RDMA_TRANSPORT_IB) >> +continue; >> + >> for (p = 1; p <= cma

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-06 Thread Steve Wise
Note, even though this patch resolved the openmpi failure on my iwarp nodes, ucmatose -b 127.0.0.1 doesn't fail. I haven't looked at the src, but something funny must be happening. So we still have a regression issue with ofed-1.5.1/upstream kernels and openmpi over IB with rdmacm. Steve.

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-06 Thread Steve Wise
rdma/cm: disallow loopback address for iwarp devices From: Sean Hefty The current RDMA iWarp devices cannot be used to establish connections using the loopback address. Prevent rdma_bind_addr from associating the loopback address with an iWarp device. This fixes an issue with openmpi, where

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-06 Thread Steve Wise
Sean Hefty wrote: There is still some inconsistency here. Sean, you claimed binds to 127.0.0.1 succeed in ofed-1.4 for IB devices. If so, then folks running IB/openmpi/rdmacm should be seeing issues. We need to dig a little more... You can verify this by running ucmatose -b 127.0.0.1 a

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Roland Dreier
> > Well, I think you are right. This kind of change seems appropriate to > > me for mainline, but OFED/RHEL should carry a responsibility to manage > > an identified incompatibility, either patch their kernel, patch their > > OMPI, or publish an errata. That is the role of a distribution. >

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Sean Hefty wrote: There is still some inconsistency here. Sean, you claimed binds to 127.0.0.1 succeed in ofed-1.4 for IB devices. If so, then folks running IB/openmpi/rdmacm should be seeing issues. We need to dig a little more... You can verify this by running ucmatose -b 127.0.0.1 a

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jeff Squyres
On Feb 5, 2010, at 4:53 PM, Steve Wise wrote: > There is still some inconsistency here. Sean, you claimed binds to > 127.0.0.1 succeed in ofed-1.4 for IB devices. If so, then folks running > IB/openmpi/rdmacm should be seeing issues. We need to dig a little more... FWIW, I can run Open MPI v1

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Sean Hefty
>There is still some inconsistency here. Sean, you claimed binds to >127.0.0.1 succeed in ofed-1.4 for IB devices. If so, then folks running >IB/openmpi/rdmacm should be seeing issues. We need to dig a little more... You can verify this by running ucmatose -b 127.0.0.1 and see if the test ente

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Jeff Squyres wrote: On Feb 5, 2010, at 4:14 PM, Jason Gunthorpe wrote: Well, I think you are right. This kind of change seems appropriate to me for mainline, but OFED/RHEL should carry a responsibility to manage an identified incompatibility, either patch their kernel, patch their OMPI, or p

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jeff Squyres
On Feb 5, 2010, at 4:14 PM, Jason Gunthorpe wrote: > Well, I think you are right. This kind of change seems appropriate to > me for mainline, but OFED/RHEL should carry a responsibility to manage > an identified incompatibility, either patch their kernel, patch their > OMPI, or publish an errata.

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2010 at 03:08:10PM -0500, Jeff Squyres wrote: > On Feb 5, 2010, at 1:56 PM, Jason Gunthorpe wrote: > > > > I think we should remove the feature of allowing binds to 127.0.0.1 > > > altogether based on Jeff's arguments and my assertion that 127.0.0.1 is > > > a sw-loopback mechani

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Sean Hefty
>Ammasso and Chelsio T3 rnics do not support HW loopback. It looks like the NES driver doesn't support 127.0.0.1, but does support loopback connections (gurgle). Here's an untested patch for 2.6.33 (not even compile tested) for consideration then. I'll be testing this shortly unless there's disa

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jeff Squyres
On Feb 5, 2010, at 1:56 PM, Jason Gunthorpe wrote: > > I think we should remove the feature of allowing binds to 127.0.0.1 > > altogether based on Jeff's arguments and my assertion that 127.0.0.1 is > > a sw-loopback mechanism anyway... > > I don't agree, the kernel should be free to provide a

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Roland Dreier
> > That should be the patch in question. I'm not sure about reaching > > consensus. :) > > If the other changes to the rdma_cm aren't closely tied to that change, we > > may > > be able to back that one patch out until we can get whatever other fix may > > be > > needed. > I'd like to

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Sean Hefty wrote: Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")? This just went in for 2.6.33, which is still at -rc6, so if we can quickly reach a consensus, there is still time to get a fix in for 2.6.33. That should be the patch in question. I'm not sure about reachi

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2010 at 12:32:51PM -0600, Steve Wise wrote: > I think we should remove the feature of allowing binds to 127.0.0.1 > altogether based on Jeff's arguments and my assertion that 127.0.0.1 is > a sw-loopback mechanism anyway... I don't agree, the kernel should be free to provide a

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Roland Dreier
> I think we should remove the feature of allowing binds to 127.0.0.1 > altogether based on Jeff's arguments and my assertion that 127.0.0.1 > is a sw-loopback mechanism anyway... Well, someone propose a patch please. -- Roland Dreier Cisco.com - http://www.cisco.com For corporate legal info

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Sean Hefty
>Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")? This >just went in for 2.6.33, which is still at -rc6, so if we can quickly >reach a consensus, there is still time to get a fix in for 2.6.33. That should be the patch in question. I'm not sure about reaching consensus. :) If the

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Jeff Squyres wrote: On Feb 5, 2010, at 12:51 PM, Roland Dreier (rdreier) wrote: > But Jeff, note that if someone uses the upstream kernel and OpenMPI, > its busted... Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")? This just went in for 2.6.33, which is still at -rc6, so

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jeff Squyres
On Feb 5, 2010, at 12:51 PM, Roland Dreier (rdreier) wrote: > > But Jeff, note that if someone uses the upstream kernel and OpenMPI, > > its busted... > > Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")? This > just went in for 2.6.33, which is still at -rc6, so if we can quick

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Jeff Squyres
On Feb 5, 2010, at 11:16 AM, Steve Wise wrote: > > Note that it is highly unlikely that we will release open mpi 1.4.2 in > > time for ofed 1.5.1. > > Jeff, there is no way to handle high priority bug fixes in the current > released stream? We have 1.4.2 cooking, but it's not ready yet. I'll

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Roland Dreier
> But Jeff, note that if someone uses the upstream kernel and OpenMPI, > its busted... Is the issue 6f8372b6 ("RDMA/cm: fix loopback address support")? This just went in for 2.6.33, which is still at -rc6, so if we can quickly reach a consensus, there is still time to get a fix in for 2.6.33.

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Sean Hefty wrote: My concern is breaking an existing working OpenMPI in a point release because we changed semantics of the rdma-cm in an ofed point release... OFED can call this release a point release, but in reality, the content makes it a major release... BTW: Was this change an

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Sean Hefty
>My concern is breaking an existing working OpenMPI in a point release >because we changed semantics of the rdma-cm in an ofed point release... OFED can call this release a point release, but in reality, the content makes it a major release... >BTW: Was this change an artifact of rebasing ofed-1

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
I agree that we should probably not allow 127.0.0.1 binds in ofed-1.5.1 at all because it regresses OpenMPI. Even with IB systems, if the bind to 127.0.0.1 succeeds, then OpenMPI assumes 127.0.0.1 is bound to that rdma interface and advertises this address to its peer as an address to-which

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Sean Hefty wrote: Also note that trying to bind rdma cm to all interface ip addresses was the way that we were advised by openfabrics to figure out which devices are rdma- capable. As such, it is highly desirable to get the fix transparently in rdmacm and preserve the old semantic. More specific

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Sean Hefty
>Also note that trying to bind rdma cm to all interface ip addresses was the way >that we were advised by openfabrics to figure out which devices are rdma- >capable. > >As such, it is highly desirable to get the fix transparently in rdmacm and >preserve the old semantic. More specifically, it seems

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-05 Thread Steve Wise
Jeff Squyres (jsquyres) wrote: Note that it is highly unlikely that we will release open mpi 1.4.2 in time for ofed 1.5.1. Jeff, there is no way to handle high priority bug fixes in the current released stream? Also note that trying to bind rdma cm to all interface ip addresses was the

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Paul Grun
ssage- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Roland Dreier Sent: Thursday, February 04, 2010 3:51 PM To: Steve Wise Cc: Sean Hefty; linux-rdma; OpenFabrics EWG; Jeff Squyres Subject: Re: bug 1918 - openmpi broken due to rdma-cm changes &g

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Roland Dreier
> Is this only an iwarp issue? IE do all IB devices support hw > loopback? And will all future devices support it (IE is it an IBTA > requirement)? I do think IBA requires loopback to work. Can't quote chapter & verse off the top of my head. -- Roland Dreier Cisco.com - http://www.cisco.co

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Steve Wise
Roland Dreier wrote: > Hey Roland, are you ok with a device attribute to indicate hw-loopback > support? Sigh, I guess so. Can we have the rdma-cm handle this somewhat automagically, eg only choose devices that do handle loopback when binding/connecting to 127.0.0.1? That's the plan. Or

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Roland Dreier
> Hey Roland, are you ok with a device attribute to indicate hw-loopback > support? Sigh, I guess so. Can we have the rdma-cm handle this somewhat automagically, eg only choose devices that do handle loopback when binding/connecting to 127.0.0.1? Or maybe can we put the handling of this into t

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Steve Wise
Sean Hefty wrote: Well then the rdma-cm needs to know which devices support hw loopback. Cuz on a T3-only system, no hwloop... The problem sounds like it's more than just whether 127.0.0.1 is usable. That check may fix openmpi, but it sounds more like the app needs to know whether the dev

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Sean Hefty
>This solution would work. Will you code it up? I can do that. I just want to make sure that we address the full scope of the problem. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at h

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Steve Wise
Sean Hefty wrote: At first thought, we can extend enum ib_device_cap_flags to indicate if a device supports loopback capabilities or not. The rdma_cm could then skip over such devices when dealing with a loopback address. This solution would work. Will you code it up? Stevo -- To unsubscri

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Sean Hefty
>Well then the rdma-cm needs to know which devices support hw loopback. >Cuz on a T3-only system, no hwloop... The problem sounds like it's more than just whether 127.0.0.1 is usable. That check may fix openmpi, but it sounds more like the app needs to know whether the device can actually support

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Steve Wise
Sean Hefty wrote: But how can you determine _which_ rdma device should be used if and app binds to 127.0.0.1? I think this is busted... The code just picks the first rdma device available. To me, this is preferable than simply disallowing the loopback device from working at all. I perso

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Sean Hefty
>But how can you determine _which_ rdma device should be used if and app >binds to 127.0.0.1? I think this is busted... The code just picks the first rdma device available. To me, this is preferable than simply disallowing the loopback device from working at all. I personally use it all the tim

Re: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Steve Wise
Sean Hefty wrote: OpenMPI uses rdma_bind_addr() to figure out which ip addresses are valid for which IB devices. This logic is now broken. Regardless of whether OpenMPI should use another method for determining which IP address belong to which interfaces, we should probably rethink whether we'

RE: bug 1918 - openmpi broken due to rdma-cm changes

2010-02-04 Thread Sean Hefty
>OpenMPI uses rdma_bind_addr() to figure out which ip addresses are valid >for which IB devices. This logic is now broken. Regardless of whether >OpenMPI should use another method for determining which IP address >belong to which interfaces, we should probably rethink whether we're >breaking rdm