The key question, though, is: has anyone checked to see if the ofacm code even 
works any more??

Only oob and xoob components appear to be present - so unless someone fixed 
those since they were originally copied from openib, I doubt ofacm works.


On Nov 14, 2013, at 11:08 AM, Shamis, Pavel <sham...@ornl.gov> wrote:

> There is some confusion in the thread. UDCM is just another CPC, like XOOB, 
> OOB, and RDMACM (I think IBCM is officially dead).
> XOOB and OOB don't use UDCM, they relay on ORTE out-of-band communication.
> 
> OpenIB/connect supports UDCM,XOOB,OOB, and RDMACM
> OFACM supports (at least last time when we checked) OOB and XOOB
> 
> RDMACM was not moved to OFACM, because of iWarp's "first message" requirement 
> that used to break the abstraction.
> Moreover RDMACM scalability used to be terrible, as a result no one in IB 
> community really used it.
> The situation is a bit different today, since ROCEE relays on RDMACM. It 
> worth noting that you may setup
> ROCEE connections with a regular OOB with a some restrictions (we did it for 
> mvapich-1).
> 
> The code between ofacm and openib is similar, but NOT the same. We change the 
> API in a way that it allows
> to hide XRC QP management (there is hash table that manages QP to EP mapping) 
> in OFACM instead of OPENIB.
> This made openib initialization code a bit cleaner. Here is my old tree with 
> openib btl changes https://bitbucket.org/pasha/ofacm
> 
> I hope it helps,
> 
> Best,
> Pasha
> 
> On Nov 14, 2013, at 1:17 PM, Joshua Ladd <josh...@mellanox.com> wrote:
> 
>> Unless someone went in and "fixed" the code in common (judging by the 
>> comments, fixed seems to imply porting (x)oob to use UDCM, which hasn't been 
>> done at all in the context of xoob and is incompletely patched and remains 
>> unusable as a replacement for oob in 1.7.4), there is no reason to believe 
>> it would work any different than the cpcs under btl/openib/connect. IIRC, 
>> it's the same code - copy/pasted - just moved to a common location so 
>> Cheetah collectives can do their wireup. So, if oob cpc doesn't work, ofacm 
>> oob won't work either and, I guess, by extension, Cheetah IBoffload won't 
>> work. Pasha, correct me if you know different. 
>> 
>> 
>> Josh
>> 
>> 
>> -----Original Message-----
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
>> Sent: Thursday, November 14, 2013 1:05 PM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 
>> - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib 
>> ompi/mca/btl/openib/connect
>> 
>> 
>> On Nov 14, 2013, at 9:33 AM, Barrett, Brian W <bwba...@sandia.gov> wrote:
>> 
>>> On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote:
>>> 
>>>> Does XRC work with the UDCM CPC?
>>>> 
>>>> 
>>>> On Nov 14, 2013, at 9:35 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> 
>>>>> I think the problems in udcm were fixed by Nathan quite some time 
>>>>> ago, but never moved to 1.7 as everyone was told that the connect 
>>>>> code in openib was already deprecated pending merge with the new 
>>>>> ofacm common code. Looking over at that area, I see only oob and 
>>>>> xoob - so if the users of the common ofacm code are finding that it 
>>>>> works, the simple answer may just be to finally complete the switchover.
>>>>> 
>>>>> Meantime, perhaps someone can CMR and review a copying of the udcm 
>>>>> cpc to the 1.7 branch?
>>>>> 
>>>>> 
>>>>> On Nov 14, 2013, at 5:14 AM, Joshua Ladd <josh...@mellanox.com> wrote:
>>>>> 
>>>>>> Um, no. It's supposed to work with UDCM which doesn't appear to be 
>>>>>> enabled in 1.7.
>>>>>> 
>>>>>> Per Ralph's comment to me last night:
>>>>>> 
>>>>>> "... you cannot use the oob connection manager. It doesn't work and 
>>>>>> was deprecated. You must use udcm, which is why things are supposed 
>>>>>> to be set to do so by default. Please check the openib connect 
>>>>>> priorities and correct them if necessary."
>>>>>> 
>>>>>> However, it's never been enabled in 1.7 - don't know what "borked"
>>>>>> means, and from what Devendar tells me, several UDCM commits that 
>>>>>> are in the trunk have not been pushed over to 1.7:
>>>>>> 
>>>>>> So, as of this moment, OpenIB BTL is essentially dead-in-the-water 
>>>>>> in 1.7.
>>>>>> 
>>>>>> 
>>>>>> 
>>> 
>>> I'm going to start by admitting that I haven't been paying attention 
>>> to IB the last couple of months, so I'm out of my league a little bit 
>>> here.  I remember discussions of UDCM replacing OOB both because the 
>>> OOB CPC had some issues and because it would make it easier to move 
>>> the BTLs to the OPAL layer (ie, below the OOB).  But I also thought 
>>> that was more future work than it clearly was.  So can someone let me know:
>>> 
>>> 1) What the status of UDCM is (does it work reliably, does it support 
>>> XRC, etc.)
>> 
>> Seems to be working okay on the IB systems at LANL and IU. Don't know about 
>> XRC - I seem to recall the answer is "no"
>> 
>>> 2) What's the difference between CPCs and OFACM and what's our plans 
>>> w.r.t 1.7 there?
>> 
>> Pasha created ofacm because some of the collective components now need to 
>> forge connections. So he created the common/ofacm code to meet those needs, 
>> with the intention of someday replacing the openib cpc's with the new common 
>> code. However, this was stalled by the iWarp issue, and so it fell off the 
>> table.
>> 
>> We now have two duplicate ways of doing the same thing, but with code in two 
>> different places. :-(
>> 
>>> 3) Someone mentioned that ofacm oob worked, but cpc oob didn't.  Can 
>>> someone explain why?
>> 
>> I'm not sure that is actually true as there is no indication that anyone is 
>> using or testing the collective components that use ofacm code.
>> 
>> 
>>> 
>>> Again, sorry for being dense; I've been spending too much time in 
>>> Portals land lately.
>>> 
>>> Brian
>>> 
>>> --
>>> Brian W. Barrett
>>> Scalable System Software Group
>>> Sandia National Laboratories
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to