The key question, though, is: has anyone checked to see if the ofacm code even works any more??
Only oob and xoob components appear to be present - so unless someone fixed those since they were originally copied from openib, I doubt ofacm works. On Nov 14, 2013, at 11:08 AM, Shamis, Pavel <sham...@ornl.gov> wrote: > There is some confusion in the thread. UDCM is just another CPC, like XOOB, > OOB, and RDMACM (I think IBCM is officially dead). > XOOB and OOB don't use UDCM, they relay on ORTE out-of-band communication. > > OpenIB/connect supports UDCM,XOOB,OOB, and RDMACM > OFACM supports (at least last time when we checked) OOB and XOOB > > RDMACM was not moved to OFACM, because of iWarp's "first message" requirement > that used to break the abstraction. > Moreover RDMACM scalability used to be terrible, as a result no one in IB > community really used it. > The situation is a bit different today, since ROCEE relays on RDMACM. It > worth noting that you may setup > ROCEE connections with a regular OOB with a some restrictions (we did it for > mvapich-1). > > The code between ofacm and openib is similar, but NOT the same. We change the > API in a way that it allows > to hide XRC QP management (there is hash table that manages QP to EP mapping) > in OFACM instead of OPENIB. > This made openib initialization code a bit cleaner. Here is my old tree with > openib btl changes https://bitbucket.org/pasha/ofacm > > I hope it helps, > > Best, > Pasha > > On Nov 14, 2013, at 1:17 PM, Joshua Ladd <josh...@mellanox.com> wrote: > >> Unless someone went in and "fixed" the code in common (judging by the >> comments, fixed seems to imply porting (x)oob to use UDCM, which hasn't been >> done at all in the context of xoob and is incompletely patched and remains >> unusable as a replacement for oob in 1.7.4), there is no reason to believe >> it would work any different than the cpcs under btl/openib/connect. IIRC, >> it's the same code - copy/pasted - just moved to a common location so >> Cheetah collectives can do their wireup. So, if oob cpc doesn't work, ofacm >> oob won't work either and, I guess, by extension, Cheetah IBoffload won't >> work. Pasha, correct me if you know different. >> >> >> Josh >> >> >> -----Original Message----- >> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain >> Sent: Thursday, November 14, 2013 1:05 PM >> To: Open MPI Developers >> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 >> - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib >> ompi/mca/btl/openib/connect >> >> >> On Nov 14, 2013, at 9:33 AM, Barrett, Brian W <bwba...@sandia.gov> wrote: >> >>> On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)" <jsquy...@cisco.com> wrote: >>> >>>> Does XRC work with the UDCM CPC? >>>> >>>> >>>> On Nov 14, 2013, at 9:35 AM, Ralph Castain <r...@open-mpi.org> wrote: >>>> >>>>> I think the problems in udcm were fixed by Nathan quite some time >>>>> ago, but never moved to 1.7 as everyone was told that the connect >>>>> code in openib was already deprecated pending merge with the new >>>>> ofacm common code. Looking over at that area, I see only oob and >>>>> xoob - so if the users of the common ofacm code are finding that it >>>>> works, the simple answer may just be to finally complete the switchover. >>>>> >>>>> Meantime, perhaps someone can CMR and review a copying of the udcm >>>>> cpc to the 1.7 branch? >>>>> >>>>> >>>>> On Nov 14, 2013, at 5:14 AM, Joshua Ladd <josh...@mellanox.com> wrote: >>>>> >>>>>> Um, no. It's supposed to work with UDCM which doesn't appear to be >>>>>> enabled in 1.7. >>>>>> >>>>>> Per Ralph's comment to me last night: >>>>>> >>>>>> "... you cannot use the oob connection manager. It doesn't work and >>>>>> was deprecated. You must use udcm, which is why things are supposed >>>>>> to be set to do so by default. Please check the openib connect >>>>>> priorities and correct them if necessary." >>>>>> >>>>>> However, it's never been enabled in 1.7 - don't know what "borked" >>>>>> means, and from what Devendar tells me, several UDCM commits that >>>>>> are in the trunk have not been pushed over to 1.7: >>>>>> >>>>>> So, as of this moment, OpenIB BTL is essentially dead-in-the-water >>>>>> in 1.7. >>>>>> >>>>>> >>>>>> >>> >>> I'm going to start by admitting that I haven't been paying attention >>> to IB the last couple of months, so I'm out of my league a little bit >>> here. I remember discussions of UDCM replacing OOB both because the >>> OOB CPC had some issues and because it would make it easier to move >>> the BTLs to the OPAL layer (ie, below the OOB). But I also thought >>> that was more future work than it clearly was. So can someone let me know: >>> >>> 1) What the status of UDCM is (does it work reliably, does it support >>> XRC, etc.) >> >> Seems to be working okay on the IB systems at LANL and IU. Don't know about >> XRC - I seem to recall the answer is "no" >> >>> 2) What's the difference between CPCs and OFACM and what's our plans >>> w.r.t 1.7 there? >> >> Pasha created ofacm because some of the collective components now need to >> forge connections. So he created the common/ofacm code to meet those needs, >> with the intention of someday replacing the openib cpc's with the new common >> code. However, this was stalled by the iWarp issue, and so it fell off the >> table. >> >> We now have two duplicate ways of doing the same thing, but with code in two >> different places. :-( >> >>> 3) Someone mentioned that ofacm oob worked, but cpc oob didn't. Can >>> someone explain why? >> >> I'm not sure that is actually true as there is no indication that anyone is >> using or testing the collective components that use ofacm code. >> >> >>> >>> Again, sorry for being dense; I've been spending too much time in >>> Portals land lately. >>> >>> Brian >>> >>> -- >>> Brian W. Barrett >>> Scalable System Software Group >>> Sandia National Laboratories >>> >>> >>> >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel