On Aug 22, 2011, at 9:35 AM, Bhargava Ramu Kavati wrote:

> I am trying to explore the details of connection establishment in OpenMPI 
> using libibcm/librdmacm.  

Note that the IB community has given up on ibcm.  Our support of it is 
incomplete; I wouldn't look at it as an example.

> In the code, I could not find how OpenMPI app is getting service-id/lid of 
> remote node to which it wants to connect.  

In the normal case, we pass that information during MPI_INIT.  It's a global 
gather / broadcast operation that we refer to as the "modex" (module exchange). 
 I.e., each openib BTL module instance publishes its address information in the 
modex and sends it.  Near the end of MPI_INIT, each MPI process receives the 
modex broadcast and caches it.

During connection establishment, an MPI process will look in its modex cache to 
find the connection information for the peer process that it wants to connect 
to.

> Also, I did not see any query in the code related to service_record_get from 
> SA.  Can you please desribe what is happening OR Am I missing something here ?

IIRC, we don't currently use the SA because of its serialization and other 
resource bottlenecks (this is a hand-waving answer; I don't remember the exact 
reasons for not using the SA, but there were many discussions between the MPI 
and OpenFabrics communities a long time ago.  The SA issues were not resolved 
to the MPI community's liking, IIRC, but this was a long time ago, and I don't 
even work for an IB vendor any more, so I might not be remembering this 
correctly...).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to