[OMPI devel] Availability of hwloc topology info

Jeff Squyres Tue, 13 Sep 2011 10:59:57 -0400

I should clarify something: While opal_hwloc_base_open() is invoked during 
opal_init(), the global variable opal_hwloc_topology is not filled with 
topology information during opal_init() due to the overhead of invoking 
hwloc_topology_load() in many processes simultaneously.


Specifically: the process of loading the topology can be expensive, 
particularly on large-core-count machines (e.g., SGI machines).  Consider 64 
(or 128 or 256 or ...) MPI processes all hammering on /proc and/or /sys at the 
same time.  We've had reports on the hwloc users list that this is not a good 
idea; it *really* slows down the startup process.

Additionally, since opal is, by definition, a single process abstraction, it 
made sense to push the decision of whether/how to load the topology information 
to a higher layer.  On the OMPI trunk, the orted loads the topology information 
and disseminates it to its local processes during the ESS handshake.

To be specific:

- the orted and HNP will have non-NULL values in opal_hwloc_topology after the 
ESS startup

- MPI processes will have non-NULL values of opal_hwloc_topology after 
orte_init()


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

[OMPI devel] Availability of hwloc topology info

Reply via email to