Thanks! I'll add the latter to our code.

Ralph

On Oct 12, 2011, at 3:11 PM, Brice Goglin wrote:

> Le 12/10/2011 22:56, Jeff Squyres a écrit :
>> One of the OMPI devs found a problem when I upgraded the OMPI SVN trunk to 
>> the hwloc 1.2.2ompi version last week that I think I am just now beginning 
>> to understand.
>> 
>> Brief reminder of our strategy:
>> 
>> - on each compute node, OMPI launches a local "orted" helper daemon
>> - this orted fork/exec's the local MPI processes
>> 
>> To avoid the penalty of each MPI process invoking hwloc discovery 
>> more-or-less simultaneously upon startup (which, as we've see on this list 
>> before, can be painful when core counts are large), we have the orted do the 
>> hwloc discovery, serialize this into XML, and send it to each of its local 
>> processes.  The local processes receive this XML and then load it into hwloc 
>> and run from there.
>> 
>> However, it looks like the resulting loaded-from-XML topology->is_thissystem 
>> is set to 0, and therefore functions like hwloc_get_cpubind() actually get 
>> wired up to dontget_thisproc_cpubind() (instead of the proper Linux backend, 
>> for example).
>> 
>> How do we avoid this?  We need working hwloc functions after loading up an 
>> XML topology string.
> 
> export HWLOC_THISSYSTEM=1
> or
> hwloc_topology_set_flags(HWLOC_TOPOLOGY_FLAG_IS_THISSYSTEM) between
> init() and load()
> 
> Brice
> 
> _______________________________________________
> hwloc-devel mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-devel


Reply via email to