A few comments:
1. Have you guys considered using hwloc for level 4-7 detection?
2. Is L2 related to L2 cache? If no then is there some other term you
could use?
3. What do you see if the process is bound to multiple cores/hyperthreads?
4. What do you see if the process is not bound to any level 4-7 items?
5. What about L1 and L2 cache locality as some levels? (hwloc exposes
these but these are also at different depths depending on the platform).
Note I am working with Jeff Squyres and Josh Hursey on some new
paffinity code that uses hwloc. Though the paffinity code may not have
direct relationship to hitopo the use of hwloc and standardization of
what you call level 4-7 might help avoid some user confusions.
--td
On 11/15/2010 06:56 AM, Sylvain Jeaugey wrote:
As a followup of Stuttgart's developper's meeting, here is an RFC for
our topology detection framework.
WHAT: Add a framework for hardware topology detection to be used by
any other part of Open MPI to help optimization.
WHY: Collective operations or shared memory algorithms among others
may have optimizations depending on the hardware relationship between
two MPI processes. HiTopo is an attempt to provide it in a unified
manner.
WHERE: ompi/mca/hitopo/
WHEN: When wanted.
==========================================================================
We developped the HiTopo framework for our collective operation
component, but it may be useful for other parts of Open MPI, so we'd
like to contribute it.
A wiki page has been setup :
https://svn.open-mpi.org/trac/ompi/wiki/HiTopo
and a bitbucket repository :
http://bitbucket.org/jeaugeys/hitopo/
In a few words, we have 3 steps in HiTopo :
- Detection : each MPI process detects its topology at various levels :
- core/socket : through the cpuid component
- node : through gethostname
- switch/island : through openib (mad) or slurm
[ Other topology detection components may be added for other
resource managers, specific hardware or whatever we want ...]
- Collection : an allgather is performed to have all other processes'
addresses
- Renumbering : "string" addresses are converted to numbers starting
at 0 (Example : nodenames "foo" and "bar" are renamed 0 and 1).
Any comment welcome,
Sylvain
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>