Just tested on a 96-core shared-memory machine. Running OpenMPI 1.6 mpiexec lstopo, here's the execution time (mpiexec launch time is 0.2-0.4s)
1 rank : 0.2s 8 ranks: 0.3-0.5s depending on binding (packed or scatter) 24ranks: 0.8-3.7s depending on binding 48ranks: 2.8-8.0s depending on binding 96ranks: 14.2s 96ranks from a single XML file: 0.4s (negligible against mpiexec launch time) Brice Le 05/03/2013 20:23, Simon Hammond a écrit : > Hi HWLOC users, > > We are seeing some significant performance problems using HWLOC 1.6.2 > on Intel's MIC products. In one of our configurations we create 56 MPI > ranks, each rank then queries the topology of the MIC card before > creating threads. We are noticing that if we run 56 MPI ranks as > opposed to one the calls to query the topology in HWLOC are very slow, > runtime goes from seconds to minutes (and upwards). > > We guessed that this might be caused by the kernel serializing access > to the /proc filesystem but this is just a hunch. > > Has anyone had this problem and found an easy way to change the > library / calls to HWLOC so that the slow down is not experienced? > Would you describe this as a bug? > > Thanks for your help. > > > -- > Simon Hammond > > 1-(505)-845-7897 / MS-1319 > Scalable Computer Architectures > Sandia National Laboratories, NM > > > > > > > _______________________________________________ > hwloc-users mailing list > hwloc-us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users