Just tested on a 96-core shared-memory machine. Running OpenMPI 1.6
mpiexec lstopo, here's the execution time (mpiexec launch time is 0.2-0.4s)

1 rank :  0.2s
8 ranks:  0.3-0.5s depending on binding (packed or scatter)
24ranks:  0.8-3.7s depending on binding
48ranks:  2.8-8.0s depending on binding
96ranks: 14.2s

96ranks from a single XML file: 0.4s (negligible against mpiexec launch time)


Brice



Le 05/03/2013 20:23, Simon Hammond a écrit :
> Hi HWLOC users,
>
> We are seeing some significant performance problems using HWLOC 1.6.2
> on Intel's MIC products. In one of our configurations we create 56 MPI
> ranks, each rank then queries the topology of the MIC card before
> creating threads. We are noticing that if we run 56 MPI ranks as
> opposed to one the calls to query the topology in HWLOC are very slow,
> runtime goes from seconds to minutes (and upwards).
>
> We guessed that this might be caused by the kernel serializing access
> to the /proc filesystem but this is just a hunch. 
>
> Has anyone had this problem and found an easy way to change the
> library / calls to HWLOC so that the slow down is not experienced?
> Would you describe this as a bug?
>
> Thanks for your help.
>
>
> --
> Simon Hammond
>
> 1-(505)-845-7897 / MS-1319
> Scalable Computer Architectures
> Sandia National Laboratories, NM
>
>
>
>
>
>
> _______________________________________________
> hwloc-users mailing list
> hwloc-us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/hwloc-users

Reply via email to