I am curious if anyone is doing any work currently on the hierarchical collectives. I ask this because I just did some runs on a cluster made up of 4 servers with 4 processors per server. I used TCP over IB. I was running with np=16 and using the IMB benchmark to test MPI_Bcast. What I am seeing is that the hierarchical collectives appear to boost performance. The IMB test rotates the root so one could imagine that since the hierarchical minimizes internode communication, performance increases. See the table at the end of this post with the comparison for MPI_Bcast between tuned and hierarchical. This leads me to a few other questions.
1. From what I can tell from the debug messages, we still cannot stack the hierarchical on top of the tuned. I know that Brian Barrett did some work after the collectives meeting to allow for this, but I could not figure out how to get it to work.
2. Enabling the hierarchical collectives causes a massive slowdown during MPI_Init. I know it was discussed a little at the collectives meeting and it appears that this is still something we need to solve. For a simple hello_world, np=4, 2 node cluster, I see around 5 seconds to run for tuned collectives, but I see around 18 seconds for
hierarchical. 3. Apart from the MPI_Init issue, is hierarchical ready to go? 4. As the nodes get fatter, I assume the need for hierarchical will increase, so this may become a larger issue for all of us? RESULTS FROM TWO RUNS OF IMB-MPI1 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 16 TUNED HIERARCH #---------------------------------------------------------------- #bytes #repetitions t_avg[usec] t_avg[usec] 0 1000 0.11 0.22 1 1000 205.97 319.86 2 1000 159.23 180.80 4 1000 175.32 189.16 8 1000 153.10 184.26 16 1000 170.98 192.33 32 1000 160.69 187.17 64 1000 159.75 182.62 128 1000 175.47 185.19 256 1000 160.77 194.68 512 1000 265.45 313.89 1024 1000 185.66 215.43 2048 1000 815.97 257.37 4096 1000 1208.48 442.93 8192 1000 1521.23 530.54 16384 1000 2357.45 813.44 32768 1000 3341.29 1455.78 65536 640 6485.70 3387.02 131072 320 13488.35 5261.65 262144 160 24783.09 10747.28 524288 80 50906.06 21817.64 1048576 40 95466.82 41397.49 2097152 20 180759.72 81319.54 4194304 10 322327.71 163274.55 ========================= rolf.vandeva...@sun.com 781-442-3043 =========================