Re: [OMPI devel] Hierarchical Collectives Query

Edgar Gabriel Thu, 24 Jan 2008 20:02:44 -0500

Rolf,

Whoowh! That's actually good news, since in our own tests hierarch isalways slower. But this might be due to various reasons, including thefact, that we only have two cores per node. BTW: I actually would expectIMB test to have worse performance for hierarch compared to many otherbenchmarks, since the rotating root causes some additional work/overhead.


Rolf Vandevaart wrote:

I am curious if anyone is doing any work currently on the hierarchicalcollectives. I ask this because I just did some runs on a cluster madeup of 4 servers with 4 processors per server. I used TCP over IB. Iwas running with np=16 and using the IMB benchmark to test MPI_Bcast.What I am seeing is that the hierarchical collectives appear to boostperformance. The IMB test rotates the root so one could imagine thatsince the hierarchical minimizes internode communication, performanceincreases. See the table at the end of this post with the comparisonfor MPI_Bcast between tuned and hierarchical. This leads me to a fewother questions.
1. From what I can tell from the debug messages, we still cannot stackthe hierarchical on top of the tuned. I know that Brian Barrett didsome work after the collectives meeting to allow for this, but I couldnot figure out how to get it to work.

actually, this should be possible. We do however experience with thecurrent trunk some problems, so I can not verify right now. So, just forthe sake of clarity, did you run hierarch on top of tuned or on top ofbasic and/or sm?

2. Enabling the hierarchical collectives causes a massive slowdownduring MPI_Init. I know it was discussed a little at the collectivesmeeting and it appears that this is still something we need to solve.For a simple hello_world, np=4, 2 node cluster, I see around 5 secondsto run for tuned collectives, but I see around 18 seconds for
hierarchical.

yes. A faster, however simpler hierarchy detection is implemented, butnot yet committed.


3. Apart from the MPI_Init issue, is hierarchical ready to go?

Clearly, the algorithms are very simple in hierarch, but they are stilllacking large-scale testing, so this is something which would have to beincorporated.

We have also experimented with various other hierarchical algorithms forbcast, over the last few months. Our overall progress has however beensignificantly slower than I hoped. I know however, that various othergroups also have interest in the hierarch component, and might also beready to invest some time to bring it up to speed.




Thanks
Edgar


4. As the nodes get fatter, I assume the need for hierarchical
will increase, so this may become a larger issue for all of us?

RESULTS FROM TWO RUNS OF IMB-MPI1

#----------------------------------------------------------------
# Benchmarking Bcast
# #processes = 16             TUNED         HIERARCH
#----------------------------------------------------------------
        #bytes #repetitions  t_avg[usec]  t_avg[usec]
             0         1000         0.11         0.22
             1         1000       205.97       319.86
             2         1000       159.23       180.80
             4         1000       175.32       189.16
             8         1000       153.10       184.26
            16         1000       170.98       192.33
            32         1000       160.69       187.17
            64         1000       159.75       182.62
           128         1000       175.47       185.19
           256         1000       160.77       194.68
           512         1000       265.45       313.89
          1024         1000       185.66       215.43
          2048         1000       815.97       257.37
          4096         1000      1208.48       442.93
          8192         1000      1521.23       530.54
         16384         1000      2357.45       813.44
         32768         1000      3341.29      1455.78
         65536          640      6485.70      3387.02
        131072          320     13488.35      5261.65
        262144          160     24783.09     10747.28
        524288           80     50906.06     21817.64
       1048576           40     95466.82     41397.49
       2097152           20    180759.72     81319.54
       4194304           10    322327.71    163274.55


=========================
[email protected]
781-442-3043
=========================
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab      http://pstl.cs.uh.edu
Department of Computer Science          University of Houston
Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335

Re: [OMPI devel] Hierarchical Collectives Query

Reply via email to