Re: [OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

Gilles Gouaillardet Tue, 17 Oct 2017 00:00:39 -0700

Konstantinos,


I am afraid there is some confusion here.

the plm_rsh_no_tree_spawn is only used at startup time (e.g. when remotelaunching one orted daemon per node but the one running mpirun).

there is zero impact on the performances of MPI communications such asMPI_Bcast()

the coll/tuned module select the broadcast algorithm based oncommunicator and message sizes.

you can manually force that with

mpirun --mca coll_tuned_use_dynamic_rules true --mcacoll_tuned_bcast_algo <algo> ./my_test


where <algo> is the algo number as described by ompi_info --all

MCA coll tuned: parameter "coll_tuned_bcast_algorithm"(current value: "ignore", data source: default, level: 5 tuner/detail,type: int) Which bcast algorithm is used. Can be lockeddown to choice of: 0 ignore, 1 basic linear, 2 chain, 3: pipeline, 4:split binary tree, 5: binary tree, 6: binomial tree. Valid values: 0:"ignore", 1:"basic_linear",2:"chain", 3:"pipeline", 4:"split_binary_tree", 5:"binary_tree",6:"binomial"

for some specific communicator and message sizes, you might experiencebetter performances.you also have the option to write your own rules (e.g. which algo shouldbe used based on communicator and message sizes) if you are not happywith the default rules.

(that would be with the coll_tuned_dynamic_rules_filename MCA option)

note coll/tuned does not take the topology (e.g. inter vs intra nodecommunications) into consideration when choosing the algorithm.



Cheers,

Gilles

On 10/17/2017 3:30 PM, Konstantinos Konstantinidis wrote:

I have implemented some algorithms in C++ which are greatly affectedby shuffling time among nodes which is done by some broadcast calls.Up to now, I have been testing them by running something like
mpirun -mca btl ^openib -mca plm_rsh_no_tree_spawn 1 ./my_test
which I think make MPI_Bcast to work serially. Now, I want to improvethe communication time so I have configured the appropriate SSH accessfrom every node to every other node and I have enabled the binary treeimplementation of Open MPI collective calls by running
mpirun -mca btl ^openib ./my_test
My problem is that throughout various experiments with files ofdifferent sizes, I realized that there is no improvement in terms oftransmission time even though theoretically I would expect a gain ofapproximately (log(k))/(k-1) where k is the size of the group that thecommunication takes place within.
I compile the code with

mpic++ my_test.cc -o my_test
and all of the experiments are done on Amazon EC2 r3.large or m3.largemachines. I have also set different values of rate limits to avoidbursty behavior of Amazon's EC2 transmission rate. The Open MPI I haveinstalled is described on the txt I have attached after running ompi_info.
What can be wrong here?


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] MPI_Bcast performance doesn't improve after enabling tree implementation

Reply via email to