Eric,

In the 1.3 and some of the latest 1.2.X versions tuned is the default component for collectives. However, the tuned currently in the trunk are optimized for high performance networks (such as IB or MX), and they do not deliver the best performance on slower devices such as Ethernet.

In order to play with the different implementation of allgather you should either on the $(HOME)/.openmpi/mca-params.conf or command line set the following MCA parameters: 1) coll_tuned_use_dynamic_rules to one in order to enable fine grain selection of the algorithms 2) coll_tuned_allgather_algorithm to a value between 0 and 6 (read the output corresponding to this algorithm from 'ompi_info --param coll tuned' once you enabled the dynamic rules).

This will allow you to select a specific algorithm for the allgather. You can further tuned it, by playing with the fanout (in case of trees topologies), and with the segment size (for the pipelined ones).

  george.


On Oct 3, 2008, at 8:48 AM, Eric Thibodeau wrote:

Hello all,

I am currently profiling a simple case where I replace multiple S/ R calls with Allgather calls and it would _seem_ the simple S/R calls are faster. Now, *before* I come to any conclusion on this, one of the pieces I am missing is more details on how /if/when the tuned coll MCA is selected. In other words, can I assume the tuned versions are used by default? I skimmed through the well documented source code but before I can even start to analyze the replacement's impact (in a small cluster), I need to know how and when the tuned coll MCA is used/selected.

Thanks,

Eric
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to