I am curious about the algorithm(s) used in the OpenMPI implementations
of the all2all and all2allv.  As many of you know, there are alternate
algorithms for all2all type operations, such as that of Plimpton, et al
(2006), that basically exchange latency costs for bandwidth costs, which
pays big dividends for large processor numbers, e.g. 100's or 1000's.
Does OpenMPI, or any other MPI distributions, test for processor count
and switch to such an all2all algorithm at some point?  I realize the
switchover point would be very much a function of the architecture, and
so could be a risky decision in some cases.  Nevertheless, has it been
considered?

Tom R.



Reply via email to