I am using Open MPI 2.1.0 on RHEL 7. My application has one unavoidable
pinch point where a large amount of data needs to be transferred (about 8
GB of data needs to be both sent to and received all other ranks), and I'm
seeing worse performance than I would expect; this step has a major impact
on
Adam,
at first, you need to change the default send and receive socket buffers :
mpirun --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 ...
/* note this will be the default from Open MPI 2.1.2 */
hopefully, that will be enough to greatly improve the bandwidth for
large messages.
generally speakin
Gilles,
Thanks for the fast response!
The --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 flags you recommended
made a huge difference - this got me up to 5.7 Gb/s! I wasn't aware of
these flags... with a little Googling, is
https://www.open-mpi.org/faq/?category=tcp the best place to look for this
Adam,
You can also set btl_tcp_links to 2 or 3 to allow multiple connections
between peers, with a potential higher aggregate bandwidth.
George.
On Sun, Jul 9, 2017 at 10:04 AM, Adam Sylvester wrote:
> Gilles,
>
> Thanks for the fast response!
>
> The --mca btl_tcp_sndbuf 0 --mca btl_tcp_r
Adam,
Thanks for letting us know your performance issue has been resolved.
yes, https://www.open-mpi.org/faq/?category=tcp is the best place to
look for this kind of information.
i will add a reference to these parameters. i will also ask folks at AWS
if they have additional/other recommen