[Public] Hi Folks,
I can run benchmarks and find the pml+btl (ob1, ucx, uct, vader, etc) combination that gives the best performance, but I wanted to hear from the community about what is generally used in "__high_core_count_intra_node_" cases before jumping into conclusions. As I am a newcomer to openMPI I don't want to end up using a combination only because it fared better in a benchmark (overfitting?) Or the choice of pml+btl for the 'intranode' case is not so important as openmpi is mainly used in 'internode' and the 'networking-equipment' decides the pml+btl? (UCX for IB) --Arun -----Original Message----- From: users <users-boun...@lists.open-mpi.org> On Behalf Of Chandran, Arun via users Sent: Thursday, March 2, 2023 4:01 PM To: users@lists.open-mpi.org Cc: Chandran, Arun <arun.chand...@amd.com> Subject: [OMPI users] What is the best choice of pml and btl for intranode communication Hi Folks, As the number of cores in a socket is keep on increasing, the right pml,btl (ucx, ob1, uct, vader, etc) that gives the best performance in "intra-node" scenario is important. For openmpi-4.1.4, which pml, btl combination is the best for intra-node communication in the case of higher core count scenario? (p-to-p as well as coll) and why? Does the answer for the above question holds good for the upcoming ompi5 release? --Arun