ucx PML should work just fine even on a single node scenario. As Jeff
indicated you need to move the MCA param `--mca pml ucx` before your
command.

  George.


On Mon, Mar 6, 2023 at 9:48 AM Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:

> If this run was on a single node, then UCX probably disabled itself since
> it wouldn't be using InfiniBand or RoCE to communicate between peers.
>
> Also, I'm not sure your command line was correct:
>
> perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf  --mca
> pml ucx
>
> You probably need to list all of mpirun's CLI options *before*​ you list
> the ./perf executable.  In its right-to-left traversal, once mpirun hits a
> CLI option it does not recognize (e.g., "./perf"), it assumes that it is
> the user's executable name, and does not process the CLI options to the
> right of that.
>
> Hence, the output you show must have forced the UCX PML another way --
> perhaps you set an environment variable or something?
>
> ------------------------------
> *From:* users <users-boun...@lists.open-mpi.org> on behalf of Chandran,
> Arun via users <users@lists.open-mpi.org>
> *Sent:* Monday, March 6, 2023 3:33 AM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Chandran, Arun <arun.chand...@amd.com>
> *Subject:* Re: [OMPI users] What is the best choice of pml and btl for
> intranode communication
>
>
> [Public]
>
>
>
> Hi Gilles,
>
>
>
> Thanks very much for the information.
>
>
>
> I was looking for the best pml + btl combination for a standalone intra
> node with high task count (>= 192) with no HPC-class networking installed.
>
>
>
> Just now realized that I can’t use pml ucx for such cases as it is unable
> find IB and fails.
>
>
>
> perf_benchmark $ mpirun -np 32 --map-by core --bind-to core ./perf  --mca
> pml ucx
>
> --------------------------------------------------------------------------
>
>
> No components were able to be opened in the pml
> framework.
>
>
>
>
> This typically means that either no components of this type
> were
>
> installed, or none of the installed components can be
> loaded.
>
> Sometimes this means that shared libraries required by
> these
>
> components are unable to be
> found/loaded.
>
>
>
>
>   Host:
> lib-ssp-04
>
>   Framework:
> pml
>
> --------------------------------------------------------------------------
>
>
> [lib-ssp-04:753542] PML ucx cannot be
> selected
>
> [lib-ssp-04:753531] PML ucx cannot be
> selected
>
> [lib-ssp-04:753541] PML ucx cannot be
> selected
>
> [lib-ssp-04:753539] PML ucx cannot be
> selected
>
> [lib-ssp-04:753545] PML ucx cannot be
> selected
>
> [lib-ssp-04:753547] PML ucx cannot be
> selected
>
> [lib-ssp-04:753572] PML ucx cannot be
> selected
>
> [lib-ssp-04:753538] PML ucx cannot be selected
>
>
> [lib-ssp-04:753530] PML ucx cannot be
> selected
>
> [lib-ssp-04:753537] PML ucx cannot be
> selected
>
> [lib-ssp-04:753546] PML ucx cannot be selected
>
>
> [lib-ssp-04:753544] PML ucx cannot be
> selected
>
> [lib-ssp-04:753570] PML ucx cannot be
> selected
>
> [lib-ssp-04:753567] PML ucx cannot be selected
>
>
> [lib-ssp-04:753534] PML ucx cannot be
> selected
>
> [lib-ssp-04:753592] PML ucx cannot be selected
>
> [lib-ssp-04:753529] PML ucx cannot be selected
>
> <snip>
>
>
>
> That means my only choice is pml/ob1 + btl/vader.
>
>
>
> --Arun
>
>
>
> *From:* users <users-boun...@lists.open-mpi.org> *On Behalf Of *Gilles
> Gouaillardet via users
> *Sent:* Monday, March 6, 2023 12:56 PM
> *To:* Open MPI Users <users@lists.open-mpi.org>
> *Cc:* Gilles Gouaillardet <gilles.gouaillar...@gmail.com>
> *Subject:* Re: [OMPI users] What is the best choice of pml and btl for
> intranode communication
>
>
>
> *Caution:* This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
>
> Arun,
>
>
>
> First Open MPI selects a pml for **all** the MPI tasks (for example,
> pml/ucx or pml/ob1)
>
>
>
> Then, if pml/ob1 ends up being selected, a btl component (e.g. btl/uct,
> btl/vader) is used for each pair of MPI tasks
>
> (tasks on the same node will use btl/vader, tasks on different nodes will
> use btl/uct)
>
>
>
> Note that if UCX is available, pml/ucx takes the highest priority, so no
> btl is involved
>
> (in your case, if means intra-node communications will be handled by UCX
> and not btl/vader).
>
> You can force ob1 and try different combinations of btl with
>
> mpirun --mca pml ob1 --mca btl self,<btl1>,<btl2> ...
>
>
>
> I expect pml/ucx is faster than pml/ob1 with btl/uct for inter node
> communications.
>
>
>
> I have not benchmarked Open MPI for a while and it is possible btl/vader
> outperforms pml/ucx for intra nodes communications,
>
> so if you run on a small number of Infiniband interconnected nodes with a
> large number of tasks per node, you might be able
>
> to get the best performances by forcing pml/ob1.
>
>
>
> Bottom line, I think it is best for you to benchmark your application and
> pick the combination that leads to the best performances,
>
> and you are more than welcome to share your conclusions.
>
>
>
> Cheers,
>
>
>
> Gilles
>
>
>
>
>
> On Mon, Mar 6, 2023 at 3:12 PM Chandran, Arun via users <
> users@lists.open-mpi.org> wrote:
>
> [Public]
>
> Hi Folks,
>
> I can run benchmarks and find the pml+btl (ob1, ucx, uct, vader, etc)
> combination that gives the best performance,
> but I wanted to hear from the community about what is generally used in
> "__high_core_count_intra_node_" cases before jumping into conclusions.
>
> As I am a newcomer to openMPI I don't want to end up using a combination
> only because it fared better in a benchmark (overfitting?)
>
> Or the choice of pml+btl for the 'intranode' case is not so important as
> openmpi is mainly used in 'internode' and the 'networking-equipment'
> decides the pml+btl? (UCX for IB)
>
> --Arun
>
> -----Original Message-----
> From: users <users-boun...@lists.open-mpi.org> On Behalf Of Chandran,
> Arun via users
> Sent: Thursday, March 2, 2023 4:01 PM
> To: users@lists.open-mpi.org
> Cc: Chandran, Arun <arun.chand...@amd.com>
> Subject: [OMPI users] What is the best choice of pml and btl for intranode
> communication
>
> Hi Folks,
>
> As the number of cores in a socket is keep on increasing, the right
> pml,btl (ucx, ob1, uct, vader, etc) that gives the best performance in
> "intra-node" scenario is important.
>
> For openmpi-4.1.4, which pml, btl combination is the best for intra-node
> communication in the case of higher core count scenario? (p-to-p as well as
> coll) and why?
> Does the answer for the above question holds good for the upcoming ompi5
> release?
>
> --Arun
>
>

Reply via email to