Hi Ole,

you can disable openib via
OMPI_MCA_btl='^openib'
or at compile time by putting
configopts = '--without-verbs'
in the easyconfig. Neither UCX nor OpenIB (=verbs) is the best solution for
OmniPath

The other thing to look out for is if psm2 is compiled in. The EasyBuild
log (
grep mtl:psm2 $EBROOTOPENMPI/easybuild/easybuild-OpenMPI*log
after loading the module)
should have this line:
checking if MCA component mtl:psm2 can compile... yes

Bart

On Mon, 6 Dec 2021 at 09:04, Ole Holm Nielsen <ole.h.niel...@fysik.dtu.dk>
wrote:

> Hi Bart,
>
> Thanks for your recommendations!  We had already tried this:
>
> export OMPI_MCA_osc='^ucx'
> export OMPI_MCA_pml='^ucx'
>
> and unfortunately this increased the CPU time of our benchmark code (GPAW)
> by about 30% compared to the same compute node without an Omni-Path
> adapter.  So this doesn't appear to be a viable solution.
>
> We had also tried to rebuild with:
>
> $ eb --filter-deps=UCX OpenMPI-4.0.5-GCC-10.2.0.eb --force
>
> but then the job error log files had some warnings:
>
> >
> --------------------------------------------------------------------------
> > By default, for Open MPI 4.0 and later, infiniband ports on a device
> > are not used by default.  The intent is to use UCX for these devices.
> > You can override this policy by setting the btl_openib_allow_ib MCA
> parameter
> > to true.
> >
> >   Local host:              d063
> >   Local adapter:           hfi1_0
> >   Local port:              1
> >
> >
> --------------------------------------------------------------------------
> >
> --------------------------------------------------------------------------
> > WARNING: There is at least non-excluded one OpenFabrics device found,
> > but there are no active ports detected (or Open MPI was unable to use
> > them).  This is most certainly not what you wanted.  Check your
> > cables, subnet manager configuration, etc.  The openib BTL will be
> > ignored for this job.
> >
> >   Local host: d063
> >
> --------------------------------------------------------------------------
> > [d063.nifl.fysik.dtu.dk:23605] 55 more processes have sent help message
> help-mpi-btl-openib.txt / ib port not selected
> > [d063.nifl.fysik.dtu.dk:23605] Set MCA parameter
> "orte_base_help_aggregate" to 0 to see all help / error messages
> > [d063.nifl.fysik.dtu.dk:23605] 55 more processes have sent help message
> help-mpi-btl-openib.txt / no active ports found
>
> These warnings did sound rather bad, so we didn't pursue this approach any
> further.
>
> Do you have any other ideas about OMPI_* variables that we could try?
> Since I'm not an MPI expert, complete commands and variables would be
> appreciated :-)
>
> I would like to remind you that we're running AlmaLinux 8.5 with new
> versions of libfabric etc. from the BaseOS.  On CentOS 7.9 we never had
> any problems with Omni-Path adapters.
>
> Thanks,
> Ole
>
> On 12/3/21 15:08, Bart Oldeman wrote:
> > Hi Ole,
> >
> > we found that UCX isn't very useful not performant on OmniPath, so if
> your
> > compiled isn't used on both InfiniBand and OmniPath you can compile
> > OpenMPI using "eb --filter-deps=UCX ..."
> > Open MPI works well there either using libpsm2 directly (using the "cm"
> > pml and "psm2" mtl), or via libfabric (using the same "cm" pml and the
> > "ofi" mtl)
> >
> > We use the same Open MPI binaries on multiple clusters but set this on
> > OmniPath:
> > OMPI_MCA_btl='^openib'
> > OMPI_MCA_osc='^ucx'
> > OMPI_MCA_pml='^ucx'
> > to disable UCX and openib at runtime. If you include UCX in EB's OpenMPI
> > it will not compile in "openib" so the first one of those three would
> not
> > be needed.
> >
> > Regards,
> > Bart
> >
> > On Fri, 3 Dec 2021 at 07:29, Ole Holm Nielsen <
> ole.h.niel...@fysik.dtu.dk
> > <mailto:ole.h.niel...@fysik.dtu.dk>> wrote:
> >
> >     Hi Åke,
> >
> >     On 12/3/21 08:27, Åke Sandgren wrote:
> >      >> On 02-12-2021 14:18, Åke Sandgren wrote:
> >      >>> On 12/2/21 2:06 PM, Ole Holm Nielsen wrote:
> >      >>>> These are updated observations of running OpenMPI codes with an
> >      >>>> Omni-Path network fabric on AlmaLinux 8.5::
> >      >>>>
> >      >>>> Using the foss-2021b toolchain and OpenMPI/4.1.1-GCC-11.2.0 my
> >     trivial
> >      >>>> MPI test code works correctly:
> >      >>>>
> >      >>>> $ ml OpenMPI
> >      >>>> $ ml
> >      >>>>
> >      >>>> Currently Loaded Modules:
> >      >>>>     1) GCCcore/11.2.0                     9)
> >     hwloc/2.5.0-GCCcore-11.2.0
> >      >>>>     2) zlib/1.2.11-GCCcore-11.2.0        10) OpenSSL/1.1
> >      >>>>     3) binutils/2.37-GCCcore-11.2.0      11)
> >      >>>> libevent/2.1.12-GCCcore-11.2.0
> >      >>>>     4) GCC/11.2.0                        12)
> >     UCX/1.11.2-GCCcore-11.2.0
> >      >>>>     5) numactl/2.0.14-GCCcore-11.2.0     13)
> >      >>>> libfabric/1.13.2-GCCcore-11.2.0
> >      >>>>     6) XZ/5.2.5-GCCcore-11.2.0           14)
> >     PMIx/4.1.0-GCCcore-11.2.0
> >      >>>>     7) libxml2/2.9.10-GCCcore-11.2.0     15)
> >     OpenMPI/4.1.1-GCC-11.2.0
> >      >>>>     8) libpciaccess/0.16-GCCcore-11.2.0
> >      >>>>
> >      >>>> $ mpicc mpi_test.c
> >      >>>> $ mpirun -n 2 a.out
> >      >>>>
> >      >>>> (null): There are 2 processes
> >      >>>>
> >      >>>> (null): Rank  1:  d008
> >      >>>>
> >      >>>> (null): Rank  0:  d008
> >      >>>>
> >      >>>>
> >      >>>> I also tried the OpenMPI/4.1.0-GCC-10.2.0 module, but this
> still
> >     gives
> >      >>>> the error messages:
> >      >>>>
> >      >>>> $ ml OpenMPI/4.1.0-GCC-10.2.0
> >      >>>> $ ml
> >      >>>>
> >      >>>> Currently Loaded Modules:
> >      >>>>     1) GCCcore/10.2.0               3)
> >     binutils/2.35-GCCcore-10.2.0   5)
> >      >>>> numactl/2.0.13-GCCcore-10.2.0   7)
> >     libxml2/2.9.10-GCCcore-10.2.0      9)
> >      >>>> hwloc/2.2.0-GCCcore-10.2.0      11)
> >     UCX/1.9.0-GCCcore-10.2.0         13)
> >      >>>> PMIx/3.1.5-GCCcore-10.2.0
> >      >>>>     2) zlib/1.2.11-GCCcore-10.2.0   4)
> >     GCC/10.2.0                     6)
> >      >>>> XZ/5.2.5-GCCcore-10.2.0         8)
> >     libpciaccess/0.16-GCCcore-10.2.0  10)
> >      >>>> libevent/2.1.12-GCCcore-10.2.0  12)
> >     libfabric/1.11.0-GCCcore-10.2.0  14)
> >      >>>> OpenMPI/4.1.0-GCC-10.2.0
> >      >>>>
> >      >>>> $ mpicc mpi_test.c
> >      >>>> $ mpirun -n 2 a.out
> >      >>>> [1638449983.577933] [d008:910356:0]       ib_iface.c:966  UCX
> ERROR
> >      >>>> ibv_create_cq(cqe=4096) failed: Operation not supported
> >      >>>> [1638449983.577827] [d008:910355:0]       ib_iface.c:966  UCX
> ERROR
> >      >>>> ibv_create_cq(cqe=4096) failed: Operation not supported
> >      >>>> [d008.nifl.fysik.dtu.dk:910355
> >     <http://d008.nifl.fysik.dtu.dk:910355>] pml_ucx.c:273  Error: Failed
> >     to create
> >      >>>> UCP worker
> >      >>>> [d008.nifl.fysik.dtu.dk:910356
> >     <http://d008.nifl.fysik.dtu.dk:910356>] pml_ucx.c:273  Error: Failed
> >     to create
> >      >>>> UCP worker
> >      >>>>
> >      >>>> (null): There are 2 processes
> >      >>>>
> >      >>>> (null): Rank  0:  d008
> >      >>>>
> >      >>>> (null): Rank  1:  d008
> >      >>>>
> >      >>>> Conclusion: The foss-2021b toolchain with
> >     OpenMPI/4.1.1-GCC-11.2.0 seems
> >      >>>> to be required on systems with an Omni-Path network fabric on
> >     AlmaLinux
> >      >>>> 8.5.  Perhaps the newer UCX/1.11.2-GCCcore-11.2.0 is really
> what's
> >      >>>> needed, compared to UCX/1.9.0-GCCcore-10.2.0 from foss-2020b.
> >      >>>>
> >      >>>> Does anyone have comments on this?
> >      >>>
> >      >>> UCX is the problem here in combination with libfabric I think.
> >     Write a
> >      >>> hook that upgrades the version of UCX to 1.11-something if it's
> <
> >      >>> 1.11-ish, or just that specific version if you have
> older-and-working
> >      >>> versions.
> >      >>
> >      >> You are right that the nodes with Omni-Path have different
> libfabric
> >      >> packages which come from the EL8.5 BaseOS as well as the latest
> >      >> Cornelis/Intel Omni-Path drivers:
> >      >>
> >      >> $ rpm -qa | grep libfabric
> >      >> libfabric-verbs-1.10.0-2.x86_64
> >      >> libfabric-1.12.1-1.el8.x86_64
> >      >> libfabric-devel-1.12.1-1.el8.x86_64
> >      >> libfabric-psm2-1.10.0-2.x86_64
> >      >>
> >      >> The 1.12 packages are from EL8.5, and 1.10 packages are from
> Cornelis.
> >      >>
> >      >> Regarding UCX, I was first using the trusted foss-2020b toolchain
> >     which
> >      >> includes UCX/1.9.0-GCCcore-10.2.0. I guess that we shouldn't
> mess with
> >      >> the toolchains?
> >      >>
> >      >> The foss-2021b toolchain includes the newer UCX 1.11, which
> seems to
> >      >> solve this particular problem.
> >      >>
> >      >> Can we make any best practices recommendations from these
> >     observations?
> >      >
> >      > I didn't check properly, but UCX does not depend on libfabric,
> OpenMPI
> >      > does, so I'd write a hook that replaces libfabric < 1.12 with at
> least
> >      > 1.12.1.
> >      > Sometimes you just have to mess with the toolchains, and this
> looks
> >     like
> >      > one of those situations.
> >      >
> >      > Or as a test build your own OpenMPI-4.1.0 or 4.0.5 (that 2020b
> uses)
> >      > with an updated libfabric and check if that fixes the problem. And
> >     if it
> >      > does, write a hook that replaces libfabric. See the
> framework/contrib
> >      > for examples, I did that for UCX so there is code there to show
> you
> >     how.
> >
> >     I don't feel qualified to mess around with modifying EB toolchains...
> >
> >     The foss-2021b toolchain including OpenMPI/4.1.1-GCC-11.2.0 seems to
> >     solve
> >     the present problem.  Do you think there are any disadvantages with
> >     asking
> >     users to go for foss-2021b?  Of course we may need several modules
> to be
> >     upgraded from foss-2020b to foss-2021b.
> >
> >     Another possibility may be the coming driver upgrade from Cornelis
> >     Networks to support the Omni-Path fabric on EL 8.4 and EL 8.5.  I'm
> >     definitely going to check this when it becomes available.
> >
> >     Thanks,
> >     Ole
> >
> >
> >
> > --
> > Dr. Bart E. Oldeman | bart.olde...@mcgill.ca
> > <mailto:bart.olde...@mcgill.ca> | bart.olde...@calculquebec.ca
> > <mailto:bart.olde...@calculquebec.ca>
> > Scientific Computing Analyst / Analyste en calcul scientifique
> > McGill HPC Centre / Centre de Calcul Haute Performance de McGill |
> > http://www.hpc.mcgill.ca <http://www.hpc.mcgill.ca>
> > Calcul Québec | http://www.calculquebec.ca <http://www.calculquebec.ca>
> > Compute/Calcul Canada | http://www.computecanada.ca
> > <http://www.computecanada.ca>
> > Tel/Tél: 514-396-8926 | Fax/Télécopieur: 514-396-8934
>
>

-- 
Dr. Bart E. Oldeman | bart.olde...@mcgill.ca | bart.olde...@calculquebec.ca
Scientific Computing Analyst / Analyste en calcul scientifique
McGill HPC Centre / Centre de Calcul Haute Performance de McGill |
http://www.hpc.mcgill.ca
Calcul Québec | http://www.calculquebec.ca
Compute/Calcul Canada | http://www.computecanada.ca
Tel/Tél: 514-396-8926 | Fax/Télécopieur: 514-396-8934

Reply via email to