I was reading about this today. Isn't OpenMPI compiled --with-slurm by
default when installing with one of the pkg managers?

https://www.open-mpi.org/faq/?category=building#build-rte

Cheers
L.

------
The most dangerous phrase in the language is, "We've always done it this
way."

- Grace Hopper

On 13 April 2016 at 16:30, Craig Yoshioka <[email protected]> wrote:

>
> Thanks, I'll add that to my list of things to try. I did use --with-pmi
> but not --with-slurm.
>
> Sent from my iPhone
>
> > On Apr 12, 2016, at 11:19 PM, Jordan Willis <[email protected]>
> wrote:
> >
> >
> > Have you tried recompiling openmpi with the —with-slurm option? That did
> the trick for me
> >
> >
> >> On Apr 12, 2016, at 10:52 PM, Craig Yoshioka <[email protected]> wrote:
> >>
> >> Hi,
> >>
> >> I have a strange situation that I could use assistance with.  We
> recently rebooted some nodes in our Slurm cluster and after the reboot,
> running MPI programs on these nodes results in complaints from OpenMPI
> about the Infiniband ports:
> >>
> >>
> --------------------------------------------------------------------------
> >> No OpenFabrics connection schemes reported that they were able to be
> >> used on a specific port.  As such, the openib BTL (OpenFabrics
> >> support) will be disabled for this port.
> >>
> >> Local host:           XXXXXXXXXX
> >> Local device:         mlx4_0
> >> Local port:           1
> >> CPCs attempted:       udcm
> >> —————————————————————————————————————
> >>
> [XXXXXXXXXXX][[7024,1],1][btl_openib_proc.c:157:mca_btl_openib_proc_create]
> [btl_openib_proc.c:157] ompi_modex_recv failed for peer [[7024,1],0]
> >>
> >> These nodes did receive some updates, but are otherwise all running the
> same version of Slurm (15.08.7) and OpenMPI (1.10.2).  The weird thing is
> that if I ssh into the affected nodes and use mpirun directly Infiniband
> works correctly.  So the problem definitely involves an interaction between
> Slurm (maybe via PMI?) and OpenMPI.
> >>
> >> Any thoughts?
> >>
> >> Thanks!,
> >> -Craig
> >>
>

Reply via email to