Just wanted to follow up on my own post.

Turns out there was a missing symlink (much embarrassment) on by build host.   
That’s why you don’t see “pmix_v1” in the “srun —mpi=list” output (previous 
post).
Once I fixed that and rebuilt SLURM, I was able to launch existing OpenMPI 3.x 
apps with,

      srun —mpi=pmix_v1

Apologies for the wasted bandwidth.

Regards,

Charlie

> On Jun 28, 2018, at 8:14 AM, Charles A Taylor <chas...@ufl.edu> wrote:
> 
> There is a name for my pain and it is “OpenMPI + PMIx”.  :)
> 
> I’m looking at upgrading SLURM from 16.05.11 to 17.11.05 (bear with me, this 
> is not a SLURM question).
> 
> After building SLURM 17.11.05 with 
> ‘--with-pmix=/opt/pmix/1.1.5:/opt/pmix/2.1/1’ and installing a test instance, 
> I see
> 
> $ srun --mpi=list
> srun: MPI types are...
> srun: pmix
> srun: pmi2
> srun: pmix_v2
> srun: none
> srun: openmpi
> 
> Seems reasonable.
> 
> Now, we have applications built with OpenMPI 3.0.0 and 3.1.0 linked against 
> /opt/pmix/1.1.5 (--with-pmix=/opt/pmix/1.1.5).  When I attempt to launch 
> these applications using,
> 
>      srun —mpi=pmix <some mpi app>
> 
> I get the following ...
> 
> [c1a-s18.ufhpc:17995] Security mode none is not available
> [c1a-s18.ufhpc:17995] PMIX ERROR: UNREACHABLE in file 
> src/client/pmix_client.c at line 199
> --------------------------------------------------------------------------
> The application appears to have been direct launched using "srun",
> but OMPI was not built with SLURM's PMI support and therefore cannot
> execute. There are several options for building PMI support under
> SLURM, depending upon the SLURM version you are using:
> 
>  version 16.05 or later: you can use SLURM's PMIx support. This
>  requires that you configure and build SLURM --with-pmix.
> 
>  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
>  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
>  install PMI-2. You must then build Open MPI using --with-pmi pointing
>  to the SLURM PMI library location.
> 
> Please configure as appropriate and try again.
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***    and potentially your MPI job)
> ———————————————————————————————————————————————————————
> 
> So slurm/srun appear to have library support for both pmix and pmix_v2 and 
> OpenMPI 3.0.0 and OpenMPI 3.1.0 both have pmix support (1.1.5) since we 
> launch them every day with “srun —mpi=pmix” under slurm 16.05.11.
> 
> Is this a bug?   Am I overlooking something?  Is it possible to transition to 
> OpenMPI 3.x + PMIx 2.x + SLURM 17.x without rebuilding (essentially) 
> everything (including all applications)?
> 
> Charlie Taylor
> UF Research Computing
> 

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to