By the way, there was a change between 2.x and 3.0.x: 2.x:
Hello, world, I am 0 of 1, (Open MPI v2.1.2a1, package: Open MPI bbarrett@ip-172-31-64-10 Distribution, ident: 2.1.2a1, repo rev: v2.1.1-59-gdc049e4, Unreleased developer copy, 148) Hello, world, I am 0 of 1, (Open MPI v2.1.2a1, package: Open MPI bbarrett@ip-172-31-64-10 Distribution, ident: 2.1.2a1, repo rev: v2.1.1-59-gdc049e4, Unreleased developer copy, 148) 3.0.x: % srun -n 2 ./hello_c *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) [ip-172-31-64-100:72545] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! *** An error occurred in MPI_Init *** on a NULL communicator *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, *** and potentially your MPI job) [ip-172-31-64-100:72546] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed! srun: error: ip-172-31-64-100: tasks 0-1: Exited with exit code 1 Don’t think it really matters, since v2.x probably wasn’t what the customer wanted. Brian On Jun 19, 2017, at 7:18 AM, Howard Pritchard <hpprit...@gmail.com<mailto:hpprit...@gmail.com>> wrote: Hi Ralph I think the alternative you mention below should suffice. Howard r...@open-mpi.org<mailto:r...@open-mpi.org> <r...@open-mpi.org<mailto:r...@open-mpi.org>> schrieb am Mo. 19. Juni 2017 um 07:24: So what you guys want is for me to detect that no opal/pmix framework components could run, detect that we are in a slurm job, and so print out an error message saying “hey dummy - you didn’t configure us with slurm pmi support”? It means embedding slurm job detection code in the heart of ORTE (as opposed to in a component), which bothers me a bit. As an alternative, what if I print out a generic “you didn’t configure us with pmi support for this environment” instead of the “pmix select failed” message? I can mention how to configure the support in a general way, but it avoids having to embed slurm detection into ORTE outside of a component. > On Jun 16, 2017, at 8:39 AM, Jeff Squyres (jsquyres) > <jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote: > > +1 on the error message. > > > >> On Jun 16, 2017, at 10:06 AM, Howard Pritchard >> <hpprit...@gmail.com<mailto:hpprit...@gmail.com>> wrote: >> >> Hi Ralph >> >> I think a helpful error message would suffice. >> >> Howard >> >> r...@open-mpi.org<mailto:r...@open-mpi.org> >> <r...@open-mpi.org<mailto:r...@open-mpi.org>> schrieb am Di. 13. Juni 2017 >> um 11:15: >> Hey folks >> >> Brian brought this up today on the call, so I spent a little time >> investigating. After installing SLURM 17.02 (with just --prefix as config >> args), I configured OMPI with just --prefix config args. Getting an >> allocation and then executing “srun ./hello” failed, as expected. >> >> However, configuring OMPI --with-pmi=<path-to-slurm> resolved the problem. >> SLURM continues to default to PMI-1, and so we pick that option up and use >> it. Everything works fine. >> >> FWIW: I also went back and checked using SLURM 15.08 and got the identical >> behavior. >> >> So the issue is: we don’t pick up PMI support by default, and never have due >> to the SLURM license issue. Thus, we have always required that the user >> explicitly configure --with-pmi so they take responsibility for the license. >> This is an acknowledged way of avoiding having GPL pull OMPI under its >> umbrella as it is the user, and not the OMPI community, that is making the >> link. >> >> I’m not sure there is anything we can or should do about this, other than >> perhaps providing a nicer error message. Thoughts? >> Ralph >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel > > > -- > Jeff Squyres > jsquy...@cisco.com<mailto:jsquy...@cisco.com> > > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel _______________________________________________ devel mailing list devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel _______________________________________________ devel mailing list devel@lists.open-mpi.org<mailto:devel@lists.open-mpi.org> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel