Not stupid at all. I suspect the problem is that OMPI was not configured
--with-pmi=. As a result, when you srun the
application, each processes thinks it is a singleton and nothing works
correctly.
OMPI does not pickup the slurm pmi support by default due to license issues, so
you have to ma
Thanks!
Indeed seems to work when I used "--with_pmi" when building openmpi, and
added the flag --mpi=pmi2 to the srun command.
Much appreciated!
/jon
On 10/06/2017 09:18 PM, r...@open-mpi.org wrote:
Not stupid at all. I suspect the problem is that OMPI was not configured
--with-pmi=. As
But unfortunately the parallel job started by srun is still killed at
the first notice when the higher priority job enters the queue. That is,
the parallel job is killed without grace time (for other, serial
applications, it seems to work as I expect).
Regards,
/jon
On 10/07/2017 10:01 AM,