Hi Slurm Devs,

I'm using 17.02.2 with MVAPICH2-GDR 2.2-5.


I'm having a couple of problems, seemingly both Slurm and MPI related.


But first, the problem I see immediately with my Slurm installation is
that it is not distributing MPI processes correctly.  When I run a
simple test with 'srun -n 64 -N 8 ./a.out', node 0 gets ranks 0 through
49, and nodes 1-8 get 50 through 63.

Interestingly, ranks 48 and 49 are segfaulting (somewhere in MPI_Init).

Including the flag '--tasks-per-node=8' allows it to complete successfully.


Are you aware of something in my configuration or environment that would
cause this behavior?


The other behavior I see in MPI is that some collective calls are
misbehaving, in that they're returning without a number of processes in
the communicator.


Any clues you may provide will be gratefully received!


Best -- Geof

Reply via email to