Hi Slurm Devs,
I'm using 17.02.2 with MVAPICH2-GDR 2.2-5. I'm having a couple of problems, seemingly both Slurm and MPI related. But first, the problem I see immediately with my Slurm installation is that it is not distributing MPI processes correctly. When I run a simple test with 'srun -n 64 -N 8 ./a.out', node 0 gets ranks 0 through 49, and nodes 1-8 get 50 through 63. Interestingly, ranks 48 and 49 are segfaulting (somewhere in MPI_Init). Including the flag '--tasks-per-node=8' allows it to complete successfully. Are you aware of something in my configuration or environment that would cause this behavior? The other behavior I see in MPI is that some collective calls are misbehaving, in that they're returning without a number of processes in the communicator. Any clues you may provide will be gratefully received! Best -- Geof