Hi, At our site here at the University of Karlsruhe we are running two large clusters with SLURM and HP-MPI. For our new cluster we want to keep SLURM and switch to OpenMPI. While testing I got the following problem:
with HP-MPI I do something like srun -N 2 -n 2 -b mpirun -srun helloworld and get Hi, here is process 0 of 2, running MPI version 2.0 on xc3n13. Hi, here is process 1 of 2, running MPI version 2.0 on xc3n14. when I try the same with OpenMPI (version 1.2.4) srun -N 2 -n 2 -b mpirun helloworld I get Hi, here is process 1 of 8, running MPI version 2.0 on xc3n13. Hi, here is process 0 of 8, running MPI version 2.0 on xc3n13. Hi, here is process 5 of 8, running MPI version 2.0 on xc3n14. Hi, here is process 2 of 8, running MPI version 2.0 on xc3n13. Hi, here is process 4 of 8, running MPI version 2.0 on xc3n14. Hi, here is process 3 of 8, running MPI version 2.0 on xc3n13. Hi, here is process 6 of 8, running MPI version 2.0 on xc3n14. Hi, here is process 7 of 8, running MPI version 2.0 on xc3n14. and with srun -N 2 -n 2 -b mpirun -np 2 helloworld Hi, here is process 0 of 2, running MPI version 2.0 on xc3n13. Hi, here is process 1 of 2, running MPI version 2.0 on xc3n13. which is still wrong, because it uses only one of the two allocated nodes. OpenMPI uses the SLURM_NODELIST and SLURM_TASKS_PER_NODE environment variables, starts with slurm one orted per node and tasks upto the maximum number of slots on every node. So basically it also does some 'resource management' and interferes with slurm. OK, I can fix that with a mpirun wrapper script which calls mpirun with the right -np and the right rmaps_base_n_pernode setting, but it gets worse. We want to allocate computing power on a per cpu base instead of per node, i.e. different user might share a node. In addition slurm allows to schedule according to memory usage. Therefore it is important that on every node there is exactly the number of tasks running that slurm wants. The only solution I came up with is to generate for every job a detailed hostfile and call mpirun --hostfile. Any suggestions for improvement? I've found a discussion thread "slurm and all-srun orterun" in the mailinglist archive concerning the same problem, where Ralph Castain announced that he is working on two new launch methods which would fix my problems. Unfortunately his email address is deleted from the archive, so it would be really nice if the friendly elf mentioned there is still around and could forward my mail to him. Thanks in advance, Werner Augustin