The problem here is that you have made an incorrect assumption. In the older OMPI versions, the -H option simply indicated that the specified hosts were available for use - it did not imply the number of slots on that host. Since you have specified 2 slots on each host, and you told mpirun to launch 2 procs of your second app_context (the “slave”), it filled the first node with the 2 procs.
I don’t recall the options for that old a version, but IIRC you should add --pernode to the cmd line to get exactly 1 proc/node Or upgrade to a more recent OMPI version where -H can also be used to specify the #slots on a node :-) > On May 15, 2018, at 11:58 PM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: > > You can try to disable SLURM : > > mpirun --mca ras ^slurm --mca plm ^slurm --mca ess ^slurm,slurmd ... > > That will require you are able to SSH between compute nodes. > Keep in mind this is far form ideal since it might leave some MPI > processes on nodes if you cancel a job, and mess SLURM accounting too. > > > Cheers, > > Gilles > > On Wed, May 16, 2018 at 3:50 PM, Nicolas Deladerriere > <nicolas.deladerri...@gmail.com> wrote: >> Hi all, >> >> >> >> I am trying to run mpi application through SLURM job scheduler. Here is my >> running sequence >> >> >> sbatch --> my_env_script.sh --> my_run_script.sh --> mpirun >> >> >> In order to minimize modification of my production environment, I had to >> setup following hostlist management in different scripts: >> >> >> my_env_script.sh >> >> >> build host list from SLURM resource manager information >> >> Example: node01 nslots=2 ; node02 nslots=2 ; node03 nslots=2 >> >> >> my_run_script.sh >> >> >> Build host list according to required job (process mapping depends on job >> requirement). >> >> Nodes are always fully dedicated to my job, but I have to manage different >> master-slave situation with corresponding mpirun command: >> >> as many process as number of slots: >> >> mpirun -H node01 -np 1 process_master.x : -H node02,node02,node03,node03 -np >> 4 process_slave.x >> >> only one process per node (slots are usually used through openMP threading) >> >> mpirun -H node01 -np 1 other_process_master.x : -H node02,node03 -np 2 >> other_process_slave.x >> >> >> >> However, I realized that whatever I specified through my mpirun command, >> process mapping is overridden at run time by slurm according to slurm >> setting (either default setting or sbatch command line). For example, if I >> run with: >> >> >> sbatch -N 3 --exclusive my_env_script.sh myjob >> >> >> where final mpirun command (depending on myjob) is: >> >> >> mpirun -H node01 -np 1 other_process_master.x : -H node02,node03 -np 2 >> other_process_slave.x >> >> >> It will be run with process mapping corresponding to: >> >> >> mpirun -H node01 -np 1 other_process_master.x : -H node02,node02 -np 2 >> other_process_slave.x >> >> >> So far I did not find a way to force mpirun to run with host mapping from >> command line instead of slurm one. Is there a way to do it (either by using >> MCA parameters of slurm configuration or …) ? >> >> >> openmpi version : 1.6.5 >> >> slurm version : 17.11.2 >> >> >> >> Ragards, >> >> Nicolas >> >> >> Note 1: I know, it would be better to let slurm manage my process mapping by >> only using slurm parameters and not specifying host mapping in my mpirun >> command, but in order to minimize modification in my production environment >> I had to use such solution. >> >> Note 2: I know I am using old openmpi version ! >> >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users