Yeah, I don’t think that the slurm bindings will work for you. Problem is that 
the slurm directive gets applied to the launch of our daemon, not the 
application procs. So what you’ve done is bind our daemon to 3 cpus. This has 
nothing to do with the OMPI-Slurm integration - you told slurm to bind any  
process it launches to 3 cpus, and the only “processes” slurm launches are our 
daemons.

The only way to get what you want is to have slurm make the allocation without 
specifying cpus-per-task, and then have mpirun do the pe=N.


> On Jan 30, 2015, at 8:20 AM, Michael Di Domenico <mdidomeni...@gmail.com> 
> wrote:
> 
> I'm trying to get slurm and openmpi to cooperate when running multi
> thread jobs.  i'm sure i'm doing something wrong, but i can't figure
> out what
> 
> my node configuration is
> 
> 2 nodes
> 2 sockets
> 6 cores per socket
> 
> i want to run
> 
> sbatch -N2 -n 8 --ntasks-per-node=4 --cpus-per-task=3 -w node1,node2
> program.sbatch
> 
> inside the program.sbatch i'm calling openmpi
> 
> mpirun -n $SLURM_NTASKS --report-bindings program
> 
> when the binds report comes out i get
> 
> node1 rank 0 socket 0 core 0
> node1 rank 1 socket 1 core 6
> node1 rank 2 socket 0 core 1
> node1 rank 3 socket 1 core 7
> node2 rank 4 socket 0 core 0
> node2 rank 5 socket 1 core 6
> node2 rank 6 socket 0 core 1
> node2 rank 7 socket 1 core 7
> 
> which is semi-fine, but when the job runs the resulting threads from
> the program are locked (according to top) to those eight cores rather
> then spreading themselves over the 24 cores available
> 
> i tried a few incantations of the map-by, bind-to, etc, but openmpi
> basically complained about everything i tried for one reason or
> another
> 
> my understand is that slurm should be passing the requested config to
> openmpi (or openmpi is pulling from the environment somehow) and it
> should magically work
> 
> if i skip slurm and run
> 
> mpirun -n 8 --map-by node:pe=3 -bind-to core -host node1,node2
> --report-bindings program
> 
> node1 rank 0 socket 0 core 0
> node2 rank 1 socket 0 core 0
> node1 rank 2 socket 0 core 3
> node2 rank 3 socket 0 core 3
> node1 rank 4 socket 1 core 6
> node2 rank 5 socket 1 core 6
> node1 rank 6 socket 1 core 9
> node2 rank 7 socket 1 core 9
> 
> i do get the behavior i want (though i would prefer a -npernode switch
> in there, but openmpi complains).  the bindings look better and the
> threads are not locked to the particular cores
> 
> therefore i'm pretty sure this is a problem between openmpi and slurm
> and not necessarily with either individually
> 
> i did compile openmpi with the slurm support switch and we're using
> the cgroups taskplugin within slurm
> 
> i guess ancillary to this, is there a way to turn off core
> binding/placement routines and control the placement manually?
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/01/26245.php

Reply via email to