Sten,

Some thoughts which may or may not be of any help…


1.        The slurmctld determines the number of tasks to allocate on each 
node.  You need the task/affinity plugin enabled for the slurmd to actually 
bind tasks to sockets, cores, or threads.

2.       A good description of how this happens can be found here:  
http://www.schedmd.com/slurmdocs/cpu_management.html

3.       The code that handles the rich set of related options (some of which 
you explored below) is very complicated and has undergone numerous improvements 
in the past.

4.       The answer as to whether there is a bug in v2.3.2 in the code you 
exercised is to compare it against the latest v2.4 behavior.  If it works as 
expected in v2.4, then v2.3.2 has a bug.

Don

From: Sten Wolf [mailto:[email protected]]
Sent: Tuesday, March 13, 2012 11:41 PM
To: slurm-dev
Subject: [slurm-dev] Re: forcing tasks per socket constraint with openmpi

Got it working in the end -
sbatch -N 8 -B 2:4:1 --ntasks-per-socket=4 --ntasks-per-node=8 myapp.sh

It seems redundant to add --ntasks-per-node=8, if the node description already 
includes 2 sockets and the ntasks-per-socket is defined. Is that a bug?

On 14/03//2012 05:51, Sten Wolf wrote:
Hi,

I am using slurm 2.3.2, with openmpi 1.4.5 on dual socket -  6 core - single 
thread intel CPUs (defined in slurm.conf, as well as 
SelectTypeParameters=CR_Core_Memory).

I am trying to run a 64 logical cpu mpi app on 8 nodes, using 4 tasks per 
socket, but everything I've tried from sbatch uses either 6 nodes (12 cores per 
node x 5 + 4cores), or 8 nodes  with 8 cores per node, but assigned 6+2 instead 
of 4+4.
I know I can solve this easily at the mpirun level by providing a correct 
machine file etc.  , but I'm hoping I can use the scheduler to assign resources 
correctly.

I have created a simple batch file containing 2 lines:
$ cat myapp.sh
#!/bin/bash
mpirun --mca btl openib,self,sm $HOME/myapp

So far I have tried the following:

1. sbatch -n 64 --ntasks-per-socket=4 myapp.sh
2. sbatch -n 64 -B 2:4:1 myapp.sh
3. sbatch -N 8 -B 2:4:1 myapp.sh
4. sbatch -N 8 --ntasks-per-socket=4 myapp.sh
5. sbatch -N 8 -B 2:4:1 --ntasks-per-node=8 myapp.sh

1-4) allocate 12 tasks per node,
5) allocates 8 tasks per node (6+2 allocation).
What am I doing wrong?

for some reason, when using -B, even though the allocation is the same as when 
unused, I get better results (total runtime-wise).
I assume -B is only a constraint (only allocate nodes which support at least 
the -B geometry), but I was hoping there was some way for slurm to pass my 
preferences to openmpi.

Thanks in advance


<<inline: ~WRD000.jpg>>

Reply via email to