We (Dakota project at Sandia) have gotten a lot of mileage with “tiling” 
multiple MPI executions within a SLURM allocation using the relative host 
indexing options, mpirun -host +n2,+n3, for instance. (Thanks for the feature!) 
However, it’s been almost exclusively with openmpi-1.x.

I’m attempting to use the relative host indexing feature on a CTS-1/TOSS3 
machine under SLURM with an openmpi-4.1.1 application. What I’m seeing is that 
(I believe in contrast to 3.x and earlier), the default number of slots per 
node is no longer taken from SLURM_TASKS_PER_NODE or available cores when using 
the relative host indexing and I get the attached (helpful) error.

From the message and reading docs I understand the default is to assume one 
slot N=1 if not specified. It seems I can work around this with any of:
  -host +n0, +n0, ..., +n0   # (${SLURM_TASKS_PER_NODE} or num_mpi_tasks times 
if less than cores per node)
  -host mz52:${SLURM_TASKS_PER_NODE} # or num_mpi_tasks times
  --oversubscribe

But -host +n0:${SLURM_TASKS_PER_NODE} does not work. Should it?

Our typical use cases might look like 2 nodes, 16 cores each and we want to 
start up 4 mpi jobs each on 8 cores, e.g., if we don’t worry about pinning 
tasks, our dispatch script is effectively doing:
  mpirun -n 8 --bind-to none -host +n0 mpi_hello.exe config0
  mpirun -n 8 --bind-to none -host +n0 mpi_hello.exe config1
  mpirun -n 8 --bind-to none -host +n1 mpi_hello.exe config2
  mpirun -n 8 --bind-to none -host +n1 mpi_hello.exe config3

Is there a way I can specify the slots when using the relative node indexing? 
Or ideally, another way to do this where I wouldn’t have to worry about the 
slots at all and have it default to cores per node or from the slurm job 
configuration? (I’d rather not use hostfiles or hostnames if possible as the 
relative node command line options integrate super smoothly.) If not 
command-line, even an environment variable, e.g., OMPI_MCA_rmaps_*, might not 
be too hard to build into the workflow.

Thanks for any insights,
Brian



$ mpirun -n 2 -host +n0 /bin/hostname

--------------------------------------------------------------------------

There are not enough slots available in the system to satisfy the 2

slots that were requested by the application:



  /bin/hostname



Either request fewer slots for your application, or make more slots

available for use.



A "slot" is the Open MPI term for an allocatable unit where we can

launch a process.  The number of slots available are defined by the

environment in which Open MPI processes are run:



  1. Hostfile, via "slots=N" clauses (N defaults to number of

     processor cores if not provided)

  2. The --host command line parameter, via a ":N" suffix on the

     hostname (N defaults to 1 if not provided)

  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)

  4. If none of a hostfile, the --host command line parameter, or an

     RM is present, Open MPI defaults to the number of processor cores



In all the above cases, if you want Open MPI to default to the number

of hardware threads instead of the number of processor cores, use the

--use-hwthread-cpus option.



Alternatively, you can use the --oversubscribe option to ignore the

number of available slots when deciding the number of processes to

launch.



fff

Reply via email to