We (Dakota project at Sandia) have gotten a lot of mileage with “tiling”
multiple MPI executions within a SLURM allocation using the relative host
indexing options, mpirun -host +n2,+n3, for instance. (Thanks for the feature!)
However, it’s been almost exclusively with openmpi-1.x.
I’m attempting to use the relative host indexing feature on a CTS-1/TOSS3
machine under SLURM with an openmpi-4.1.1 application. What I’m seeing is that
(I believe in contrast to 3.x and earlier), the default number of slots per
node is no longer taken from SLURM_TASKS_PER_NODE or available cores when using
the relative host indexing and I get the attached (helpful) error.
From the message and reading docs I understand the default is to assume one
slot N=1 if not specified. It seems I can work around this with any of:
-host +n0, +n0, ..., +n0 # (${SLURM_TASKS_PER_NODE} or num_mpi_tasks times
if less than cores per node)
-host mz52:${SLURM_TASKS_PER_NODE} # or num_mpi_tasks times
--oversubscribe
But -host +n0:${SLURM_TASKS_PER_NODE} does not work. Should it?
Our typical use cases might look like 2 nodes, 16 cores each and we want to
start up 4 mpi jobs each on 8 cores, e.g., if we don’t worry about pinning
tasks, our dispatch script is effectively doing:
mpirun -n 8 --bind-to none -host +n0 mpi_hello.exe config0
mpirun -n 8 --bind-to none -host +n0 mpi_hello.exe config1
mpirun -n 8 --bind-to none -host +n1 mpi_hello.exe config2
mpirun -n 8 --bind-to none -host +n1 mpi_hello.exe config3
Is there a way I can specify the slots when using the relative node indexing?
Or ideally, another way to do this where I wouldn’t have to worry about the
slots at all and have it default to cores per node or from the slurm job
configuration? (I’d rather not use hostfiles or hostnames if possible as the
relative node command line options integrate super smoothly.) If not
command-line, even an environment variable, e.g., OMPI_MCA_rmaps_*, might not
be too hard to build into the workflow.
Thanks for any insights,
Brian
$ mpirun -n 2 -host +n0 /bin/hostname
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
/bin/hostname
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
fff