In the future, can you please just mail one of the lists?  This particular 
question is probably more of a users type of question (since we're not talking 
about the internals of Open MPI itself), so I'll reply just on the users list.

For what it's worth, I'm unable to replicate your error:


$ mpirun --version

mpirun (Open MPI) 5.0.0rc9


Report bugs to https://www.open-mpi.org/community/help/

$ cat hostfile

mpi002 slots=1

mpi005 slots=1

$ mpirun -n 2 --machinefile hostfile hostname

mpi002

mpi005

Can you try running with "--mca rmaps_base_verbose 100" so that we can get some 
debugging output and see why the slots aren't working for you?  Show the full 
output, like I did above (e.g., cat the hostfile, and then mpirun with the MCA 
param and all the output).  Thanks!

--
Jeff Squyres
jsquy...@cisco.com
________________________________
From: devel <devel-boun...@lists.open-mpi.org> on behalf of mrlong via devel 
<de...@lists.open-mpi.org>
Sent: Monday, November 7, 2022 3:37 AM
To: de...@lists.open-mpi.org <de...@lists.open-mpi.org>; Open MPI Users 
<users@lists.open-mpi.org>
Cc: mrlong <mrlong...@gmail.com>
Subject: [OMPI devel] There are not enough slots available in the system to 
satisfy the 2, slots that were requested by the application


Two machines, each with 64 cores. The contents of the hosts file are:

192.168.180.48 slots=1
192.168.60.203 slots=1

Why do you get the following error when running with openmpi 5.0.0rc9?

(py3.9) [user@machine01 share]$  mpirun -n 2 --machinefile hosts hostname
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:

  hostname

Either request fewer procs for your application, or make more slots
available for use.

A "slot" is the PRRTE term for an allocatable unit where we can
launch a process.  The number of slots available are defined by the
environment in which PRRTE processes are run:

  1. Hostfile, via "slots=N" clauses (N defaults to number of
     processor cores if not provided)
  2. The --host command line parameter, via a ":N" suffix on the
     hostname (N defaults to 1 if not provided)
  3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
  4. If none of a hostfile, the --host command line parameter, or an
     RM is present, PRRTE defaults to the number of processor cores

In all the above cases, if you want PRRTE to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.

Alternatively, you can use the --map-by :OVERSUBSCRIBE option to ignore the
number of available slots when deciding the number of processes to
launch.

Reply via email to