[OMPI devel] Slurm integration and rankfiles....

2021-03-11 Thread Martyn Foster via devel
Hi all,

Using a rather trivial example
mpirun -np 1 -rf rankfile ./HelloWorld
on a Slurm system;
--
While trying to determine what resources are available, the SLURM
resource allocator expects to find the following environment variables:

SLURM_NODELIST
SLURM_TASKS_PER_NODE

However, it was unable to find the following environment variable:

SLURM_TASKS_PER_NODE

--

(Both for OpenMPI 4.0/4.1). It is correct the variable is not set,  but
why is  SLURM_TASKS_PER_NODE expected or required when using a
rankfile where one presumes it would not be a constant across the job
anyway?

Martyn


Re: [OMPI devel] Slurm integration and rankfiles....

2021-03-12 Thread Martyn Foster via devel
Hi Ralph,

Slurm is 19.05.

To be clear - its not unexpected that SLURM_TASKS_PER_NODE is unset in the
configuration.

Martyn

On Thu, 11 Mar 2021 at 16:09, Ralph Castain via devel <
devel@lists.open-mpi.org> wrote:

> What version of Slurm is this?
>
> > On Mar 11, 2021, at 8:03 AM, Martyn Foster via devel <
> devel@lists.open-mpi.org> wrote:
> >
> > Hi all,
> >
> > Using a rather trivial example
> > mpirun -np 1 -rf rankfile ./HelloWorld
> > on a Slurm system;
> >
> --
> > While trying to determine what resources are available, the SLURM
> > resource allocator expects to find the following environment variables:
> >
> > SLURM_NODELIST
> > SLURM_TASKS_PER_NODE
> >
> > However, it was unable to find the following environment variable:
> >
> > SLURM_TASKS_PER_NODE
> >
> >
> --
> >
> > (Both for OpenMPI 4.0/4.1). It is correct the variable is not set,  but
> why is  SLURM_TASKS_PER_NODE expected or required when using a rankfile
> where one presumes it would not be a constant across the job anyway?
> >
> > Martyn
> >
>
>
>


Re: [OMPI devel] Slurm integration and rankfiles....

2021-03-21 Thread Martyn Foster via devel
Sorry for the slow reply!

I didn't want to get fixated on why the variable was unset, though I can
understand the existence of a check if Slurm always sets this (I don't
recall that being the case for all configurations historically, but perhaps
it is now). The reason I'd unset it (!) is because I was trying to build an
environment to support completely arbitrary task placement/distribution
that works with various launchers (orterun/srun/hydra) and it seems
tasks_per_node being set was upsetting one of the others.

Slurm's internal geometry parameters can't possibly describe an arbitrary
(rankfile) layout, so I was nervous about why they would be required if a
rankfile was provided...

Martyn

On Mon, 15 Mar 2021 at 19:57, Ralph Castain via devel <
devel@lists.open-mpi.org> wrote:

> Martyn? Why are you saying SLURM_TASKS_PER_NODE might not be present?
>
> It sounds to me like something is wrong in your Slurm environment - I
> really believe that this envar is always supposed to be there.
>
>
> > On Mar 15, 2021, at 4:20 AM, Peter Kjellström  wrote:
> >
> > On Fri, 12 Mar 2021 22:19:09 +
> > Ralph Castain via devel  wrote:
> >
> >> Why would it not be set? AFAICT, Slurm is supposed to always set that
> >> envar, or so we've been told.
> >
> > Maybe confusion on the exact name?
> >
> > AFAIK slurm always sets SLURM_TASKS_PER_NODE but only sets
> > SLURM_NTASKS_PER_NODE (almost same name) when --ntasks-per-node is
> > given.
> >
> > /Peter K
>
>
>