SLURM seems to be doing this in the case of a regular srun:

[brent@node1 mpi]$ srun -N 2 -n 4 env | egrep 
SLURM_NODEID\|SLURM_PROCID\|SLURM_LOCALID | sort
SLURM_LOCALID=0
SLURM_LOCALID=0
SLURM_LOCALID=1
SLURM_LOCALID=1
SLURM_NODEID=0
SLURM_NODEID=0
SLURM_NODEID=1
SLURM_NODEID=1
SLURM_PROCID=0
SLURM_PROCID=1
SLURM_PROCID=2
SLURM_PROCID=3
[brent@node1 mpi]$

Since srun is not supported currently by OpenMPI, I have to use salloc - right? 
 In this case, it is up to OpenMPI to interpret the SLURM environment variables 
it sees in the one process that is launched and 'do the right thing' - whatever 
that means in this case.  How does OpenMPI start the processes on the remote 
nodes under the covers?  (use srun, generate a hostfile and launch as you would 
outside SLURM, ...)  This may be the difference between HP-MPI and OpenMPI.

Thanks,

Brent


From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Wednesday, February 23, 2011 10:07 AM
To: Open MPI Users
Subject: Re: [OMPI users] SLURM environment variables at runtime

Resource managers generally frown on the idea of any program passing RM-managed 
envars from one node to another, and this is certainly true of slurm. The 
reason is that the RM reserves those values for its own use when managing 
remote nodes. For example, if you got an allocation and then used mpirun to 
launch a job across only a portion of that allocation, and then ran another 
mpirun instance in parallel on the remainder of the nodes, the slurm envars for 
those two mpirun instances -need- to be quite different. Having mpirun forward 
the values it sees would cause the system to become very confused.

We learned the hard way never to cross that line :-(

You have two options:

(a) you could get your sys admin to configure slurm correctly to provide your 
desired envars on the remote nodes. This is the recommended (by slurm and other 
RMs) way of getting what you requested. It is a simple configuration option - 
if he needs help, he should contact the slurm mailing list

(b) you can ask mpirun to do so, at your own risk. Specify each parameter with 
a "-x FOO" argument. See "man mpirun" for details. Keep an eye out for aberrant 
behavior.

Ralph

On Wed, Feb 23, 2011 at 8:38 AM, Henderson, Brent 
<brent.hender...@hp.com<mailto:brent.hender...@hp.com>> wrote:
Hi Everyone, I have an OpenMPI/SLURM specific question,

I'm using MPI as a launcher for another application I'm working on and it is 
dependent on the SLURM environment variables making their way into the a.out's 
environment.  This works as I need if I use HP-MPI/PMPI, but when I use 
OpenMPI, it appears that not all are set as I would like across all of the 
ranks.

I have example output below from a simple a.out that just writes out the 
environment that it sees to a file whose name is based on the node name and 
rank number.  Note that with OpenMPI, that things like SLURM_NNODES and 
SLURM_TASKS_PER_NODE are not set the same for ranks on the different nodes and 
things like SLURM_LOCALID are just missing entirely.

So the question is, should the environment variables on the remote nodes (from 
the perspective of where the job is launched) have the full set of SLURM 
environment variables as seen on the launching node?

Thanks,

Brent Henderson

[brent@node2 mpi]$ rm node*
[brent@node2 mpi]$ mkdir openmpi hpmpi
[brent@node2 mpi]$ salloc -N 2 -n 4 mpirun ./printenv.openmpi
salloc: Granted job allocation 23
Hello world! I'm 3 of 4 on node1
Hello world! I'm 2 of 4 on node1
Hello world! I'm 1 of 4 on node2
Hello world! I'm 0 of 4 on node2
salloc: Relinquishing job allocation 23
[brent@node2 mpi]$ mv node* openmpi/
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node1.3.of.4
SLURM_JOB_NODELIST=node[1-2]
SLURM_NNODES=1
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=1
SLURM_NPROCS=1
SLURM_STEP_NODELIST=node1
SLURM_STEP_TASKS_PER_NODE=1
SLURM_NODEID=0
SLURM_PROCID=0
SLURM_LOCALID=0
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' openmpi/node2.1.of.4
SLURM_JOB_NODELIST=node[1-2]
SLURM_NNODES=2
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_NPROCS=4
[brent@node2 mpi]$


[brent@node2 mpi]$ /opt/hpmpi/bin/mpirun -srun -N 2 -n 4 ./printenv.hpmpi
Hello world! I'm 2 of 4 on node2
Hello world! I'm 3 of 4 on node2
Hello world! I'm 0 of 4 on node1
Hello world! I'm 1 of 4 on node1
[brent@node2 mpi]$ mv node* hpmpi/
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node1.1.of.4
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_STEP_NODELIST=node[1-2]
SLURM_STEP_TASKS_PER_NODE=2(x2)
SLURM_NNODES=2
SLURM_NPROCS=4
SLURM_NODEID=0
SLURM_PROCID=1
SLURM_LOCALID=1
[brent@node2 mpi]$ egrep 
'NODEID|NNODES|LOCALID|NODELIST|NPROCS|PROCID|TASKS_PER' hpmpi/node2.3.of.4
SLURM_NODELIST=node[1-2]
SLURM_TASKS_PER_NODE=2(x2)
SLURM_STEP_NODELIST=node[1-2]
SLURM_STEP_TASKS_PER_NODE=2(x2)
SLURM_NNODES=2
SLURM_NPROCS=4
SLURM_NODEID=1
SLURM_PROCID=3
SLURM_LOCALID=1
[brent@node2 mpi]$

_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to