Hi Christoph,

you need to make use of PMI2 version of slurm to test MPI_comm_spawn primitive of mpich2.

In more detail, you have to rebuilt your mpich2 adding the following flags on your configure:

--enable-pmiport --with-pmi=pmi2--with-slurm=$YOUR_SLURM


and when you run jobs with slurm you need to make use of the following parameter on your srun:

--mpi=pmi2

By the way, since you are testing MPI_comm_spawn, the following slurm feature that allows jobs to change their sizes while they are executing may be interesting for you:

http://lists.schedmd.com/cgi-bin/dada/mail.cgi/r/slurmdev/974238636614/

Best Regards,
Yiannis

On 03/26/2013 09:18 AM, Christoph Sprenger wrote:
Hi,

i've been trying to test to test MPI_comm_spawn interface with slurm and
pm=srun opposed to hydra provided by mpich2

i've rebuilt my mpich2 version with these flags:
./configure --with-pmi=slurm --with-pm=no --enable-shared

whatever i've tried so far i can't get it to spawn new commands for me:

int err = MPI_Comm_spawn(cmd, &char_argv[0], 1, NULL, 0, mpi_mgr.comm(),
&intracomm, MPI_ERRCODES_IGNORE);

results in:

Fatal error in MPI_Comm_spawn: Other MPI error, error stack:
MPI_Comm_spawn(144)...........: MPI_Comm_spawn(cmd="spawn_tst",
argv=0x7faea00276b0, maxprocs=1, info=0x9c000000, root=0,
MPI_COMM_WORLD, intercomm=0x7fff96511850, errors=(nil)) failed
MPIDI_Comm_spawn_multiple(240): PMI_Spawn_multiple returned -1

all the code works fine via hydra. i'm curious if people are using srun
pm with mpi_comm_spawn successfully or if there are some caveats/known
issues i need to look out for ? i can't seem to even make the basics
examples work, so i am sure i must be doing something wrong.

i'm using  slurm-2.5.4 and mpich2-1.5rc1

any help would be highly appreciated.

srun -N 2 -B '*:*:*' --exclusive mycmd args


Kind Regards,
Christoph


Reply via email to