Hi Junjun,

On Mon, Jan 23, 2017 at 12:04:17AM -0800, liu junjun wrote:

> Hi all,
> 
> I have small MPI test program just printing the rannk id of a parallel job.
> The output is like this:
> >mpirun -n 2 ./mpitest
> Hello world: rank 0 of 2 running on cddlogin
> Hello world: rank 1 of 2 running on cddlogin
> 
> I ran this test program with salloc. It produces similar output:
> >salloc -n 2
> salloc: Granted job allocation 3605
> >mpirun -n 2 ./mpitest
> Hello world: rank 0 of 2 running on cdd001
> Hello world: rank 1 of 2 running on cdd001
> 
> I put this one line command into a bash script for running with sbatch. It
> also get the same result as expected. However, it is totally different if
> it run with srun:
> >srun -n 2 mpirun -n 2 ./mpitest
> Hello world: rank 0 of 2 running on cdd001
> Hello world: rank 1 of 2 running on cdd001
> Hello world: rank 0 of 2 running on cdd001
> Hello world: rank 1 of 2 running on cdd001

That looks like expected behaviour from calling both srun and mpirun; have never
tried it, but it looks like what might happen if you call them both.

But it's not recommended to run your code like that.

I think basically don't call both srun and mpirun! In your sbatch either put:

  #SBATCH -n 2
  ....
  mpirun ./mpitest


..or:


  #SBATCH -n 2
  ....
  srun ./mpitest


You don't need both. And it's simpler not to repeat the '-n 2' again in the
mpirun/srun line, as it will lead to copy/paste errors when you change it in the
'#SBATCH' line but not below.

> The test program was invoked twice ($SLURM_NTASKS) with each time asked 2
> ($SLURM_NTASKS) CPU for mpi program!!

Yes.

> The problem of srun is actually not about mpi:
> >srun -n 2 echo "Hello"
> Hello
> Hello
> 
> How can I resolve the problem of srun, and let it behaves like sbatch or
> salloc, where the program executed only one time?
> 
> The version of slurm is 16.05.3, and

Thanks,
Paddy

-- 
Paddy Doyle
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Phone: +353-1-896-3725
http://www.tchpc.tcd.ie/

Reply via email to