"Loris Bennett" <[email protected]> writes:

> Dear list,
>
> I came across a bug report relating to version 2.2.4 which mentions that
> the core binding method changes when the number of tasks is the same as
> the number of cores per socket:
>
> http://comments.gmane.org/gmane.comp.distributed.slurm.devel/558
>
> We are seeing incorrect binding with version 2.2.7, which also seems to
> occur when a socket-full of tasks is started.
>
> Could this be a similar issue?
>
> I am assuming that this problem has been fixed in later versions, but
> I'm asking this in order to have arguments to encourage our vendor to
> provide us with a new version or at least backport some fixes.
>
> Cheers,
>
> Loris

Just in case anyone is also stuck with an ancient version of SLURM or
the problem occurs in more recent versions, we solved the problem by
replacing 

mpiexec.hydra -psm -bootstrap slurm -n $SLURM_NPROCS a.out

with

export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/current/lib64/libpmi.so
srun -n $SLURM_NPROCS a.out

Cheers,

Loris

-- 
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin         Email [email protected]

Reply via email to