Hi,

I'm trying to deploy OpenMPI 4.0.5 on the university's supercomputer:

  * Debian GNU/Linux 9 (stretch)
  * Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] (rev 11)

and for several days I have a bug (wrong results using MPI_AllToAllW) on
this server when using OmniPath.

Running 4 threads on a single node, using OpenMPI 4.0.5 built without
omnipath support, the code is working:

CC=$(which gcc) CXX=$(which g++) FC=$(which gfortran) ../configure
--with-hwloc --enable-mpirun-prefix-by-default \
--prefix=/bettik/begou/OpenMPI405-noib --enable-mpi1-compatibility \
--enable-mpi-cxx --enable-cxx-exceptions --without-verbs --without-ofi
--without-psm --without-psm2 --without-openib \
--without-slurm

If I use omnipath, still with 4 threads on one node, the test-case does
not work (incorrect results):

CFLAGS="-O3 -march=native -mtune=native" CXXFLAGS="-O3 -march=native
-mtune=native" FCFLAGS="-O3 -march=native -mtune=native" \
CC=$(which gcc) CXX=$(which g++) FC=$(which gfortran) ../configure
--with-hwloc --enable-mpirun-prefix-by-default \
--prefix=/bettik/begou/OpenMPI405 --enable-mpi1-compatibility \
--enable-mpi-cxx --enable-cxx-exceptions --without-verbs

I do not undestand what could be wrong as the code is running on many
architecture with various interconnect and openMPI versions.

Thanks for your suggestions.

Patrick

Reply via email to