Posting this on UCX list.

On Thu, Oct 4, 2018 at 4:42 PM Charles A Taylor <chas...@ufl.edu> wrote:

>
> We are seeing a gaping memory leak when running OpenMPI 3.1.x (or 2.1.2,
> for that matter) built with UCX support.   The leak shows up
> whether the “ucx” PML is specified for the run or not.  The applications
> in question are arepo and gizmo but it I have no reason to believe
> that others are not affected as well.
>
> Basically the MPI processes grow without bound until SLURM kills the job
> or the host memory is exhausted.
> If I configure and build with “--without-ucx” the problem goes away.
>
> I didn’t see anything about this on the UCX github site so I thought I’d
> ask here.  Anyone else seeing the same or similar?
>
> What version of UCX is OpenMPI 3.1.x tested against?
>
> Regards,
>
> Charlie Taylor
> UF Research Computing
>
> Details:
> —————————————
> RHEL7.5
> OpenMPI 3.1.2 (and any other version I’ve tried).
> ucx 1.2.2-1.el7 (RH native)
> RH native IB stack
> Mellanox FDR/EDR IB fabric
> Intel Parallel Studio 2018.1.163
>
> Configuration Options:
> —————————————————
> CFG_OPTS=""
> CFG_OPTS="$CFG_OPTS C=icc CXX=icpc FC=ifort FFLAGS=\"-O2 -g -warn -m64\"
> LDFLAGS=\"\" "
> CFG_OPTS="$CFG_OPTS --enable-static"
> CFG_OPTS="$CFG_OPTS --enable-orterun-prefix-by-default"
> CFG_OPTS="$CFG_OPTS --with-slurm=/opt/slurm"
> CFG_OPTS="$CFG_OPTS --with-pmix=/opt/pmix/2.1.1"
> CFG_OPTS="$CFG_OPTS --with-pmi=/opt/slurm"
> CFG_OPTS="$CFG_OPTS --with-libevent=external"
> CFG_OPTS="$CFG_OPTS --with-hwloc=external"
> CFG_OPTS="$CFG_OPTS --with-verbs=/usr"
> CFG_OPTS="$CFG_OPTS --with-libfabric=/usr"
> CFG_OPTS="$CFG_OPTS --with-ucx=/usr"
> CFG_OPTS="$CFG_OPTS --with-verbs-libdir=/usr/lib64"
> CFG_OPTS="$CFG_OPTS --with-mxm=no"
> CFG_OPTS="$CFG_OPTS --with-cuda=${HPC_CUDA_DIR}"
> CFG_OPTS="$CFG_OPTS --enable-openib-udcm"
> CFG_OPTS="$CFG_OPTS --enable-openib-rdmacm"
> CFG_OPTS="$CFG_OPTS --disable-pmix-dstore"
>
> rpmbuild --ba \
>          --define '_name openmpi' \
>          --define "_version $OMPI_VER" \
>          --define "_release ${RELEASE}" \
>          --define "_prefix $PREFIX" \
>          --define '_mandir %{_prefix}/share/man' \
>          --define '_defaultdocdir %{_prefix}' \
>          --define 'mflags -j 8' \
>          --define 'use_default_rpm_opt_flags 1' \
>          --define 'use_check_files 0' \
>          --define 'install_shell_scripts 1' \
>          --define 'shell_scripts_basename mpivars' \
>          --define "configure_options $CFG_OPTS " \
>          openmpi-${OMPI_VER}.spec 2>&1 | tee rpmbuild.log
>
>
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to