We are seeing a gaping memory leak when running OpenMPI 3.1.x (or 2.1.2, for that matter) built with UCX support. The leak shows up whether the “ucx” PML is specified for the run or not. The applications in question are arepo and gizmo but it I have no reason to believe that others are not affected as well.
Basically the MPI processes grow without bound until SLURM kills the job or the host memory is exhausted. If I configure and build with “--without-ucx” the problem goes away. I didn’t see anything about this on the UCX github site so I thought I’d ask here. Anyone else seeing the same or similar? What version of UCX is OpenMPI 3.1.x tested against? Regards, Charlie Taylor UF Research Computing Details: ————————————— RHEL7.5 OpenMPI 3.1.2 (and any other version I’ve tried). ucx 1.2.2-1.el7 (RH native) RH native IB stack Mellanox FDR/EDR IB fabric Intel Parallel Studio 2018.1.163 Configuration Options: ————————————————— CFG_OPTS="" CFG_OPTS="$CFG_OPTS C=icc CXX=icpc FC=ifort FFLAGS=\"-O2 -g -warn -m64\" LDFLAGS=\"\" " CFG_OPTS="$CFG_OPTS --enable-static" CFG_OPTS="$CFG_OPTS --enable-orterun-prefix-by-default" CFG_OPTS="$CFG_OPTS --with-slurm=/opt/slurm" CFG_OPTS="$CFG_OPTS --with-pmix=/opt/pmix/2.1.1" CFG_OPTS="$CFG_OPTS --with-pmi=/opt/slurm" CFG_OPTS="$CFG_OPTS --with-libevent=external" CFG_OPTS="$CFG_OPTS --with-hwloc=external" CFG_OPTS="$CFG_OPTS --with-verbs=/usr" CFG_OPTS="$CFG_OPTS --with-libfabric=/usr" CFG_OPTS="$CFG_OPTS --with-ucx=/usr" CFG_OPTS="$CFG_OPTS --with-verbs-libdir=/usr/lib64" CFG_OPTS="$CFG_OPTS --with-mxm=no" CFG_OPTS="$CFG_OPTS --with-cuda=${HPC_CUDA_DIR}" CFG_OPTS="$CFG_OPTS --enable-openib-udcm" CFG_OPTS="$CFG_OPTS --enable-openib-rdmacm" CFG_OPTS="$CFG_OPTS --disable-pmix-dstore" rpmbuild --ba \ --define '_name openmpi' \ --define "_version $OMPI_VER" \ --define "_release ${RELEASE}" \ --define "_prefix $PREFIX" \ --define '_mandir %{_prefix}/share/man' \ --define '_defaultdocdir %{_prefix}' \ --define 'mflags -j 8' \ --define 'use_default_rpm_opt_flags 1' \ --define 'use_check_files 0' \ --define 'install_shell_scripts 1' \ --define 'shell_scripts_basename mpivars' \ --define "configure_options $CFG_OPTS " \ openmpi-${OMPI_VER}.spec 2>&1 | tee rpmbuild.log _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users