Dear Rolf, your suggestion works!
$ mpirun -np 4 --map-by ppr:1:socket -bind-to core --mca coll ^ml osu_alltoall # OSU MPI All-to-All Personalized Exchange Latency Test v4.2 # Size Avg Latency(us) 1 8.02 2 2.96 4 2.91 8 2.91 16 2.96 32 3.07 64 3.25 128 3.74 256 3.85 512 4.11 1024 4.79 2048 5.91 4096 15.84 8192 24.88 16384 35.35 32768 56.20 65536 66.88 131072 114.89 262144 209.36 524288 396.12 1048576 765.65 Can you clarify exactly where the problem come from? Regards, Filippo On Mar 4, 2014, at 12:17 AM, Rolf vandeVaart <rvandeva...@nvidia.com> wrote: > Can you try running with --mca coll ^ml and see if things work? > > Rolf > >> -----Original Message----- >> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga >> Sent: Monday, March 03, 2014 7:14 PM >> To: Open MPI Users >> Subject: [OMPI users] 1.7.5rc1, error "COLL-ML ml_discover_hierarchy exited >> with error." >> >> Dear Open MPI developers, >> >> I hit an expected error running OSU osu_alltoall benchmark using Open MPI >> 1.7.5rc1. Here the error: >> >> $ mpirun -np 4 --map-by ppr:1:socket -bind-to core osu_alltoall In >> bcol_comm_query hmca_bcol_basesmuma_allocate_sm_ctl_memory failed >> In bcol_comm_query hmca_bcol_basesmuma_allocate_sm_ctl_memory >> failed >> [tesla50][[6927,1],1][../../../../../ompi/mca/coll/ml/coll_ml_module.c:2996:mc >> a_coll_ml_comm_query] COLL-ML ml_discover_hierarchy exited with error. >> >> [tesla50:42200] In base_bcol_masesmuma_setup_library_buffers and mpool >> was not successfully setup! >> [tesla50][[6927,1],0][../../../../../ompi/mca/coll/ml/coll_ml_module.c:2996:mc >> a_coll_ml_comm_query] COLL-ML ml_discover_hierarchy exited with error. >> >> [tesla50:42201] In base_bcol_masesmuma_setup_library_buffers and mpool >> was not successfully setup! >> # OSU MPI All-to-All Personalized Exchange Latency Test v4.2 >> # Size Avg Latency(us) >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 3 with PID 4508 on node tesla51 exited on >> signal 11 (Segmentation fault). >> -------------------------------------------------------------------------- >> 2 total processes killed (some possibly by mpirun during cleanup) >> >> Any idea where this come from? >> >> I compiled Open MPI using Intel 12.1, latest Mellanox stack and CUDA 6.0RC. >> Attached outputs grabbed from configure, make and run. The configure was >> >> export MXM_DIR=/opt/mellanox/mxm >> export KNEM_DIR=$(find /opt -maxdepth 1 -type d -name "knem*" -print0) >> export FCA_DIR=/opt/mellanox/fca export HCOLL_DIR=/opt/mellanox/hcoll >> >> ../configure CC=icc CXX=icpc F77=ifort FC=ifort FFLAGS="-xSSE4.2 -axAVX -ip - >> O3 -fno-fnalias" FCFLAGS="-xSSE4.2 -axAVX -ip -O3 -fno-fnalias" >> --prefix=<...> >> --enable-mpirun-prefix-by-default --with-fca=$FCA_DIR --with- >> mxm=$MXM_DIR --with-knem=$KNEM_DIR --with- >> cuda=$CUDA_INSTALL_PATH --enable-mpi-thread-multiple --with- >> hwloc=internal --with-verbs 2>&1 | tee config.out >> >> >> Thanks in advance, >> Regards >> >> Filippo >> >> -- >> Mr. Filippo SPIGA, M.Sc. >> http://www.linkedin.com/in/filippospiga ~ skype: filippo.spiga >> >> <Nobody will drive us out of Cantor's paradise.> ~ David Hilbert >> >> ***** >> Disclaimer: "Please note this message and any attachments are >> CONFIDENTIAL and may be privileged or otherwise protected from disclosure. >> The contents are not to be disclosed to anyone other than the addressee. >> Unauthorized recipients are requested to preserve this confidentiality and to >> advise the sender immediately of any error in transmission." > > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may > contain > confidential information. Any unauthorized review, use, disclosure or > distribution > is prohibited. If you are not the intended recipient, please contact the > sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Mr. Filippo SPIGA, M.Sc. http://www.linkedin.com/in/filippospiga ~ skype: filippo.spiga «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert ***** Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and may be privileged or otherwise protected from disclosure. The contents are not to be disclosed to anyone other than the addressee. Unauthorized recipients are requested to preserve this confidentiality and to advise the sender immediately of any error in transmission."