I tried last nights v1.8 tarball (openmpi-v1.8.3-272-g4e4f997.tar.bz2) with the Studio Compilers (v12.3) on a Solaris/x86-64 system. Configure args (other than prefix) were:
--enable-debug --with-verbs \ CC=cc CXX=CC FC=f90 \ CFLAGS=-m64 --with-wrapper-cflags=-m64 \ FCFLAGS=-m64 --with-wrapper-fcflags=-m64 \ CXXFLAGS='-m64 -library=stlport4' --with-wrapper-cxxflags='-m64 -library=stlport4' When running ring_c I see the following $ mpirun -mca btl sm,self,openib -np 2 -host pcp-j-19,pcp-j-20 examples/ring_c' [pcp-j-20:24250] mca_oob_tcp_accept: accept() failed: Error 0 (0). [pcp-j-20:24250] *** Process received signal *** [pcp-j-20:24250] Signal: Segmentation Fault (11) [pcp-j-20:24250] Signal code: Address not mapped (1) [pcp-j-20:24250] Failing at address: fffffd7fe45bf227 /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'opal_backtrace_print+0x2d [0xfffffd7fe450a91d] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'show_stackframe+0xafd [0xfffffd7fe450066d] /lib/amd64/libc.so.1'__sighndlr+0x6 [0xfffffd7fff202cc6] /lib/amd64/libc.so.1'call_user_handler+0x2aa [0xfffffd7fff1f648e] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'opal_hwloc172_hwloc_get_obj_by_depth+0x1d7 [0xfffffd7fe45bf227] [Signal 11 (SEGV)] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'opal_hwloc172_hwloc_get_root_obj+0x24 [0xfffffd7fe4560504] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'opal_hwloc_base_get_nbobjs_by_type+0xec [0xfffffd7fe45653ec] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/openmpi/mca_rmaps_round_robin.so'orte_rmaps_rr_byobj+0x252 [0xfffffd7fe1c9ddd2] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/openmpi/mca_rmaps_round_robin.so'orte_rmaps_rr_map+0x65e [0xfffffd7fe1c912be] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-rte.so.7.0.5'orte_rmaps_base_map_job+0xdce [0xfffffd7fe276aace] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'event_process_ac tive_single_queue+0x1dc [0xfffffd7fe453afbc] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'event_process_active+0xb1 [0xfffffd7fe453b361] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/lib/libopen-pal.so.6.2.1'opal_libevent2021_event_base_loop+0x339 [0xfffffd7fe453bc79] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/bin/orterun'orterun+0x1d0e [0x4101fe] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/bin/orterun'main+0x20 [0x408ca0] /shared/OMPI/openmpi-1.8-latest-solaris11-x64-ib-ss12u3-nightly/INST/bin/orterun'0x8b0b [0x408b0b] [pcp-j-20:24250] *** End of error message *** Dbx gives me t@1 (l@1) terminated by signal SEGV (no mapping at the fault address) Current function is opal_hwloc172_hwloc_get_obj_by_depth 74 return topology->levels[depth][idx]; (dbx) where current thread: t@1 =>[1] opal_hwloc172_hwloc_get_obj_by_depth(topology = 0x4d49e0, depth = 0, idx = 0), line 74 in "traversal.c" [2] opal_hwloc172_hwloc_get_root_obj(topology = 0x4d49e0), line 118 in "helper.h" [3] opal_hwloc_base_get_nbobjs_by_type(topo = 0x4d49e0, target = OPAL_HWLOC172_hwloc_OBJ_CORE, cache_level = 0, rtype = '\003'), line 833 in "hwloc_base_util.c" [4] orte_rmaps_rr_byobj(jdata = 0x43c940, app = 0x483fe0, node_list = 0xfffffd7fffdff4b0, num_slots = 2, num_procs = 2U, target = OPAL_HWLOC172_hwloc_OBJ_CORE, cache_level = 0), line 495 in "rmaps_rr_mappers.c" [5] orte_rmaps_rr_map(jdata = 0x43c940), line 165 in "rmaps_rr.c" [6] orte_rmaps_base_map_job(fd = -1, args = 4, cbdata = 0x4a3300), line 277 in "rmaps_base_map_job.c" [7] event_process_active_single_queue(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fe453afbc [8] event_process_active(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fe453b361 [9] opal_libevent2021_event_base_loop(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fe453bc79 [10] orterun(argc = 9, argv = 0xfffffd7fffdffa58), line 1081 in "orterun.c" [11] main(argc = 9, argv = 0xfffffd7fffdffa58), line 13 in "main.c" (dbx) print depth depth = 0 (dbx) print index index = 0xfffffd7fff19c174 Pretty sure that index value is bogus. -Paul -- Paul H. Hargrove phhargr...@lbl.gov Computer Languages & Systems Software (CLaSS) Group Computer Science Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900