With last night's master tarball (openmpi-dev-845-ga3275aa) on a
Linux/x86-64 system, I am seeing a crash (below) from ring_c run on a login
node.  Other than CC/CXX/FC settings I've configured with only --prefix=...
--enable-debug --with-tm=...

This is occurring with at least the Gnu, Intel, PathScale and Open64
compilers (PGI has other issues).
Only the Intel build gave me a usable backtrace, which is given below along
with the command line and output.

A fresh build of 1.8.4 on the same system with the same configure args does
NOT have this problem.

-Paul

$ mpirun -mca btl sm,self -np 2 examples/ring_c'
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs2
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs1
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs2
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs1
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs2
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs1
Process 1 exiting
Process 0 sending 10 to 1, tag 201 (2 processes in ring)
Process 0 sent to 1
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
[cvrsvc04:26446] *** Process received signal ***
[cvrsvc04:26446] Signal: Segmentation fault (11)
[cvrsvc04:26446] Signal code: Address not mapped (1)
[cvrsvc04:26446] Failing at address: 0x7fc828289aaf
[cvrsvc04:26446] [ 0] /lib64/libpthread.so.0[0x7fc8f943db10]
[cvrsvc04:26446] [ 1] /usr/lib64/libmlx4-m-rdmav2.so[0x7fc8f6491091]
[cvrsvc04:26446] [ 2]
/usr/lib64/libmlx4-m-rdmav2.so(__mlx4_cq_clean+0x69)[0x7fc8f64912b9]
[cvrsvc04:26446] [ 3]
/usr/lib64/libmlx4-m-rdmav2.so(mlx4_cq_clean+0x3e)[0x7fc8f649146e]
[cvrsvc04:26446] [ 4]
/usr/lib64/libmlx4-m-rdmav2.so(mlx4_modify_qp+0xb7)[0x7fc8f6494a87]
[cvrsvc04:26446] [ 5]
/usr/lib64/libibverbs.so.1(ibv_modify_qp+0x24)[0x7fc8f6abbae4]
[cvrsvc04:26446] [ 6]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/openmpi/mca_oob_ud.so(mca_oob_ud_qp_to_reset+0xc8)[0x7fc8f6cca228]
[cvrsvc04:26446] [ 7]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/openmpi/mca_oob_ud.so(mca_oob_ud_event_stop_monitor+0x92)[0x7fc8f6cc8d32]
[cvrsvc04:26446] [ 8]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/openmpi/mca_oob_ud.so[0x7fc8f6cc5baf]
[cvrsvc04:26446] [ 9]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/libopen-rte.so.0[0x7fc8fa4c3db0]
[cvrsvc04:26446] [10]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/libopen-pal.so.0(mca_base_framework_close+0x6e)[0x7fc8fa15ed6e]
[cvrsvc04:26446] [11]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/openmpi/mca_ess_hnp.so[0x7fc8f7f1987a]
[cvrsvc04:26446] [12]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/lib/libopen-rte.so.0(orte_finalize+0x5d)[0x7fc8fa46812d]
[cvrsvc04:26446] [13]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/bin/mpirun(orterun+0x1c5f)[0x4071ed]
[cvrsvc04:26446] [14]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/bin/mpirun(main+0x20)[0x404d68]
[cvrsvc04:26446] [15]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x7fc8f90f5994]
[cvrsvc04:26446] [16]
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-master-linux-x86_64-icc-11.1/INST/bin/mpirun(orte_daemon_recv+0x2f9)[0x404c99]
[cvrsvc04:26446] *** End of error message ***


-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to