Hi,

On 1.8.4rc1 we observe the following assert in the osu_mbw_mr test when
using the openib BTL.

When compiled in production mode (i.e. no --enable-debug) the test simply
hangs.

When using either the tcp BTL or the cm PML, the benchmark completes
without error.

The command line to reproduce this is:

$ mpirun --bind-to core -display-map -mca btl_openib_if_include mlx5_0:1
-np 2 -mca pml ob1 -mca btl openib,self,sm ./osu_mbw_mr

# OSU MPI Multiple Bandwidth / Message Rate Test v4.4
# [ pairs: 1 ] [ window size: 64 ]
# Size                  MB/s        Messages/s
osu_mbw_mr: ../../../../opal/class/opal_list.h:547: _opal_list_append:
Assertion `0 == item->opal_list_item_refcount' failed.
[vegas15:30395] *** Process received signal ***
[vegas15:30395] Signal: Aborted (6)
[vegas15:30395] Signal code:  (-6)
[vegas15:30395] [ 0] /lib64/libpthread.so.0[0x30bc40f500]
[vegas15:30395] [ 1] /lib64/libc.so.6(gsignal+0x35)[0x30bc0328a5]
[vegas15:30395] [ 2] /lib64/libc.so.6(abort+0x175)[0x30bc034085]
[vegas15:30395] [ 3] /lib64/libc.so.6[0x30bc02ba1e]
[vegas15:30395] [ 4]
/lib64/libc.so.6(__assert_perror_fail+0x0)[0x30bc02bae0]
[vegas15:30395] [ 5]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(+0x9087)[0x7ffff3f70087]
[vegas15:30395] [ 6]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_alloc+0x403)[0x7ffff3f754b3]
[vegas15:30395] [ 7]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_btl_openib.so(mca_btl_openib_sendi+0xf9e)[0x7ffff3f785b4]
[vegas15:30395] [ 8]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xed08)[0x7ffff3308d08]
[vegas15:30395] [ 9]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(+0xf8ba)[0x7ffff33098ba]
[vegas15:30395] [10]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_isend+0x108)[0x7ffff3309a1f]
[vegas15:30395] [11]
/labhome/alinas/workspace/tt/ompi_rc1/openmpi-1.8.4rc1/install/lib/libmpi.so.1(MPI_Isend+0x2ec)[0x7ffff7cff5e8]
[vegas15:30395] [12]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400fa4]
[vegas15:30395] [13]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x40167d]
[vegas15:30395] [14] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30bc01ecdd]
[vegas15:30395] [15]
/hpc/local/benchmarks/hpc-stack-gcc/install/ompi-mellanox-v1.8/tests/osu-micro-benchmarks-4.4/osu_mbw_mr[0x400db9]
[vegas15:30395] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 30395 on node vegas15 exited on
signal 6 (Aborted).
--------------------------------------------------------------------------


Thanks,
Alina.

Reply via email to