Hi Howard,

I had tried to send config.log of my 2.1.0 build, but I guess it was
too big for the list. I'm trying again with a compressed file.
I have based it on the OpenHPC package. Unfortunately, it still
crashes with disabling
the vader btl with this command line:
mpirun --mca btl "^vader" IMB-MPI1


[pax11-10:44753] *** Process received signal ***
[pax11-10:44753] Signal: Bus error (7)
[pax11-10:44753] Signal code: Non-existant physical address (2)
[pax11-10:44753] Failing at address: 0x2b3989e27a00
[pax11-10:44753] [ 0] /usr/lib64/libpthread.so.0(+0xf370)[0x2b3976f44370]
[pax11-10:44753] [ 1]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.so(+0x559a)[0x2b398545259a]
[pax11-10:44753] [ 2]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libopen-pal.so.20(opal_free_list_grow_st+0x1df)[0x2b39777bb78f]
[pax11-10:44753] [ 3]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.so(mca_btl_sm_sendi+0x272)[0x2b3985450562]
[pax11-10:44753] [ 4]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.so(+0x8a3f)[0x2b3985d78a3f]
[pax11-10:44753] [ 5]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x4a7)[0x2b3985d79ad7]
[pax11-10:44753] [ 6]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_coll_base_sendrecv_nonzero_actual+0x110)[0x2b3976cda620]
[pax11-10:44753] [ 7]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_coll_base_allreduce_intra_ring+0x860)[0x2b3976cdb8f0]
[pax11-10:44753] [ 8]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(PMPI_Allreduce+0x17b)[0x2b3976ca36ab]
[pax11-10:44753] [ 9] IMB-MPI1[0x40b2ff]
[pax11-10:44753] [10] IMB-MPI1[0x402646]
[pax11-10:44753] [11]
/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2b3977172b35]
[pax11-10:44753] [12] IMB-MPI1[0x401f79]
[pax11-10:44753] *** End of error message ***
[pax11-10:44752] *** Process received signal ***
[pax11-10:44752] Signal: Bus error (7)
[pax11-10:44752] Signal code: Non-existant physical address (2)
[pax11-10:44752] Failing at address: 0x2ab0d270d3e8
[pax11-10:44752] [ 0] /usr/lib64/libpthread.so.0(+0xf370)[0x2ab0bf7ec370]
[pax11-10:44752] [ 1]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_allocator_bucket.so(mca_allocator_bucket_alloc_align+0x89)[0x2ab0c2eed1c9]
[pax11-10:44752] [ 2]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmca_common_sm.so.20(+0x1495)[0x2ab0cde8d495]
[pax11-10:44752] [ 3]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libopen-pal.so.20(opal_free_list_grow_st+0x277)[0x2ab0c0063827]
[pax11-10:44752] [ 4]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_btl_sm.so(mca_btl_sm_sendi+0x272)[0x2ab0cdc87562]
[pax11-10:44752] [ 5]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.so(+0x8a3f)[0x2ab0ce630a3f]
[pax11-10:44752] [ 6]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_send+0x4a7)[0x2ab0ce631ad7]
[pax11-10:44752] [ 7]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_coll_base_sendrecv_nonzero_actual+0x110)[0x2ab0bf582620]
[pax11-10:44752] [ 8]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(ompi_coll_base_allreduce_intra_ring+0x860)[0x2ab0bf5838f0]
[pax11-10:44752] [ 9]
/opt/ohpc/pub/mpi/openmpi-gnu/2.1.0/lib/libmpi.so.20(PMPI_Allreduce+0x17b)[0x2ab0bf54b6ab]
[pax11-10:44752] [10] IMB-MPI1[0x40b2ff]
[pax11-10:44752] [11] IMB-MPI1[0x402646]
[pax11-10:44752] [12]
/usr/lib64/libc.so.6(__libc_start_main+0xf5)[0x2ab0bfa1ab35]
[pax11-10:44752] [13] IMB-MPI1[0x401f79]
[pax11-10:44752] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 340 with PID 44753 on node pax11-10

Attachment: config.log.xz
Description: application/xz

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to