Jumping in pretty late in this thread here...

I see that it's failing in opal_hwloc_base_close().  That's a little worrysome. 
 

I do see an odd path through the hwloc initialization that *could* cause an 
error during finalization -- but it would involve you setting an invalid value 
for an MCA parameter.  Are you setting hwloc_base_mem_bind_failure_action or 
hwloc_base_mem_alloc_policy, perchance?


On Jan 16, 2012, at 1:56 PM, Andrew Senin wrote:

> Hi,
> 
> I think I've found a bug in the hear revision of the OpenMPI 1.5
> branch. If it is configured with --disable-debug it crashes in
> finalize on the hello_c.c example. Did I miss something out?
> 
> Configure options:
> ./configure --with-pmi=/usr/ --with-slurm=/usr/ --without-psm
> --disable-debug --enable-mpirun-prefix-by-default
> --prefix=/hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install
> 
> Runtime command and output:
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:../lib ./mpirun --mca btl openib,self
> --npernode 1 --host mir1,mir2 ./hello
> 
> Hello, world, I am 0 of 2
> Hello, world, I am 1 of 2
> [mir1:05542] *** Process received signal ***
> [mir1:05542] Signal: Segmentation fault (11)
> [mir1:05542] Signal code: Address not mapped (1)
> [mir1:05542] Failing at address: 0xe8
> [mir2:10218] *** Process received signal ***
> [mir2:10218] Signal: Segmentation fault (11)
> [mir2:10218] Signal code: Address not mapped (1)
> [mir2:10218] Failing at address: 0xe8
> [mir1:05542] [ 0] /lib64/libpthread.so.0() [0x390d20f4c0]
> [mir1:05542] [ 1]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
> [0x7f4588cee6a8]
> [mir1:05542] [ 2]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
> [0x7f4588cee700]
> [mir1:05542] [ 3]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
> [0x7f4588d1beb2]
> [mir1:05542] [ 4]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
> [0x7f4588c81eb5]
> [mir1:05542] [ 5]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
> [0x7f4588c217c3]
> [mir1:05542] [ 6]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
> [0x7f4588c39959]
> [mir1:05542] [ 7] ./hello(main+0x69) [0x4008fd]
> [mir1:05542] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) [0x390ca1ec5d]
> [mir1:05542] [ 9] ./hello() [0x4007d9]
> [mir1:05542] *** End of error message ***
> [mir2:10218] [ 0] /lib64/libpthread.so.0() [0x3a6dc0f4c0]
> [mir2:10218] [ 1]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(+0x1346a8)
> [0x7f409f31d6a8]
> [mir2:10218] [ 2]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_hwloc_base_close+0x32)
> [0x7f409f31d700]
> [mir2:10218] [ 3]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(opal_finalize+0x73)
> [0x7f409f34aeb2]
> [mir2:10218] [ 4]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(orte_finalize+0xfe)
> [0x7f409f2b0eb5]
> [mir2:10218] [ 5]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(ompi_mpi_finalize+0x67a)
> [0x7f409f2507c3]
> [mir2:10218] [ 6]
> /hpc/home/USERS/senina/projects/distribs/openmpi-svn_v1.5/install/lib/libmpi.so.1(PMPI_Finalize+0x59)
> [0x7f409f268959]
> [mir2:10218] [ 7] ./hello(main+0x69) [0x4008fd]
> [mir2:10218] [ 8] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3a6d41ec5d]
> [mir2:10218] [ 9] ./hello() [0x4007d9]
> [mir2:10218] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 5542 on node mir1 exited
> on signal 11 (Segmentation fault).
> ---------------------------------------------------------------------
> 
> Thanks,
> Andrew Senin
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to