I rebuilt without the memory manager, now ompi_info crashes with this output:

./configure --prefix=/usr/local/openmpi --disable-mpi-f90 --disable- mpi-f77 --without-memory-manager

localhost:~/openmpi> ompi_info
                Open MPI: 1.2.8
   Open MPI SVN revision: r19718
                Open RTE: 1.2.8
   Open RTE SVN revision: r19718
                    OPAL: 1.2.8
       OPAL SVN revision: r19718
                  Prefix: /usr/local/openmpi
 Configured architecture: x86_64-unknown-linux-gnu
           Configured by: root
           Configured on: Tue Nov 11 04:08:47 CET 2008
          Configure host: localhost
                Built by: root
                Built on: Tue Nov 11 04:13:01 CET 2008
              Built host: localhost
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: no
      Fortran90 bindings: no
 Fortran90 bindings size: na
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: none
  Fortran90 compiler abs: none
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: no
     Fortran90 profiling: no
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
*** glibc detected *** ompi_info: double free or corruption (fasttop): 0x00000000006279e0 ***
======= Backtrace: =========
======= Memory map: ========
00400000-0041f000 r-xp 00000000 08:01 68989625 /usr/local/openmpi/bin/ompi_info 0061e000-0061f000 r--p 0001e000 08:01 68989625 /usr/local/openmpi/bin/ompi_info 0061f000-00620000 rw-p 0001f000 08:01 68989625 /usr/local/openmpi/bin/ompi_info 00620000-00642000 rw-p 00620000 00:00 0 [heap] 2ae687174000-2ae687190000 r-xp 00000000 08:01 100681559 /lib64/ld-2.6.1.so
2ae687190000-2ae687192000 rw-p 2ae687190000 00:00 0
2ae68738f000-2ae687391000 rw-p 0001b000 08:01 100681559 /lib64/ld-2.6.1.so 2ae687391000-2ae687411000 r-xp 00000000 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687411000-2ae687611000 ---p 00080000 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687611000-2ae687612000 r--p 00080000 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0 2ae687612000-2ae68761b000 rw-p 00081000 08:01 403422302 /usr/local/openmpi/lib/libmpi.so.0.0.0
2ae68761b000-2ae687622000 rw-p 2ae68761b000 00:00 0
2ae687622000-2ae68767a000 r-xp 00000000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68767a000-2ae68787a000 ---p 00058000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68787a000-2ae68787b000 r--p 00058000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0 2ae68787b000-2ae68787d000 rw-p 00059000 08:01 403422294 /usr/local/openmpi/lib/libopen-rte.so.0.0.0
2ae68787d000-2ae68787e000 rw-p 2ae68787d000 00:00 0
2ae68787e000-2ae6878b1000 r-xp 00000000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae6878b1000-2ae687ab0000 ---p 00033000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae687ab0000-2ae687ab1000 r--p 00032000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0 2ae687ab1000-2ae687ab3000 rw-p 00033000 08:01 403422290 /usr/local/openmpi/lib/libopen-pal.so.0.0.0
2ae687ab3000-2ae687ad5000 rw-p 2ae687ab3000 00:00 0
2ae687af3000-2ae687af5000 r-xp 00000000 08:01 100681700 /lib64/libdl-2.6.1.so 2ae687af5000-2ae687cf5000 ---p 00002000 08:01 100681700 /lib64/libdl-2.6.1.so 2ae687cf5000-2ae687cf7000 rw-p 00002000 08:01 100681700 /lib64/libdl-2.6.1.so
2ae687cf7000-2ae687cf8000 rw-p 2ae687cf7000 00:00 0
2ae687cf8000-2ae687d0c000 r-xp 00000000 08:01 100681705 /lib64/libnsl-2.6.1.so 2ae687d0c000-2ae687f0b000 ---p 00014000 08:01 100681705 /lib64/libnsl-2.6.1.so 2ae687f0b000-2ae687f0d000 rw-p 00013000 08:01 100681705 /lib64/libnsl-2.6.1.so
2ae687f0d000-2ae687f0f000 rw-p 2ae687f0d000 00:00 0
2ae687f0f000-2ae687f11000 r-xp 00000000 08:01 100681728 /lib64/libutil-2.6.1.so 2ae687f11000-2ae688110000 ---p 00002000 08:01 100681728 /lib64/libutil-2.6.1.so 2ae688110000-2ae688112000 rw-p 00001000 08:01 100681728 /lib64/libutil-2.6.1.so 2ae688112000-2ae6881fe000 r-xp 00000000 08:01 67350662 /usr/lib64/libstdc++.so.6.0.9 2ae6881fe000-2ae6883fe000 ---p 000ec000 08:01 67350662 /usr/lib64/libstdc++.so.6.0.9 2ae6883fe000-2ae688404000 r--p 000ec000 08:01 67350662 /usr/lib64/libstdc++.so.6.0.9 2ae688404000-2ae688407000 rw-p 000f2000 08:01 67350662 /usr/lib64/libstdc++.so.6.0.9
2ae688407000-2ae68841b000 rw-p 2ae688407000 00:00 0
2ae68841b000-2ae68846d000 r-xp 00000000 08:01 100681702 /lib64/libm-2.6.1.so 2ae68846d000-2ae68866c000 ---p 00052000 08:01 100681702 /lib64/libm-2.6.1.so 2ae68866c000-2ae68866e000 rw-p 00051000 08:01 100681702 /lib64/libm-2.6.1.so 2ae68866e000-2ae68867b000 r-xp 00000000 08:01 100845329 /lib64/libgcc_s.so.1 2ae68867b000-2ae68887a000 ---p 0000d000 08:01 100845329 /lib64/libgcc_s.so.1 2ae68887a000-2ae68887c000 rw-p 0000c000 08:01 100845329 /lib64/libgcc_s.so.1 2ae68887c000-2ae688891000 r-xp 00000000 08:01 100681720 /lib64/libpthread-2.6.1.so 2ae688891000-2ae688a91000 ---p 00015000 08:01 100681720 /lib64/libpthread-2.6.1.so 2ae688a91000-2ae688a93000 rw-p 00015000 08:01 100681720 /lib64/libpthread-2.6.1.so
2ae688a93000-2ae688a98000 rw-p 2ae688a93000 00:00 0
2ae688a98000-2ae688bd4000 r-xp 00000000 08:01 100681566 /lib64/libc-2.6.1.so 2ae688bd4000-2ae688dd4000 ---p 0013c000 08:01 100681566 /lib64/libc-2.6.1.so 2ae688dd4000-2ae688dd7000 r--p 0013c000 08:01 100681566 /lib64/libc-2.6.1.so 2ae688dd7000-2ae688dd9000 rw-p 0013f000 08:01 100681566 /lib64/libc-2.6.1.so
2ae688dd9000-2ae688de0000 rw-p 2ae688dd9000 00:00 0
2ae68c000000-2ae68c021000 rw-p 2ae68c000000 00:00 0
2ae68c021000-2ae690000000 ---p 2ae68c021000 00:00 0
7fff23921000-7fff23936000 rw-p 7fff23921000 00:00 0 [stack] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vdso]
[localhost] *** Process received signal ***
[localhost] Signal: Aborted (6)
[localhost] Signal code:  (-6)
[localhost] [ 0] /lib64/libpthread.so.0 [0x2ae688889fb0]
[localhost] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x2ae688ac8b45]
[localhost] [ 2] /lib64/libc.so.6(abort+0x110) [0x2ae688aca0e0]
[localhost] [ 3] /lib64/libc.so.6 [0x2ae688b00fbb]
[localhost] [ 4] /lib64/libc.so.6 [0x2ae688b0621d]
[localhost] [ 5] /lib64/libc.so.6(cfree+0x76) [0x2ae688b07f76]
[localhost] [ 6] /usr/lib64/libstdc++.so.6(_ZNSs6assignERKSs+0x9c) [0x2ae6881b44bc] [localhost] [ 7] ompi_info(_ZN9ompi_info15open_componentsEv+0x100) [0x405670]
[localhost] [ 8] ompi_info(main+0x11e7) [0x40b837]
[localhost] [ 9] /lib64/libc.so.6(__libc_start_main+0xf4) [0x2ae688ab5b54]
[localhost] [10] ompi_info(__gxx_personality_v0+0x121) [0x405249]
[localhost] *** End of error message ***

localhost:~/archives/openmpi-1.2.8> g++ -v
Using built-in specs.
Target: x86_64-suse-linux
Configured with: ../configure --enable-threads=posix --prefix=/usr -- with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/ share/man --libdir=/usr/lib64 --libexecdir=/usr/lib64 --enable- languages=c,c++,objc,fortran,obj-c++,java,ada --enable- checking=release --with-gxx-include-dir=/usr/include/c++/4.2.1 -- enable-ssp --disable-libssp --disable-libgcj --with-slibdir=/lib64 -- with-system-zlib --enable-shared --enable-__cxa_atexit --enable- libstdcxx-allocator=new --disable-libstdcxx-pch --program-suffix=-4.2 --enable-version-specific-runtime-libs --without-system-libunwind -- with-cpu=generic --host=x86_64-suse-linux
Thread model: posix
gcc version 4.2.1 (SUSE Linux)

On Nov 10, 2008, at 1:44 PM, Jeff Squyres wrote:

If you're not using OpenFabrics-based networks, try configuring Open MPI --without-memory-manager and see if that fixes your problems.

On Nov 8, 2008, at 5:31 PM, Robert Kubrick wrote:

George, I have warning when running under debugger 'Lowest section in system-supplied DSO at 0xffffe000 is .hash at ffffe0b4'
The program hangs in _int_malloc():

(gdb) run
Starting program: /opt/openmpi-1.2.7/bin/ompi_info
warning: Lowest section in system-supplied DSO at 0xffffe000 is .hash at ffffe0b4
[Thread debugging using libthread_db enabled]
[New Thread 0xf7b7d6d0 (LWP 16621)]

Program received signal SIGINT, Interrupt.
[Switching to Thread 0xf7b7d6d0 (LWP 16621)]
0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen-pal.so.0
(gdb) where
#0 0xf7e5267e in _int_malloc () from /opt/openmpi/lib/libopen- pal.so.0
#1  0xf7e544e1 in malloc () from /opt/openmpi/lib/libopen-pal.so.0
#2  0xf7db46c7 in operator new () from /usr/lib/libstdc++.so.6
#3 0xf7d8e121 in std::string::_Rep::_S_create () from /usr/lib/ libstdc++.so.6 #4 0xf7d8ee18 in std::string::_Rep::_M_clone () from /usr/lib/ libstdc++.so.6 #5 0xf7d8fac8 in std::string::reserve () from /usr/lib/libstdc+ +.so.6
#6  0xf7d8ff6a in std::string::append () from /usr/lib/libstdc++.so.6
#7  0x08054f30 in ompi_info::out ()
#8  0x08062a33 in ompi_info::show_ompi_version ()
#9  0x080533a0 in main ()

On Nov 8, 2008, at 12:33 PM, George Bosilca wrote:

I think we had a similar problem on the past. It has something to do with the atomics on this architecture.

I don't have access to such an architecture. Can you provide us a stack trace when this happens ?


On Nov 8, 2008, at 12:14 PM, Robert Kubrick wrote:

I am having problems building OMPI 1.2.7 on an Intel Xeon quad- core 64 bits server. The compilation completes but ompi_info hangs after printing the OMPI version:

# ompi_info

I tried to run a few mpi applications on this same install and they do work fine. What can cause ompi_info to hang?

users mailing list

users mailing list

users mailing list

Jeff Squyres
Cisco Systems

users mailing list

Reply via email to