With OpenBSD-5.9 on x86 (but *not* amd64) when I try to run an application
built with Open MPI 2.0.0rc2, I get a SEGV before any output:

$ mpirun -mca btl sm,self -np 2 examples
/ring_c'
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node openbsd5-i386 exited
on signal 11 (Segmentation fault).
--------------------------------------------------------------------------



The backtrace from gdb on the core file doesn't contain much info, but
looks to me like some sort of dl-open related problem.

(gdb) where
#0  0x087f3b23 in _dl_find_symbol_obj (object=0x7d388800, name=0xf9ef8ac
"fileno", hash=Variable "hash" is not available.
)
    at /usr/src/libexec/ld.so/resolve.c:543
#1  0x087f3dbf in _dl_find_symbol (name=0xf9ef8ac "fileno",
this=0x7bec4f74, flags=Variable "flags" is not available.
)
    at /usr/src/libexec/ld.so/resolve.c:672
#2  0x087f65cf in _dl_bind (object=0x802a8200, index=4696) at
/usr/src/libexec/ld.so/i386/rtld_machine.c:387
#3  0x087f25f7 in _dl_bind_start () at /usr/src/libexec/
ld.so/i386/ldasm.S:153
#4  0x802a8200 in ?? ()
#5  0x00001258 in ?? ()
#6  0x00000033 in ?? ()
#7  0x00000033 in ?? ()
#8  0x00000000 in ?? ()


When I retry I get a similar backtrace, but with different symbols (in
place of "fileno") on each try.
The symbol is always, in the several trials I have made, ones that should
appear in libc.

On some trials I see additional output such as:

[openbsd5-i386:16620] *** Process received signal ***
[openbsd5-i386:16620] Signal: Segmentation fault (11)
[openbsd5-i386:16620] Signal code: Address not mapped (1)
[openbsd5-i386:16620] Failing at address: 0x308023c
Unable to print stack trace!
[openbsd5-i386:16620] *** End of error message ***

And in these cases gdb can't make sense of the core file either.

I can run a singleton w/o problems, fwiw:

{openbsd5-i386 examples}$ ./ring_c
Process 0 sending 10 to 0, tag 201 (1 processes in ring)
Process 0 sent to 0
Process 0 decremented value: 9
Process 0 decremented value: 8
Process 0 decremented value: 7
Process 0 decremented value: 6
Process 0 decremented value: 5
Process 0 decremented value: 4
Process 0 decremented value: 3
Process 0 decremented value: 2
Process 0 decremented value: 1
Process 0 decremented value: 0
Process 0 exiting
pthread_mutex_destroy on mutex with waiters!


[Somebody might want to look into that pthread_mutex_destroy() warning, but
I can't see it being relevant to the current problem.]

I am open to suggestions as to how to debug this issue.

-Paul

-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply via email to