I just committed a fix to the trunk to fix your original segfault down in opal_show_help() - this is the same problem Ken posted. This fix should make it into the v1.0 branch eventually. Even so, you are going to run into the real problem you were handling - this fix is just for proper error handling/output.

The error below looks like a word size mismatch - one thing is compiled 64bit, the other is compiled 32bit. Make sure everything is compiled either 32bit or 64bit.

Andrew

On Oct 21, 2005, at 1:55 PM, Troy Benjegerdes wrote:

On Fri, Oct 21, 2005 at 08:04:56AM -0500, Andrew Friedley wrote:
I've managed to reproduce the segfault, but haven't yet figured out the problem. I've got some distractions to attend to this afternoon, so it
might be a while before I get a fix.

Andrew

I rebuilt without a vpath build (running ./configure) in the src dir,
and now I get:

troy@octeropt:/usr/src/ompi-buildtest$ /usr/local/bin/mpirun -np 2
hostname
/usr/local/bin/mpirun: Symbol `opal_event_lock' has different size in
shared object, consider re-linking
Segmentation fault

(gdb) run -np 2 hostname
Starting program: /usr/local/bin/mpirun -np 2 hostname
/usr/local/bin/mpirun: Symbol `opal_event_lock' has different size in
shared object, consider re-linking
[Thread debugging using libthread_db enabled]
[New Thread 46912509504224 (LWP 14767)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 46912509504224 (LWP 14767)]
0x00002aaaab31fc9b in getenv () from /lib/libc.so.6
(gdb) bt
#0  0x00002aaaab31fc9b in getenv () from /lib/libc.so.6
#1  0x00002aaaab53bd86 in poll_init () at poll.c:101
#2  0x00002aaaab539bbc in opal_event_init () at event.c:269
#3  0x00002aaaaabd73c2 in orte_init_stage1 (infrastructure=true)
    at orte_init_stage1.c:143
#4  0x00002aaaaabdbfbf in orte_system_init (infrastructure=true)
    at orte_system_init.c:38
#5  0x00002aaaaabd6d24 in orte_init (infrastructure=true) at
orte_init.c:46
#6  0x0000000000402375 in orterun (argc=4, argv=0x7fffffa593e8)
    at orterun.c:294
#7  0x0000000000402013 in main (argc=4, argv=0x7fffffa593e8) at
main.c:13

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to