Hi All

I wonder if configuring OpenMPI while
forcing the default types to non-default values
(-fdefault-integer-8 -fdefault-real-8) might have
something to do with the segmentation fault.
Would this be effective, i.e., actually make the
the sizes of MPI_INTEGER/MPI_INT and MPI_REAL/MPI_FLOAT bigger,
or just elusive?

There were some recent discussions here about MPI
limiting counts to MPI_INTEGER.
Since Benjamin said he "had to raise the number of data structures",
which eventually led to the the error,
I wonder if he is inadvertently flipping to negative integer
side of the 32-bit universe (i.e. >= 2**31), as was reported here by other list subscribers a few times.

Anyway, segmentation fault can come from many different places,
this is just a guess.

Gus Correa

Jeff Squyres wrote:
Do you get a corefile?

It looks like you're calling MPI_RECV in Fortran and then it segv's.  This is 
*likely* because you're either passing a bad parameter or your buffer isn't big 
enough.  Can you double check all your parameters?

Unfortunately, there's no line numbers printed in the stack trace, so it's not 
possible to tell exactly where in the ob1 PML it's dying (i.e., so we can't see 
exactly what it's doing to cause the segv).



On Dec 2, 2010, at 9:36 AM, Benjamin Toueg wrote:

Hi,

I am using DRAGON, a neutronic simulation code in FORTRAN77 that has its own 
datastructures. I added a module to send these data structures thanks to 
MPI_SEND / MPI_RECEIVE, and everything worked perfectly for a while.

Then I had to raise the number of data structures to be sent up to a point 
where my cluster has this bug :
*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x2c2579fc0
[ 0] /lib/libpthread.so.0 [0x7f52d2930410]
[ 1] /home/toueg/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f52d153fe03]
[ 2] /home/toueg/openmpi/lib/libmpi.so.0(PMPI_Recv+0x2d2) [0x7f52d3504a1e]
[ 3] /home/toueg/openmpi/lib/libmpi_f77.so.0(pmpi_recv_+0x10e) [0x7f52d36cf9c6]

How can I make this error more explicit ?

I use the following configuration of openmpi-1.4.3 :
./configure --enable-debug --prefix=/home/toueg/openmpi CXX=g++ CC=gcc F77=gfortran FC=gfortran 
FLAGS="-m64 -fdefault-integer-8 -fdefault-real-8 -fdefault-double-8" FCFLAGS="-m64 
-fdefault-integer-8 -fdefault-real-8 -fdefault-double-8" --disable-mpi-f90

Here is the output of mpif77 -v :
mpif77 for 1.2.7 (release) of : 2005/11/04 11:54:51
Driving: f77 -L/usr/lib/mpich-mpd/lib -v -lmpich-p4mpd -lpthread -lrt 
-lfrtbegin -lg2c -lm -shared-libgcc
Lecture des spécification à partir de /usr/lib/gcc/x86_64-linux-gnu/3.4.6/specs
Configuré avec: ../src/configure -v --enable-languages=c,c++,f77,pascal 
--prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 
--enable-shared --with-system-zlib --enable-nls --without-included-gettext 
--program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu 
--enable-libstdcxx-debug x86_64-linux-gnu
Modèle de thread: posix
version gcc 3.4.6 (Debian 3.4.6-5)
 /usr/lib/gcc/x86_64-linux-gnu/3.4.6/collect2 --eh-frame-hdr -m elf_x86_64 
-dynamic-linker /lib64/ld-linux-x86-64.so.2 
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crt1.o 
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crti.o 
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtbegin.o -L/usr/lib/mpich-mpd/lib 
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib 
-L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../.. -L/lib/../lib -L/usr/lib/../lib 
-lmpich-p4mpd -lpthread -lrt -lfrtbegin -lg2c -lm -lgcc_s -lgcc -lc -lgcc_s 
-lgcc /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtend.o 
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crtn.o
/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/libfrtbegin.a(frtbegin.o): 
dans la fonction ▒ main ▒:
(.text+0x1e): référence indéfinie vers ▒ MAIN__ ▒
collect2: ld a retourné 1 code d'état d'exécution

Thanks,
Benjamin

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



Reply via email to