Re: [OMPI users] Scalability issue

Jeff Squyres Tue, 7 Dec 2010 10:50:24 -0500

It is always a good idea to have your application's sizeof(INTEGER) match the 
MPI's sizeof(INTEGER).  Having them mismatch is a recipe for trouble.


Meaning: if you're compiling your app with -make-integer-be-8-bytes, then you 
should configure/build Open MPI with that same flag.

I'm thinking that this should *only* affect the back-end behavior of 
MPI_INTEGER; the size of address pointers and whatnot should not be affected 
(unless -make-integer-be-8-bytes also changes the sizes of some other types).


On Dec 5, 2010, at 9:01 PM, Gustavo Correa wrote:

> Hi Benjamin
> 
> I guess you could compile OpenMPI with standard integer and real sizes.
> Then compile your application (DRAGON) with the flags to change to 8-byte
> integers and 8-byte reals.
> We have some programs here that use real8 and are compiled this way,
> and run without a problem.
> I guess this is what Tim Prince was also telling you in his comments.
> 
> You can pass those flags to the MPI compiler wrappers (mpif77 etc),
> which will relay them to gfortran when you compile DRAGON.
> 
> I am not even sure if those flags would be accepted or ignored by OpenMPI
> when you build it.
> I guess they will be ignored.
> You could check this out by looking at the MPI type sizes in your header
> files in the include directory and subdirectories.
> 
> Maybe an OpenMPI developer could shed some light here.
> 
> Moreover, if I remember right, 
> the MPI address type complies with the machine architecture,
> i.e., 32 bits if your machine is 32-bit, 64-bits if the machine is 64-bit, 
> and you don't need to force it to be 8-bytes with compilation flags.
> 
> Unfortunately mixing pointers ("Cray pointers", I suppose) 
> with integers is a common source of headaches, if DRAGON does this.
> It is yet another possible situation where negative integers could crop in
> and lead to segmentation fault.
> At least one ocean circulation model we run here had
> many problems because of this mix of integers and (Cray) pointers
> spread all across the code.
> 
> Gus Correa
> 
> On Dec 5, 2010, at 7:17 PM, Benjamin Toueg wrote:
> 
>> Unfortunately DRAGON is old FORTRAN77. Integers have been used instead of 
>> pointers. If I compile it in 64bits without -f-default-integer-8, the 
>> so-called pointers will remain in 32bits. Problems could also arise from its 
>> data structure handlers.
>> 
>> Therefore -f-default-integer-8 is absolutely necessary.
>> 
>> Futhermore MPI_SEND and MPI_RECEIVE are called a dozen times in only one 
>> source file (used for passing a data structure from one node to another) and 
>> it has proved to be working in every situtation.
>> 
>> Not knowing which line is causing my segfault is annoying. <323.gif>
>> 
>> Regards,
>> Benjamin
>> 
>> 2010/12/6 Gustavo Correa <[email protected]>
>> Hi Benjamin
>> 
>> I would just rebuild OpenMPI withOUT the compiler flags that change the 
>> standard
>> sizes of  "int" and "float" (do a "make cleandist" first!), then recompile 
>> your program,
>> and see how it goes.
>> I don't think you are gaining anything by trying to change the standard 
>> "int/integer" and
>> "real/float" sizdes, and most likely they are inviting trouble, making 
>> things more confusing.
>> Worst scenario, you will at least be sure that the bug is somewhere else, 
>> not on the mismatch
>> of basic type sizes.
>> 
>> If you need to pass 8-byte real buffers, use MPI_DOUBLE_PRECISION, or 
>> MPI_REAL8
>> in your (Fortran) MPI calls, and declare them in the Fortran code accordingly
>> (double precision or real(kind=8)).
>> 
>> If I remember right, there is no  8-byte integer support in the Fortran MPI 
>> bindings,
>> only in the C bindings, but some OpenMPI expert could clarify this.
>> Hence, if you are passing 8-byte integers in your MPI calls this may be also 
>> problematic.
>> 
>> My two cents,
>> Gus Correa
>> 
>> On Dec 5, 2010, at 3:04 PM, Benjamin Toueg wrote:
>> 
>>> Hi,
>>> 
>>> First of all thanks for your insight !
>>> 
>>> Do you get a corefile?
>>> I don't get a core file, but I get a file called _FIL001. It doesn't 
>>> contain any debugging symbols. It's most likely a digested version of the 
>>> input file given to the executable : ./myexec < inputfile.
>>> 
>>> there's no line numbers printed in the stack trace
>>> I would love to see those, but even if I compile openmpi with -debug 
>>> -mem-debug -mem-profile, they don't show up. I recompiled my sources to be 
>>> sure to properly link them to the newly debugged version of openmpi. I 
>>> assumed I didn't need to compile my own sources with -g option since it 
>>> crashes in openmpi itself ? I didn't try to run mpiexec via gdb either, I 
>>> guess it wont help since I already get the trace.
>>> 
>>> the -fdefault-integer-8 options ought to be highly dangerous
>>> Thanks for noting. Indeed I had some issues with this option. For instance 
>>> I have to declare some arguments as INTEGER*4 like RANK,SIZE,IERR in :
>>> CALL MPI_COMM_RANK(MPI_COMM_WORLD,RANK,IERR)
>>> CALL MPI_COMM_SIZE(MPI_COMM_WORLD,SIZE,IERR)
>>> In your example "call MPI_Send(buf, count, MPI_INTEGER, dest, tag, 
>>> MPI_COMM_WORLD, mpierr)" I checked that count is never bigger than 2000 (as 
>>> you mentioned it could flip to the negative). However I haven't declared it 
>>> as INTEGER*4 and I think I should.
>>> When I said "I had to raise the number of data strucutures to be sent", I 
>>> meant that I had to call MPI_SEND many more times, not that buffers were 
>>> bigger than before.
>>> 
>>> I'll get back to you with more info when I'll be able to fix my connexion 
>>> problem to the cluster...
>>> 
>>> Thanks,
>>> Benjamin
>>> 
>>> 2010/12/3 Martin Siegert <[email protected]>
>>> Hi All,
>>> 
>>> just to expand on this guess ...
>>> 
>>> On Thu, Dec 02, 2010 at 05:40:53PM -0500, Gus Correa wrote:
>>>> Hi All
>>>> 
>>>> I wonder if configuring OpenMPI while
>>>> forcing the default types to non-default values
>>>> (-fdefault-integer-8 -fdefault-real-8) might have
>>>> something to do with the segmentation fault.
>>>> Would this be effective, i.e., actually make the
>>>> the sizes of MPI_INTEGER/MPI_INT and MPI_REAL/MPI_FLOAT bigger,
>>>> or just elusive?
>>> 
>>> I believe what happens is that this mostly affects the fortran
>>> wrapper routines and the way Fortran variables are mapped to C:
>>> 
>>> MPI_INTEGER -> MPI_LONG
>>> MPI_FLOAT   -> MPI_DOUBLE
>>> MPI_DOUBLE_PRECISION -> MPI_DOUBLE
>>> 
>>> In that respect I believe that the -fdefault-real-8 option is harmless,
>>> i.e., it does the expected thing.
>>> But the -fdefault-integer-8 options ought to be highly dangerous:
>>> It works for integer variables that are used as "buffer" arguments
>>> in MPI statements, but I would assume that this does not work for
>>> "count" and similar arguments.
>>> Example:
>>> 
>>> integer, allocatable :: buf(*,*)
>>> integer i, count, dest, tag, mpierr
>>> 
>>> i = 32768
>>> i2 = 2*i
>>> allocate(buf(i,i2))
>>> count = i*i2
>>> buf = 1
>>> call MPI_Send(buf, count, MPI_INTEGER, dest, tag, MPI_COMM_WORLD, mpierr)
>>> 
>>> Now count is 2^31 which overflows a 32bit integer.
>>> The MPI standard requires that count is a 32bit integer, correct?
>>> Thus while buf gets the type MPI_LONG, count remains an int.
>>> Is this interpretation correct? If it is, then you are calling
>>> MPI_Send with a count argument of -2147483648.
>>> Which could result in a segmentation fault.
>>> 
>>> Cheers,
>>> Martin
>>> 
>>> --
>>> Martin Siegert
>>> Head, Research Computing
>>> WestGrid/ComputeCanada Site Lead
>>> IT Services                                phone: 778 782-4691
>>> Simon Fraser University                    fax:   778 782-4242
>>> Burnaby, British Columbia                  email: [email protected]
>>> Canada  V5A 1S6
>>> 
>>>> There were some recent discussions here about MPI
>>>> limiting counts to MPI_INTEGER.
>>>> Since Benjamin said he "had to raise the number of data structures",
>>>> which eventually led to the the error,
>>>> I wonder if he is inadvertently flipping to negative integer
>>>> side of the 32-bit universe (i.e. >= 2**31), as was reported here by
>>>> other list subscribers a few times.
>>>> 
>>>> Anyway, segmentation fault can come from many different places,
>>>> this is just a guess.
>>>> 
>>>> Gus Correa
>>>> 
>>>> Jeff Squyres wrote:
>>>>> Do you get a corefile?
>>>>> 
>>>>> It looks like you're calling MPI_RECV in Fortran and then it segv's.  
>>>>> This is *likely* because you're either passing a bad parameter or your 
>>>>> buffer isn't big enough.  Can you double check all your parameters?
>>>>> 
>>>>> Unfortunately, there's no line numbers printed in the stack trace, so 
>>>>> it's not possible to tell exactly where in the ob1 PML it's dying (i.e., 
>>>>> so we can't see exactly what it's doing to cause the segv).
>>>>> 
>>>>> 
>>>>> 
>>>>> On Dec 2, 2010, at 9:36 AM, Benjamin Toueg wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I am using DRAGON, a neutronic simulation code in FORTRAN77 that has its 
>>>>>> own datastructures. I added a module to send these data structures 
>>>>>> thanks to MPI_SEND / MPI_RECEIVE, and everything worked perfectly for a 
>>>>>> while.
>>>>>> 
>>>>>> Then I had to raise the number of data structures to be sent up to a 
>>>>>> point where my cluster has this bug :
>>>>>> *** Process received signal ***
>>>>>> Signal: Segmentation fault (11)
>>>>>> Signal code: Address not mapped (1)
>>>>>> Failing at address: 0x2c2579fc0
>>>>>> [ 0] /lib/libpthread.so.0 [0x7f52d2930410]
>>>>>> [ 1] /home/toueg/openmpi/lib/openmpi/mca_pml_ob1.so [0x7f52d153fe03]
>>>>>> [ 2] /home/toueg/openmpi/lib/libmpi.so.0(PMPI_Recv+0x2d2) 
>>>>>> [0x7f52d3504a1e]
>>>>>> [ 3] /home/toueg/openmpi/lib/libmpi_f77.so.0(pmpi_recv_+0x10e) 
>>>>>> [0x7f52d36cf9c6]
>>>>>> 
>>>>>> How can I make this error more explicit ?
>>>>>> 
>>>>>> I use the following configuration of openmpi-1.4.3 :
>>>>>> ./configure --enable-debug --prefix=/home/toueg/openmpi CXX=g++ CC=gcc 
>>>>>> F77=gfortran FC=gfortran FLAGS="-m64 -fdefault-integer-8 
>>>>>> -fdefault-real-8 -fdefault-double-8" FCFLAGS="-m64 -fdefault-integer-8 
>>>>>> -fdefault-real-8 -fdefault-double-8" --disable-mpi-f90
>>>>>> 
>>>>>> Here is the output of mpif77 -v :
>>>>>> mpif77 for 1.2.7 (release) of : 2005/11/04 11:54:51
>>>>>> Driving: f77 -L/usr/lib/mpich-mpd/lib -v -lmpich-p4mpd -lpthread -lrt 
>>>>>> -lfrtbegin -lg2c -lm -shared-libgcc
>>>>>> Lecture des spécification à partir de 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/specs
>>>>>> Configuré avec: ../src/configure -v --enable-languages=c,c++,f77,pascal 
>>>>>> --prefix=/usr --libexecdir=/usr/lib 
>>>>>> --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared 
>>>>>> --with-system-zlib --enable-nls --without-included-gettext 
>>>>>> --program-suffix=-3.4 --enable-__cxa_atexit --enable-clocale=gnu 
>>>>>> --enable-libstdcxx-debug x86_64-linux-gnu
>>>>>> Modèle de thread: posix
>>>>>> version gcc 3.4.6 (Debian 3.4.6-5)
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/collect2 --eh-frame-hdr -m 
>>>>>> elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crt1.o 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crti.o 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtbegin.o -L/usr/lib/mpich-mpd/lib 
>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 
>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6 
>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib 
>>>>>> -L/usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../.. -L/lib/../lib 
>>>>>> -L/usr/lib/../lib -lmpich-p4mpd -lpthread -lrt -lfrtbegin -lg2c -lm 
>>>>>> -lgcc_s -lgcc -lc -lgcc_s -lgcc 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/crtend.o 
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/crtn.o
>>>>>> /usr/lib/gcc/x86_64-linux-gnu/3.4.6/../../../../lib/libfrtbegin.a(frtbegin.o):
>>>>>>  dans la fonction ▒ main ▒:
>>>>>> (.text+0x1e): référence indéfinie vers ▒ MAIN__ ▒
>>>>>> collect2: ld a retourné 1 code d'état d'exécution
>>>>>> 
>>>>>> Thanks,
>>>>>> Benjamin
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> [email protected]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> [email protected]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
[email protected]
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] Scalability issue

Reply via email to