Hi

Thanks for replying. Answering all the questions:

- This is a debian box, X86_64 native. So all that is compiled is naturally 64 bit;

- I've compiled myself the fftw-2.5.1 because the fftw3 has only experimental MPI suport, without Fortran bindings. I've asked if the project has stoped because the last release (fftw, 3.2 alpha) is dated Nov. 13, 2007

- I'm using openmpi, from the debian package. I've also compiled openmpi by hand and the same problem happens. I've compiled the latest LAM (although had to explicit the 4.1 version of gcc suite because I've found a problem with the 4.3. It says g++ isn't boolean capable). I can run other mpi codes in this machine (a pseudo-spectral DNS code I've parallized myself) with this openmpi instalation;

- Using LAM it works for 1 processor. It blews up for more than 2. I can run my DNS code with lam without problem.

- The only 64 bit caveat on the fftw notes relates to the declaration of the plan variables that should be integer(8). I've carefully done that. I even got to the extreme of placing -fdefault-integer-8 in the compilation flags of this code;

 - I can run this code as serial or threaded without problems;

- The 32 bit test was my laptop, a 32 bit machine. The 64 bit on the 64 bit machine. No libraries are transported (svn co and make and so on...)

 - Yes, I've managed to run the tests (but they are C programs allas!).

- The program only blows up when going to do the fft r2c (my first transform). Before that it is able to do another mpi functions.

- Gus, Ode Triunfal by Alvaro de Campos is one of my favourite poems. The early XX century machine emotion fever of electricity. The furious hunger to be alive and eating the world full :)

- I've tried it on another debian box, X86_64, with openmpi from debian and the same problem happens...


 - if I compile with -fdefault-integer-8 this is the error message

5068.0 $ mpirun -np 2 ~/bin/spec2.mpi
Launching MPI program with     2 proc.
[tenorio:21099] *** Process received signal ***
[tenorio:21100] *** Process received signal ***
[tenorio:21099] Signal: Segmentation fault (11)
[tenorio:21099] Signal code:  (128)
[tenorio:21099] Failing at address: (nil)
[tenorio:21099] [ 0] /lib/libpthread.so.0 [0x7f13ca893a90]
[tenorio:21099] [ 1] /usr/lib/libopen-pal.so.0(_int_malloc+0x962) 
[0x7f13cb3057c2]
[tenorio:21099] [ 2] /usr/lib/libopen-pal.so.0(malloc+0x8f) [0x7f13cb3068ef]
[tenorio:21099] [ 3] /home/rreis/bin/spec2.mpi(MAIN__+0x79a) [0x40eb0a]
[tenorio:21099] [ 4] /home/rreis/bin/spec2.mpi(main+0x2c) [0x46d3cc]
[tenorio:21099] [ 5] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f13ca5501a6]
[tenorio:21099] [ 6] /home/rreis/bin/spec2.mpi [0x407d59]
[tenorio:21099] *** End of error message ***
[tenorio:21100] Signal: Segmentation fault (11)
[tenorio:21100] Signal code:  (128)
[tenorio:21100] Failing at address: (nil)
[tenorio:21100] [ 0] /lib/libpthread.so.0 [0x7f858af35a90]
[tenorio:21100] [ 1] /usr/lib/libopen-pal.so.0(_int_malloc+0x962) 
[0x7f858b9a77c2]
[tenorio:21100] [ 2] /usr/lib/libopen-pal.so.0(malloc+0x8f) [0x7f858b9a88ef]
[tenorio:21100] [ 3] /home/rreis/bin/spec2.mpi(MAIN__+0x79a) [0x40eb0a]
[tenorio:21100] [ 4] /home/rreis/bin/spec2.mpi(main+0x2c) [0x46d3cc]
[tenorio:21100] [ 5] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f858abf21a6]
[tenorio:21100] [ 6] /home/rreis/bin/spec2.mpi [0x407d59]
[tenorio:21100] *** End of error message ***
mpirun noticed that job rank 0 with PID 21099 on node tenorio exited on signal 11 (Segmentation fault).
1 additional process aborted (not shown)

 - if I take the flag out

5070.0 $ mpirun -np 2 ~/bin/spec2.mpi
Launching MPI program with     2 proc.
Read field (DONE)
[tenorio:21234] *** Process received signal ***
[tenorio:21234] Signal: Segmentation fault (11)
[tenorio:21234] Signal code: Address not mapped (1)
[tenorio:21234] Failing at address: 0x4840
[tenorio:21234] [ 0] /lib/libpthread.so.0 [0x7fd57da65a90]
[tenorio:21234] [ 1] /home/rreis/bin/spec2.mpi(rfftwnd_f77_mpi_+0x16) [0x40f676]
[tenorio:21234] [ 2] /home/rreis/bin/spec2.mpi(MAIN__+0xb69) [0x40f1fe]
[tenorio:21234] [ 3] /home/rreis/bin/spec2.mpi(main+0x2c) [0x46d6bc]
[tenorio:21234] [ 4] /lib/libc.so.6(__libc_start_main+0xe6) [0x7fd57d7221a6]
[tenorio:21234] [ 5] /home/rreis/bin/spec2.mpi [0x407d59]
[tenorio:21234] *** End of error message ***
mpirun noticed that job rank 0 with PID 21234 on node tenorio exited on signal 11 (Segmentation fault).
1 additional process aborted (not shown)


Maybe I should try mpich or compile the openmpi with all bells and whistles and give it another run...

 greets,

 Ricardo Reis

 'Non Serviam'

 PhD student @ Lasef
 Computational Fluid Dynamics, High Performance Computing, Turbulence
 http://www.lasef.ist.utl.pt

 &

 Cultural Instigator @ RĂ¡dio Zero
 http://www.radiozero.pt

 http://www.flickr.com/photos/rreis/
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to