Simone Pellegrini wrote:
sorry for the delay but I did some additional experiments to found out
whether the problem was openmpi or gcc!
The program just hangs... and never terminates! I am running on a SMP
machine with 32 cores, actually it is a Sun Fire X4600 X2. (8
quad-core Barcelona AMD chips), the OS is CentOS 5 and the kernel is
2.6.18-92.el5.src-PAPI (patched with PAPI).
I use a N of 1024, and if I print out the value of the iterator i,
sometimes it stops around 165, other times around 520... and it
doesn't make any sense.
If I run the program (and it's important to notice I don't recompile
it, I just use another mpirun from a different mpi version) the
program works fine. I did some experiments during the weekend and if I
use openmpi-1.3.2 compiled with gcc433 everything works fine.
So I really think the problem is strictly related to the usage of
gcc-4.4.0! ...and it doesn't depends from OpenMPI as the program hangs
even when I use gcc 1.3.1 compiled with gcc 4.4!
I finally got GCC 4.4, but was unable to reproduce the problem. How
small can you make np (number of MPI processes) and still see the
problem? How reproducible is the problem? When it hangs, can you get
stack traces of all the processes? We're trying to hunt down some
similar behavior, but I think yours is of a different flavor.