Simone Pellegrini wrote:

sorry for the delay but I did some additional experiments to found out whether the problem was openmpi or gcc!

The program just hangs... and never terminates! I am running on a SMP machine with 32 cores, actually it is a Sun Fire X4600 X2. (8 quad-core Barcelona AMD chips), the OS is CentOS 5 and the kernel is 2.6.18-92.el5.src-PAPI (patched with PAPI). I use a N of 1024, and if I print out the value of the iterator i, sometimes it stops around 165, other times around 520... and it doesn't make any sense.

If I run the program (and it's important to notice I don't recompile it, I just use another mpirun from a different mpi version) the program works fine. I did some experiments during the weekend and if I use openmpi-1.3.2 compiled with gcc433 everything works fine.

So I really think the problem is strictly related to the usage of gcc-4.4.0! ...and it doesn't depends from OpenMPI as the program hangs even when I use gcc 1.3.1 compiled with gcc 4.4!

I finally got GCC 4.4, but was unable to reproduce the problem. How small can you make np (number of MPI processes) and still see the problem? How reproducible is the problem? When it hangs, can you get stack traces of all the processes? We're trying to hunt down some similar behavior, but I think yours is of a different flavor.

Reply via email to