May be this FAQ will help : http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion

Brock Palen wrote:
We have a code (arts) that locks up only when running on IB. Works fine on tcp and sm.

When we ran it in a debugger. It locked up on a MPI_Comm_split() That as far as I could tell was valid. Because the split was a hack they did to use MPI_File_open() on a single cpu, we reworked it to remove the split. The code then locks up again.

This time its locked up on an MPI_Allreduce() Which was really strange. When running on 8 cpus only rank 4 would get sucks. The rest of the ranks are fine and get the right value. (we are using ddt as our debugger).

Its very strange. Do you have any idea what could cause this to happen? We are using openmpi-1.2.3/1.2.6 with PGI compilers.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to