Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
5 16:30:01,Richard wrote: >> Hi Jody, >> thanks for your suggestion and you are right. if I use the ring example, the >> same happened. >> I have put a printf statement, it seems that all the three processed have >> re

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Jeff Squyres
ght. if I use the ring example, the >> same happened. >> I have put a printf statement, it seems that all the three processed have >> reached the line >> calling "PMPI_Allreduce", any further suggestion? >> >> Thanks. >> Richard >> >

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Jeff Squyres
rote: > Hi Jody, > thanks for your suggestion and you are right. if I use the ring example, the > same happened. > I have put a printf statement, it seems that all the three processed have > reached the line > calling "PMPI_Allreduce", any further suggestion? > > Tha

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
ge: 12 List-Post: users@lists.open-mpi.org Date: Tue, 25 Sep 2012 09:43:09 +0200 From: jody Subject: Re: [OMPI users] mpi job is blocked To: Open MPI Users Message-ID: Content-Type: text/plain; charset=ISO-8859-1 Hi Richard When a collective call hangs, this usually means that one (or

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
e, 25 Sep 2012 09:43:09 +0200 From: jody Subject: Re: [OMPI users] mpi job is blocked To: Open MPI Users Message-ID: Content-Type: text/plain; charset=ISO-8859-1 Hi Richard When a collective call hangs, this usually means that one (or more) processes did not reach this command. Are you

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Richard
ost: users@lists.open-mpi.org Date: Tue, 25 Sep 2012 09:43:09 +0200 From: jody Subject: Re: [OMPI users] mpi job is blocked To: Open MPI Users Message-ID: Content-Type: text/plain; charset=ISO-8859-1 Hi Richard When a collective call hangs, this usually means that one (or more) process

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread Jeff Squyres
+1 Additionally, if you're trying to debug your machines/network/setup, you might want to use something simpler, like the ring programs in the examples/ directory. On Sep 25, 2012, at 9:43 AM, jody wrote: > Hi Richard > > When a collective call hangs, this usually means that one (or more) >

Re: [OMPI users] mpi job is blocked

2012-09-25 Thread jody
Hi Richard When a collective call hangs, this usually means that one (or more) processes did not reach this command. Are you sure that all processes reach the allreduce statement? If something like this happens to me, i insert print statements just before the MPI-call so i can see which processes

[OMPI users] mpi job is blocked

2012-09-25 Thread Richard
I have 3 computers with the same Linux system. I have setup the mpi cluster based on ssh connection. I have tested a very simple mpi program, it works on the cluster. To make my story clear, I name the three computer as A, B and C. 1) If I run the job with 2 processes on A and B, it works. 2)