At first glance, your code doesn't look problematic.  First thing I'd check is 
ensure that QRECS is large enough to hold the incoming data (i.e., that you 
aren't overwriting the buffer and causing memory corruption, which can cause 
weird/unexplained faults like this).

Also, you might well be able to accomplish the same communication pattern with 
MPI_GATHER (or MPI_GATHERV, if each rank is sending a different amount of 
information).


On Sep 14, 2013, at 12:27 AM, Huangwei <hz...@cam.ac.uk>
 wrote:

> The code I would like to post is like this:
> 
> if(myrank .ne. 0) then
>      itag = myrank
>      call mpi_send(Q.............., 0, itag, .................)
> else 
>      do i=1, N-1
>           itag = i
>          call mpi_recv(QRECS......., i, itag, .................)
>      enddo
>    
> endif
> 
> call mpi_bcast(YVAR............., 0, ..............)
> 
> best regards,
> Huangwei
> 
>  
> 
>  
> 
> 
> On 13 September 2013 23:25, Huangwei <hz...@cam.ac.uk> wrote:
> Dear All, 
> 
> I have a question about using MPI_send and MPI_recv. 
> 
> The object  is as follows:
> I would like to send an array Q from rank=1, N-1 to rank=0, and then rank 0 
> receives the Q from all other processors. Q will be put into a new array Y in 
> rank 0. (of couse this is not realized by MPI). and then MPI_bcast is used 
> (from rank 0) to broadcast the Y to all the processors. 
> 
> Fortran Code is like:
> if(myrank .eq. 0) then
>      itag = myrank
>      call mpi_send(Q.............., 0, itag, .................)
> else 
>      do i=1, N-1
>           itag = i
>          call mpi_recv(QRECS......., i, itag, .................)
>      enddo
>    
> endif
> 
> call mpi_bcast(YVAR............., 0, ..............)
> 
> Problem I met is:
> In my simulation, time marching is performed, These mpi_send and recv are 
> fine for the first three time steps. However, for the fourth time step, the 
> looping is only finished from i=1 to i=13, (totally 48 processors). That mean 
> after 14th processors, the mpi_recv did not receive the date from them. Thus 
> the code hangs there forever. Does deadlock occur for this situation? How can 
> I figure out this problem?
> 
> Thank you so much if anyone can give me some suggestions. 
> 
> best regards,
> Huangwei
> 
>  
> 
>  
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to