I think Bill is right. Here is the description for mpi_finalize: This routine cleans up all MPI states. Once this routine is called, no MPI routine (not even MPI_Init) may be called, except for MPI_Get_version, MPI_Initialized, and MPI_Finalized. Unless there has been a call to MPI_Abort, you must ensure that all pending communications involving a process are complete before the process calls MPI_Finalize. If the call returns, each process may either continue local computations or exit without participating in further communication with other processes. *At the moment when the last process calls MPI_Finalize, all pending sends must be matched by a receive, and all pending receives must be matched by a send. *
So I believe what Bill is alluding to is that after you called the second Isend, your receive side hasn't posted the second Irecv; thus when mpi_finalize is called on the send side, the message got thrown out. When your receive side does get to the second Irecv, it is waiting for a message that'll never arrive. On Tue, Feb 22, 2011 at 8:06 AM, Bill Rankin <bill.ran...@sas.com> wrote: > Try putting an "MPI_Barrier()" call before your MPI_Finalize() [*]. I > suspect that one of the programs (the sending side) is calling Finalize > before the receiving side has processed the messages. > > -bill > > [*] pet peeve of mine : this should almost always be standard practice. > > > > -----Original Message----- > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > > Behalf Of Xianglong Kong > > Sent: Tuesday, February 22, 2011 10:27 AM > > To: Open MPI Users > > Subject: Re: [OMPI users] Beginner's question: why multiple sends or > > receives don't work? > > > > Hi, Thank you for the reply. > > > > However, using MPI_waitall instead of MPI_wait didn't solve the > > problem. The code would hang at the MPI_waitall. Also, I'm not quit > > understand why the code is inherently unsafe. Can the non-blocking > > send or receive cause any deadlock? > > > > Thanks! > > > > Kong > > > > On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquy...@cisco.com> > > wrote: > > > It's because you're waiting on the receive request to complete before > > the send request. This likely works locally because the message > > transfer is through shared memory and is fast, but it's still an > > inherently unsafe way to block waiting for completion (i.e., the > > receive might not complete if the send does not complete). > > > > > > What you probably want to do is build an array of 2 requests and then > > issue a single MPI_Waitall() on both of them. This will allow MPI to > > progress both requests simultaneously. > > > > > > > > > On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote: > > > > > >> Hi, all, > > >> > > >> I'm an mpi newbie. I'm trying to connect two desktops in my office > > >> with each other using a crossing cable and implement a parallel code > > >> on them using MPI. > > >> > > >> Now, the two nodes can ssh to each other without password, and can > > >> successfully run the MPI "Hello world" code. However, when I tried > > to > > >> use multiple MPI non-blocking sends or receives, the job would hang. > > >> The problem only showed up if the two processes are launched in the > > >> different nodes, the code can run successfully if the two processes > > >> are launched in the same node. Also, the code can run successfully > > if > > >> there are only one send or/and one receive in each process. > > >> > > >> Here is the code that can run successfully: > > >> > > >> #include <stdlib.h> > > >> #include <stdio.h> > > >> #include <string.h> > > >> #include <mpi.h> > > >> > > >> int main(int argc, char** argv) { > > >> > > >> int myrank, nprocs; > > >> > > >> MPI_Init(&argc, &argv); > > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); > > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); > > >> > > >> printf("Hello from processor %d of %d\n", myrank, nprocs); > > >> > > >> MPI_Request reqs1, reqs2; > > >> MPI_Status stats1, stats2; > > >> > > >> int tag1=10; > > >> int tag2=11; > > >> > > >> int buf; > > >> int mesg; > > >> int source=1-myrank; > > >> int dest=1-myrank; > > >> > > >> if(myrank==0) > > >> { > > >> mesg=1; > > >> > > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag1, > > MPI_COMM_WORLD, &reqs1); > > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag2, > > MPI_COMM_WORLD, &reqs2); > > >> > > >> > > >> } > > >> > > >> if(myrank==1) > > >> { > > >> mesg=2; > > >> > > >> MPI_Irecv(&buf, 1, MPI_INT, source, tag2, > > MPI_COMM_WORLD, &reqs1); > > >> MPI_Isend(&mesg, 1, MPI_INT, dest, tag1, > > MPI_COMM_WORLD, &reqs2); > > >> } > > >> > > >> MPI_Wait(&reqs1, &stats1); > > >> printf("myrank=%d,received the message\n",myrank); > > >> > > >> MPI_Wait(&reqs2, &stats2); > > >> printf("myrank=%d,sent the messages\n",myrank); > > >> > > >> printf("myrank=%d, buf=%d\n",myrank, buf); > > >> > > >> MPI_Finalize(); > > >> return 0; > > >> } > > >> > > >> And here is the code that will hang > > >> > > >> #include <stdlib.h> > > >> #include <stdio.h> > > >> #include <string.h> > > >> #include <mpi.h> > > >> > > >> int main(int argc, char** argv) { > > >> > > >> int myrank, nprocs; > > >> > > >> MPI_Init(&argc, &argv); > > >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); > > >> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); > > >> > > >> printf("Hello from processor %d of %d\n", myrank, nprocs); > > >> > > >> MPI_Request reqs1, reqs2; > > >> MPI_Status stats1, stats2; > > >> > > >> int tag1=10; > > >> int tag2=11; > > >> > > >> int source=1-myrank; > > >> int dest=1-myrank; > > >> > > >> if(myrank==0) > > >> { > > >> int buf1, buf2; > > >> > > >> MPI_Irecv(&buf1, 1, MPI_INT, source, tag1, > > MPI_COMM_WORLD, &reqs1); > > >> MPI_Irecv(&buf2, 1, MPI_INT, source, tag2, > > MPI_COMM_WORLD, &reqs2); > > >> > > >> MPI_Wait(&reqs1, &stats1); > > >> printf("received one message\n"); > > >> > > >> MPI_Wait(&reqs2, &stats2); > > >> printf("received two messages\n"); > > >> > > >> printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1, > > buf2); > > >> } > > >> > > >> if(myrank==1) > > >> { > > >> int mesg1=1; > > >> int mesg2=2; > > >> > > >> MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1, > > MPI_COMM_WORLD, &reqs1); > > >> MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2, > > MPI_COMM_WORLD, &reqs2); > > >> > > >> MPI_Wait(&reqs1, &stats1); > > >> printf("sent one message\n"); > > >> > > >> MPI_Wait(&reqs2, &stats2); > > >> printf("sent two messages\n"); > > >> } > > >> > > >> MPI_Finalize(); > > >> return 0; > > >> } > > >> > > >> And the output of the second failed code: > > >> *********************************************** > > >> Hello from processor 0 of 2 > > >> > > >> Received one message > > >> > > >> Hello from processor 1 of 2 > > >> > > >> Sent one message > > >> ******************************************************* > > >> > > >> Can anyone help to point out why the second code didn't work? > > >> > > >> Thanks! > > >> > > >> Kong > > >> > > >> _______________________________________________ > > >> users mailing list > > >> us...@open-mpi.org > > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > -- > > > Jeff Squyres > > > jsquy...@cisco.com > > > For corporate legal information go to: > > > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > > -- > > Xianglong Kong > > Department of Mechanical Engineering > > University of Rochester > > Phone: (585)520-4412 > > MSN: dinosaur8...@hotmail.com > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Zhang University of California, San Diego