On Feb 22, 2011, at 11:06 AM, Bill Rankin wrote: > Try putting an "MPI_Barrier()" call before your MPI_Finalize() [*]. I > suspect that one of the programs (the sending side) is calling Finalize > before the receiving side has processed the messages.
FWIW: I have rarely seen this to be the issue. MPI does not guarantee point-to-point progress when you are in a collective. Some implementations do this anyone; others do not (e.g., some of OMPI's transports will; others will not). In short, programs are erroneous that do not guarantee that all their outstanding requests have completed before calling finalize. Also, I first read your email on a phone and did not notice that you had *2* sets of source code. Sorry for the confusion. I just copied your 2nd code to my test cluster and it runs fine for me across multiple nodes -- it does not hang. The order of waits seems correct to me. > -bill > > [*] pet peeve of mine : this should almost always be standard practice. > > >> -----Original Message----- >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >> Behalf Of Xianglong Kong >> Sent: Tuesday, February 22, 2011 10:27 AM >> To: Open MPI Users >> Subject: Re: [OMPI users] Beginner's question: why multiple sends or >> receives don't work? >> >> Hi, Thank you for the reply. >> >> However, using MPI_waitall instead of MPI_wait didn't solve the >> problem. The code would hang at the MPI_waitall. Also, I'm not quit >> understand why the code is inherently unsafe. Can the non-blocking >> send or receive cause any deadlock? >> >> Thanks! >> >> Kong >> >> On Mon, Feb 21, 2011 at 2:32 PM, Jeff Squyres <jsquy...@cisco.com> >> wrote: >>> It's because you're waiting on the receive request to complete before >> the send request. This likely works locally because the message >> transfer is through shared memory and is fast, but it's still an >> inherently unsafe way to block waiting for completion (i.e., the >> receive might not complete if the send does not complete). >>> >>> What you probably want to do is build an array of 2 requests and then >> issue a single MPI_Waitall() on both of them. This will allow MPI to >> progress both requests simultaneously. >>> >>> >>> On Feb 18, 2011, at 11:58 AM, Xianglong Kong wrote: >>> >>>> Hi, all, >>>> >>>> I'm an mpi newbie. I'm trying to connect two desktops in my office >>>> with each other using a crossing cable and implement a parallel code >>>> on them using MPI. >>>> >>>> Now, the two nodes can ssh to each other without password, and can >>>> successfully run the MPI "Hello world" code. However, when I tried >> to >>>> use multiple MPI non-blocking sends or receives, the job would hang. >>>> The problem only showed up if the two processes are launched in the >>>> different nodes, the code can run successfully if the two processes >>>> are launched in the same node. Also, the code can run successfully >> if >>>> there are only one send or/and one receive in each process. >>>> >>>> Here is the code that can run successfully: >>>> >>>> #include <stdlib.h> >>>> #include <stdio.h> >>>> #include <string.h> >>>> #include <mpi.h> >>>> >>>> int main(int argc, char** argv) { >>>> >>>> int myrank, nprocs; >>>> >>>> MPI_Init(&argc, &argv); >>>> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); >>>> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); >>>> >>>> printf("Hello from processor %d of %d\n", myrank, nprocs); >>>> >>>> MPI_Request reqs1, reqs2; >>>> MPI_Status stats1, stats2; >>>> >>>> int tag1=10; >>>> int tag2=11; >>>> >>>> int buf; >>>> int mesg; >>>> int source=1-myrank; >>>> int dest=1-myrank; >>>> >>>> if(myrank==0) >>>> { >>>> mesg=1; >>>> >>>> MPI_Irecv(&buf, 1, MPI_INT, source, tag1, >> MPI_COMM_WORLD, &reqs1); >>>> MPI_Isend(&mesg, 1, MPI_INT, dest, tag2, >> MPI_COMM_WORLD, &reqs2); >>>> >>>> >>>> } >>>> >>>> if(myrank==1) >>>> { >>>> mesg=2; >>>> >>>> MPI_Irecv(&buf, 1, MPI_INT, source, tag2, >> MPI_COMM_WORLD, &reqs1); >>>> MPI_Isend(&mesg, 1, MPI_INT, dest, tag1, >> MPI_COMM_WORLD, &reqs2); >>>> } >>>> >>>> MPI_Wait(&reqs1, &stats1); >>>> printf("myrank=%d,received the message\n",myrank); >>>> >>>> MPI_Wait(&reqs2, &stats2); >>>> printf("myrank=%d,sent the messages\n",myrank); >>>> >>>> printf("myrank=%d, buf=%d\n",myrank, buf); >>>> >>>> MPI_Finalize(); >>>> return 0; >>>> } >>>> >>>> And here is the code that will hang >>>> >>>> #include <stdlib.h> >>>> #include <stdio.h> >>>> #include <string.h> >>>> #include <mpi.h> >>>> >>>> int main(int argc, char** argv) { >>>> >>>> int myrank, nprocs; >>>> >>>> MPI_Init(&argc, &argv); >>>> MPI_Comm_size(MPI_COMM_WORLD, &nprocs); >>>> MPI_Comm_rank(MPI_COMM_WORLD, &myrank); >>>> >>>> printf("Hello from processor %d of %d\n", myrank, nprocs); >>>> >>>> MPI_Request reqs1, reqs2; >>>> MPI_Status stats1, stats2; >>>> >>>> int tag1=10; >>>> int tag2=11; >>>> >>>> int source=1-myrank; >>>> int dest=1-myrank; >>>> >>>> if(myrank==0) >>>> { >>>> int buf1, buf2; >>>> >>>> MPI_Irecv(&buf1, 1, MPI_INT, source, tag1, >> MPI_COMM_WORLD, &reqs1); >>>> MPI_Irecv(&buf2, 1, MPI_INT, source, tag2, >> MPI_COMM_WORLD, &reqs2); >>>> >>>> MPI_Wait(&reqs1, &stats1); >>>> printf("received one message\n"); >>>> >>>> MPI_Wait(&reqs2, &stats2); >>>> printf("received two messages\n"); >>>> >>>> printf("myrank=%d, buf1=%d, buf2=%d\n",myrank, buf1, >> buf2); >>>> } >>>> >>>> if(myrank==1) >>>> { >>>> int mesg1=1; >>>> int mesg2=2; >>>> >>>> MPI_Isend(&mesg1, 1, MPI_INT, dest, tag1, >> MPI_COMM_WORLD, &reqs1); >>>> MPI_Isend(&mesg2, 1, MPI_INT, dest, tag2, >> MPI_COMM_WORLD, &reqs2); >>>> >>>> MPI_Wait(&reqs1, &stats1); >>>> printf("sent one message\n"); >>>> >>>> MPI_Wait(&reqs2, &stats2); >>>> printf("sent two messages\n"); >>>> } >>>> >>>> MPI_Finalize(); >>>> return 0; >>>> } >>>> >>>> And the output of the second failed code: >>>> *********************************************** >>>> Hello from processor 0 of 2 >>>> >>>> Received one message >>>> >>>> Hello from processor 1 of 2 >>>> >>>> Sent one message >>>> ******************************************************* >>>> >>>> Can anyone help to point out why the second code didn't work? >>>> >>>> Thanks! >>>> >>>> Kong >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> >> -- >> Xianglong Kong >> Department of Mechanical Engineering >> University of Rochester >> Phone: (585)520-4412 >> MSN: dinosaur8...@hotmail.com >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/