I am not sure whether test will cover this... You should check it...
I here attach my example script which shows two working cases, and one not workning (you can check the memory usage simultaneously and see that the first two works, the last one goes ballistic in memory). Just check it with test to see if it works... 2014-09-18 13:20 GMT+02:00 XingFENG <xingf...@cse.unsw.edu.au>: > Thanks very much for your reply! > > To Sir Jeff Squyres: > > I think it fails due to truncation errors. I am now logging information of > each send and receive to find out the reason. > > > > > To Sir Nick Papior Andersen: > > Oh, wait (mpi_wait) is never called in my codes. > > What I do is to call MPI_Irecv once. Then MPI_Test is called several > times to check whether new messages are available. If new messages are > available, some functions to process these messages are called. > > I will add the wait function and check the running results. > > On Thu, Sep 18, 2014 at 8:47 PM, Nick Papior Andersen < > nickpap...@gmail.com> wrote: > >> In complement to Jeff, I would add that using asynchronous messages >> REQUIRES that you wait (mpi_wait) for all messages at some point. Even >> though this might not seem obvious it is due to memory allocation "behind >> the scenes" which are only de-allocated upon completion through a wait >> statement. >> >> >> 2014-09-18 12:36 GMT+02:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com>: >> >> On Sep 18, 2014, at 2:43 AM, XingFENG <xingf...@cse.unsw.edu.au> wrote: >>> >>> > a. How to get more information about errors? I got errors like below. >>> This says that program exited abnormally in function MPI_Test(). But is >>> there a way to know more about the error? >>> > >>> > *** An error occurred in MPI_Test >>> > *** on communicator MPI_COMM_WORLD >>> > *** MPI_ERR_TRUNCATE: message truncated >>> > *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort >>> >>> For the purpose of this discussion, let's take a simplification that you >>> are sending and receiving the same datatypes (e.g., you're sending MPI_INT >>> and you're receiving MPI_INT). >>> >>> This error means that you tried to receive message with too small a >>> buffer. >>> >>> Specifically, MPI says that if you send a message that is X element long >>> (e.g., 20 MPI_INTs), then the matching receive must be Y elements, where >>> Y>=X (e.g., *at least* 20 MPI_INTs). If the receiver provides a Y where >>> Y<X, this is a truncation error. >>> >>> Unfortunately, Open MPI doesn't report a whole lot more information >>> about these kinds of errors than what you're seeing, sorry. >>> >>> > b. Are there anything to note about asynchronous communication? I use >>> MPI_Isend, MPI_Irecv, MPI_Test to implement asynchronous communication. My >>> program works well on small data sets(10K nodes graphs), but it exits >>> abnormally on large data set (1M nodes graphs). >>> >>> Is it failing due to truncation errors, or something else? >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/09/25344.php >>> >> >> >> >> -- >> Kind regards Nick >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/09/25345.php >> > > > > -- > Best Regards. > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/09/25346.php > -- Kind regards Nick
irecv_issend.f90
Description: Binary data