I am trying to modify the communication routines in our code to use MPI_Put's instead of sends and receives. This worked fine for several variable Put's, but now I have one that is causing seg faults. Reading through the MPI documentation it is not clear to me if what I am doing is permissible or not. Basically, the question is this - if I have defined all of an array as a window on each processor, can I PUT data from that array to remote processes at the same time as the remote processes are PUTing into the local copy, assuming no overlaps of any of the PUTs?
Here are the details if that doesn't make sense. I have a (Fortran) array QF(6,2,N) on each processor, where N could be a very large number (100,000). I create a window QFWIN on the entire array on all the processors. I define MPI_Type_indexed "sending" datatypes (QFSND) with block lengths of 6 that send from QF(1,1,*), and MPI_Type_indexed "receiving" datatypes (QFREC) with block lengths of 6 the receive into QF(1,2,*). Here * is non-repeating set of integers up to N. I create groups of processors that communicate, where these groups will all exchange QF data, PUTing local QF(1,1,*) to remote QF(1,2,*). So, processor 1 is PUTing QF data to processors 2,3,4 at the same time 2,3,4 are putting their QF data to 1, and so on. Processors 2,3,4 are PUTing into non-overlapping regions of QF(1,2,*) on 1, and 1 is PUTing from QF(1,1,*) to 2,3,4, and so on. So, my calls look like this on each processor: assertion = 0 call MPI_Win_post(group, assertion, QFWIN, ierr) call MPI_Win_start(group, assertion, QFWIN, ierr) do I=1,neighbors call MPI_Put(QF, 1, QFSND(I), NEIGHBOR(I), 0, 1, QFREC(I), QFWIN, ierr) end do call MPI_Win_complete(QFWIN,ierr) call MPI_Win_wait(QFWIN,ierr) Note I did define QFREC locally on each processor to properly represent where the data was going on the remote processors. The error value ierr=0 after MPI_Win_post, MPI_Win_start, MPI_Put, and MPI_Win_complete, and the code seg faults in MPI_Win_wait. I'm using Open MPI 1.4.3 on Mac OS X 10.6.5, built with Intel XE (12.0) compilers, and running on just 2 (internal) processors of my Mac Pro. The code ran normally with this configuration up until the point I put the above in. Several other communications with MPI_Put similar to the above work fine, though the windows are only on a subset of the communicated array, and the origin data is being PUT from part of the array that is not within the window. _____________________________________________________ Matt