Re: [OMPI users] mca_pml_ob1_send blocks

Shaun Jackman Mon, 14 Sep 2009 13:56:57 -0400

Hi Jeff,

Jeff Squyres wrote:

On Sep 8, 2009, at 1:06 PM, Shaun Jackman wrote:
My INBOX has been a disaster recently. Please ping me repeatedly ifyou need quicker replies (sorry! :-( ).
(btw, should this really be on the devel list, not the user list?)

It's tending that way. I'll keep the thread here for now forcontinuity. If I start a new thread on this topic, I'll move it to devel.

I can see one sort of ugly scenario unfolding in my head. Consider two
processes running the following pseudocode:

req = MPI_Irecv
while (!done) {
   while (MPI_Test(req)) {
     req = MPI_Irecv
   }
   MPI_Send(!me)
   MPI_Send(!me)
}
Are the sends guaranteed to have matching receives elsewhere? If not,this has potential to deadlock on the whole assuming-buffering issue...

You're right that this is an erroneous program because there is onlyone Irecv posted and two Send. Change the two MPI_Send to MPI_Bsend toprevent deadlock, and the situation I'm describing below still applies.

If you're expecting the sends to be matched by the Irecv's, this lookslike an erroneous program to me (there will always be 2x as many sendsoutstanding as receives).

I'll describe one process here:
* MPI_Test checks req->req_complete, which is false, then calls
opal_progress (which finds two packets from the other guy).
* Send two packets to the other guy.

...only if they're eager. The sends are *not* guaranteed to completeuntil the matching receives occur.

* MPI_Test checks req->req_complete, which is true, returns
immediately. No progress is made.
* MPI_Test checks req->req_complete, which is false, because no
progress has been made since the last call. Call opal_progress (which
finds two packets from the other guy).
* Send two packets to the other guy.

* MPI_Test checks req->req_complete, which is true, returns
immediately. No progress is made.
* MPI_Test checks req->req_complete, which is false, because no
progress has been made since the last time. Call opal_progress (which
finds two packets from the other guy).
* Send two packets to the other guy.

and loop.

In each iteration through the loop, one packet is received and two
packets are sent. Eventually this has to end badly.


Bad user behavior should be punished, yes.  :-)

I'm not quite sure that I see the problem you're identifying -- fromwhat you describe, I think it's an erroneous program.

With buffered sends, in each iteration through the loop two packetswill be sent and only one packet will be received due toMPI_Request_get_status not checking request->req_complete aftercalling opal_progress.

Following is an untested fix to request_get_status.c. It checks
req->req_complete and returns immediately if it is true. If not, it
calls opal_progress() and checks req->req_complete again. If
OMPI_ENABLE_PROGRESS_THREADS is defined, it only checks the once and
does not call opal_progress(). It would look better if the body of the
loop were factored out into its own function.

Hmm. Do you mean this to be in request_get_status.c or req_test.c?(you mentioned MPI_TEST above, not MPI_REQUEST_GET_STATUS)

I meant this code for MPI_Request_get_status. I just read the code forompi_request_default_test in req_test.c. It includes code very similarto what I suggested for checking request->req_complete, callingopal_progress and then checking request->req_complete a second time,except that it implements the loop using a goto.

Is this the optimization I mentioned in my previous reply (i.e., ifreq_complete is false, call opal_progress, and then check req_completeagain?) If so, I think it would be better to do it without an if loopsomehow (testing and branching, etc.).

Yes, and MPI_Request_get_status would then behave as MPI_Test doescurrently. Is it not so crazy for MPI_Test to be implemented as callsto MPI_Request_get_status and MPI_Request_free? It would eliminate thecode duplication.


Cheers,
Shaun

Re: [OMPI users] mca_pml_ob1_send blocks

Reply via email to