Yossi,
I think you raised an interesting corner-case, and a possible bug in the MTL
implementation. As the request is marked as complete by the CM/PML the cancel
should never succeed. As the CM/PML is forcing the completion on all bend
requests, it should also enforce that all completed requests cannot be
cancelled (instead of leaving this task to the MTL).
I think the cleanest approach will be to allow the MTL itself o handle the
complete case, by moving the code you pinpointed to
(MCA_PML_CM_HVY_SEND_REQUEST_START) from the CM/MTL down in each MTL send case
(they can check for buffered send requests). This approach will possible allow
an MTL to implement cancel sends.
George.
On Aug 4, 2014, at 09:49 , Yossi Etigin wrote:
> Hi,
>
> Seems like it’s impossible to cancel buffered sends with pml/cm.
>
> From one hand, pml/cm completes the buffered send immediately
> (MCA_PML_CM_HVY_SEND_REQUEST_START):
> if(OMPI_SUCCESS == ret &&
>\
>sendreq->req_send.req_send_mode == MCA_PML_BASE_SEND_BUFFERED) {
>\
> sendreq->req_send.req_base.req_ompi.req_status.MPI_ERROR = 0;
>\
> ompi_request_complete(&(sendreq)->req_send.req_base.req_ompi,
> true); \
> }
>
> So, if the user is doing Bsend()/Cancel()/Wait()/Test_canceled(), the Wait()
> would be a no-op.
> Therefore when mtl_cancel() was called, it had to either cancel/guarantee
> completion *immediately*, otherwise the return from Test_canceled would be
> undefined.
> However, it’s not always possible to cancel immediately, because need to make
> sure the peer has not matched it yet (fox example, with mtl mxm).
>
> IMHO it’s wrong for pml_cm to complete a buffered send immediately.
> What do you think?
>
> --Yossi