Re: [OMPI devel] Flush CQ error on iWARP/Out-of-sync shutdown

Steve Wise Tue, 6 May 2008 11:45:53 -0400

Jeff Squyres wrote:

On May 5, 2008, at 6:27 PM, Steve Wise wrote:
I am seeing some unusual behavior during the shutdown phase of ompiat the end of my testcase. While running a IMB pingpong test overthe rdmacm on openib, I get cq flush errors on my iWARP adapters.
This error is happening because the remote node is still pollingthe endpoint while the other one shutdown. This occurs becauseiWARP puts the qps in error state when the channel is disconnected(IB does not do this). Since the cq is still being polled when theevent is received on the remote node, ompi thinks it hit an errorand kills the run. Since this is expected behavior on iWARP, thisis not really an error case.
The key here, I think is that when an iWARP QP moves out of RTS, allthe
RECVs and any pending SQ WRs get flushed.  Further, disconnecting the
iwarp connection forces the QP out of RTS.  This is probably different
than they way IB works.  IE "disconnecting" in IB is an out-of-band
exchange done by the IBCM.  For iWARP, "disconnecting" is an in-band
operation (a TCP close or abort) so the QP cannot remain in RTS during
this process.
Let me make sure I understand:

- proc A calls del_procs on proc B
- proc A calls ibv_destroy_qp() on QP to proc B


Actually proc A calls rdma_disconnect() on QP to proc B

- this causes a local (proc A) flush on all pending receives and SQ WRs
- this then causes a FLUSH event to show up *in proc B*
   --> I'm not clear on this point from Jon's/Steve's text

Yes. Once the connection is torn down the iwarp QPs will be flushed onboth ends.

- OMPI [currently] treats the FLUSH in proc B as an error

Is that right?

What is the purpose of the FLUSH event?

In general, I think it is to allow the application to recover anyresources that are allocated and cannot be touched until the WRscomplete. For example, the buffers that were described in all the RECVWRs. If the app is going to exit, this isn't very interesting sinceeverything will get cleaned up in the exit path. But if the process islong lived and setting up/tearing down connections, then these pendingRECV buffers need to be reclaimed and put back into the buffer poll, asan example...

There is a larger question regarding why the remote node is stillpolling the hca and not shutting down, but my immediate question isif it is an acceptable fix to simply disregard this "error" if itis an iWARP adapter.
If proc B is still polling the hca, it is likely because it simply hasnot yet stopped doing it. I.e., a big problem in MPI implementationsis that not all actions are exactly synchronous. MPI disconnects are*effectively* synchronous, but we probably didn't *guarantee*synchronicity in this case because we didn't need it (perhaps untilnow).


Yes.


Steve.

Re: [OMPI devel] Flush CQ error on iWARP/Out-of-sync shutdown

Reply via email to