What we do for heart-beat is using zero-byte rdma_write, the message goes to the peer QP only, there is no need to post anything on remote side, no need for pinned memory.
--CQ > -----Original Message----- > From: Jack Morgenstein [mailto:ja...@dev.mellanox.co.il] > Sent: Friday, December 21, 2007 12:09 PM > To: Tang, Changqing > Cc: pa...@dev.mellanox.co.il; > mvapich-disc...@cse.ohio-state.edu; > gene...@lists.openfabrics.org; Open MPI Developers > Subject: Re: [ofa-general] [RFC] XRC -- make receiving XRC QP > independent of any one user process > > On Friday 21 December 2007 19:13, Tang, Changqing wrote: > > This kernel QP is for receiving only, so when there is no > activity on > > this QP, can the kernel sends a heart-beat message to check if the > > remote sending QP is still there (still connected) ? if not, the > > kernel is safe to cleanup this qp. > > > > So whenever the RC connection is broken, kernel can destroy this QP. > > > This increases the XRC complexity considerably: > > 1. Need to have a separate kernel thread which will scan ALL > xrc domains on this host for XRC receive QPs. > This thread will need to do some form of RDMA_READ/WRITE, > because otherwise it will interfere with > the remote (sending side) operation. Furthermore, the > sending-side XRC QP may not have anyone listening > on an associated XRC SRQ qp -- it is not meant to be set > up to receive. We only need an operation that > will yield a RETRY_EXCEEDED error completion if the > connection has broken. > > 2. This opens the door for all sorts of nasty race > conditions, since we will now have a bi-directional > protocol. For example, what if this feature is being > combined with APM (valid for RC QPs), and we > are simply in the middle of a migration, and maybe > communication is temporarily interrupted. > We will be killing off the QP without allowing any error > recovery mechanism to work. > > 3. The application complexity goes up -- we now need the > sending-side QP to declare a memory region and send > this region's address to the receiving side so that the > receiving side (the kernel thread mentioned above) > can periodically try to read from this region. > > Still, I'll give this some thought. For example, maybe we > can rdma_read some random (illegal) address -- If the > connection is alive, we'll get a "remote access error" > completion, while if its dead, we'll get retry exceeded (need > to check that the bad rdma read request does not cause the > QPs to enter an error state). > > - Jack >