At 06:29 PM 10/2/2008, Hal Rosenstock wrote: >Tom, > >On Thu, Oct 2, 2008 at 1:39 PM, Talpey, Thomas ><[EMAIL PROTECTED]> wrote: >> I'm debugging a reconnect problem in the NFS/RDMA client and >> am seeing something rather odd. The context is that if a client >> mount point goes idle for 5 minutes, the Linux RPC layer closes >> the associated connection. When a new request needs to be >> sent, the RPC layer then performs a reconnect. >> >> At this point, the NFS/RDMA client code will call rdma_create_id() >> to create a new rdma_cm_id, then rdma_resolve_addr() and >> finally rdma_resolve_route(). In the reconnect scenario, that >> last step however returns -EINVAL. >> >> Looking at the code, I think the only reasons for this return are >> 1) calling rdma_resolve_route() in the wrong state (which I'm not), >> and 2) way down in the ib_post_send_mad() function, if there is >> a timeout passed-in (which there is) and there's no receive handler >> registered for the MAD (no clue but it worked the first time). > >Are you saying you're suspecting reason 2 above ? FWIW, my read >relative to ib_post_send_mad is that CM does register a receive
Hi Hal, thanks for looking at it. As it turns out I've determined it's actually 1) above, but for a new reason. It turns out that the CM has a new upcall enum called RDMA_CM_EVENT_TIMEWAIT_EXIT which is emitted shortly after any disconnect. This upcall arrives either before or during my connection recovery and signals a completion in my code that causes the re-binding to skip a step. What's the purpose of this new upcall, do you know? It's not used by anything I see. Tom. >handler so I don't think -EINVAL comes from there. Are you actually >seeing the lack of a receive handler or is it from reviewing the code >looking from where -EINVAL could possibly come ? > >-- Hal > >> This is using the ib_mthca driver, and 2.6.27-rc7 btw. Any clues to >> help figure out what might be wrong? >> >> Thanks, >> Tom. _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general