Tom, On Thu, Oct 2, 2008 at 1:39 PM, Talpey, Thomas <[EMAIL PROTECTED]> wrote: > I'm debugging a reconnect problem in the NFS/RDMA client and > am seeing something rather odd. The context is that if a client > mount point goes idle for 5 minutes, the Linux RPC layer closes > the associated connection. When a new request needs to be > sent, the RPC layer then performs a reconnect. > > At this point, the NFS/RDMA client code will call rdma_create_id() > to create a new rdma_cm_id, then rdma_resolve_addr() and > finally rdma_resolve_route(). In the reconnect scenario, that > last step however returns -EINVAL. > > Looking at the code, I think the only reasons for this return are > 1) calling rdma_resolve_route() in the wrong state (which I'm not), > and 2) way down in the ib_post_send_mad() function, if there is > a timeout passed-in (which there is) and there's no receive handler > registered for the MAD (no clue but it worked the first time).
Are you saying you're suspecting reason 2 above ? FWIW, my read relative to ib_post_send_mad is that CM does register a receive handler so I don't think -EINVAL comes from there. Are you actually seeing the lack of a receive handler or is it from reviewing the code looking from where -EINVAL could possibly come ? -- Hal > This is using the ib_mthca driver, and 2.6.27-rc7 btw. Any clues to > help figure out what might be wrong? > > Thanks, > Tom. > > _______________________________________________ > general mailing list > general@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general > _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general