In rpcrdma_ep_connect(): write_lock(&ia->ri_qplock); old = ia->ri_id; ia->ri_id = id; write_unlock(&ia->ri_qplock);
rdma_destroy_qp(old); rdma_destroy_id(old); =============> Cm -id is destroyed here. If following code fails in rpcrdma_ep_connect(): id = rpcrdma_create_id(xprt, ia, (struct sockaddr *)&xprt->rx_data.addr); if (IS_ERR(id)) { rc = -EHOSTUNREACH; goto out; } it leaves old cm-id still alive. This will always fail if Device is removed abruptly. In rdma_resolve_addr()/rdma_destroy_id() cm_dev is referenced/de-referenced here (cma.c): static int cma_acquire_dev(struct rdma_id_private *id_priv, struct rdma_id_private *listen_id_priv) { . . if (!ret) cma_attach_to_dev(id_priv, cma_dev); } static void cma_release_dev(struct rdma_id_private *id_priv) { mutex_lock(&lock); list_del(&id_priv->list); cma_deref_dev(id_priv->cma_dev); . . } Since as per design of nfs-rdma at-least previously known good cm-id always remains live utill another good cm-id is created, cma_dev->refcount never becomes 0 upon device removal . Thus blocking the rmmod <vendor driver> forever. -Regards Devesh > -----Original Message----- > From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Devesh Sharma > Sent: Monday, July 21, 2014 11:42 AM > To: Shirley Ma; Steve Wise; 'Chuck Lever' > Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider > module > > Shirley, > > Once rmmod is issued, the connection corresponding to the active mount is > destroyed and all the associated resources Are freed. As per the processing > logic of DEVICE-REMOVAL event, nfs-rdma wakes-up all the waiters, This > results into Re-establishment efforts, since the device is not present any > more, rdma_resolve_address() fails with CM resolution Error. This loop > continues forever. > > I am yet to find out which part of ocrdma is blocked. I am putting some debug > messages to find it out. I will get back to The group with an update. > > -Regards > Devesh > > > -----Original Message----- > > From: Shirley Ma [mailto:shirley...@oracle.com] > > Sent: Friday, July 18, 2014 9:18 PM > > To: Steve Wise; Devesh Sharma; 'Chuck Lever' > > Cc: 'Hefty, Sean'; 'Roland Dreier'; linux-rdma@vger.kernel.org > > Subject: Re: [for-next 1/2] xprtrdma: take reference of rdma provider > > module > > > > > > On 07/18/2014 06:27 AM, Steve Wise wrote: > > >>>> We can't really deal with a CM_DEVICE_REMOVE event while there > > >>>> are active NFS mounts. > > >>>> > > >>>> System shutdown ordering should guarantee (one would hope) that > > NFS > > >>>> mount points are unmounted before the RDMA/IB core > infrastructure > > >>>> is torn down. Ordering shouldn't matter as long all NFS activity > > >>>> has ceased before the CM tries to remove the device. > > >>>> > > >>>> So if something is hanging up the CM, there's something xprtrdma > > >>>> is not cleaning up properly. > > >>>> > > >>> > > >>> > > >>> Devesh, how are you reproducing this? Are you just rmmod'ing the > > >>> ocrdma module while there are active mounts? > > >> > > >> Yes, I am issuing rmmod while there is an active mount. In my case > > >> rmmod ocrdma remains blocked forever. > > Where is it blocked? > > > > >> Off-the-course of this discussion: Is there a reasoning behind not > > >> using > > >> ib_register_client()/ib_unregister_client() framework? > > > > > > I think the idea is that you don't need to use it if you are > > > transport-independent and use the rdmacm... > > > > > > > > > > > > -- > > > To unsubscribe from this list: send the line "unsubscribe linux-rdma" > > > in the body of a message to majord...@vger.kernel.org More > > majordomo > > > info at http://vger.kernel.org/majordomo-info.html > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html