> -----Original Message----- > From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma- > ow...@vger.kernel.org] On Behalf Of Steve Wise > Sent: Friday, July 18, 2014 1:39 AM > To: 'Hefty, Sean'; 'Shirley Ma'; Devesh Sharma; 'Roland Dreier' > Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider > module > > > > > -----Original Message----- > > From: Steve Wise [mailto:sw...@opengridcomputing.com] > > Sent: Thursday, July 17, 2014 2:56 PM > > To: 'Hefty, Sean'; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier' > > Cc: 'linux-rdma@vger.kernel.org'; 'chuck.le...@oracle.com' > > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma provider > > module > > > > > > > > > -----Original Message----- > > > From: Hefty, Sean [mailto:sean.he...@intel.com] > > > Sent: Thursday, July 17, 2014 2:50 PM > > > To: Steve Wise; 'Shirley Ma'; 'Devesh Sharma'; 'Roland Dreier' > > > Cc: linux-rdma@vger.kernel.org; chuck.le...@oracle.com > > > Subject: RE: [for-next 1/2] xprtrdma: take reference of rdma > > > provider module > > > > > > > > So the rdma cm is expected to increase the driver reference > > > > > count > > > > (try_module_get) for > > > > > each new cm id, then deference count (module_put) when cm id is > > > > destroyed? > > > > > > > > > > > > > No, I think he's saying the rdma-cm posts a > RDMA_CM_DEVICE_REMOVAL > > > > event to each application with rdmacm objects allocated, and each > > > > application is expected to destroy all the objects it has > > > > allocated before returning from the event handler. > > > > > > This is almost correct. The applications do not have to destroy all > > > the objects that > it has > > > allocated before returning from their event handler. E.g. an app > > > can queue a work > item > > > that does the destruction. The rdmacm will block in its ib_client > > > remove handler > until all > > > relevant rdma_cm_id's have been destroyed. > > > > > > > Thanks for the clarification. > > > > And looking at xprtrdma, it does handle the DEVICE_REMOVAL event in > rpcrdma_conn_upcall(). > It sets ep->rep_connected to -ENODEV, wakes everybody up, and calls > rpcrdma_conn_func() for that endpoint, which schedules > rep_connect_worker... and I gave up following the code path at this point... > :) > > For this to all work correctly, it would need to destroy all the QPs, MRs, > CQs, > etc for that device _before_ destroying the rdma cm ids. Otherwise the > provider module could be unloaded too soon...
Okay, Should I try to handle device removal in this proposed fashion and post the v1. > > Steve. > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html