On Wed, 31 Aug 2005, Roland Dreier wrote:
> James> The device could still be used after it's gone. For > James> example: > > James> - the user is configuring SRP via sysfs. The thread in > James> srp_create_target() has just called ib_sa_path_rec_get() > James> [srp.c line 1209] and is waiting for the path record query > James> to complete in wait_for_completion() - the SA callback, > James> srp_path_rec_completion(), is called. This callback thread > James> will make several verb calls (ib_create_cq, > James> ib_req_notify_cq, ib_create_qp, ...) without any > James> coordination with the hotplug device removal callback, > James> srp_remove_one > > I don't think this can happen. How could srp_remove_one get past > > wait_for_completion(&host->released); > > if the sysfs file is still in use? You're right. srp_remove_one will wait for the sysfs file to close. What about SRP's interactions with the SCSI layer? When scsi_remove_host() returns are you guaranteed that there are no SCSI calls into your code in progress (e.g. in srp_queuecommand)? > James> Notice that if the SA client's hotplug removal function, > James> ib_sa_remove_one(), ensured that all callbacks had > James> completed before returning the problem would be fixed. This > James> would protect all ULPs from having to deal with hotplug > James> races in their SA callback function. The fix belongs in the > James> SA client (the core stack), not in SRP. > > All SA client callbacks are driven by the MAD layer. And > ib_sa_remove_one() does ib_unregister_mad_agent(), which should wait > for all callbacks to finish. So I think we already do the best we can > here. Unfortunately the SA client code must clean up after all the > ULPs that depend on it, because ULPs can use the SA up until they know > the device is gone. But I don't see a way around that. > > James> All the ULPs are deficient with respect to their hotplug > James> synchronization. Given that there is a common problem, > James> doesn't it make sense to try and solve it in a generic way > James> instead of in each ULP? > > Yes, but what is the generic way? The generic way would be to handle this in a common layer. For the IB verbs + RDMA connection API to be as easy to use as the sockets API, then it needs to make this issue transparent. Take the current rpc code in net/sunrpc as an example. It uses the sock_create_kern(), kernel_sendmsg(), kernel_recvmsg(), etc. without ever needing to worry about hotplug events. The layers between it and the low level drivers (Ethernet, IBoIP, etc.) take care of that. _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general