On 06/30/2015 06:26 AM, Yishai Hadas wrote:
> Currently, if there is any user space application using an IB device,
> it is impossible to unload the HW device driver for this device.
> 
> Similarly, if the device is hot-unplugged or reset, the device driver
> hardware removal flow blocks until all user contexts are destroyed.
> 
> This patchset removes the above limitations from both uverbs and ucma.
> 
> The IB-core and uverbs layers are still required to remain loaded as
> long as there are user applications using the verbs API. However, the
> hardware device drivers are not blocked any more by the user space
> activity.
> 
> To support this, the hardware device needs to expose a new kernel API
> named 'disassociate_ucontext'. The device driver is given a ucontext
> to detach from, and it should block this user context from any future
> hardware access. In the IB-core level, we use this interface to
> deactivate all ucontext that address a specific device when handling a
> remove_one callback for it.
> 
> In the RDMA-CM layer, a similar change is needed in the ucma module,
> to prevent blocking of the remove_one operation. This allows for
> hot-removal of devices with RDMA-CM users in the user space.
> 
> The first two patches are preparation for this series.
> The first patch fixes a reference counting issue pointed by Jason Gunthorpe.
> The second patch is a preparation step for deploying RCU for the device
> removal flow.
> 
> The third patch introduces the new API between the HW device driver and
> the IB core. For devices which implement the functionality, IB core
> will use it in remove_one, disassociating any active ucontext from the
> hardware device. Other drivers that didn't implement it will behave as
> today, remove_one will block until all ucontexts referring the device
> are destroyed before returning.
> 
> The fourth patch provides implementation of this API for the mlx4
> driver.
> 
> The last patch extends ucma to avoid blocking remove_one operation in
> the cma module. When such device removal event is received, ucma is
> turning all user contexts to zombie contexts. This is done by
> releasing all underlying resources and preventing any further user
> operations on the context.
> 
> Changes from V5:
> Addressed Jason's comments for below patches:
> patch #1: Improve kref usage.
> patch #3: Use 2 different krefs for complete and memory, improve some 
> comments.
> 
> Changes from V4:
> patch #1,#3 - addressed Jason's comments.
> patch #2, #4 - rebased upon last stuff.
> 
> Changes from V3:
> Add 2 patches as a preparation for this series, details above.
> patch #3: Change the locking schema based on Jason's comments.
> 
> Changes from V2:
> patch #1: Rebase over ODP patches.
> 
> Changes from V1:
> patch #1: Use uverbs flags instead of disassociate support, drop 
> fatal_event_raised flag.
> patch #3: Add support in ucma for handling device removal.
> 
> Changes from V0:
> patch #1: ib_uverbs_close, reduced mutex scope to enable tasks run in 
> parallel.
> Yishai Hadas (5):
>   IB/uverbs: Fix reference counting usage of event files
>   IB/uverbs: Explicitly pass ib_dev to uverbs commands
>   IB/uverbs: Enable device removal when there are active user space
>     applications
>   IB/mlx4_ib: Disassociate support
>   IB/ucma: HW Device hot-removal support
> 
>  drivers/infiniband/core/ucma.c        |  130 +++++++++-
>  drivers/infiniband/core/uverbs.h      |   16 +-
>  drivers/infiniband/core/uverbs_cmd.c  |  114 ++++++----
>  drivers/infiniband/core/uverbs_main.c |  438 
> +++++++++++++++++++++++++++------
>  drivers/infiniband/hw/mlx4/main.c     |  139 ++++++++++-
>  drivers/infiniband/hw/mlx4/mlx4_ib.h  |   13 +
>  include/rdma/ib_verbs.h               |    1 +
>  7 files changed, 714 insertions(+), 137 deletions(-)
> 

I've pulled this series in for 4.3.  There were some additional items in
some of Jason's comments that ought to be looked into, but I think this
patch set has reached the point where it's no worse than existing in
terms of locking, there were just some existing issues that should be
addressed too.

-- 
Doug Ledford <dledf...@redhat.com>
              GPG KeyID: 0E572FDD


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to