Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region

2020-11-15 Thread Jason Gunthorpe
On Fri, Nov 13, 2020 at 03:51:20AM +, Xiong, Jianxin wrote:

> > > +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment
> > > +*attach) {
> > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
> > > + struct mlx5_ib_mr *mr = umem_dmabuf->private;
> > > +
> > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
> > > +
> > > + if (mr)
> > 
> > I don't think this 'if (mr)' test is needed anymore? I certainly prefer it 
> > gone as it is kind of messy. I expect unmapping the dma to ensure this
> > function is not running, and won't run again.
> 
> It is still needed. When the dma-buf moves, the callback function of every 
> attached importer is invoked, regardless if the importer has mapped the dma 
> or not.
> 
> > 
> > > +/**
> > > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> > > + * @mr: to fence
> > > + *
> > > + * On return no parallel threads will be touching this MR and no DMA
> > > +will be
> > > + * active.
> > > + */
> > > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) {
> > > + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
> > > +
> > > + /* Prevent new page faults and prefetch requests from succeeding */
> > > + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> > > +
> > > + /* Wait for all running page-fault handlers to finish. */
> > > + synchronize_srcu(&mr->dev->odp_srcu);
> > > +
> > > + wait_event(mr->q_deferred_work,
> > > +!atomic_read(&mr->num_deferred_work));
> > > +
> > > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> > > + mlx5_mr_cache_invalidate(mr);
> > > + umem_dmabuf->private = NULL;
> > > + ib_umem_dmabuf_unmap_pages(umem_dmabuf);
> > > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> > > +
> > > + if (!mr->cache_ent) {
> > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> > > + WARN_ON(mr->descs);
> > > + }
> > 
> > I didn't check carefully, but are you sure this destroy_mkey should be 
> > here??
> 
> To my understanding, yes. This is similar to what dma_fence_odp_mr() does,
> just inlined here since it's not called from other places.

I think you should put the calls to dma_buf_dynamic_attach() and
dma_buf_detach() into mlx5, it makes the whole thing a little cleaner,
then the umem->private isn't needed any more either.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region

2020-11-13 Thread Xiong, Jianxin
> -Original Message-
> From: Jason Gunthorpe 
> Sent: Friday, November 13, 2020 5:06 AM
> To: Xiong, Jianxin 
> Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford 
> ; Leon Romanovsky
> ; Sumit Semwal ; Christian Koenig 
> ; Vetter, Daniel
> 
> Subject: Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace 
> memory region
> 
> On Fri, Nov 13, 2020 at 03:51:20AM +, Xiong, Jianxin wrote:
> 
> > > > +static void mlx5_ib_dmabuf_invalidate_cb(struct
> > > > +dma_buf_attachment
> > > > +*attach) {
> > > > +   struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
> > > > +   struct mlx5_ib_mr *mr = umem_dmabuf->private;
> > > > +
> > > > +   dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
> > > > +
> > > > +   if (mr)
> > >
> > > I don't think this 'if (mr)' test is needed anymore? I certainly
> > > prefer it gone as it is kind of messy. I expect unmapping the dma to 
> > > ensure this function is not running, and won't run again.
> >
> > It is still needed. When the dma-buf moves, the callback function of every 
> > attached importer is invoked, regardless if the importer has
> mapped the dma or not.
> >
> > >
> > > > +/**
> > > > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> > > > + * @mr: to fence
> > > > + *
> > > > + * On return no parallel threads will be touching this MR and no
> > > > +DMA will be
> > > > + * active.
> > > > + */
> > > > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) {
> > > > +   struct ib_umem_dmabuf *umem_dmabuf =
> > > > +to_ib_umem_dmabuf(mr->umem);
> > > > +
> > > > +   /* Prevent new page faults and prefetch requests from 
> > > > succeeding */
> > > > +   xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> > > > +
> > > > +   /* Wait for all running page-fault handlers to finish. */
> > > > +   synchronize_srcu(&mr->dev->odp_srcu);
> > > > +
> > > > +   wait_event(mr->q_deferred_work,
> > > > +!atomic_read(&mr->num_deferred_work));
> > > > +
> > > > +   dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> > > > +   mlx5_mr_cache_invalidate(mr);
> > > > +   umem_dmabuf->private = NULL;
> > > > +   ib_umem_dmabuf_unmap_pages(umem_dmabuf);
> > > > +   dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> > > > +
> > > > +   if (!mr->cache_ent) {
> > > > +   mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> > > > +   WARN_ON(mr->descs);
> > > > +   }
> > >
> > > I didn't check carefully, but are you sure this destroy_mkey should be 
> > > here??
> >
> > To my understanding, yes. This is similar to what dma_fence_odp_mr()
> > does, just inlined here since it's not called from other places.
> 
> I think you should put the calls to dma_buf_dynamic_attach() and
> dma_buf_detach() into mlx5, it makes the whole thing a little cleaner, then 
> the umem->private isn't needed any more either.

Putting these calls into mlx5 can remove the 'ops' parameter from the umem_get 
call,
but I don't see how umem->private can be eliminated. In addition, I feel 
keeping these
two calls in the core provides a better separation between the core and the 
driver -- 
dma-buf API functions are only called from the core, except for 
locking/unlocking.

The 'if (mr)' part in the callback can be removed by adding a line
'if (!umem_dmabutf->sgt) return;' before that if that makes a difference. 

> 
> Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region

2020-11-13 Thread Jason Gunthorpe
On Tue, Nov 10, 2020 at 01:41:15PM -0800, Jianxin Xiong wrote:

> -static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags)
> +int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags)
>  {
>   struct mlx5_ib_dev *dev = mr->dev;
>   struct device *ddev = dev->ib_dev.dev.parent;
> @@ -1255,6 +1267,10 @@ static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr 
> *mr, unsigned int flags)
>   cur_mtt->ptag =
>   cpu_to_be64(rdma_block_iter_dma_address(&biter) |
>   MLX5_IB_MTT_PRESENT);
> +
> + if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP))
> + cur_mtt->ptag = 0;
> +
>   cur_mtt++;
>   }
>  
> @@ -1291,8 +1307,15 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr 
> *ibmr, struct ib_pd *pd,
>   int err;
>   bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg));
>  
> - page_size =
> - mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova);
> + if (umem->is_dmabuf) {
> + if ((iova ^ umem->address) & (PAGE_SIZE - 1))
> + return ERR_PTR(-EINVAL);
> + umem->iova = iova;
> + page_size = PAGE_SIZE;
> + } else {
> + page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size,
> +  0, iova);
> + }

Urk, maybe this duplicated sequence should be in a function..

This will also collide with a rereg_mr bugfixing series that should be
posted soon..

> +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach)
> +{
> + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
> + struct mlx5_ib_mr *mr = umem_dmabuf->private;
> +
> + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
> +
> + if (mr)

I don't think this 'if (mr)' test is needed anymore? I certainly
prefer it gone as it is kind of messy. I expect unmapping the dma to
ensure this function is not running, and won't run again.

> +/**
> + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> + * @mr: to fence
> + *
> + * On return no parallel threads will be touching this MR and no DMA will be
> + * active.
> + */
> +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr)
> +{
> + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
> +
> + /* Prevent new page faults and prefetch requests from succeeding */
> + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> +
> + /* Wait for all running page-fault handlers to finish. */
> + synchronize_srcu(&mr->dev->odp_srcu);
> +
> + wait_event(mr->q_deferred_work, !atomic_read(&mr->num_deferred_work));
> +
> + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> + mlx5_mr_cache_invalidate(mr);
> + umem_dmabuf->private = NULL;
> + ib_umem_dmabuf_unmap_pages(umem_dmabuf);
> + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> +
> + if (!mr->cache_ent) {
> + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> + WARN_ON(mr->descs);
> + }

I didn't check carefully, but are you sure this destroy_mkey should be
here??

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region

2020-11-12 Thread Xiong, Jianxin
> -Original Message-
> From: Jason Gunthorpe 
> Sent: Thursday, November 12, 2020 4:40 PM
> To: Xiong, Jianxin 
> Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford 
> ; Leon Romanovsky
> ; Sumit Semwal ; Christian Koenig 
> ; Vetter, Daniel
> 
> Subject: Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace 
> memory region
> 
> On Tue, Nov 10, 2020 at 01:41:15PM -0800, Jianxin Xiong wrote:
> 
> > -static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int
> > flags)
> > +int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags)
> >  {
> > struct mlx5_ib_dev *dev = mr->dev;
> > struct device *ddev = dev->ib_dev.dev.parent; @@ -1255,6 +1267,10 @@
> > static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags)
> > cur_mtt->ptag =
> > cpu_to_be64(rdma_block_iter_dma_address(&biter) |
> > MLX5_IB_MTT_PRESENT);
> > +
> > +   if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP))
> > +   cur_mtt->ptag = 0;
> > +
> > cur_mtt++;
> > }
> >
> > @@ -1291,8 +1307,15 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr 
> > *ibmr, struct ib_pd *pd,
> > int err;
> > bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg));
> >
> > -   page_size =
> > -   mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova);
> > +   if (umem->is_dmabuf) {
> > +   if ((iova ^ umem->address) & (PAGE_SIZE - 1))
> > +   return ERR_PTR(-EINVAL);
> > +   umem->iova = iova;
> > +   page_size = PAGE_SIZE;
> > +   } else {
> > +   page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size,
> > +0, iova);
> > +   }
> 
> Urk, maybe this duplicated sequence should be in a function..
> 
> This will also collide with a rereg_mr bugfixing series that should be posted 
> soon..
> 
> > +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment
> > +*attach) {
> > +   struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv;
> > +   struct mlx5_ib_mr *mr = umem_dmabuf->private;
> > +
> > +   dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv);
> > +
> > +   if (mr)
> 
> I don't think this 'if (mr)' test is needed anymore? I certainly prefer it 
> gone as it is kind of messy. I expect unmapping the dma to ensure this
> function is not running, and won't run again.

It is still needed. When the dma-buf moves, the callback function of every 
attached importer is invoked, regardless if the importer has mapped the dma or 
not.

> 
> > +/**
> > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> > + * @mr: to fence
> > + *
> > + * On return no parallel threads will be touching this MR and no DMA
> > +will be
> > + * active.
> > + */
> > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) {
> > +   struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
> > +
> > +   /* Prevent new page faults and prefetch requests from succeeding */
> > +   xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> > +
> > +   /* Wait for all running page-fault handlers to finish. */
> > +   synchronize_srcu(&mr->dev->odp_srcu);
> > +
> > +   wait_event(mr->q_deferred_work,
> > +!atomic_read(&mr->num_deferred_work));
> > +
> > +   dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> > +   mlx5_mr_cache_invalidate(mr);
> > +   umem_dmabuf->private = NULL;
> > +   ib_umem_dmabuf_unmap_pages(umem_dmabuf);
> > +   dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> > +
> > +   if (!mr->cache_ent) {
> > +   mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> > +   WARN_ON(mr->descs);
> > +   }
> 
> I didn't check carefully, but are you sure this destroy_mkey should be here??

To my understanding, yes. This is similar to what dma_fence_odp_mr() does,
just inlined here since it's not called from other places.
 
> 
> Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region

2020-11-10 Thread Jianxin Xiong
Implement the new driver method 'reg_user_mr_dmabuf'.  Utilize the core
functions to import dma-buf based memory region and update the mappings.

Add code to handle dma-buf related page fault.

Signed-off-by: Jianxin Xiong 
Reviewed-by: Sean Hefty 
Acked-by: Michael J. Ruhl 
Acked-by: Christian Koenig 
Acked-by: Daniel Vetter 
---
 drivers/infiniband/hw/mlx5/main.c|   2 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h |  18 +
 drivers/infiniband/hw/mlx5/mr.c  | 124 +--
 drivers/infiniband/hw/mlx5/odp.c |  86 ++--
 4 files changed, 221 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/main.c 
b/drivers/infiniband/hw/mlx5/main.c
index 36b15a0..e647ea4 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
 /*
  * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved.
+ * Copyright (c) 2020, Intel Corporation. All rights reserved.
  */
 
 #include 
@@ -4055,6 +4056,7 @@ static int mlx5_ib_enable_driver(struct ib_device *dev)
.query_srq = mlx5_ib_query_srq,
.query_ucontext = mlx5_ib_query_ucontext,
.reg_user_mr = mlx5_ib_reg_user_mr,
+   .reg_user_mr_dmabuf = mlx5_ib_reg_user_mr_dmabuf,
.req_notify_cq = mlx5_ib_arm_cq,
.rereg_user_mr = mlx5_ib_rereg_user_mr,
.resize_cq = mlx5_ib_resize_cq,
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h 
b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index bb44080..3ef6872 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
 /*
  * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved.
+ * Copyright (c) 2020, Intel Corporation. All rights reserved.
  */
 
 #ifndef MLX5_IB_H
@@ -665,6 +666,12 @@ static inline bool is_odp_mr(struct mlx5_ib_mr *mr)
   mr->umem->is_odp;
 }
 
+static inline bool is_dmabuf_mr(struct mlx5_ib_mr *mr)
+{
+   return IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && mr->umem &&
+  mr->umem->is_dmabuf;
+}
+
 struct mlx5_ib_mw {
struct ib_mwibmw;
struct mlx5_core_mkey   mmkey;
@@ -1200,6 +1207,10 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct 
ib_cq_init_attr *attr,
 struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
  u64 virt_addr, int access_flags,
  struct ib_udata *udata);
+struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start,
+u64 length, u64 virt_addr,
+int fd, int access_flags,
+struct ib_udata *udata);
 int mlx5_ib_advise_mr(struct ib_pd *pd,
  enum ib_uverbs_advise_mr_advice advice,
  u32 flags,
@@ -1210,11 +1221,13 @@ int mlx5_ib_advise_mr(struct ib_pd *pd,
 int mlx5_ib_dealloc_mw(struct ib_mw *mw);
 int mlx5_ib_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages,
   int page_shift, int flags);
+int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags);
 struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd,
 struct ib_udata *udata,
 int access_flags);
 void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *mr);
 void mlx5_ib_fence_odp_mr(struct mlx5_ib_mr *mr);
+void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr);
 int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
  u64 length, u64 virt_addr, int access_flags,
  struct ib_pd *pd, struct ib_udata *udata);
@@ -1306,6 +1319,7 @@ int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd,
   enum ib_uverbs_advise_mr_advice advice,
   u32 flags, struct ib_sge *sg_list, u32 num_sge);
 int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable);
+int mlx5_ib_init_dmabuf_mr(struct mlx5_ib_mr *mr);
 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev)
 {
@@ -1331,6 +1345,10 @@ static inline int mlx5_ib_init_odp_mr(struct mlx5_ib_mr 
*mr, bool enable)
 {
return -EOPNOTSUPP;
 }
+static inline int mlx5_ib_init_dmabuf_mr(struct mlx5_ib_mr *mr)
+{
+   return -EOPNOTSUPP;
+}
 #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */
 
 extern const struct mmu_interval_notifier_ops mlx5_mn_ops;
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 9f653b4..fdd2db2a 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1,5 +1,6 @@
 /*
  * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2020, Intel Corporation.