Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region
On Fri, Nov 13, 2020 at 03:51:20AM +, Xiong, Jianxin wrote: > > > +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment > > > +*attach) { > > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > > > + struct mlx5_ib_mr *mr = umem_dmabuf->private; > > > + > > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > > > + > > > + if (mr) > > > > I don't think this 'if (mr)' test is needed anymore? I certainly prefer it > > gone as it is kind of messy. I expect unmapping the dma to ensure this > > function is not running, and won't run again. > > It is still needed. When the dma-buf moves, the callback function of every > attached importer is invoked, regardless if the importer has mapped the dma > or not. > > > > > > +/** > > > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR > > > + * @mr: to fence > > > + * > > > + * On return no parallel threads will be touching this MR and no DMA > > > +will be > > > + * active. > > > + */ > > > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) { > > > + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); > > > + > > > + /* Prevent new page faults and prefetch requests from succeeding */ > > > + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key)); > > > + > > > + /* Wait for all running page-fault handlers to finish. */ > > > + synchronize_srcu(&mr->dev->odp_srcu); > > > + > > > + wait_event(mr->q_deferred_work, > > > +!atomic_read(&mr->num_deferred_work)); > > > + > > > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); > > > + mlx5_mr_cache_invalidate(mr); > > > + umem_dmabuf->private = NULL; > > > + ib_umem_dmabuf_unmap_pages(umem_dmabuf); > > > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); > > > + > > > + if (!mr->cache_ent) { > > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > > > + WARN_ON(mr->descs); > > > + } > > > > I didn't check carefully, but are you sure this destroy_mkey should be > > here?? > > To my understanding, yes. This is similar to what dma_fence_odp_mr() does, > just inlined here since it's not called from other places. I think you should put the calls to dma_buf_dynamic_attach() and dma_buf_detach() into mlx5, it makes the whole thing a little cleaner, then the umem->private isn't needed any more either. Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region
> -Original Message- > From: Jason Gunthorpe > Sent: Friday, November 13, 2020 5:06 AM > To: Xiong, Jianxin > Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford > ; Leon Romanovsky > ; Sumit Semwal ; Christian Koenig > ; Vetter, Daniel > > Subject: Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace > memory region > > On Fri, Nov 13, 2020 at 03:51:20AM +, Xiong, Jianxin wrote: > > > > > +static void mlx5_ib_dmabuf_invalidate_cb(struct > > > > +dma_buf_attachment > > > > +*attach) { > > > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > > > > + struct mlx5_ib_mr *mr = umem_dmabuf->private; > > > > + > > > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > > > > + > > > > + if (mr) > > > > > > I don't think this 'if (mr)' test is needed anymore? I certainly > > > prefer it gone as it is kind of messy. I expect unmapping the dma to > > > ensure this function is not running, and won't run again. > > > > It is still needed. When the dma-buf moves, the callback function of every > > attached importer is invoked, regardless if the importer has > mapped the dma or not. > > > > > > > > > +/** > > > > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR > > > > + * @mr: to fence > > > > + * > > > > + * On return no parallel threads will be touching this MR and no > > > > +DMA will be > > > > + * active. > > > > + */ > > > > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) { > > > > + struct ib_umem_dmabuf *umem_dmabuf = > > > > +to_ib_umem_dmabuf(mr->umem); > > > > + > > > > + /* Prevent new page faults and prefetch requests from > > > > succeeding */ > > > > + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key)); > > > > + > > > > + /* Wait for all running page-fault handlers to finish. */ > > > > + synchronize_srcu(&mr->dev->odp_srcu); > > > > + > > > > + wait_event(mr->q_deferred_work, > > > > +!atomic_read(&mr->num_deferred_work)); > > > > + > > > > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); > > > > + mlx5_mr_cache_invalidate(mr); > > > > + umem_dmabuf->private = NULL; > > > > + ib_umem_dmabuf_unmap_pages(umem_dmabuf); > > > > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); > > > > + > > > > + if (!mr->cache_ent) { > > > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > > > > + WARN_ON(mr->descs); > > > > + } > > > > > > I didn't check carefully, but are you sure this destroy_mkey should be > > > here?? > > > > To my understanding, yes. This is similar to what dma_fence_odp_mr() > > does, just inlined here since it's not called from other places. > > I think you should put the calls to dma_buf_dynamic_attach() and > dma_buf_detach() into mlx5, it makes the whole thing a little cleaner, then > the umem->private isn't needed any more either. Putting these calls into mlx5 can remove the 'ops' parameter from the umem_get call, but I don't see how umem->private can be eliminated. In addition, I feel keeping these two calls in the core provides a better separation between the core and the driver -- dma-buf API functions are only called from the core, except for locking/unlocking. The 'if (mr)' part in the callback can be removed by adding a line 'if (!umem_dmabutf->sgt) return;' before that if that makes a difference. > > Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region
On Tue, Nov 10, 2020 at 01:41:15PM -0800, Jianxin Xiong wrote: > -static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) > +int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) > { > struct mlx5_ib_dev *dev = mr->dev; > struct device *ddev = dev->ib_dev.dev.parent; > @@ -1255,6 +1267,10 @@ static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr > *mr, unsigned int flags) > cur_mtt->ptag = > cpu_to_be64(rdma_block_iter_dma_address(&biter) | > MLX5_IB_MTT_PRESENT); > + > + if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP)) > + cur_mtt->ptag = 0; > + > cur_mtt++; > } > > @@ -1291,8 +1307,15 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr > *ibmr, struct ib_pd *pd, > int err; > bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg)); > > - page_size = > - mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova); > + if (umem->is_dmabuf) { > + if ((iova ^ umem->address) & (PAGE_SIZE - 1)) > + return ERR_PTR(-EINVAL); > + umem->iova = iova; > + page_size = PAGE_SIZE; > + } else { > + page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, > + 0, iova); > + } Urk, maybe this duplicated sequence should be in a function.. This will also collide with a rereg_mr bugfixing series that should be posted soon.. > +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) > +{ > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > + struct mlx5_ib_mr *mr = umem_dmabuf->private; > + > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > + > + if (mr) I don't think this 'if (mr)' test is needed anymore? I certainly prefer it gone as it is kind of messy. I expect unmapping the dma to ensure this function is not running, and won't run again. > +/** > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR > + * @mr: to fence > + * > + * On return no parallel threads will be touching this MR and no DMA will be > + * active. > + */ > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) > +{ > + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); > + > + /* Prevent new page faults and prefetch requests from succeeding */ > + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key)); > + > + /* Wait for all running page-fault handlers to finish. */ > + synchronize_srcu(&mr->dev->odp_srcu); > + > + wait_event(mr->q_deferred_work, !atomic_read(&mr->num_deferred_work)); > + > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); > + mlx5_mr_cache_invalidate(mr); > + umem_dmabuf->private = NULL; > + ib_umem_dmabuf_unmap_pages(umem_dmabuf); > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); > + > + if (!mr->cache_ent) { > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > + WARN_ON(mr->descs); > + } I didn't check carefully, but are you sure this destroy_mkey should be here?? Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
RE: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region
> -Original Message- > From: Jason Gunthorpe > Sent: Thursday, November 12, 2020 4:40 PM > To: Xiong, Jianxin > Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford > ; Leon Romanovsky > ; Sumit Semwal ; Christian Koenig > ; Vetter, Daniel > > Subject: Re: [PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace > memory region > > On Tue, Nov 10, 2020 at 01:41:15PM -0800, Jianxin Xiong wrote: > > > -static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int > > flags) > > +int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) > > { > > struct mlx5_ib_dev *dev = mr->dev; > > struct device *ddev = dev->ib_dev.dev.parent; @@ -1255,6 +1267,10 @@ > > static int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags) > > cur_mtt->ptag = > > cpu_to_be64(rdma_block_iter_dma_address(&biter) | > > MLX5_IB_MTT_PRESENT); > > + > > + if (mr->umem->is_dmabuf && (flags & MLX5_IB_UPD_XLT_ZAP)) > > + cur_mtt->ptag = 0; > > + > > cur_mtt++; > > } > > > > @@ -1291,8 +1307,15 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr > > *ibmr, struct ib_pd *pd, > > int err; > > bool pg_cap = !!(MLX5_CAP_GEN(dev->mdev, pg)); > > > > - page_size = > > - mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova); > > + if (umem->is_dmabuf) { > > + if ((iova ^ umem->address) & (PAGE_SIZE - 1)) > > + return ERR_PTR(-EINVAL); > > + umem->iova = iova; > > + page_size = PAGE_SIZE; > > + } else { > > + page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, > > +0, iova); > > + } > > Urk, maybe this duplicated sequence should be in a function.. > > This will also collide with a rereg_mr bugfixing series that should be posted > soon.. > > > +static void mlx5_ib_dmabuf_invalidate_cb(struct dma_buf_attachment > > +*attach) { > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > > + struct mlx5_ib_mr *mr = umem_dmabuf->private; > > + > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > > + > > + if (mr) > > I don't think this 'if (mr)' test is needed anymore? I certainly prefer it > gone as it is kind of messy. I expect unmapping the dma to ensure this > function is not running, and won't run again. It is still needed. When the dma-buf moves, the callback function of every attached importer is invoked, regardless if the importer has mapped the dma or not. > > > +/** > > + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR > > + * @mr: to fence > > + * > > + * On return no parallel threads will be touching this MR and no DMA > > +will be > > + * active. > > + */ > > +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr) { > > + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem); > > + > > + /* Prevent new page faults and prefetch requests from succeeding */ > > + xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key)); > > + > > + /* Wait for all running page-fault handlers to finish. */ > > + synchronize_srcu(&mr->dev->odp_srcu); > > + > > + wait_event(mr->q_deferred_work, > > +!atomic_read(&mr->num_deferred_work)); > > + > > + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); > > + mlx5_mr_cache_invalidate(mr); > > + umem_dmabuf->private = NULL; > > + ib_umem_dmabuf_unmap_pages(umem_dmabuf); > > + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); > > + > > + if (!mr->cache_ent) { > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > > + WARN_ON(mr->descs); > > + } > > I didn't check carefully, but are you sure this destroy_mkey should be here?? To my understanding, yes. This is similar to what dma_fence_odp_mr() does, just inlined here since it's not called from other places. > > Jason ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH v10 4/6] RDMA/mlx5: Support dma-buf based userspace memory region
Implement the new driver method 'reg_user_mr_dmabuf'. Utilize the core functions to import dma-buf based memory region and update the mappings. Add code to handle dma-buf related page fault. Signed-off-by: Jianxin Xiong Reviewed-by: Sean Hefty Acked-by: Michael J. Ruhl Acked-by: Christian Koenig Acked-by: Daniel Vetter --- drivers/infiniband/hw/mlx5/main.c| 2 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 18 + drivers/infiniband/hw/mlx5/mr.c | 124 +-- drivers/infiniband/hw/mlx5/odp.c | 86 ++-- 4 files changed, 221 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 36b15a0..e647ea4 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. */ #include @@ -4055,6 +4056,7 @@ static int mlx5_ib_enable_driver(struct ib_device *dev) .query_srq = mlx5_ib_query_srq, .query_ucontext = mlx5_ib_query_ucontext, .reg_user_mr = mlx5_ib_reg_user_mr, + .reg_user_mr_dmabuf = mlx5_ib_reg_user_mr_dmabuf, .req_notify_cq = mlx5_ib_arm_cq, .rereg_user_mr = mlx5_ib_rereg_user_mr, .resize_cq = mlx5_ib_resize_cq, diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index bb44080..3ef6872 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2013-2020, Mellanox Technologies inc. All rights reserved. + * Copyright (c) 2020, Intel Corporation. All rights reserved. */ #ifndef MLX5_IB_H @@ -665,6 +666,12 @@ static inline bool is_odp_mr(struct mlx5_ib_mr *mr) mr->umem->is_odp; } +static inline bool is_dmabuf_mr(struct mlx5_ib_mr *mr) +{ + return IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && mr->umem && + mr->umem->is_dmabuf; +} + struct mlx5_ib_mw { struct ib_mwibmw; struct mlx5_core_mkey mmkey; @@ -1200,6 +1207,10 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_udata *udata); +struct ib_mr *mlx5_ib_reg_user_mr_dmabuf(struct ib_pd *pd, u64 start, +u64 length, u64 virt_addr, +int fd, int access_flags, +struct ib_udata *udata); int mlx5_ib_advise_mr(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice, u32 flags, @@ -1210,11 +1221,13 @@ int mlx5_ib_advise_mr(struct ib_pd *pd, int mlx5_ib_dealloc_mw(struct ib_mw *mw); int mlx5_ib_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages, int page_shift, int flags); +int mlx5_ib_update_mr_pas(struct mlx5_ib_mr *mr, unsigned int flags); struct mlx5_ib_mr *mlx5_ib_alloc_implicit_mr(struct mlx5_ib_pd *pd, struct ib_udata *udata, int access_flags); void mlx5_ib_free_implicit_mr(struct mlx5_ib_mr *mr); void mlx5_ib_fence_odp_mr(struct mlx5_ib_mr *mr); +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr); int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start, u64 length, u64 virt_addr, int access_flags, struct ib_pd *pd, struct ib_udata *udata); @@ -1306,6 +1319,7 @@ int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd, enum ib_uverbs_advise_mr_advice advice, u32 flags, struct ib_sge *sg_list, u32 num_sge); int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable); +int mlx5_ib_init_dmabuf_mr(struct mlx5_ib_mr *mr); #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev) { @@ -1331,6 +1345,10 @@ static inline int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable) { return -EOPNOTSUPP; } +static inline int mlx5_ib_init_dmabuf_mr(struct mlx5_ib_mr *mr) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ extern const struct mmu_interval_notifier_ops mlx5_mn_ops; diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c index 9f653b4..fdd2db2a 100644 --- a/drivers/infiniband/hw/mlx5/mr.c +++ b/drivers/infiniband/hw/mlx5/mr.c @@ -1,5 +1,6 @@ /* * Copyright (c) 2013-2015, Mellanox Technologies. All rights reserved. + * Copyright (c) 2020, Intel Corporation.