Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread John Hubbard

On 2/5/21 7:53 AM, Daniel Vetter wrote:

On Fri, Feb 05, 2021 at 11:43:19AM -0400, Jason Gunthorpe wrote:

On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:


And again, for slightly older hardware, without pinning to VRAM there is
no way to use this solution here for peer-to-peer. So I'm glad to see that
so far you're not ruling out the pinning option.


Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your
cgroups) or something like that, so we could benefit from the work to make
sure pin_user_pages and all these never end up in there?


ZONE_DEVICE should already not be returned from GUP.

I've understood in the hmm casse the idea was a CPU touch of some
ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would
want to follow the same logic, presumably it comes for free with the
fault handler somehow


Oh I didn't know this, I thought the proposed p2p direct i/o patches would
just use the fact that underneath ZONE_DEVICE there's "normal" struct
pages. And so I got worried that maybe also pin_user_pages can creep in.
But I didn't read the patches in full detail:

https://lore.kernel.org/linux-block/20201106170036.18713-12-log...@deltatee.com/

But if you're saying that this all needs specific code and all the gup/pup
code we have is excluded, I think we can make sure that we're not ever
building features that requiring time-unlimited pinning of ZONE_DEVICE.
Which I think we want.



From an HMM perspective, the above sounds about right. HMM relies on the
GPU/device memory being ZONE_DEVICE, *and* on that memory *not* being pinned.
(HMM's mmu notifier callbacks act as a sort of virtual pin, but not a refcount
pin.)

It's a nice clean design point that we need to preserve, and fortunately it
doesn't conflict with anything I'm seeing here. But I want to say this out
loud because I see some doubt about it creeping into the discussion.

thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread Daniel Vetter
On Fri, Feb 05, 2021 at 12:00:03PM -0400, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 04:53:04PM +0100, Daniel Vetter wrote:
> > On Fri, Feb 05, 2021 at 11:43:19AM -0400, Jason Gunthorpe wrote:
> > > On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:
> > > 
> > > > > And again, for slightly older hardware, without pinning to VRAM there 
> > > > > is
> > > > > no way to use this solution here for peer-to-peer. So I'm glad to see 
> > > > > that
> > > > > so far you're not ruling out the pinning option.
> > > > 
> > > > Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
> > > > ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in 
> > > > your
> > > > cgroups) or something like that, so we could benefit from the work to 
> > > > make
> > > > sure pin_user_pages and all these never end up in there?
> > > 
> > > ZONE_DEVICE should already not be returned from GUP.
> > > 
> > > I've understood in the hmm casse the idea was a CPU touch of some
> > > ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would
> > > want to follow the same logic, presumably it comes for free with the
> > > fault handler somehow
> > 
> > Oh I didn't know this, I thought the proposed p2p direct i/o patches would
> > just use the fact that underneath ZONE_DEVICE there's "normal" struct
> > pages. 
> 
> So, if that every happens, it would be some special FOLL_ALLOW_P2P
> flag to get the behavior.
> 
> > And so I got worried that maybe also pin_user_pages can creep in.
> > But I didn't read the patches in full detail:
> 
> And yes, you might want to say that you can't longterm pin certain
> kinds of zone_device pages, but if that is the common operating mode
> then we'd probably never create a FOLL_ALLOW_P2P
> 
> > But if you're saying that this all needs specific code and all the gup/pup
> > code we have is excluded, I think we can make sure that we're not ever
> > building features that requiring time-unlimited pinning of
> > ZONE_DEVICE.
> 
> Well, it is certainly a useful idea of some uses of ZONE_DEVICE, GPU
> vram is not the whole world.

Yeah non-volatile RAM can probably pin whatever it wants :-)

>From the other thread, I think if we can get some cgroups going for
accounting pinned memory, then pinning gpu memory should also not be any
real issue. Might be somewhat tricky to glue that into a FOLL_ALLOW_P2P
flag, maybe through zone-awareness or something like that. With the right
accounting in place I'm happy to let userspace pin whatever they want
really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2021 at 04:53:04PM +0100, Daniel Vetter wrote:
> On Fri, Feb 05, 2021 at 11:43:19AM -0400, Jason Gunthorpe wrote:
> > On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:
> > 
> > > > And again, for slightly older hardware, without pinning to VRAM there is
> > > > no way to use this solution here for peer-to-peer. So I'm glad to see 
> > > > that
> > > > so far you're not ruling out the pinning option.
> > > 
> > > Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
> > > ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your
> > > cgroups) or something like that, so we could benefit from the work to make
> > > sure pin_user_pages and all these never end up in there?
> > 
> > ZONE_DEVICE should already not be returned from GUP.
> > 
> > I've understood in the hmm casse the idea was a CPU touch of some
> > ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would
> > want to follow the same logic, presumably it comes for free with the
> > fault handler somehow
> 
> Oh I didn't know this, I thought the proposed p2p direct i/o patches would
> just use the fact that underneath ZONE_DEVICE there's "normal" struct
> pages. 

So, if that every happens, it would be some special FOLL_ALLOW_P2P
flag to get the behavior.

> And so I got worried that maybe also pin_user_pages can creep in.
> But I didn't read the patches in full detail:

And yes, you might want to say that you can't longterm pin certain
kinds of zone_device pages, but if that is the common operating mode
then we'd probably never create a FOLL_ALLOW_P2P

> But if you're saying that this all needs specific code and all the gup/pup
> code we have is excluded, I think we can make sure that we're not ever
> building features that requiring time-unlimited pinning of
> ZONE_DEVICE.

Well, it is certainly a useful idea of some uses of ZONE_DEVICE, GPU
vram is not the whole world.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread Daniel Vetter
On Fri, Feb 05, 2021 at 11:43:19AM -0400, Jason Gunthorpe wrote:
> On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:
> 
> > > And again, for slightly older hardware, without pinning to VRAM there is
> > > no way to use this solution here for peer-to-peer. So I'm glad to see that
> > > so far you're not ruling out the pinning option.
> > 
> > Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
> > ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your
> > cgroups) or something like that, so we could benefit from the work to make
> > sure pin_user_pages and all these never end up in there?
> 
> ZONE_DEVICE should already not be returned from GUP.
> 
> I've understood in the hmm casse the idea was a CPU touch of some
> ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would
> want to follow the same logic, presumably it comes for free with the
> fault handler somehow

Oh I didn't know this, I thought the proposed p2p direct i/o patches would
just use the fact that underneath ZONE_DEVICE there's "normal" struct
pages. And so I got worried that maybe also pin_user_pages can creep in.
But I didn't read the patches in full detail:

https://lore.kernel.org/linux-block/20201106170036.18713-12-log...@deltatee.com/

But if you're saying that this all needs specific code and all the gup/pup
code we have is excluded, I think we can make sure that we're not ever
building features that requiring time-unlimited pinning of ZONE_DEVICE.
Which I think we want.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread Jason Gunthorpe
On Fri, Feb 05, 2021 at 04:39:47PM +0100, Daniel Vetter wrote:

> > And again, for slightly older hardware, without pinning to VRAM there is
> > no way to use this solution here for peer-to-peer. So I'm glad to see that
> > so far you're not ruling out the pinning option.
> 
> Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
> ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your
> cgroups) or something like that, so we could benefit from the work to make
> sure pin_user_pages and all these never end up in there?

ZONE_DEVICE should already not be returned from GUP.

I've understood in the hmm casse the idea was a CPU touch of some
ZONE_DEVICE pages would trigger a migration to CPU memory, GUP would
want to follow the same logic, presumably it comes for free with the
fault handler somehow

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-05 Thread Daniel Vetter
On Thu, Feb 04, 2021 at 11:00:32AM -0800, John Hubbard wrote:
> On 2/4/21 10:44 AM, Alex Deucher wrote:
> ...
> > > > The argument is that vram is a scarce resource, but I don't know if
> > > > that is really the case these days.  At this point, we often have as
> > > > much vram as system ram if not more.
> > > 
> > > I thought the main argument was that GPU memory could move at any time
> > > between the GPU and CPU and the DMA buf would always track its current
> > > location?
> > 
> > I think the reason for that is that VRAM is scarce so we have to be
> > able to move it around.  We don't enforce the same limitations for
> > buffers in system memory.  We could just support pinning dma-bufs in
> > vram like we do with system ram.  Maybe with some conditions, e.g.,
> > p2p is possible, and the device has a large BAR so you aren't tying up
> > the BAR window.

Minimally we need cgroups for that vram, so it can be managed. Which is a
bit stuck unfortunately. But if we have cgroups with some pin limit, I
think we can easily lift this.

> Excellent. And yes, we are already building systems in which VRAM is
> definitely not scarce, but on the other hand, those newer systems can
> also handle GPU (and NIC) page faults, so not really an issue. For that,
> we just need to enhance HMM so that it does peer to peer.
> 
> We also have some older hardware with large BAR1 apertures, specifically
> for this sort of thing.
> 
> And again, for slightly older hardware, without pinning to VRAM there is
> no way to use this solution here for peer-to-peer. So I'm glad to see that
> so far you're not ruling out the pinning option.

Since HMM and ZONE_DEVICE came up, I'm kinda tempted to make ZONE_DEVICE
ZONE_MOVEABLE (at least if you don't have a pinned vram contigent in your
cgroups) or something like that, so we could benefit from the work to make
sure pin_user_pages and all these never end up in there?

https://lwn.net/Articles/843326/

Kind inspired by the recent lwn article.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-04 Thread Jason Gunthorpe
On Thu, Feb 04, 2021 at 08:50:38AM -0500, Alex Deucher wrote:
> On Thu, Feb 4, 2021 at 2:48 AM John Hubbard  wrote:
> >
> > On 12/15/20 1:27 PM, Jianxin Xiong wrote:
> > > This patch series adds dma-buf importer role to the RDMA driver in
> > > attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
> > > chosen for a few reasons: first, the API is relatively simple and allows
> > > a lot of flexibility in implementing the buffer manipulation ops.
> > > Second, it doesn't require page structure. Third, dma-buf is already
> > > supported in many GPU drivers. However, we are aware that existing GPU
> > > drivers don't allow pinning device memory via the dma-buf interface.
> > > Pinning would simply cause the backing storage to migrate to system RAM.
> > > True peer-to-peer access is only possible using dynamic attach, which
> > > requires on-demand paging support from the NIC to work. For this reason,
> > > this series only works with ODP capable NICs.
> >
> > Hi,
> >
> > Looking ahead to after this patchset is merged...
> >
> > Are there design thoughts out there, about the future of pinning to vidmem,
> > for this? It would allow a huge group of older GPUs and NICs and such to
> > do p2p with this approach, and it seems like a natural next step, right?
> 
> The argument is that vram is a scarce resource, but I don't know if
> that is really the case these days.  At this point, we often have as
> much vram as system ram if not more.

I thought the main argument was that GPU memory could move at any time
between the GPU and CPU and the DMA buf would always track its current
location?

IMHO there is no reason not to have a special API to create small
amounts of GPU dedicated locked memory that cannot be moved off the
GPU.

For instance this paper:

http://www.ziti.uni-heidelberg.de/ziti/uploads/ce_group/2014-ASHESIPDPS.pdf

Considers using the GPU to directly drive the RDMA work
queues. Putting the queues themselves in GPU VRAM would make alot of
sense.

But that is impossible without fixed non-invalidating dma bufs.

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-04 Thread John Hubbard

On 2/4/21 10:44 AM, Alex Deucher wrote:
...

The argument is that vram is a scarce resource, but I don't know if
that is really the case these days.  At this point, we often have as
much vram as system ram if not more.


I thought the main argument was that GPU memory could move at any time
between the GPU and CPU and the DMA buf would always track its current
location?


I think the reason for that is that VRAM is scarce so we have to be
able to move it around.  We don't enforce the same limitations for
buffers in system memory.  We could just support pinning dma-bufs in
vram like we do with system ram.  Maybe with some conditions, e.g.,
p2p is possible, and the device has a large BAR so you aren't tying up
the BAR window.



Excellent. And yes, we are already building systems in which VRAM is
definitely not scarce, but on the other hand, those newer systems can
also handle GPU (and NIC) page faults, so not really an issue. For that,
we just need to enhance HMM so that it does peer to peer.

We also have some older hardware with large BAR1 apertures, specifically
for this sort of thing.

And again, for slightly older hardware, without pinning to VRAM there is
no way to use this solution here for peer-to-peer. So I'm glad to see that
so far you're not ruling out the pinning option.



thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-04 Thread Alex Deucher
On Thu, Feb 4, 2021 at 1:29 PM Jason Gunthorpe  wrote:
>
> On Thu, Feb 04, 2021 at 08:50:38AM -0500, Alex Deucher wrote:
> > On Thu, Feb 4, 2021 at 2:48 AM John Hubbard  wrote:
> > >
> > > On 12/15/20 1:27 PM, Jianxin Xiong wrote:
> > > > This patch series adds dma-buf importer role to the RDMA driver in
> > > > attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
> > > > chosen for a few reasons: first, the API is relatively simple and allows
> > > > a lot of flexibility in implementing the buffer manipulation ops.
> > > > Second, it doesn't require page structure. Third, dma-buf is already
> > > > supported in many GPU drivers. However, we are aware that existing GPU
> > > > drivers don't allow pinning device memory via the dma-buf interface.
> > > > Pinning would simply cause the backing storage to migrate to system RAM.
> > > > True peer-to-peer access is only possible using dynamic attach, which
> > > > requires on-demand paging support from the NIC to work. For this reason,
> > > > this series only works with ODP capable NICs.
> > >
> > > Hi,
> > >
> > > Looking ahead to after this patchset is merged...
> > >
> > > Are there design thoughts out there, about the future of pinning to 
> > > vidmem,
> > > for this? It would allow a huge group of older GPUs and NICs and such to
> > > do p2p with this approach, and it seems like a natural next step, right?
> >
> > The argument is that vram is a scarce resource, but I don't know if
> > that is really the case these days.  At this point, we often have as
> > much vram as system ram if not more.
>
> I thought the main argument was that GPU memory could move at any time
> between the GPU and CPU and the DMA buf would always track its current
> location?

I think the reason for that is that VRAM is scarce so we have to be
able to move it around.  We don't enforce the same limitations for
buffers in system memory.  We could just support pinning dma-bufs in
vram like we do with system ram.  Maybe with some conditions, e.g.,
p2p is possible, and the device has a large BAR so you aren't tying up
the BAR window.

Alex


>
> IMHO there is no reason not to have a special API to create small
> amounts of GPU dedicated locked memory that cannot be moved off the
> GPU.
>
> For instance this paper:
>
> http://www.ziti.uni-heidelberg.de/ziti/uploads/ce_group/2014-ASHESIPDPS.pdf
>
> Considers using the GPU to directly drive the RDMA work
> queues. Putting the queues themselves in GPU VRAM would make alot of
> sense.
>
> But that is impossible without fixed non-invalidating dma bufs.
>
> Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-04 Thread Alex Deucher
On Thu, Feb 4, 2021 at 2:48 AM John Hubbard  wrote:
>
> On 12/15/20 1:27 PM, Jianxin Xiong wrote:
> > This patch series adds dma-buf importer role to the RDMA driver in
> > attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
> > chosen for a few reasons: first, the API is relatively simple and allows
> > a lot of flexibility in implementing the buffer manipulation ops.
> > Second, it doesn't require page structure. Third, dma-buf is already
> > supported in many GPU drivers. However, we are aware that existing GPU
> > drivers don't allow pinning device memory via the dma-buf interface.
> > Pinning would simply cause the backing storage to migrate to system RAM.
> > True peer-to-peer access is only possible using dynamic attach, which
> > requires on-demand paging support from the NIC to work. For this reason,
> > this series only works with ODP capable NICs.
>
> Hi,
>
> Looking ahead to after this patchset is merged...
>
> Are there design thoughts out there, about the future of pinning to vidmem,
> for this? It would allow a huge group of older GPUs and NICs and such to
> do p2p with this approach, and it seems like a natural next step, right?

The argument is that vram is a scarce resource, but I don't know if
that is really the case these days.  At this point, we often have as
much vram as system ram if not more.

Alex
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-02-03 Thread John Hubbard

On 12/15/20 1:27 PM, Jianxin Xiong wrote:

This patch series adds dma-buf importer role to the RDMA driver in
attempt to support RDMA using device memory such as GPU VRAM. Dma-buf is
chosen for a few reasons: first, the API is relatively simple and allows
a lot of flexibility in implementing the buffer manipulation ops.
Second, it doesn't require page structure. Third, dma-buf is already
supported in many GPU drivers. However, we are aware that existing GPU
drivers don't allow pinning device memory via the dma-buf interface.
Pinning would simply cause the backing storage to migrate to system RAM.
True peer-to-peer access is only possible using dynamic attach, which
requires on-demand paging support from the NIC to work. For this reason,
this series only works with ODP capable NICs.


Hi,

Looking ahead to after this patchset is merged...

Are there design thoughts out there, about the future of pinning to vidmem,
for this? It would allow a huge group of older GPUs and NICs and such to
do p2p with this approach, and it seems like a natural next step, right?


thanks,
--
John Hubbard
NVIDIA
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-22 Thread Jason Gunthorpe
On Tue, Dec 15, 2020 at 01:27:12PM -0800, Jianxin Xiong wrote:
> Jianxin Xiong (4):
>   RDMA/umem: Support importing dma-buf as user memory region
>   RDMA/core: Add device method for registering dma-buf based memory
> region
>   RDMA/uverbs: Add uverbs command for dma-buf based MR registration
>   RDMA/mlx5: Support dma-buf based userspace memory region

I applied the below fix for rereg, but otherwise took this to rdma's
for-next

Thanks,
Jason

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index f9ca19fa531b45..a63ef7c66e383d 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1825,9 +1825,6 @@ struct ib_mr *mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, 
int flags, u64 start,
if (flags & ~(IB_MR_REREG_TRANS | IB_MR_REREG_PD | IB_MR_REREG_ACCESS))
return ERR_PTR(-EOPNOTSUPP);
 
-   if (is_dmabuf_mr(mr))
-   return ERR_PTR(-EOPNOTSUPP);
-
if (!(flags & IB_MR_REREG_ACCESS))
new_access_flags = mr->access_flags;
if (!(flags & IB_MR_REREG_PD))
@@ -1844,8 +1841,8 @@ struct ib_mr *mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, 
int flags, u64 start,
return ERR_PTR(err);
return NULL;
}
-   /* DM or ODP MR's don't have a umem so we can't re-use it */
-   if (!mr->umem || is_odp_mr(mr))
+   /* DM or ODP MR's don't have a normal umem so we can't re-use 
it */
+   if (!mr->umem || is_odp_mr(mr) || is_dmabuf_mr(mr))
goto recreate;
 
/*
@@ -1864,10 +1861,10 @@ struct ib_mr *mlx5_ib_rereg_user_mr(struct ib_mr 
*ib_mr, int flags, u64 start,
}
 
/*
-* DM doesn't have a PAS list so we can't re-use it, odp does but the
-* logic around releasing the umem is different
+* DM doesn't have a PAS list so we can't re-use it, odp/dmabuf does
+* but the logic around releasing the umem is different
 */
-   if (!mr->umem || is_odp_mr(mr))
+   if (!mr->umem || is_odp_mr(mr) || is_dmabuf_mr(mr))
goto recreate;
 
if (!(new_access_flags & IB_ACCESS_ON_DEMAND) &&
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-13 Thread Yishai Hadas

On 1/11/2021 7:55 PM, Xiong, Jianxin wrote:

-Original Message-
From: Alex Deucher 
Sent: Monday, January 11, 2021 9:47 AM
To: Xiong, Jianxin 
Cc: Jason Gunthorpe ; Leon Romanovsky ; 
linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
Doug Ledford ; Vetter, Daniel ; 
Christian Koenig 
Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support

On Mon, Jan 11, 2021 at 12:44 PM Xiong, Jianxin  wrote:

-Original Message-
From: Jason Gunthorpe 
Sent: Monday, January 11, 2021 7:43 AM
To: Xiong, Jianxin 
Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
Doug Ledford ; Leon Romanovsky
; Sumit Semwal ; Christian
Koenig ; Vetter, Daniel

Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support

On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:

Jason, will this series be able to get into 5.12?

I was going to ask you where things are after the break?

Did everyone agree the userspace stuff is OK now? Is Edward OK with
the pyverbs changes, etc


There is no new comment on the both the kernel and userspace series. I
assume silence means no objection. I will ask for opinions on the userspace 
thread.

Do you have a link to the userspace thread?


https://www.spinics.net/lists/linux-rdma/msg98135.html

Any reason why the 'fork' comment that was given few times wasn't not 
handled / answered ?


Specifically,

ibv_reg_dmabuf_mr() doesn't call ibv_dontfork_range() but ibv_dereg_mr 
does call its opposite API (i.e. ibv_dofork_range())


Yishai

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-12 Thread Xiong, Jianxin
 -Original Message-
> From: Yishai Hadas 
> Sent: Tuesday, January 12, 2021 4:49 AM
> To: Xiong, Jianxin ; Alex Deucher 
> 
> Cc: Jason Gunthorpe ; Leon Romanovsky ; 
> linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
> Doug Ledford ; Vetter, Daniel ; 
> Christian Koenig ; Yishai
> Hadas 
> Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> 
> On 1/11/2021 7:55 PM, Xiong, Jianxin wrote:
> >> -Original Message-
> >> From: Alex Deucher 
> >> Sent: Monday, January 11, 2021 9:47 AM
> >> To: Xiong, Jianxin 
> >> Cc: Jason Gunthorpe ; Leon Romanovsky
> >> ; linux-r...@vger.kernel.org;
> >> dri-devel@lists.freedesktop.org; Doug Ledford ;
> >> Vetter, Daniel ; Christian Koenig
> >> 
> >> Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> >>
> >> On Mon, Jan 11, 2021 at 12:44 PM Xiong, Jianxin  
> >> wrote:
> >>>> -Original Message-
> >>>> From: Jason Gunthorpe 
> >>>> Sent: Monday, January 11, 2021 7:43 AM
> >>>> To: Xiong, Jianxin 
> >>>> Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
> >>>> Doug Ledford ; Leon Romanovsky
> >>>> ; Sumit Semwal ;
> >>>> Christian Koenig ; Vetter, Daniel
> >>>> 
> >>>> Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> >>>>
> >>>> On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:
> >>>>> Jason, will this series be able to get into 5.12?
> >>>> I was going to ask you where things are after the break?
> >>>>
> >>>> Did everyone agree the userspace stuff is OK now? Is Edward OK with
> >>>> the pyverbs changes, etc
> >>>>
> >>> There is no new comment on the both the kernel and userspace series.
> >>> I assume silence means no objection. I will ask for opinions on the 
> >>> userspace thread.
> >> Do you have a link to the userspace thread?
> >>
> > https://www.spinics.net/lists/linux-rdma/msg98135.html
> >
> Any reason why the 'fork' comment that was given few times wasn't not handled 
> / answered ?
> 
> Specifically,
> 
> ibv_reg_dmabuf_mr() doesn't call ibv_dontfork_range() but ibv_dereg_mr does 
> call its opposite API (i.e. ibv_dofork_range())
> 

Sorry, that part was missed. Strangely enough, a few of your replies didn't 
reach my inbox and I just found them in the web archives:  
https://www.spinics.net/lists/linux-rdma/msg97973.html, and 
https://www.spinics.net/lists/linux-rdma/msg98133.html

I will add check to ibv_dereg_mr() to avoid calling ibv_ibv_dofork_range() for 
dmabuf case.

Thanks a lot for bring this up again.

Jianxin
  


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-11 Thread Jason Gunthorpe
On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:
> Jason, will this series be able to get into 5.12?

I was going to ask you where things are after the break? 

Did everyone agree the userspace stuff is OK now? Is Edward OK with
the pyverbs changes, etc

Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-11 Thread Xiong, Jianxin
> -Original Message-
> From: Alex Deucher 
> Sent: Monday, January 11, 2021 9:47 AM
> To: Xiong, Jianxin 
> Cc: Jason Gunthorpe ; Leon Romanovsky ; 
> linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
> Doug Ledford ; Vetter, Daniel ; 
> Christian Koenig 
> Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> 
> On Mon, Jan 11, 2021 at 12:44 PM Xiong, Jianxin  
> wrote:
> >
> > > -Original Message-
> > > From: Jason Gunthorpe 
> > > Sent: Monday, January 11, 2021 7:43 AM
> > > To: Xiong, Jianxin 
> > > Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org;
> > > Doug Ledford ; Leon Romanovsky
> > > ; Sumit Semwal ; Christian
> > > Koenig ; Vetter, Daniel
> > > 
> > > Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> > >
> > > On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:
> > > > Jason, will this series be able to get into 5.12?
> > >
> > > I was going to ask you where things are after the break?
> > >
> > > Did everyone agree the userspace stuff is OK now? Is Edward OK with
> > > the pyverbs changes, etc
> > >
> >
> > There is no new comment on the both the kernel and userspace series. I
> > assume silence means no objection. I will ask for opinions on the userspace 
> > thread.
> 
> Do you have a link to the userspace thread?
> 
https://www.spinics.net/lists/linux-rdma/msg98135.html

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-11 Thread Alex Deucher
On Mon, Jan 11, 2021 at 12:44 PM Xiong, Jianxin  wrote:
>
> > -Original Message-
> > From: Jason Gunthorpe 
> > Sent: Monday, January 11, 2021 7:43 AM
> > To: Xiong, Jianxin 
> > Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug 
> > Ledford ; Leon Romanovsky
> > ; Sumit Semwal ; Christian Koenig 
> > ; Vetter, Daniel
> > 
> > Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> >
> > On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:
> > > Jason, will this series be able to get into 5.12?
> >
> > I was going to ask you where things are after the break?
> >
> > Did everyone agree the userspace stuff is OK now? Is Edward OK with the 
> > pyverbs changes, etc
> >
>
> There is no new comment on the both the kernel and userspace series. I assume 
> silence
> means no objection. I will ask for opinions on the userspace thread.

Do you have a link to the userspace thread?

Thanks,

Alex
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-11 Thread Xiong, Jianxin
> -Original Message-
> From: Jason Gunthorpe 
> Sent: Monday, January 11, 2021 7:43 AM
> To: Xiong, Jianxin 
> Cc: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford 
> ; Leon Romanovsky
> ; Sumit Semwal ; Christian Koenig 
> ; Vetter, Daniel
> 
> Subject: Re: [PATCH v16 0/4] RDMA: Add dma-buf support
> 
> On Mon, Jan 11, 2021 at 03:24:18PM +, Xiong, Jianxin wrote:
> > Jason, will this series be able to get into 5.12?
> 
> I was going to ask you where things are after the break?
> 
> Did everyone agree the userspace stuff is OK now? Is Edward OK with the 
> pyverbs changes, etc
> 

There is no new comment on the both the kernel and userspace series. I assume 
silence
means no objection. I will ask for opinions on the userspace thread.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH v16 0/4] RDMA: Add dma-buf support

2021-01-11 Thread Xiong, Jianxin
Jason, will this series be able to get into 5.12?

> -Original Message-
> From: Xiong, Jianxin 
> Sent: Tuesday, December 15, 2020 1:27 PM
> To: linux-r...@vger.kernel.org; dri-devel@lists.freedesktop.org
> Cc: Xiong, Jianxin ; Doug Ledford 
> ; Jason Gunthorpe ; Leon Romanovsky
> ; Sumit Semwal ; Christian Koenig 
> ; Vetter, Daniel
> 
> Subject: [PATCH v16 0/4] RDMA: Add dma-buf support
> 
> This is the sixteenth version of the patch set. Changelog:
> 
> v16:
> * Add "select DMA_SHARED_BUFFER" to Kconfig when IB UMEM is enabled.
>   This fixes the auto build test error with a random config.
> 
> v15: https://www.spinics.net/lists/linux-rdma/msg98369.html
> * Rebase to the latest linux-rdma 'for-next' branch (commit 0583531bb9ef)
>   to pick up RDMA core and mlx5 updates
> * Let ib_umem_dmabuf_get() return 'struct ib_umem_dmabuf *' instead of
>   'struct ib_umem *'
> * Move the check of on demand paging support to mlx5_ib_reg_user_mr_dmabuf()
> * Check iova alignment at the entry point of the uverb command so that
>   mlx5_umem_dmabuf_default_pgsz() can always succeed
> 
> v14: https://www.spinics.net/lists/linux-rdma/msg98265.html
> * Check return value of dma_fence_wait()
> * Fix a dma-buf leak in ib_umem_dmabuf_get()
> * Fix return value type cast for ib_umem_dmabuf_get()
> * Return -EOPNOTSUPP instead of -EINVAL for unimplemented functions
> * Remove an unnecessary use of unlikely()
> * Remove left-over commit message resulted from rebase
> 
> v13: https://www.spinics.net/lists/linux-rdma/msg98227.html
> * Rebase to the latest linux-rdma 'for-next' branch (5.10.0-rc6+)
> * Check for device on-demand paging capability at the entry point of
>   the new verbs command to avoid calling device's reg_user_mr_dmabuf()
>   method when CONFIG_INFINIBAND_ON_DEMAND_PAGING is diabled.
> 
> v12: https://www.spinics.net/lists/linux-rdma/msg97943.html
> * Move the prototype of function ib_umem_dmabuf_release() to ib_umem.h
>   and remove umem_dmabuf.h
> * Break a line that is too long
> 
> v11: https://www.spinics.net/lists/linux-rdma/msg97860.html
> * Rework the parameter checking code inside ib_umem_dmabuf_get()
> * Fix incorrect error handling in the new verbs command handler
> * Put a duplicated code sequence for checking iova and setting page size
>   into a function
> * In the invalidation callback, check for if the buffer has been mapped
>   and thus the presence of a valid driver mr is ensured
> * The patch that checks for dma_virt_ops is dropped because it is no
>   longer needed
> * The patch that documents that dma-buf size is fixed has landed at:
>   https://cgit.freedesktop.org/drm/drm-misc/commit/?id=476b485be03c
>   and thus is no longer included here
> * The matching user space patch set is sent separately
> 
> v10: https://www.spinics.net/lists/linux-rdma/msg97483.html
> * Don't map the pages in ib_umem_dmabuf_get(); use the size information
>   of the dma-buf object to validate the umem size instead
> * Use PAGE_SIZE directly instead of use ib_umem_find_best_pgsz() when
>   the MR is created since the pages have not been mapped yet and dma-buf
>   requires PAGE_SIZE anyway
> * Always call mlx5_umem_find_best_pgsz() after mapping the pages to
>   verify that the page size requirement is satisfied
> * Add a patch to document that dma-buf size is fixed
> 
> v9: https://www.spinics.net/lists/linux-rdma/msg97432.html
> * Clean up the code for sg list in-place modification
> * Prevent dma-buf pages from being mapped multiple times
> * Map the pages in ib_umem_dmabuf_get() so that inproper values of
>   address/length/iova can be caught early
> * Check for unsupported flags in the new uverbs command
> * Add missing uverbs_finalize_uobj_create()
> * Sort uverbs objects by name
> * Fix formating issue -- unnecessary alignment of '='
> * Unmap pages in mlx5_ib_fence_dmabuf_mr()
> * Remove address range checking from pagefault_dmabuf_mr()
> 
> v8: https://www.spinics.net/lists/linux-rdma/msg97370.html
> * Modify the dma-buf sg list in place to get a proper umem sg list and
>   restore it before calling dma_buf_unmap_attachment()
> * Validate the umem sg list with ib_umem_find_best_pgsz()
> * Remove the logic for slicing the sg list at runtime
> 
> v7: https://www.spinics.net/lists/linux-rdma/msg97297.html
> * Rebase on top of latest mlx5 MR patch series
> * Slice dma-buf sg list at runtime instead of creating a new list
> * Preload the buffer page mapping when the MR is created
> * Move the 'dma_virt_ops' check into dma_buf_dynamic_attach()
> 
> v6: https://www.spinics.net/lists/linux-rdma/msg96923.html
> * Move the dma-buf invalidation callback from the core to the device
>   driver
> * M

[PATCH v16 0/4] RDMA: Add dma-buf support

2020-12-15 Thread Jianxin Xiong
This is the sixteenth version of the patch set. Changelog:

v16:
* Add "select DMA_SHARED_BUFFER" to Kconfig when IB UMEM is enabled.
  This fixes the auto build test error with a random config.

v15: https://www.spinics.net/lists/linux-rdma/msg98369.html
* Rebase to the latest linux-rdma 'for-next' branch (commit 0583531bb9ef)
  to pick up RDMA core and mlx5 updates
* Let ib_umem_dmabuf_get() return 'struct ib_umem_dmabuf *' instead of
  'struct ib_umem *'
* Move the check of on demand paging support to mlx5_ib_reg_user_mr_dmabuf()
* Check iova alignment at the entry point of the uverb command so that
  mlx5_umem_dmabuf_default_pgsz() can always succeed

v14: https://www.spinics.net/lists/linux-rdma/msg98265.html
* Check return value of dma_fence_wait()
* Fix a dma-buf leak in ib_umem_dmabuf_get()
* Fix return value type cast for ib_umem_dmabuf_get()
* Return -EOPNOTSUPP instead of -EINVAL for unimplemented functions
* Remove an unnecessary use of unlikely()
* Remove left-over commit message resulted from rebase

v13: https://www.spinics.net/lists/linux-rdma/msg98227.html
* Rebase to the latest linux-rdma 'for-next' branch (5.10.0-rc6+)
* Check for device on-demand paging capability at the entry point of
  the new verbs command to avoid calling device's reg_user_mr_dmabuf()
  method when CONFIG_INFINIBAND_ON_DEMAND_PAGING is diabled.

v12: https://www.spinics.net/lists/linux-rdma/msg97943.html
* Move the prototype of function ib_umem_dmabuf_release() to ib_umem.h
  and remove umem_dmabuf.h
* Break a line that is too long

v11: https://www.spinics.net/lists/linux-rdma/msg97860.html
* Rework the parameter checking code inside ib_umem_dmabuf_get() 
* Fix incorrect error handling in the new verbs command handler
* Put a duplicated code sequence for checking iova and setting page size
  into a function
* In the invalidation callback, check for if the buffer has been mapped
  and thus the presence of a valid driver mr is ensured
* The patch that checks for dma_virt_ops is dropped because it is no
  longer needed
* The patch that documents that dma-buf size is fixed has landed at:
  https://cgit.freedesktop.org/drm/drm-misc/commit/?id=476b485be03c
  and thus is no longer included here
* The matching user space patch set is sent separately

v10: https://www.spinics.net/lists/linux-rdma/msg97483.html
* Don't map the pages in ib_umem_dmabuf_get(); use the size information
  of the dma-buf object to validate the umem size instead
* Use PAGE_SIZE directly instead of use ib_umem_find_best_pgsz() when
  the MR is created since the pages have not been mapped yet and dma-buf
  requires PAGE_SIZE anyway
* Always call mlx5_umem_find_best_pgsz() after mapping the pages to
  verify that the page size requirement is satisfied
* Add a patch to document that dma-buf size is fixed

v9: https://www.spinics.net/lists/linux-rdma/msg97432.html
* Clean up the code for sg list in-place modification
* Prevent dma-buf pages from being mapped multiple times
* Map the pages in ib_umem_dmabuf_get() so that inproper values of
  address/length/iova can be caught early
* Check for unsupported flags in the new uverbs command
* Add missing uverbs_finalize_uobj_create()
* Sort uverbs objects by name
* Fix formating issue -- unnecessary alignment of '='
* Unmap pages in mlx5_ib_fence_dmabuf_mr()
* Remove address range checking from pagefault_dmabuf_mr()

v8: https://www.spinics.net/lists/linux-rdma/msg97370.html
* Modify the dma-buf sg list in place to get a proper umem sg list and
  restore it before calling dma_buf_unmap_attachment()
* Validate the umem sg list with ib_umem_find_best_pgsz()
* Remove the logic for slicing the sg list at runtime

v7: https://www.spinics.net/lists/linux-rdma/msg97297.html
* Rebase on top of latest mlx5 MR patch series
* Slice dma-buf sg list at runtime instead of creating a new list
* Preload the buffer page mapping when the MR is created
* Move the 'dma_virt_ops' check into dma_buf_dynamic_attach()

v6: https://www.spinics.net/lists/linux-rdma/msg96923.html
* Move the dma-buf invalidation callback from the core to the device
  driver
* Move mapping update from work queue to pagefault handler
* Add dma-buf based MRs to the xarray of mmkeys so that the pagefault
  handler can be reached
* Update the new driver method and uverbs command signature by changing
  the paramter 'addr' to 'offset'
* Modify the sg list returned from dma_buf_map_attachment() based on
  the parameters 'offset' and 'length'
* Don't import dma-buf if 'dma_virt_ops' is used by the dma device
* The patch that clarifies dma-buf sg lists alignment has landed at
  https://cgit.freedesktop.org/drm/drm-misc/commit/?id=ac80cd17a615
  and thus is no longer included with this set

v5: https://www.spinics.net/lists/linux-rdma/msg96786.html
* Fix a few warnings reported by kernel test robot:
- no previous prototype for function 'ib_umem_dmabuf_release' 
- no previous prototype for function 'ib_umem_dmabuf_map_pages'
- comparison of distinct