RE: [PATCH rdma-core v3 4/6] pyverbs: Add dma-buf based MR support

Xiong, Jianxin Mon, 30 Nov 2020 10:04:23 -0800

> -----Original Message-----
> From: Daniel Vetter <dan...@ffwll.ch>
> Sent: Monday, November 30, 2020 8:56 AM
> To: Jason Gunthorpe <j...@ziepe.ca>
> Cc: Daniel Vetter <dan...@ffwll.ch>; Xiong, Jianxin 
> <jianxin.xi...@intel.com>; linux-r...@vger.kernel.org; dri-
> de...@lists.freedesktop.org; Leon Romanovsky <l...@kernel.org>; Doug Ledford 
> <dledf...@redhat.com>; Vetter, Daniel
> <daniel.vet...@intel.com>; Christian Koenig <christian.koe...@amd.com>
> Subject: Re: [PATCH rdma-core v3 4/6] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 30, 2020 at 12:36:42PM -0400, Jason Gunthorpe wrote:
> > On Mon, Nov 30, 2020 at 05:04:43PM +0100, Daniel Vetter wrote:
> > > On Mon, Nov 30, 2020 at 11:55:44AM -0400, Jason Gunthorpe wrote:
> > > > On Mon, Nov 30, 2020 at 03:57:41PM +0100, Daniel Vetter wrote:
> > > > > > +   err = ioctl(dri->fd, DRM_IOCTL_AMDGPU_GEM_CREATE, &gem_create);
> > > > > > +   if (err)
> > > > > > +           return err;
> > > > > > +
> > > > > > +   *handle = gem_create.out.handle;
> > > > > > +   return 0;
> > > > > > +}
> > > > > > +
> > > > > > +static int radeon_alloc(struct dri *dri, size_t size,
> > > > > > +uint32_t *handle)
> > > > >
> > > > > Tbh radeon chips are old enough I wouldn't care. Also doesn't
> > > > > support p2p dma-buf, so always going to be in system memory when
> > > > > you share. Plus you also need some more flags like I suggested above 
> > > > > I think.
> > > >
> > > > What about nouveau?
> > >
> > > Reallistically chances that someone wants to use rdma together with
> > > the upstream nouveau driver are roughly nil. Imo also needs someone
> > > with the right hardware to make sure it works (since the flags are
> > > all kinda arcane driver specific stuff testing is really needed).
> >
> > Well, it would be helpful if we can test the mlx5 part of the
> > implementation, and I have a lab stocked with nouveau compatible HW..
> >
> > But you are right someone needs to test/etc, so this does not seem
> > like Jianxin should worry
> 
> Ah yes sounds good. I can help with trying to find how to allocate vram with 
> nouveau if you don't find it. Caveat is that nouveau doesn't do
> dynamic dma-buf exports and hence none of the intersting flows and also not 
> p2p. Not sure how much work it would be to roll that out (iirc
> it wasnt that much amdgpu code really, just endless discussions on the 
> interface semantics and how to roll it out without breaking any of
> the existing dma-buf users).
> 
> Another thing that just crossed my mind: Do we have a testcase for forcing 
> the eviction? Should be fairly easy to provoke with something
> like this:
> 
> - register vram-only buffer with mlx5 and do something that binds it
> - allocate enough vram-only buffers to overfill vram (again figuring out
>   how much vram you have is driver specific)
> - touch each buffer with mmap. that should force the mlx5 buffer out. it
>   might be that eviction isn't lru but preferentially idle buffers (i.e.
>   not used by hw, so anything register to mlx5 won't qualify as first
>   victims). so we might need to instead register a ton of buffers with
>   mlx5 and access them through ibverbs
> - do something with mlx5 again to force the rebinding and test it all
>   keeps working
> 
> That entire invalidate/buffer move flow is the most complex interaction I 
> think.


Right now on my side the evict scenario is tested with the "timeout" feature of 
the
AMD gpu. The GPU driver would move all VRAM allocations to system buffer after
a certain period of "inactivity" (10s by default). VRAM being accessed by peer 
DMA
is not counted as activity from GPU's POV. I can observe the 
invalidation/remapping
sequence by running an RDMA test for long enough time. 

I agree having a more generic mechanism to force this scenario is going to be 
useful.

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

RE: [PATCH rdma-core v3 4/6] pyverbs: Add dma-buf based MR support

Reply via email to