Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-12 Thread Daniel Vetter
On Thu, Oct 12, 2023 at 02:35:15PM +0200, Christian König wrote:
> Am 12.10.23 um 12:33 schrieb Dave Airlie:
> > On Wed, 11 Oct 2023 at 17:07, Christian König  
> > wrote:
> > > Am 10.10.23 um 22:23 schrieb Dave Airlie:
> > > > > I think we're then optimizing for different scenarios. Our compute
> > > > > driver will use mostly external objects only, and if shared, I don't
> > > > > forsee them bound to many VMs. What saves us currently here is that in
> > > > > compute mode we only really traverse the extobj list after a preempt
> > > > > fence wait, or when a vm is using a new context for the first time. So
> > > > > vm's extobj list is pretty large. Each bo's vma list will typically be
> > > > > pretty small.
> > > > Can I ask why we are optimising for this userspace, this seems
> > > > incredibly broken.
> > > > 
> > > > We've has this sort of problem in the past with Intel letting the tail
> > > > wag the horse, does anyone remember optimising relocations for a
> > > > userspace that didn't actually need to use relocations?
> > > > 
> > > > We need to ask why this userspace is doing this, can we get some
> > > > pointers to it? compute driver should have no reason to use mostly
> > > > external objects, the OpenCL and level0 APIs should be good enough to
> > > > figure this out.
> > > Well that is pretty normal use case, AMD works the same way.
> > > 
> > > In a multi GPU compute stack you have mostly all the data shared between
> > > different hardware devices.
> > > 
> > > As I said before looking at just the Vulcan use case is not a good idea
> > > at all.
> > > 
> > It's okay, I don't think anyone is doing that, some of the these
> > use-cases are buried in server land and you guys don't communicate
> > them very well.
> 
> Yeah, well everybody is trying very hard to get away from those approaches
> :)
> 
> But so far there hasn't been any breakthrough.
> 
> > 
> > multi-gpu compute would I'd hope be moving towards HMM/SVM type
> > solutions though?
> 
> Unfortunately not in the foreseeable future. HMM seems more and more like a
> dead end, at least for AMD.
> 
> AMD still has hardware support in all of their MI* products, but for Navi
> the features necessary for implementing HMM have been dropped. And it looks
> more and more like their are not going to come back.
> 
> Additional to that from the software side Felix summarized it in the HMM
> peer2peer discussion thread recently quite well. A buffer object based
> approach is not only simpler to handle, but also performant vise multiple
> magnitudes faster.

This matches what I'm hearing from all over. Turns out that handling page
faults in full generality in a compute/accel device (not just gpu) is just
too damn hard. At least for anyone who isn't nvidia. Usually time bound
preemption guarantees are the first to go, followed right after by a long
list of more fixed function hardware blocks that outright can't cope with
page faults.

There's so many corner cases where it breaks down that I feel like device
driver allocated memory of one flavor or another will stick around for a
very long time.

This isn't even counting the software challenges.
-Sima

> > I'm also not into looking at use-cases that used to be important but
> > might not as important going forward.
> 
> Well multimedia applications and OpenGL are still around, but it's not the
> main focus any more.
> 
> Christian.
> 
> > 
> > Dave.
> > 
> > 
> > > Christian.
> > > 
> > > > Dave.
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-12 Thread Christian König

Am 12.10.23 um 12:33 schrieb Dave Airlie:

On Wed, 11 Oct 2023 at 17:07, Christian König  wrote:

Am 10.10.23 um 22:23 schrieb Dave Airlie:

I think we're then optimizing for different scenarios. Our compute
driver will use mostly external objects only, and if shared, I don't
forsee them bound to many VMs. What saves us currently here is that in
compute mode we only really traverse the extobj list after a preempt
fence wait, or when a vm is using a new context for the first time. So
vm's extobj list is pretty large. Each bo's vma list will typically be
pretty small.

Can I ask why we are optimising for this userspace, this seems
incredibly broken.

We've has this sort of problem in the past with Intel letting the tail
wag the horse, does anyone remember optimising relocations for a
userspace that didn't actually need to use relocations?

We need to ask why this userspace is doing this, can we get some
pointers to it? compute driver should have no reason to use mostly
external objects, the OpenCL and level0 APIs should be good enough to
figure this out.

Well that is pretty normal use case, AMD works the same way.

In a multi GPU compute stack you have mostly all the data shared between
different hardware devices.

As I said before looking at just the Vulcan use case is not a good idea
at all.


It's okay, I don't think anyone is doing that, some of the these
use-cases are buried in server land and you guys don't communicate
them very well.


Yeah, well everybody is trying very hard to get away from those 
approaches :)


But so far there hasn't been any breakthrough.



multi-gpu compute would I'd hope be moving towards HMM/SVM type
solutions though?


Unfortunately not in the foreseeable future. HMM seems more and more 
like a dead end, at least for AMD.


AMD still has hardware support in all of their MI* products, but for 
Navi the features necessary for implementing HMM have been dropped. And 
it looks more and more like their are not going to come back.


Additional to that from the software side Felix summarized it in the HMM 
peer2peer discussion thread recently quite well. A buffer object based 
approach is not only simpler to handle, but also performant vise 
multiple magnitudes faster.



I'm also not into looking at use-cases that used to be important but
might not as important going forward.


Well multimedia applications and OpenGL are still around, but it's not 
the main focus any more.


Christian.



Dave.



Christian.


Dave.




Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-12 Thread Dave Airlie
On Wed, 11 Oct 2023 at 17:07, Christian König  wrote:
>
> Am 10.10.23 um 22:23 schrieb Dave Airlie:
> >> I think we're then optimizing for different scenarios. Our compute
> >> driver will use mostly external objects only, and if shared, I don't
> >> forsee them bound to many VMs. What saves us currently here is that in
> >> compute mode we only really traverse the extobj list after a preempt
> >> fence wait, or when a vm is using a new context for the first time. So
> >> vm's extobj list is pretty large. Each bo's vma list will typically be
> >> pretty small.
> > Can I ask why we are optimising for this userspace, this seems
> > incredibly broken.
> >
> > We've has this sort of problem in the past with Intel letting the tail
> > wag the horse, does anyone remember optimising relocations for a
> > userspace that didn't actually need to use relocations?
> >
> > We need to ask why this userspace is doing this, can we get some
> > pointers to it? compute driver should have no reason to use mostly
> > external objects, the OpenCL and level0 APIs should be good enough to
> > figure this out.
>
> Well that is pretty normal use case, AMD works the same way.
>
> In a multi GPU compute stack you have mostly all the data shared between
> different hardware devices.
>
> As I said before looking at just the Vulcan use case is not a good idea
> at all.
>

It's okay, I don't think anyone is doing that, some of the these
use-cases are buried in server land and you guys don't communicate
them very well.

multi-gpu compute would I'd hope be moving towards HMM/SVM type
solutions though?

I'm also not into looking at use-cases that used to be important but
might not as important going forward.

Dave.


> Christian.
>
> >
> > Dave.
>


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-11 Thread Thomas Hellström
On Wed, 2023-10-11 at 06:23 +1000, Dave Airlie wrote:
> > I think we're then optimizing for different scenarios. Our compute
> > driver will use mostly external objects only, and if shared, I
> > don't
> > forsee them bound to many VMs. What saves us currently here is that
> > in
> > compute mode we only really traverse the extobj list after a
> > preempt
> > fence wait, or when a vm is using a new context for the first time.
> > So
> > vm's extobj list is pretty large. Each bo's vma list will typically
> > be
> > pretty small.
> 
> Can I ask why we are optimising for this userspace, this seems
> incredibly broken.

First Judging from the discussion with Christian this is not really
uncommon. There *are* ways that we can play tricks in KMD of assorted
cleverness to reduce the extobj list size, but doing that in KMD that
wouldn't be much different than accepting a large extobj list size and
do what we can to reduce overhead of iterating over it.

Second the discussion here really was about whether we should be using
a lower level lock to allow for async state updates, with a rather
complex mechanism with weak reference counting and a requirement to
drop the locks within the loop to avoid locking inversion. If that were
a simplification with little or no overhead all fine, but IMO it's not
a simplification?

> 
> We've has this sort of problem in the past with Intel letting the
> tail
> wag the horse, does anyone remember optimising relocations for a
> userspace that didn't actually need to use relocations?

> 
> We need to ask why this userspace is doing this, can we get some
> pointers to it? compute driver should have no reason to use mostly
> external objects, the OpenCL and level0 APIs should be good enough to
> figure this out.

TBH for the compute UMD case, I'd be prepared to drop the *performance*
argument of fine-grained locking the extobj list since it's really only
traversed on new contexts and preemption. But as Christian mentions
there might be other cases. We should perhaps figure those out and
document?

/Thoams


> 
> Dave.



Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-11 Thread Christian König

Am 10.10.23 um 22:23 schrieb Dave Airlie:

I think we're then optimizing for different scenarios. Our compute
driver will use mostly external objects only, and if shared, I don't
forsee them bound to many VMs. What saves us currently here is that in
compute mode we only really traverse the extobj list after a preempt
fence wait, or when a vm is using a new context for the first time. So
vm's extobj list is pretty large. Each bo's vma list will typically be
pretty small.

Can I ask why we are optimising for this userspace, this seems
incredibly broken.

We've has this sort of problem in the past with Intel letting the tail
wag the horse, does anyone remember optimising relocations for a
userspace that didn't actually need to use relocations?

We need to ask why this userspace is doing this, can we get some
pointers to it? compute driver should have no reason to use mostly
external objects, the OpenCL and level0 APIs should be good enough to
figure this out.


Well that is pretty normal use case, AMD works the same way.

In a multi GPU compute stack you have mostly all the data shared between 
different hardware devices.


As I said before looking at just the Vulcan use case is not a good idea 
at all.


Christian.



Dave.




Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-10-10 Thread Dave Airlie
> I think we're then optimizing for different scenarios. Our compute
> driver will use mostly external objects only, and if shared, I don't
> forsee them bound to many VMs. What saves us currently here is that in
> compute mode we only really traverse the extobj list after a preempt
> fence wait, or when a vm is using a new context for the first time. So
> vm's extobj list is pretty large. Each bo's vma list will typically be
> pretty small.

Can I ask why we are optimising for this userspace, this seems
incredibly broken.

We've has this sort of problem in the past with Intel letting the tail
wag the horse, does anyone remember optimising relocations for a
userspace that didn't actually need to use relocations?

We need to ask why this userspace is doing this, can we get some
pointers to it? compute driver should have no reason to use mostly
external objects, the OpenCL and level0 APIs should be good enough to
figure this out.

Dave.


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-09-06 Thread Danilo Krummrich

On 9/1/23 14:10, Danilo Krummrich wrote:

On Fri, Sep 01, 2023 at 07:59:21AM +0200, Thomas Hellström (Intel) wrote:


On 8/31/23 21:07, Danilo Krummrich wrote:

On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote:

Hi,

On 8/31/23 13:18, Danilo Krummrich wrote:

On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote:

Hi!

On 8/30/23 17:00, Danilo Krummrich wrote:

On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:

On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will probably
get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:




diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
index ed8d50200cc3..693e2da3f425 100644
--- a/include/drm/drm_gpuva_mgr.h
+++ b/include/drm/drm_gpuva_mgr.h
@@ -26,12 +26,16 @@
*/
   #include 
+#include 
+#include 
   #include 
   #include 
   #include 
+#include 
   struct drm_gpuva_manager;
+struct drm_gpuva_gem;
   struct drm_gpuva_fn_ops;
   /**
@@ -140,7 +144,7 @@ struct drm_gpuva {
   int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva 
*va);
   void drm_gpuva_remove(struct drm_gpuva *va);
-void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo);
   void drm_gpuva_unlink(struct drm_gpuva *va);
   struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
@@ -240,15 +244,137 @@ struct drm_gpuva_manager {
 * @ops: _gpuva_fn_ops providing the split/merge steps to drivers
 */
const struct drm_gpuva_fn_ops *ops;
+
+   /**
+* @d_obj: Dummy GEM object; used internally to pass the GPU VMs
+* dma-resv to _exec.
+*/
+   struct drm_gem_object d_obj;
+
+   /**
+* @resv: the _resv for _gem_objects mapped in this GPU VA
+* space
+*/
+   struct dma_resv *resv;
+
+   /**
+* @exec: the _exec helper to lock external _gem_objects
+*/
+   struct drm_exec exec;
+
+   /**
+* @mt_ext: _tree storing external _gem_objects
+*/
+   struct maple_tree mt_ext;

Why are you using a maple tree here? Insertion and removal is O(log(n))
instead of O(1) for a list?


Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs
could have mappings of the same extobj.

I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry
instead, which also seems to be the obvious choice. However, there is a locking
conflict.

A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps
a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of
the corresponding drm_gem_object, or with an external lock provided by the
driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing
changes on the GPUVA space directly from the fence signalling path.

Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already,
we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being
linked and we'd want to remove it for the last one being unlinked.

(Actually we'd want to add the drm_gpuva_gem object to the extobj list even
before, because otherwise we'd not acquire it's dma-resv lock of this GEM object
through drm_gpuva_manager_lock(). But that's trival, we could do that when we
create the drm_gpuva_gem, which we need to do anyways.)

Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj
list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In
order to do so, we'd (as discussed above) either need to hold the outer GPU-VM
lock or the GPU-VMs dma-resv lock. Both would be illegal in the case
drm_gpuva_unlink() is called from within the fence signalling path. For drivers
like XE or Nouveau, we'd at least need to make sure to not mess up the locking
hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO.

Considering all that, I thought it's probably better to track extobjs separate
from the drm_gpuva_gem, hence the maple tree choice.

Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to
external objects, or in the case of multiple mappings to the same gem
object, only one of the drm_gpuvas is in the list. These are protected by
the GPU-VM lock. I don't see a problem with removing those from the fence
signalling path, though?

I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs,
since this is generic code I don't know how much mappings of an external object
the corresponding driver potentially creates. This could become a pretty large
list to iterate. Another reason was, that I want to keep the drm_gpuva structure
as small as possible, hence avoiding another list_head.


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-09-01 Thread Danilo Krummrich
On Fri, Sep 01, 2023 at 07:59:21AM +0200, Thomas Hellström (Intel) wrote:
> 
> On 8/31/23 21:07, Danilo Krummrich wrote:
> > On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote:
> > > Hi,
> > > 
> > > On 8/31/23 13:18, Danilo Krummrich wrote:
> > > > On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) 
> > > > wrote:
> > > > > Hi!
> > > > > 
> > > > > On 8/30/23 17:00, Danilo Krummrich wrote:
> > > > > > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) 
> > > > > > wrote:
> > > > > > > On 8/30/23 14:49, Danilo Krummrich wrote:
> > > > > > > > Hi Thomas,
> > > > > > > > 
> > > > > > > > thanks for having a look!
> > > > > > > > 
> > > > > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström 
> > > > > > > > (Intel) wrote:
> > > > > > > > > Hi, Danilo.
> > > > > > > > > 
> > > > > > > > > Some quick comments since I'm doing some Xe work in this 
> > > > > > > > > area. Will probably
> > > > > > > > > get back with more.
> > > > > > > > > 
> > > > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote:
> > > > 
> > > > 
> > > > > > > > > > diff --git a/include/drm/drm_gpuva_mgr.h 
> > > > > > > > > > b/include/drm/drm_gpuva_mgr.h
> > > > > > > > > > index ed8d50200cc3..693e2da3f425 100644
> > > > > > > > > > --- a/include/drm/drm_gpuva_mgr.h
> > > > > > > > > > +++ b/include/drm/drm_gpuva_mgr.h
> > > > > > > > > > @@ -26,12 +26,16 @@
> > > > > > > > > >*/
> > > > > > > > > >   #include 
> > > > > > > > > > +#include 
> > > > > > > > > > +#include 
> > > > > > > > > >   #include 
> > > > > > > > > >   #include 
> > > > > > > > > >   #include 
> > > > > > > > > > +#include 
> > > > > > > > > >   struct drm_gpuva_manager;
> > > > > > > > > > +struct drm_gpuva_gem;
> > > > > > > > > >   struct drm_gpuva_fn_ops;
> > > > > > > > > >   /**
> > > > > > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva {
> > > > > > > > > >   int drm_gpuva_insert(struct drm_gpuva_manager *mgr, 
> > > > > > > > > > struct drm_gpuva *va);
> > > > > > > > > >   void drm_gpuva_remove(struct drm_gpuva *va);
> > > > > > > > > > -void drm_gpuva_link(struct drm_gpuva *va);
> > > > > > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct 
> > > > > > > > > > drm_gpuva_gem *vm_bo);
> > > > > > > > > >   void drm_gpuva_unlink(struct drm_gpuva *va);
> > > > > > > > > >   struct drm_gpuva *drm_gpuva_find(struct 
> > > > > > > > > > drm_gpuva_manager *mgr,
> > > > > > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager {
> > > > > > > > > >  * @ops: _gpuva_fn_ops providing the 
> > > > > > > > > > split/merge steps to drivers
> > > > > > > > > >  */
> > > > > > > > > > const struct drm_gpuva_fn_ops *ops;
> > > > > > > > > > +
> > > > > > > > > > +   /**
> > > > > > > > > > +* @d_obj: Dummy GEM object; used internally to pass 
> > > > > > > > > > the GPU VMs
> > > > > > > > > > +* dma-resv to _exec.
> > > > > > > > > > +*/
> > > > > > > > > > +   struct drm_gem_object d_obj;
> > > > > > > > > > +
> > > > > > > > > > +   /**
> > > > > > > > > > +* @resv: the _resv for _gem_objects mapped in 
> > > > > > > > > > this GPU VA
> > > > > > > > > > +* space
> > > > > > > > > > +*/
> > > > > > > > > > +   struct dma_resv *resv;
> > > > > > > > > > +
> > > > > > > > > > +   /**
> > > > > > > > > > +* @exec: the _exec helper to lock external 
> > > > > > > > > > _gem_objects
> > > > > > > > > > +*/
> > > > > > > > > > +   struct drm_exec exec;
> > > > > > > > > > +
> > > > > > > > > > +   /**
> > > > > > > > > > +* @mt_ext: _tree storing external 
> > > > > > > > > > _gem_objects
> > > > > > > > > > +*/
> > > > > > > > > > +   struct maple_tree mt_ext;
> > > > > > > > > Why are you using a maple tree here? Insertion and removal is 
> > > > > > > > > O(log(n))
> > > > > > > > > instead of O(1) for a list?
> > > > > > > > > 
> > > > > > > > Having a list of drm_gem_objects directly wouldn't work, as 
> > > > > > > > multiple GPU-VMs
> > > > > > > > could have mappings of the same extobj.
> > > > > > > > 
> > > > > > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) 
> > > > > > > > as list entry
> > > > > > > > instead, which also seems to be the obvious choice. However, 
> > > > > > > > there is a locking
> > > > > > > > conflict.
> > > > > > > > 
> > > > > > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each 
> > > > > > > > drm_gpuva_gem keeps
> > > > > > > > a list of drm_gpuvas. Both lists are either protected with the 
> > > > > > > > dma-resv lock of
> > > > > > > > the corresponding drm_gem_object, or with an external lock 
> > > > > > > > provided by the
> > > > > > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by 
> > > > > > > > drivers performing
> > > > > > > > changes on the GPUVA space directly from the fence signalling 
> > > > > > > > path.
> > > > > > > > 
> > > > > > > > Now, similar to what 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-09-01 Thread Intel



On 8/31/23 21:07, Danilo Krummrich wrote:

On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote:

Hi,

On 8/31/23 13:18, Danilo Krummrich wrote:

On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote:

Hi!

On 8/30/23 17:00, Danilo Krummrich wrote:

On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:

On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will probably
get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:




diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
index ed8d50200cc3..693e2da3f425 100644
--- a/include/drm/drm_gpuva_mgr.h
+++ b/include/drm/drm_gpuva_mgr.h
@@ -26,12 +26,16 @@
   */
  #include 
+#include 
+#include 
  #include 
  #include 
  #include 
+#include 
  struct drm_gpuva_manager;
+struct drm_gpuva_gem;
  struct drm_gpuva_fn_ops;
  /**
@@ -140,7 +144,7 @@ struct drm_gpuva {
  int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va);
  void drm_gpuva_remove(struct drm_gpuva *va);
-void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo);
  void drm_gpuva_unlink(struct drm_gpuva *va);
  struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
@@ -240,15 +244,137 @@ struct drm_gpuva_manager {
 * @ops: _gpuva_fn_ops providing the split/merge steps to drivers
 */
const struct drm_gpuva_fn_ops *ops;
+
+   /**
+* @d_obj: Dummy GEM object; used internally to pass the GPU VMs
+* dma-resv to _exec.
+*/
+   struct drm_gem_object d_obj;
+
+   /**
+* @resv: the _resv for _gem_objects mapped in this GPU VA
+* space
+*/
+   struct dma_resv *resv;
+
+   /**
+* @exec: the _exec helper to lock external _gem_objects
+*/
+   struct drm_exec exec;
+
+   /**
+* @mt_ext: _tree storing external _gem_objects
+*/
+   struct maple_tree mt_ext;

Why are you using a maple tree here? Insertion and removal is O(log(n))
instead of O(1) for a list?


Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs
could have mappings of the same extobj.

I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry
instead, which also seems to be the obvious choice. However, there is a locking
conflict.

A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps
a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of
the corresponding drm_gem_object, or with an external lock provided by the
driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing
changes on the GPUVA space directly from the fence signalling path.

Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already,
we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being
linked and we'd want to remove it for the last one being unlinked.

(Actually we'd want to add the drm_gpuva_gem object to the extobj list even
before, because otherwise we'd not acquire it's dma-resv lock of this GEM object
through drm_gpuva_manager_lock(). But that's trival, we could do that when we
create the drm_gpuva_gem, which we need to do anyways.)

Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj
list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In
order to do so, we'd (as discussed above) either need to hold the outer GPU-VM
lock or the GPU-VMs dma-resv lock. Both would be illegal in the case
drm_gpuva_unlink() is called from within the fence signalling path. For drivers
like XE or Nouveau, we'd at least need to make sure to not mess up the locking
hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO.

Considering all that, I thought it's probably better to track extobjs separate
from the drm_gpuva_gem, hence the maple tree choice.

Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to
external objects, or in the case of multiple mappings to the same gem
object, only one of the drm_gpuvas is in the list. These are protected by
the GPU-VM lock. I don't see a problem with removing those from the fence
signalling path, though?

I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs,
since this is generic code I don't know how much mappings of an external object
the corresponding driver potentially creates. This could become a pretty large
list to iterate. Another reason was, that I want to keep the drm_gpuva structure
as small as possible, hence avoiding another list_head.

Yes, the list might be pretty large, but OTOH you never iterate to access a
single list element. When you need to iterate the 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-31 Thread Danilo Krummrich
On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote:
> Hi,
> 
> On 8/31/23 13:18, Danilo Krummrich wrote:
> > On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote:
> > > Hi!
> > > 
> > > On 8/30/23 17:00, Danilo Krummrich wrote:
> > > > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) 
> > > > wrote:
> > > > > On 8/30/23 14:49, Danilo Krummrich wrote:
> > > > > > Hi Thomas,
> > > > > > 
> > > > > > thanks for having a look!
> > > > > > 
> > > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) 
> > > > > > wrote:
> > > > > > > Hi, Danilo.
> > > > > > > 
> > > > > > > Some quick comments since I'm doing some Xe work in this area. 
> > > > > > > Will probably
> > > > > > > get back with more.
> > > > > > > 
> > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote:
> > 
> > 
> > > > > > > > diff --git a/include/drm/drm_gpuva_mgr.h 
> > > > > > > > b/include/drm/drm_gpuva_mgr.h
> > > > > > > > index ed8d50200cc3..693e2da3f425 100644
> > > > > > > > --- a/include/drm/drm_gpuva_mgr.h
> > > > > > > > +++ b/include/drm/drm_gpuva_mgr.h
> > > > > > > > @@ -26,12 +26,16 @@
> > > > > > > >   */
> > > > > > > >  #include 
> > > > > > > > +#include 
> > > > > > > > +#include 
> > > > > > > >  #include 
> > > > > > > >  #include 
> > > > > > > >  #include 
> > > > > > > > +#include 
> > > > > > > >  struct drm_gpuva_manager;
> > > > > > > > +struct drm_gpuva_gem;
> > > > > > > >  struct drm_gpuva_fn_ops;
> > > > > > > >  /**
> > > > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva {
> > > > > > > >  int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct 
> > > > > > > > drm_gpuva *va);
> > > > > > > >  void drm_gpuva_remove(struct drm_gpuva *va);
> > > > > > > > -void drm_gpuva_link(struct drm_gpuva *va);
> > > > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem 
> > > > > > > > *vm_bo);
> > > > > > > >  void drm_gpuva_unlink(struct drm_gpuva *va);
> > > > > > > >  struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager 
> > > > > > > > *mgr,
> > > > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager {
> > > > > > > >  * @ops: _gpuva_fn_ops providing the split/merge 
> > > > > > > > steps to drivers
> > > > > > > >  */
> > > > > > > > const struct drm_gpuva_fn_ops *ops;
> > > > > > > > +
> > > > > > > > +   /**
> > > > > > > > +* @d_obj: Dummy GEM object; used internally to pass 
> > > > > > > > the GPU VMs
> > > > > > > > +* dma-resv to _exec.
> > > > > > > > +*/
> > > > > > > > +   struct drm_gem_object d_obj;
> > > > > > > > +
> > > > > > > > +   /**
> > > > > > > > +* @resv: the _resv for _gem_objects mapped in 
> > > > > > > > this GPU VA
> > > > > > > > +* space
> > > > > > > > +*/
> > > > > > > > +   struct dma_resv *resv;
> > > > > > > > +
> > > > > > > > +   /**
> > > > > > > > +* @exec: the _exec helper to lock external 
> > > > > > > > _gem_objects
> > > > > > > > +*/
> > > > > > > > +   struct drm_exec exec;
> > > > > > > > +
> > > > > > > > +   /**
> > > > > > > > +* @mt_ext: _tree storing external 
> > > > > > > > _gem_objects
> > > > > > > > +*/
> > > > > > > > +   struct maple_tree mt_ext;
> > > > > > > Why are you using a maple tree here? Insertion and removal is 
> > > > > > > O(log(n))
> > > > > > > instead of O(1) for a list?
> > > > > > > 
> > > > > > Having a list of drm_gem_objects directly wouldn't work, as 
> > > > > > multiple GPU-VMs
> > > > > > could have mappings of the same extobj.
> > > > > > 
> > > > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) as 
> > > > > > list entry
> > > > > > instead, which also seems to be the obvious choice. However, there 
> > > > > > is a locking
> > > > > > conflict.
> > > > > > 
> > > > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each 
> > > > > > drm_gpuva_gem keeps
> > > > > > a list of drm_gpuvas. Both lists are either protected with the 
> > > > > > dma-resv lock of
> > > > > > the corresponding drm_gem_object, or with an external lock provided 
> > > > > > by the
> > > > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by 
> > > > > > drivers performing
> > > > > > changes on the GPUVA space directly from the fence signalling path.
> > > > > > 
> > > > > > Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are 
> > > > > > doing already,
> > > > > > we'd want to add a drm_gpuva_gem to the extobj list for the first 
> > > > > > mapping being
> > > > > > linked and we'd want to remove it for the last one being unlinked.
> > > > > > 
> > > > > > (Actually we'd want to add the drm_gpuva_gem object to the extobj 
> > > > > > list even
> > > > > > before, because otherwise we'd not acquire it's dma-resv lock of 
> > > > > > this GEM object
> > > > > > through drm_gpuva_manager_lock(). But that's trival, we could do 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-31 Thread Thomas Hellström



On 8/31/23 18:53, Thomas Hellström (Intel) wrote:

Hi,

On 8/31/23 13:18, Danilo Krummrich wrote:
On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) 
wrote:

Hi!

On 8/30/23 17:00, Danilo Krummrich wrote:
On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) 
wrote:

On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström 
(Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. 
Will probably

get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:



diff --git a/include/drm/drm_gpuva_mgr.h 
b/include/drm/drm_gpuva_mgr.h

index ed8d50200cc3..693e2da3f425 100644
--- a/include/drm/drm_gpuva_mgr.h
+++ b/include/drm/drm_gpuva_mgr.h
@@ -26,12 +26,16 @@
  */
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
 struct drm_gpuva_manager;
+struct drm_gpuva_gem;
 struct drm_gpuva_fn_ops;
 /**
@@ -140,7 +144,7 @@ struct drm_gpuva {
 int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct 
drm_gpuva *va);

 void drm_gpuva_remove(struct drm_gpuva *va);
-void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem 
*vm_bo);

 void drm_gpuva_unlink(struct drm_gpuva *va);
 struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager 
*mgr,

@@ -240,15 +244,137 @@ struct drm_gpuva_manager {
  * @ops: _gpuva_fn_ops providing the split/merge 
steps to drivers

  */
 const struct drm_gpuva_fn_ops *ops;
+
+    /**
+ * @d_obj: Dummy GEM object; used internally to pass the 
GPU VMs

+ * dma-resv to _exec.
+ */
+    struct drm_gem_object d_obj;
+
+    /**
+ * @resv: the _resv for _gem_objects mapped in 
this GPU VA

+ * space
+ */
+    struct dma_resv *resv;
+
+    /**
+ * @exec: the _exec helper to lock external 
_gem_objects

+ */
+    struct drm_exec exec;
+
+    /**
+ * @mt_ext: _tree storing external _gem_objects
+ */
+    struct maple_tree mt_ext;
Why are you using a maple tree here? Insertion and removal is 
O(log(n))

instead of O(1) for a list?

Having a list of drm_gem_objects directly wouldn't work, as 
multiple GPU-VMs

could have mappings of the same extobj.

I considered using the VM_BO abstraction (struct drm_gpuva_gem) 
as list entry
instead, which also seems to be the obvious choice. However, 
there is a locking

conflict.

A drm_gem_object keeps a list of drm_gpuva_gems, while each 
drm_gpuva_gem keeps
a list of drm_gpuvas. Both lists are either protected with the 
dma-resv lock of
the corresponding drm_gem_object, or with an external lock 
provided by the
driver (see drm_gem_gpuva_set_lock()). The latter is used by 
drivers performing

changes on the GPUVA space directly from the fence signalling path.

Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are 
doing already,
we'd want to add a drm_gpuva_gem to the extobj list for the first 
mapping being

linked and we'd want to remove it for the last one being unlinked.

(Actually we'd want to add the drm_gpuva_gem object to the extobj 
list even
before, because otherwise we'd not acquire it's dma-resv lock of 
this GEM object
through drm_gpuva_manager_lock(). But that's trival, we could do 
that when we

create the drm_gpuva_gem, which we need to do anyways.)

Anyway, we'd probably want to keep removing the drm_gpuva_gem 
from the extobj
list from drm_gpuva_unlink() when the last mapping of this BO is 
unlinked. In
order to do so, we'd (as discussed above) either need to hold the 
outer GPU-VM

lock or the GPU-VMs dma-resv lock. Both would be illegal in the case
drm_gpuva_unlink() is called from within the fence signalling 
path. For drivers
like XE or Nouveau, we'd at least need to make sure to not mess 
up the locking

hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO.

Considering all that, I thought it's probably better to track 
extobjs separate

from the drm_gpuva_gem, hence the maple tree choice.
Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that 
point to

external objects, or in the case of multiple mappings to the same gem
object, only one of the drm_gpuvas is in the list. These are 
protected by
the GPU-VM lock. I don't see a problem with removing those from 
the fence

signalling path, though?
I intentionally tried to avoid keeping a list of drm_gpuvas to 
track extobjs,
since this is generic code I don't know how much mappings of an 
external object
the corresponding driver potentially creates. This could become a 
pretty large
list to iterate. Another reason was, that I want to keep the 
drm_gpuva structure

as small as possible, hence avoiding another list_head.
Yes, the list might be pretty large, but OTOH you never iterate to 
access a
single list element. When you need to iterate the whole list you 
need to do
that regardless of the data structure used. As for 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-31 Thread Intel

Hi,

On 8/31/23 13:18, Danilo Krummrich wrote:

On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote:

Hi!

On 8/30/23 17:00, Danilo Krummrich wrote:

On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:

On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will probably
get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:




diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h
index ed8d50200cc3..693e2da3f425 100644
--- a/include/drm/drm_gpuva_mgr.h
+++ b/include/drm/drm_gpuva_mgr.h
@@ -26,12 +26,16 @@
  */
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
+#include 
 struct drm_gpuva_manager;
+struct drm_gpuva_gem;
 struct drm_gpuva_fn_ops;
 /**
@@ -140,7 +144,7 @@ struct drm_gpuva {
 int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va);
 void drm_gpuva_remove(struct drm_gpuva *va);
-void drm_gpuva_link(struct drm_gpuva *va);
+void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo);
 void drm_gpuva_unlink(struct drm_gpuva *va);
 struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
@@ -240,15 +244,137 @@ struct drm_gpuva_manager {
 * @ops: _gpuva_fn_ops providing the split/merge steps to drivers
 */
const struct drm_gpuva_fn_ops *ops;
+
+   /**
+* @d_obj: Dummy GEM object; used internally to pass the GPU VMs
+* dma-resv to _exec.
+*/
+   struct drm_gem_object d_obj;
+
+   /**
+* @resv: the _resv for _gem_objects mapped in this GPU VA
+* space
+*/
+   struct dma_resv *resv;
+
+   /**
+* @exec: the _exec helper to lock external _gem_objects
+*/
+   struct drm_exec exec;
+
+   /**
+* @mt_ext: _tree storing external _gem_objects
+*/
+   struct maple_tree mt_ext;

Why are you using a maple tree here? Insertion and removal is O(log(n))
instead of O(1) for a list?


Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs
could have mappings of the same extobj.

I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry
instead, which also seems to be the obvious choice. However, there is a locking
conflict.

A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps
a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of
the corresponding drm_gem_object, or with an external lock provided by the
driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing
changes on the GPUVA space directly from the fence signalling path.

Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already,
we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being
linked and we'd want to remove it for the last one being unlinked.

(Actually we'd want to add the drm_gpuva_gem object to the extobj list even
before, because otherwise we'd not acquire it's dma-resv lock of this GEM object
through drm_gpuva_manager_lock(). But that's trival, we could do that when we
create the drm_gpuva_gem, which we need to do anyways.)

Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj
list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In
order to do so, we'd (as discussed above) either need to hold the outer GPU-VM
lock or the GPU-VMs dma-resv lock. Both would be illegal in the case
drm_gpuva_unlink() is called from within the fence signalling path. For drivers
like XE or Nouveau, we'd at least need to make sure to not mess up the locking
hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO.

Considering all that, I thought it's probably better to track extobjs separate
from the drm_gpuva_gem, hence the maple tree choice.

Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to
external objects, or in the case of multiple mappings to the same gem
object, only one of the drm_gpuvas is in the list. These are protected by
the GPU-VM lock. I don't see a problem with removing those from the fence
signalling path, though?

I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs,
since this is generic code I don't know how much mappings of an external object
the corresponding driver potentially creates. This could become a pretty large
list to iterate. Another reason was, that I want to keep the drm_gpuva structure
as small as possible, hence avoiding another list_head.

Yes, the list might be pretty large, but OTOH you never iterate to access a
single list element. When you need to iterate the whole list you need to do
that regardless of the data structure used. As for the list head, it might
perhaps be aliased (union) with 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-31 Thread Danilo Krummrich
On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote:
> Hi!
> 
> On 8/30/23 17:00, Danilo Krummrich wrote:
> > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:
> > > On 8/30/23 14:49, Danilo Krummrich wrote:
> > > > Hi Thomas,
> > > > 
> > > > thanks for having a look!
> > > > 
> > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) 
> > > > wrote:
> > > > > Hi, Danilo.
> > > > > 
> > > > > Some quick comments since I'm doing some Xe work in this area. Will 
> > > > > probably
> > > > > get back with more.
> > > > > 
> > > > > On 8/20/23 23:53, Danilo Krummrich wrote:



> > > > > > diff --git a/include/drm/drm_gpuva_mgr.h 
> > > > > > b/include/drm/drm_gpuva_mgr.h
> > > > > > index ed8d50200cc3..693e2da3f425 100644
> > > > > > --- a/include/drm/drm_gpuva_mgr.h
> > > > > > +++ b/include/drm/drm_gpuva_mgr.h
> > > > > > @@ -26,12 +26,16 @@
> > > > > >  */
> > > > > > #include 
> > > > > > +#include 
> > > > > > +#include 
> > > > > > #include 
> > > > > > #include 
> > > > > > #include 
> > > > > > +#include 
> > > > > > struct drm_gpuva_manager;
> > > > > > +struct drm_gpuva_gem;
> > > > > > struct drm_gpuva_fn_ops;
> > > > > > /**
> > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva {
> > > > > > int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct 
> > > > > > drm_gpuva *va);
> > > > > > void drm_gpuva_remove(struct drm_gpuva *va);
> > > > > > -void drm_gpuva_link(struct drm_gpuva *va);
> > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem 
> > > > > > *vm_bo);
> > > > > > void drm_gpuva_unlink(struct drm_gpuva *va);
> > > > > > struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr,
> > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager {
> > > > > >  * @ops: _gpuva_fn_ops providing the split/merge 
> > > > > > steps to drivers
> > > > > >  */
> > > > > > const struct drm_gpuva_fn_ops *ops;
> > > > > > +
> > > > > > +   /**
> > > > > > +* @d_obj: Dummy GEM object; used internally to pass the GPU VMs
> > > > > > +* dma-resv to _exec.
> > > > > > +*/
> > > > > > +   struct drm_gem_object d_obj;
> > > > > > +
> > > > > > +   /**
> > > > > > +* @resv: the _resv for _gem_objects mapped in this GPU 
> > > > > > VA
> > > > > > +* space
> > > > > > +*/
> > > > > > +   struct dma_resv *resv;
> > > > > > +
> > > > > > +   /**
> > > > > > +* @exec: the _exec helper to lock external _gem_objects
> > > > > > +*/
> > > > > > +   struct drm_exec exec;
> > > > > > +
> > > > > > +   /**
> > > > > > +* @mt_ext: _tree storing external _gem_objects
> > > > > > +*/
> > > > > > +   struct maple_tree mt_ext;
> > > > > Why are you using a maple tree here? Insertion and removal is 
> > > > > O(log(n))
> > > > > instead of O(1) for a list?
> > > > > 
> > > > Having a list of drm_gem_objects directly wouldn't work, as multiple 
> > > > GPU-VMs
> > > > could have mappings of the same extobj.
> > > > 
> > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list 
> > > > entry
> > > > instead, which also seems to be the obvious choice. However, there is a 
> > > > locking
> > > > conflict.
> > > > 
> > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each 
> > > > drm_gpuva_gem keeps
> > > > a list of drm_gpuvas. Both lists are either protected with the dma-resv 
> > > > lock of
> > > > the corresponding drm_gem_object, or with an external lock provided by 
> > > > the
> > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers 
> > > > performing
> > > > changes on the GPUVA space directly from the fence signalling path.
> > > > 
> > > > Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing 
> > > > already,
> > > > we'd want to add a drm_gpuva_gem to the extobj list for the first 
> > > > mapping being
> > > > linked and we'd want to remove it for the last one being unlinked.
> > > > 
> > > > (Actually we'd want to add the drm_gpuva_gem object to the extobj list 
> > > > even
> > > > before, because otherwise we'd not acquire it's dma-resv lock of this 
> > > > GEM object
> > > > through drm_gpuva_manager_lock(). But that's trival, we could do that 
> > > > when we
> > > > create the drm_gpuva_gem, which we need to do anyways.)
> > > > 
> > > > Anyway, we'd probably want to keep removing the drm_gpuva_gem from the 
> > > > extobj
> > > > list from drm_gpuva_unlink() when the last mapping of this BO is 
> > > > unlinked. In
> > > > order to do so, we'd (as discussed above) either need to hold the outer 
> > > > GPU-VM
> > > > lock or the GPU-VMs dma-resv lock. Both would be illegal in the case
> > > > drm_gpuva_unlink() is called from within the fence signalling path. For 
> > > > drivers
> > > > like XE or Nouveau, we'd at least need to make sure to not mess up the 
> > > > locking
> > > > hierarchy of GPU-VM lock and dma-resv lock of the 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-31 Thread Intel

Hi!

On 8/30/23 17:00, Danilo Krummrich wrote:

On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:

On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will probably
get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
  this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
  shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
  GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
  of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Signed-off-by: Danilo Krummrich 
---
drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
include/drm/drm_gem.h   |  48 ++-
include/drm/drm_gpuva_mgr.h | 302 +-
3 files changed, 1010 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index f86bfad74ff8..69872b205961 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
/**
 * drm_gpuva_manager_init() - initialize a _gpuva_manager
 * @mgr: pointer to the _gpuva_manager to initialize
+ * @drm: the drivers _device
 * @name: the name of the GPU VA space
 * @start_offset: the start offset of the GPU VA space
 * @range: the size of the GPU VA space
@@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
 */
void
drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+  struct drm_device *drm,
   const char *name,
   u64 start_offset, u64 range,
   u64 reserve_offset, u64 reserve_range,
@@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
mgr->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
+   mt_init(>mt_ext);
+
+   INIT_LIST_HEAD(>evict.list);
+   spin_lock_init(>evict.lock);
+
drm_gpuva_check_overflow(start_offset, range);
mgr->mm_start = start_offset;
mgr->mm_range = range;
@@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
 reserve_range)))
__drm_gpuva_insert(mgr, >kernel_alloc_node);
}
+
+   drm_gem_private_object_init(drm, >d_obj, 0);
+   mgr->resv = mgr->d_obj.resv;
}
EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
@@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
__drm_gpuva_remove(>kernel_alloc_node);
WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),
-"GPUVA tree is not empty, potentially leaking memory.");
+"GPUVA tree is not empty, potentially leaking memory.\n");
+
+   mtree_destroy(>mt_ext);
+   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
+
+   drm_gem_private_object_fini(>d_obj);
}
EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
+/**
+ * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
+ * @mgr: the _gpuva_manager
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all _gem_objects the given
+ * _gpuva_manager contains mappings of.
+ *
+ * Drivers can obtain the corresponding _exec instance through
+ * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init()
+ * and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
+ unsigned int num_fences)
+{
+   struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
+   MA_STATE(mas, >mt_ext, 0, 0);
+   union {
+  

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Danilo Krummrich
On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote:
> 
> On 8/30/23 14:49, Danilo Krummrich wrote:
> > Hi Thomas,
> > 
> > thanks for having a look!
> > 
> > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:
> > > Hi, Danilo.
> > > 
> > > Some quick comments since I'm doing some Xe work in this area. Will 
> > > probably
> > > get back with more.
> > > 
> > > On 8/20/23 23:53, Danilo Krummrich wrote:
> > > > So far the DRM GPUVA manager offers common infrastructure to track GPU 
> > > > VA
> > > > allocations and mappings, generically connect GPU VA mappings to their
> > > > backing buffers and perform more complex mapping operations on the GPU 
> > > > VA
> > > > space.
> > > > 
> > > > However, there are more design patterns commonly used by drivers, which
> > > > can potentially be generalized in order to make the DRM GPUVA manager
> > > > represent a basic GPU-VM implementation. In this context, this patch 
> > > > aims
> > > > at generalizing the following elements.
> > > > 
> > > > 1) Provide a common dma-resv for GEM objects not being used outside of
> > > >  this GPU-VM.
> > > > 
> > > > 2) Provide tracking of external GEM objects (GEM objects which are
> > > >  shared with other GPU-VMs).
> > > > 
> > > > 3) Provide functions to efficiently lock all GEM objects dma-resv the
> > > >  GPU-VM contains mappings of.
> > > > 
> > > > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> > > >  of, such that validation of evicted GEM objects is accelerated.
> > > > 
> > > > 5) Provide some convinience functions for common patterns.
> > > > 
> > > > Rather than being designed as a "framework", the target is to make all
> > > > features appear as a collection of optional helper functions, such that
> > > > drivers are free to make use of the DRM GPUVA managers basic
> > > > functionality and opt-in for other features without setting any feature
> > > > flags, just by making use of the corresponding functions.
> > > > 
> > > > Signed-off-by: Danilo Krummrich 
> > > > ---
> > > >drivers/gpu/drm/drm_gpuva_mgr.c | 688 
> > > > +++-
> > > >include/drm/drm_gem.h   |  48 ++-
> > > >include/drm/drm_gpuva_mgr.h | 302 +-
> > > >3 files changed, 1010 insertions(+), 28 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c 
> > > > b/drivers/gpu/drm/drm_gpuva_mgr.c
> > > > index f86bfad74ff8..69872b205961 100644
> > > > --- a/drivers/gpu/drm/drm_gpuva_mgr.c
> > > > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> > > > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> > > >/**
> > > > * drm_gpuva_manager_init() - initialize a _gpuva_manager
> > > > * @mgr: pointer to the _gpuva_manager to initialize
> > > > + * @drm: the drivers _device
> > > > * @name: the name of the GPU VA space
> > > > * @start_offset: the start offset of the GPU VA space
> > > > * @range: the size of the GPU VA space
> > > > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> > > > */
> > > >void
> > > >drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > > > +  struct drm_device *drm,
> > > >const char *name,
> > > >u64 start_offset, u64 range,
> > > >u64 reserve_offset, u64 reserve_range,
> > > > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager 
> > > > *mgr,
> > > > mgr->rb.tree = RB_ROOT_CACHED;
> > > > INIT_LIST_HEAD(>rb.list);
> > > > +   mt_init(>mt_ext);
> > > > +
> > > > +   INIT_LIST_HEAD(>evict.list);
> > > > +   spin_lock_init(>evict.lock);
> > > > +
> > > > drm_gpuva_check_overflow(start_offset, range);
> > > > mgr->mm_start = start_offset;
> > > > mgr->mm_range = range;
> > > > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager 
> > > > *mgr,
> > > >  reserve_range)))
> > > > __drm_gpuva_insert(mgr, 
> > > > >kernel_alloc_node);
> > > > }
> > > > +
> > > > +   drm_gem_private_object_init(drm, >d_obj, 0);
> > > > +   mgr->resv = mgr->d_obj.resv;
> > > >}
> > > >EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
> > > > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct 
> > > > drm_gpuva_manager *mgr)
> > > > __drm_gpuva_remove(>kernel_alloc_node);
> > > > WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),
> > > > -"GPUVA tree is not empty, potentially leaking memory.");
> > > > +"GPUVA tree is not empty, potentially leaking memory.\n");
> > > > +
> > > > +   mtree_destroy(>mt_ext);
> > > > +   WARN(!list_empty(>evict.list), "Evict list should be 
> > > > empty.\n");
> > > > +
> > > > +   drm_gem_private_object_fini(>d_obj);
> > > >}
> > > >

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Intel



On 8/30/23 14:49, Danilo Krummrich wrote:

Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will probably
get back with more.

On 8/20/23 23:53, Danilo Krummrich wrote:

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
 this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
 shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
 GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
 of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Signed-off-by: Danilo Krummrich 
---
   drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
   include/drm/drm_gem.h   |  48 ++-
   include/drm/drm_gpuva_mgr.h | 302 +-
   3 files changed, 1010 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index f86bfad74ff8..69872b205961 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
   /**
* drm_gpuva_manager_init() - initialize a _gpuva_manager
* @mgr: pointer to the _gpuva_manager to initialize
+ * @drm: the drivers _device
* @name: the name of the GPU VA space
* @start_offset: the start offset of the GPU VA space
* @range: the size of the GPU VA space
@@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
*/
   void
   drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+  struct drm_device *drm,
   const char *name,
   u64 start_offset, u64 range,
   u64 reserve_offset, u64 reserve_range,
@@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
mgr->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
+   mt_init(>mt_ext);
+
+   INIT_LIST_HEAD(>evict.list);
+   spin_lock_init(>evict.lock);
+
drm_gpuva_check_overflow(start_offset, range);
mgr->mm_start = start_offset;
mgr->mm_range = range;
@@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
 reserve_range)))
__drm_gpuva_insert(mgr, >kernel_alloc_node);
}
+
+   drm_gem_private_object_init(drm, >d_obj, 0);
+   mgr->resv = mgr->d_obj.resv;
   }
   EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
@@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
__drm_gpuva_remove(>kernel_alloc_node);
WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),
-"GPUVA tree is not empty, potentially leaking memory.");
+"GPUVA tree is not empty, potentially leaking memory.\n");
+
+   mtree_destroy(>mt_ext);
+   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
+
+   drm_gem_private_object_fini(>d_obj);
   }
   EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
+/**
+ * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
+ * @mgr: the _gpuva_manager
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all _gem_objects the given
+ * _gpuva_manager contains mappings of.
+ *
+ * Drivers can obtain the corresponding _exec instance through
+ * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init()
+ * and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
+ unsigned int num_fences)
+{
+   struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
+   MA_STATE(mas, >mt_ext, 0, 0);
+   union {
+   void *ptr;
+   uintptr_t cnt;
+   } ref;
+   int ret;
+
+   ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
+  

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Danilo Krummrich
On Wed, Aug 30, 2023 at 09:48:02AM +0200, Christian König wrote:
> 
> 
> Am 20.08.23 um 23:53 schrieb Danilo Krummrich:
> > So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> > allocations and mappings, generically connect GPU VA mappings to their
> > backing buffers and perform more complex mapping operations on the GPU VA
> > space.
> > 
> > However, there are more design patterns commonly used by drivers, which
> > can potentially be generalized in order to make the DRM GPUVA manager
> > represent a basic GPU-VM implementation. In this context, this patch aims
> > at generalizing the following elements.
> > 
> > 1) Provide a common dma-resv for GEM objects not being used outside of
> > this GPU-VM.
> > 
> > 2) Provide tracking of external GEM objects (GEM objects which are
> > shared with other GPU-VMs).
> > 
> > 3) Provide functions to efficiently lock all GEM objects dma-resv the
> > GPU-VM contains mappings of.
> > 
> > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> > of, such that validation of evicted GEM objects is accelerated.
> > 
> > 5) Provide some convinience functions for common patterns.
> 
> Interesting work.
> 
> You basically implement a bunch of the ideas I came up to improve the amdgpu
> performance in the common manager now. The was one of the remaining blockers
> I had for using this in amdgpu.
> 
> Question is for example how do you track evictions? E.g. we don't have a
> common concept of eviction in GEM as far as I know. Or is the driver
> responsible for giving those notifications to the GPUVA manager?

Right, it is the driver being responsible to adding a drm_gpuva_gem (or VM_BO)
to the managers evict list.

The idea was that drivers have control about the state of a drm_gpuva_gem, such
that a driver can move it to driver specific lists as well, like all the ones
you have in amdgpu.

> 
> And would it be possible to lock only a specific area of the VM, e.g. every
> BO mapped in the interval X..Y?

Currently, the drm_gpuva_manager_lock() functions always lock the GPU-VMs
dma-resv lock, plus all the dma-resv locks of the external objects the manager
keeps track of.

But surely, we could also add something like drm_gpuva_manager_lock_range()
where we just iterate all drm_gpuvas between X and Y and lock the dma-resv
locks of each drm_gpuva's backing BO.

> 
> Regards,
> Christian.
> 
> > 
> > Rather than being designed as a "framework", the target is to make all
> > features appear as a collection of optional helper functions, such that
> > drivers are free to make use of the DRM GPUVA managers basic
> > functionality and opt-in for other features without setting any feature
> > flags, just by making use of the corresponding functions.
> > 
> > Signed-off-by: Danilo Krummrich 
> > ---
> >   drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
> >   include/drm/drm_gem.h   |  48 ++-
> >   include/drm/drm_gpuva_mgr.h | 302 +-
> >   3 files changed, 1010 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c 
> > b/drivers/gpu/drm/drm_gpuva_mgr.c
> > index f86bfad74ff8..69872b205961 100644
> > --- a/drivers/gpu/drm/drm_gpuva_mgr.c
> > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> >   /**
> >* drm_gpuva_manager_init() - initialize a _gpuva_manager
> >* @mgr: pointer to the _gpuva_manager to initialize
> > + * @drm: the drivers _device
> >* @name: the name of the GPU VA space
> >* @start_offset: the start offset of the GPU VA space
> >* @range: the size of the GPU VA space
> > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> >*/
> >   void
> >   drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > +  struct drm_device *drm,
> >const char *name,
> >u64 start_offset, u64 range,
> >u64 reserve_offset, u64 reserve_range,
> > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > mgr->rb.tree = RB_ROOT_CACHED;
> > INIT_LIST_HEAD(>rb.list);
> > +   mt_init(>mt_ext);
> > +
> > +   INIT_LIST_HEAD(>evict.list);
> > +   spin_lock_init(>evict.lock);
> > +
> > drm_gpuva_check_overflow(start_offset, range);
> > mgr->mm_start = start_offset;
> > mgr->mm_range = range;
> > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> >  reserve_range)))
> > __drm_gpuva_insert(mgr, >kernel_alloc_node);
> > }
> > +
> > +   drm_gem_private_object_init(drm, >d_obj, 0);
> > +   mgr->resv = mgr->d_obj.resv;
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
> > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager 
> > *mgr)
> > __drm_gpuva_remove(>kernel_alloc_node);
> > 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Danilo Krummrich
Hi Thomas,

thanks for having a look!

On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote:
> Hi, Danilo.
> 
> Some quick comments since I'm doing some Xe work in this area. Will probably
> get back with more.
> 
> On 8/20/23 23:53, Danilo Krummrich wrote:
> > So far the DRM GPUVA manager offers common infrastructure to track GPU VA
> > allocations and mappings, generically connect GPU VA mappings to their
> > backing buffers and perform more complex mapping operations on the GPU VA
> > space.
> > 
> > However, there are more design patterns commonly used by drivers, which
> > can potentially be generalized in order to make the DRM GPUVA manager
> > represent a basic GPU-VM implementation. In this context, this patch aims
> > at generalizing the following elements.
> > 
> > 1) Provide a common dma-resv for GEM objects not being used outside of
> > this GPU-VM.
> > 
> > 2) Provide tracking of external GEM objects (GEM objects which are
> > shared with other GPU-VMs).
> > 
> > 3) Provide functions to efficiently lock all GEM objects dma-resv the
> > GPU-VM contains mappings of.
> > 
> > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
> > of, such that validation of evicted GEM objects is accelerated.
> > 
> > 5) Provide some convinience functions for common patterns.
> > 
> > Rather than being designed as a "framework", the target is to make all
> > features appear as a collection of optional helper functions, such that
> > drivers are free to make use of the DRM GPUVA managers basic
> > functionality and opt-in for other features without setting any feature
> > flags, just by making use of the corresponding functions.
> > 
> > Signed-off-by: Danilo Krummrich 
> > ---
> >   drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
> >   include/drm/drm_gem.h   |  48 ++-
> >   include/drm/drm_gpuva_mgr.h | 302 +-
> >   3 files changed, 1010 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c 
> > b/drivers/gpu/drm/drm_gpuva_mgr.c
> > index f86bfad74ff8..69872b205961 100644
> > --- a/drivers/gpu/drm/drm_gpuva_mgr.c
> > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> >   /**
> >* drm_gpuva_manager_init() - initialize a _gpuva_manager
> >* @mgr: pointer to the _gpuva_manager to initialize
> > + * @drm: the drivers _device
> >* @name: the name of the GPU VA space
> >* @start_offset: the start offset of the GPU VA space
> >* @range: the size of the GPU VA space
> > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
> >*/
> >   void
> >   drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > +  struct drm_device *drm,
> >const char *name,
> >u64 start_offset, u64 range,
> >u64 reserve_offset, u64 reserve_range,
> > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> > mgr->rb.tree = RB_ROOT_CACHED;
> > INIT_LIST_HEAD(>rb.list);
> > +   mt_init(>mt_ext);
> > +
> > +   INIT_LIST_HEAD(>evict.list);
> > +   spin_lock_init(>evict.lock);
> > +
> > drm_gpuva_check_overflow(start_offset, range);
> > mgr->mm_start = start_offset;
> > mgr->mm_range = range;
> > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
> >  reserve_range)))
> > __drm_gpuva_insert(mgr, >kernel_alloc_node);
> > }
> > +
> > +   drm_gem_private_object_init(drm, >d_obj, 0);
> > +   mgr->resv = mgr->d_obj.resv;
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
> > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager 
> > *mgr)
> > __drm_gpuva_remove(>kernel_alloc_node);
> > WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),
> > -"GPUVA tree is not empty, potentially leaking memory.");
> > +"GPUVA tree is not empty, potentially leaking memory.\n");
> > +
> > +   mtree_destroy(>mt_ext);
> > +   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
> > +
> > +   drm_gem_private_object_fini(>d_obj);
> >   }
> >   EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
> > +/**
> > + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
> > + * @mgr: the _gpuva_manager
> > + * @num_fences: the amount of _fences to reserve
> > + *
> > + * Calls drm_exec_prepare_obj() for all _gem_objects the given
> > + * _gpuva_manager contains mappings of.
> > + *
> > + * Drivers can obtain the corresponding _exec instance through
> > + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call 
> > drm_exec_init()
> > + * and drm_exec_fini() accordingly.
> > + *
> > + * Returns: 0 on success, negative error code on failure.
> > + */
> > +int
> > +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
> > + unsigned 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Christian König




Am 20.08.23 um 23:53 schrieb Danilo Krummrich:

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.


Interesting work.

You basically implement a bunch of the ideas I came up to improve the 
amdgpu performance in the common manager now. The was one of the 
remaining blockers I had for using this in amdgpu.


Question is for example how do you track evictions? E.g. we don't have a 
common concept of eviction in GEM as far as I know. Or is the driver 
responsible for giving those notifications to the GPUVA manager?


And would it be possible to lock only a specific area of the VM, e.g. 
every BO mapped in the interval X..Y?


Regards,
Christian.



Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
  include/drm/drm_gem.h   |  48 ++-
  include/drm/drm_gpuva_mgr.h | 302 +-
  3 files changed, 1010 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index f86bfad74ff8..69872b205961 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
  /**
   * drm_gpuva_manager_init() - initialize a _gpuva_manager
   * @mgr: pointer to the _gpuva_manager to initialize
+ * @drm: the drivers _device
   * @name: the name of the GPU VA space
   * @start_offset: the start offset of the GPU VA space
   * @range: the size of the GPU VA space
@@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
   */
  void
  drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+  struct drm_device *drm,
   const char *name,
   u64 start_offset, u64 range,
   u64 reserve_offset, u64 reserve_range,
@@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
mgr->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
  
+	mt_init(>mt_ext);

+
+   INIT_LIST_HEAD(>evict.list);
+   spin_lock_init(>evict.lock);
+
drm_gpuva_check_overflow(start_offset, range);
mgr->mm_start = start_offset;
mgr->mm_range = range;
@@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
 reserve_range)))
__drm_gpuva_insert(mgr, >kernel_alloc_node);
}
+
+   drm_gem_private_object_init(drm, >d_obj, 0);
+   mgr->resv = mgr->d_obj.resv;
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
  
@@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)

__drm_gpuva_remove(>kernel_alloc_node);
  
  	WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),

-"GPUVA tree is not empty, potentially leaking memory.");
+"GPUVA tree is not empty, potentially leaking memory.\n");
+
+   mtree_destroy(>mt_ext);
+   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
+
+   drm_gem_private_object_fini(>d_obj);
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
  
+/**

+ * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
+ * @mgr: the _gpuva_manager
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all _gem_objects the given
+ * _gpuva_manager contains mappings of.
+ *
+ * Drivers can obtain the corresponding _exec instance through
+ * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init()
+ * and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
+ unsigned int 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-30 Thread Intel

Hi, Danilo.

Some quick comments since I'm doing some Xe work in this area. Will 
probably get back with more.


On 8/20/23 23:53, Danilo Krummrich wrote:

So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Signed-off-by: Danilo Krummrich 
---
  drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
  include/drm/drm_gem.h   |  48 ++-
  include/drm/drm_gpuva_mgr.h | 302 +-
  3 files changed, 1010 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index f86bfad74ff8..69872b205961 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
  /**
   * drm_gpuva_manager_init() - initialize a _gpuva_manager
   * @mgr: pointer to the _gpuva_manager to initialize
+ * @drm: the drivers _device
   * @name: the name of the GPU VA space
   * @start_offset: the start offset of the GPU VA space
   * @range: the size of the GPU VA space
@@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
   */
  void
  drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+  struct drm_device *drm,
   const char *name,
   u64 start_offset, u64 range,
   u64 reserve_offset, u64 reserve_range,
@@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
mgr->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
  
+	mt_init(>mt_ext);

+
+   INIT_LIST_HEAD(>evict.list);
+   spin_lock_init(>evict.lock);
+
drm_gpuva_check_overflow(start_offset, range);
mgr->mm_start = start_offset;
mgr->mm_range = range;
@@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
 reserve_range)))
__drm_gpuva_insert(mgr, >kernel_alloc_node);
}
+
+   drm_gem_private_object_init(drm, >d_obj, 0);
+   mgr->resv = mgr->d_obj.resv;
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
  
@@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)

__drm_gpuva_remove(>kernel_alloc_node);
  
  	WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),

-"GPUVA tree is not empty, potentially leaking memory.");
+"GPUVA tree is not empty, potentially leaking memory.\n");
+
+   mtree_destroy(>mt_ext);
+   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
+
+   drm_gem_private_object_fini(>d_obj);
  }
  EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
  
+/**

+ * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
+ * @mgr: the _gpuva_manager
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all _gem_objects the given
+ * _gpuva_manager contains mappings of.
+ *
+ * Drivers can obtain the corresponding _exec instance through
+ * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init()
+ * and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
+ unsigned int num_fences)
+{
+   struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
+   MA_STATE(mas, >mt_ext, 0, 0);
+   union {
+   void *ptr;
+   uintptr_t cnt;
+   } ref;
+   int ret;
+
+   ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
+   if (ret)
+   goto out;
+
+   rcu_read_lock();
In xe we're protecting the external object list with an outer lock, 
(same as protecting the mgr itself). 

Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-21 Thread kernel test robot
Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f]

url:
https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143
base:   25205087df1ffe06ccea9302944ed1f77dc68c6f
patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com
patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize 
dma_resv/extobj handling and GEM validation
config: arm-randconfig-r014-20230822 
(https://download.01.org/0day-ci/archive/20230822/202308221050.ktj8ufma-...@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 12.3.0
reproduce: 
(https://download.01.org/0day-ci/archive/20230822/202308221050.ktj8ufma-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202308221050.ktj8ufma-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for 
>> 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes]
 750 | drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
 | ^
   drivers/gpu/drm/drm_gpuva_mgr.c: In function '__drm_gpuva_sm_map':
   drivers/gpu/drm/drm_gpuva_mgr.c:1744:39: warning: variable 'prev' set but 
not used [-Wunused-but-set-variable]
1744 | struct drm_gpuva *va, *next, *prev = NULL;
 |   ^~~~
--
>> drivers/gpu/drm/drm_gpuva_mgr.c:1091: warning: Function parameter or member 
>> '__vm_bo' not described in 'drm_gpuva_gem_obtain_prealloc'


vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c

   734  
   735  /**
   736   * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
   737   * @mgr: the _gpuva_manager
   738   * @num_fences: the amount of _fences to reserve
   739   *
   740   * Calls drm_exec_prepare_obj() for all _gem_objects the given
   741   * _gpuva_manager contains mappings of.
   742   *
   743   * Drivers can obtain the corresponding _exec instance through
   744   * DRM_GPUVA_EXEC(). It is the drivers responsibility to call 
drm_exec_init()
   745   * and drm_exec_fini() accordingly.
   746   *
   747   * Returns: 0 on success, negative error code on failure.
   748   */
   749  int
 > 750  drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
   751unsigned int num_fences)
   752  {
   753  struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
   754  MA_STATE(mas, >mt_ext, 0, 0);
   755  union {
   756  void *ptr;
   757  uintptr_t cnt;
   758  } ref;
   759  int ret;
   760  
   761  ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
   762  if (ret)
   763  goto out;
   764  
   765  rcu_read_lock();
   766  mas_for_each(, ref.ptr, ULONG_MAX) {
   767  struct drm_gem_object *obj;
   768  
   769  mas_pause();
   770  rcu_read_unlock();
   771  
   772  obj = (struct drm_gem_object *)(uintptr_t)mas.index;
   773  ret = drm_exec_prepare_obj(exec, obj, num_fences);
   774  if (ret)
   775  goto out;
   776  
   777  rcu_read_lock();
   778  }
   779  rcu_read_unlock();
   780  
   781  out:
   782  return ret;
   783  }
   784  EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects);
   785  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-21 Thread kernel test robot
Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f]

url:
https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143
base:   25205087df1ffe06ccea9302944ed1f77dc68c6f
patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com
patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize 
dma_resv/extobj handling and GEM validation
config: sparc-randconfig-r022-20230822 
(https://download.01.org/0day-ci/archive/20230822/202308221021.jczejwoy-...@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 12.3.0
reproduce: 
(https://download.01.org/0day-ci/archive/20230822/202308221021.jczejwoy-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202308221021.jczejwoy-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for 
>> 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes]
 750 | drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
 | ^
   drivers/gpu/drm/drm_gpuva_mgr.c: In function '__drm_gpuva_sm_map':
   drivers/gpu/drm/drm_gpuva_mgr.c:1744:39: warning: variable 'prev' set but 
not used [-Wunused-but-set-variable]
1744 | struct drm_gpuva *va, *next, *prev = NULL;
 |   ^~~~


vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c

   734  
   735  /**
   736   * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
   737   * @mgr: the _gpuva_manager
   738   * @num_fences: the amount of _fences to reserve
   739   *
   740   * Calls drm_exec_prepare_obj() for all _gem_objects the given
   741   * _gpuva_manager contains mappings of.
   742   *
   743   * Drivers can obtain the corresponding _exec instance through
   744   * DRM_GPUVA_EXEC(). It is the drivers responsibility to call 
drm_exec_init()
   745   * and drm_exec_fini() accordingly.
   746   *
   747   * Returns: 0 on success, negative error code on failure.
   748   */
   749  int
 > 750  drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
   751unsigned int num_fences)
   752  {
   753  struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
   754  MA_STATE(mas, >mt_ext, 0, 0);
   755  union {
   756  void *ptr;
   757  uintptr_t cnt;
   758  } ref;
   759  int ret;
   760  
   761  ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
   762  if (ret)
   763  goto out;
   764  
   765  rcu_read_lock();
   766  mas_for_each(, ref.ptr, ULONG_MAX) {
   767  struct drm_gem_object *obj;
   768  
   769  mas_pause();
   770  rcu_read_unlock();
   771  
   772  obj = (struct drm_gem_object *)(uintptr_t)mas.index;
   773  ret = drm_exec_prepare_obj(exec, obj, num_fences);
   774  if (ret)
   775  goto out;
   776  
   777  rcu_read_lock();
   778  }
   779  rcu_read_unlock();
   780  
   781  out:
   782  return ret;
   783  }
   784  EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects);
   785  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-21 Thread kernel test robot
Hi Danilo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f]

url:
https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143
base:   25205087df1ffe06ccea9302944ed1f77dc68c6f
patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com
patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize 
dma_resv/extobj handling and GEM validation
config: i386-randconfig-r024-20230822 
(https://download.01.org/0day-ci/archive/20230822/202308220935.ik8qpkf4-...@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git 
ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce: 
(https://download.01.org/0day-ci/archive/20230822/202308220935.ik8qpkf4-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202308220935.ik8qpkf4-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for 
>> function 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes]
   drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
   ^
   drivers/gpu/drm/drm_gpuva_mgr.c:749:1: note: declare 'static' if the 
function is not intended to be used outside of this translation unit
   int
   ^
   static 
   drivers/gpu/drm/drm_gpuva_mgr.c:1744:32: warning: variable 'prev' set but 
not used [-Wunused-but-set-variable]
   struct drm_gpuva *va, *next, *prev = NULL;
 ^
   2 warnings generated.
--
>> drivers/gpu/drm/drm_gpuva_mgr.c:1091: warning: Function parameter or member 
>> '__vm_bo' not described in 'drm_gpuva_gem_obtain_prealloc'


vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c

   734  
   735  /**
   736   * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
   737   * @mgr: the _gpuva_manager
   738   * @num_fences: the amount of _fences to reserve
   739   *
   740   * Calls drm_exec_prepare_obj() for all _gem_objects the given
   741   * _gpuva_manager contains mappings of.
   742   *
   743   * Drivers can obtain the corresponding _exec instance through
   744   * DRM_GPUVA_EXEC(). It is the drivers responsibility to call 
drm_exec_init()
   745   * and drm_exec_fini() accordingly.
   746   *
   747   * Returns: 0 on success, negative error code on failure.
   748   */
   749  int
 > 750  drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
   751unsigned int num_fences)
   752  {
   753  struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
   754  MA_STATE(mas, >mt_ext, 0, 0);
   755  union {
   756  void *ptr;
   757  uintptr_t cnt;
   758  } ref;
   759  int ret;
   760  
   761  ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
   762  if (ret)
   763  goto out;
   764  
   765  rcu_read_lock();
   766  mas_for_each(, ref.ptr, ULONG_MAX) {
   767  struct drm_gem_object *obj;
   768  
   769  mas_pause();
   770  rcu_read_unlock();
   771  
   772  obj = (struct drm_gem_object *)(uintptr_t)mas.index;
   773  ret = drm_exec_prepare_obj(exec, obj, num_fences);
   774  if (ret)
   775  goto out;
   776  
   777  rcu_read_lock();
   778  }
   779  rcu_read_unlock();
   780  
   781  out:
   782  return ret;
   783  }
   784  EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects);
   785  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


[PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation

2023-08-20 Thread Danilo Krummrich
So far the DRM GPUVA manager offers common infrastructure to track GPU VA
allocations and mappings, generically connect GPU VA mappings to their
backing buffers and perform more complex mapping operations on the GPU VA
space.

However, there are more design patterns commonly used by drivers, which
can potentially be generalized in order to make the DRM GPUVA manager
represent a basic GPU-VM implementation. In this context, this patch aims
at generalizing the following elements.

1) Provide a common dma-resv for GEM objects not being used outside of
   this GPU-VM.

2) Provide tracking of external GEM objects (GEM objects which are
   shared with other GPU-VMs).

3) Provide functions to efficiently lock all GEM objects dma-resv the
   GPU-VM contains mappings of.

4) Provide tracking of evicted GEM objects the GPU-VM contains mappings
   of, such that validation of evicted GEM objects is accelerated.

5) Provide some convinience functions for common patterns.

Rather than being designed as a "framework", the target is to make all
features appear as a collection of optional helper functions, such that
drivers are free to make use of the DRM GPUVA managers basic
functionality and opt-in for other features without setting any feature
flags, just by making use of the corresponding functions.

Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++-
 include/drm/drm_gem.h   |  48 ++-
 include/drm/drm_gpuva_mgr.h | 302 +-
 3 files changed, 1010 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index f86bfad74ff8..69872b205961 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
 /**
  * drm_gpuva_manager_init() - initialize a _gpuva_manager
  * @mgr: pointer to the _gpuva_manager to initialize
+ * @drm: the drivers _device
  * @name: the name of the GPU VA space
  * @start_offset: the start offset of the GPU VA space
  * @range: the size of the GPU VA space
@@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr,
  */
 void
 drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
+  struct drm_device *drm,
   const char *name,
   u64 start_offset, u64 range,
   u64 reserve_offset, u64 reserve_range,
@@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
mgr->rb.tree = RB_ROOT_CACHED;
INIT_LIST_HEAD(>rb.list);
 
+   mt_init(>mt_ext);
+
+   INIT_LIST_HEAD(>evict.list);
+   spin_lock_init(>evict.lock);
+
drm_gpuva_check_overflow(start_offset, range);
mgr->mm_start = start_offset;
mgr->mm_range = range;
@@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr,
 reserve_range)))
__drm_gpuva_insert(mgr, >kernel_alloc_node);
}
+
+   drm_gem_private_object_init(drm, >d_obj, 0);
+   mgr->resv = mgr->d_obj.resv;
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_manager_init);
 
@@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr)
__drm_gpuva_remove(>kernel_alloc_node);
 
WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root),
-"GPUVA tree is not empty, potentially leaking memory.");
+"GPUVA tree is not empty, potentially leaking memory.\n");
+
+   mtree_destroy(>mt_ext);
+   WARN(!list_empty(>evict.list), "Evict list should be empty.\n");
+
+   drm_gem_private_object_fini(>d_obj);
 }
 EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy);
 
+/**
+ * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs
+ * @mgr: the _gpuva_manager
+ * @num_fences: the amount of _fences to reserve
+ *
+ * Calls drm_exec_prepare_obj() for all _gem_objects the given
+ * _gpuva_manager contains mappings of.
+ *
+ * Drivers can obtain the corresponding _exec instance through
+ * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init()
+ * and drm_exec_fini() accordingly.
+ *
+ * Returns: 0 on success, negative error code on failure.
+ */
+int
+drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr,
+ unsigned int num_fences)
+{
+   struct drm_exec *exec = DRM_GPUVA_EXEC(mgr);
+   MA_STATE(mas, >mt_ext, 0, 0);
+   union {
+   void *ptr;
+   uintptr_t cnt;
+   } ref;
+   int ret;
+
+   ret = drm_exec_prepare_obj(exec, >d_obj, num_fences);
+   if (ret)
+   goto out;
+
+   rcu_read_lock();
+   mas_for_each(, ref.ptr, ULONG_MAX) {
+   struct drm_gem_object *obj;
+
+   mas_pause();
+   rcu_read_unlock();
+
+   obj = (struct drm_gem_object *)(uintptr_t)mas.index;
+   ret = drm_exec_prepare_obj(exec, obj,