Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Thu, Oct 12, 2023 at 02:35:15PM +0200, Christian König wrote: > Am 12.10.23 um 12:33 schrieb Dave Airlie: > > On Wed, 11 Oct 2023 at 17:07, Christian König > > wrote: > > > Am 10.10.23 um 22:23 schrieb Dave Airlie: > > > > > I think we're then optimizing for different scenarios. Our compute > > > > > driver will use mostly external objects only, and if shared, I don't > > > > > forsee them bound to many VMs. What saves us currently here is that in > > > > > compute mode we only really traverse the extobj list after a preempt > > > > > fence wait, or when a vm is using a new context for the first time. So > > > > > vm's extobj list is pretty large. Each bo's vma list will typically be > > > > > pretty small. > > > > Can I ask why we are optimising for this userspace, this seems > > > > incredibly broken. > > > > > > > > We've has this sort of problem in the past with Intel letting the tail > > > > wag the horse, does anyone remember optimising relocations for a > > > > userspace that didn't actually need to use relocations? > > > > > > > > We need to ask why this userspace is doing this, can we get some > > > > pointers to it? compute driver should have no reason to use mostly > > > > external objects, the OpenCL and level0 APIs should be good enough to > > > > figure this out. > > > Well that is pretty normal use case, AMD works the same way. > > > > > > In a multi GPU compute stack you have mostly all the data shared between > > > different hardware devices. > > > > > > As I said before looking at just the Vulcan use case is not a good idea > > > at all. > > > > > It's okay, I don't think anyone is doing that, some of the these > > use-cases are buried in server land and you guys don't communicate > > them very well. > > Yeah, well everybody is trying very hard to get away from those approaches > :) > > But so far there hasn't been any breakthrough. > > > > > multi-gpu compute would I'd hope be moving towards HMM/SVM type > > solutions though? > > Unfortunately not in the foreseeable future. HMM seems more and more like a > dead end, at least for AMD. > > AMD still has hardware support in all of their MI* products, but for Navi > the features necessary for implementing HMM have been dropped. And it looks > more and more like their are not going to come back. > > Additional to that from the software side Felix summarized it in the HMM > peer2peer discussion thread recently quite well. A buffer object based > approach is not only simpler to handle, but also performant vise multiple > magnitudes faster. This matches what I'm hearing from all over. Turns out that handling page faults in full generality in a compute/accel device (not just gpu) is just too damn hard. At least for anyone who isn't nvidia. Usually time bound preemption guarantees are the first to go, followed right after by a long list of more fixed function hardware blocks that outright can't cope with page faults. There's so many corner cases where it breaks down that I feel like device driver allocated memory of one flavor or another will stick around for a very long time. This isn't even counting the software challenges. -Sima > > I'm also not into looking at use-cases that used to be important but > > might not as important going forward. > > Well multimedia applications and OpenGL are still around, but it's not the > main focus any more. > > Christian. > > > > > Dave. > > > > > > > Christian. > > > > > > > Dave. > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Am 12.10.23 um 12:33 schrieb Dave Airlie: On Wed, 11 Oct 2023 at 17:07, Christian König wrote: Am 10.10.23 um 22:23 schrieb Dave Airlie: I think we're then optimizing for different scenarios. Our compute driver will use mostly external objects only, and if shared, I don't forsee them bound to many VMs. What saves us currently here is that in compute mode we only really traverse the extobj list after a preempt fence wait, or when a vm is using a new context for the first time. So vm's extobj list is pretty large. Each bo's vma list will typically be pretty small. Can I ask why we are optimising for this userspace, this seems incredibly broken. We've has this sort of problem in the past with Intel letting the tail wag the horse, does anyone remember optimising relocations for a userspace that didn't actually need to use relocations? We need to ask why this userspace is doing this, can we get some pointers to it? compute driver should have no reason to use mostly external objects, the OpenCL and level0 APIs should be good enough to figure this out. Well that is pretty normal use case, AMD works the same way. In a multi GPU compute stack you have mostly all the data shared between different hardware devices. As I said before looking at just the Vulcan use case is not a good idea at all. It's okay, I don't think anyone is doing that, some of the these use-cases are buried in server land and you guys don't communicate them very well. Yeah, well everybody is trying very hard to get away from those approaches :) But so far there hasn't been any breakthrough. multi-gpu compute would I'd hope be moving towards HMM/SVM type solutions though? Unfortunately not in the foreseeable future. HMM seems more and more like a dead end, at least for AMD. AMD still has hardware support in all of their MI* products, but for Navi the features necessary for implementing HMM have been dropped. And it looks more and more like their are not going to come back. Additional to that from the software side Felix summarized it in the HMM peer2peer discussion thread recently quite well. A buffer object based approach is not only simpler to handle, but also performant vise multiple magnitudes faster. I'm also not into looking at use-cases that used to be important but might not as important going forward. Well multimedia applications and OpenGL are still around, but it's not the main focus any more. Christian. Dave. Christian. Dave.
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Wed, 11 Oct 2023 at 17:07, Christian König wrote: > > Am 10.10.23 um 22:23 schrieb Dave Airlie: > >> I think we're then optimizing for different scenarios. Our compute > >> driver will use mostly external objects only, and if shared, I don't > >> forsee them bound to many VMs. What saves us currently here is that in > >> compute mode we only really traverse the extobj list after a preempt > >> fence wait, or when a vm is using a new context for the first time. So > >> vm's extobj list is pretty large. Each bo's vma list will typically be > >> pretty small. > > Can I ask why we are optimising for this userspace, this seems > > incredibly broken. > > > > We've has this sort of problem in the past with Intel letting the tail > > wag the horse, does anyone remember optimising relocations for a > > userspace that didn't actually need to use relocations? > > > > We need to ask why this userspace is doing this, can we get some > > pointers to it? compute driver should have no reason to use mostly > > external objects, the OpenCL and level0 APIs should be good enough to > > figure this out. > > Well that is pretty normal use case, AMD works the same way. > > In a multi GPU compute stack you have mostly all the data shared between > different hardware devices. > > As I said before looking at just the Vulcan use case is not a good idea > at all. > It's okay, I don't think anyone is doing that, some of the these use-cases are buried in server land and you guys don't communicate them very well. multi-gpu compute would I'd hope be moving towards HMM/SVM type solutions though? I'm also not into looking at use-cases that used to be important but might not as important going forward. Dave. > Christian. > > > > > Dave. >
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Wed, 2023-10-11 at 06:23 +1000, Dave Airlie wrote: > > I think we're then optimizing for different scenarios. Our compute > > driver will use mostly external objects only, and if shared, I > > don't > > forsee them bound to many VMs. What saves us currently here is that > > in > > compute mode we only really traverse the extobj list after a > > preempt > > fence wait, or when a vm is using a new context for the first time. > > So > > vm's extobj list is pretty large. Each bo's vma list will typically > > be > > pretty small. > > Can I ask why we are optimising for this userspace, this seems > incredibly broken. First Judging from the discussion with Christian this is not really uncommon. There *are* ways that we can play tricks in KMD of assorted cleverness to reduce the extobj list size, but doing that in KMD that wouldn't be much different than accepting a large extobj list size and do what we can to reduce overhead of iterating over it. Second the discussion here really was about whether we should be using a lower level lock to allow for async state updates, with a rather complex mechanism with weak reference counting and a requirement to drop the locks within the loop to avoid locking inversion. If that were a simplification with little or no overhead all fine, but IMO it's not a simplification? > > We've has this sort of problem in the past with Intel letting the > tail > wag the horse, does anyone remember optimising relocations for a > userspace that didn't actually need to use relocations? > > We need to ask why this userspace is doing this, can we get some > pointers to it? compute driver should have no reason to use mostly > external objects, the OpenCL and level0 APIs should be good enough to > figure this out. TBH for the compute UMD case, I'd be prepared to drop the *performance* argument of fine-grained locking the extobj list since it's really only traversed on new contexts and preemption. But as Christian mentions there might be other cases. We should perhaps figure those out and document? /Thoams > > Dave.
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Am 10.10.23 um 22:23 schrieb Dave Airlie: I think we're then optimizing for different scenarios. Our compute driver will use mostly external objects only, and if shared, I don't forsee them bound to many VMs. What saves us currently here is that in compute mode we only really traverse the extobj list after a preempt fence wait, or when a vm is using a new context for the first time. So vm's extobj list is pretty large. Each bo's vma list will typically be pretty small. Can I ask why we are optimising for this userspace, this seems incredibly broken. We've has this sort of problem in the past with Intel letting the tail wag the horse, does anyone remember optimising relocations for a userspace that didn't actually need to use relocations? We need to ask why this userspace is doing this, can we get some pointers to it? compute driver should have no reason to use mostly external objects, the OpenCL and level0 APIs should be good enough to figure this out. Well that is pretty normal use case, AMD works the same way. In a multi GPU compute stack you have mostly all the data shared between different hardware devices. As I said before looking at just the Vulcan use case is not a good idea at all. Christian. Dave.
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
> I think we're then optimizing for different scenarios. Our compute > driver will use mostly external objects only, and if shared, I don't > forsee them bound to many VMs. What saves us currently here is that in > compute mode we only really traverse the extobj list after a preempt > fence wait, or when a vm is using a new context for the first time. So > vm's extobj list is pretty large. Each bo's vma list will typically be > pretty small. Can I ask why we are optimising for this userspace, this seems incredibly broken. We've has this sort of problem in the past with Intel letting the tail wag the horse, does anyone remember optimising relocations for a userspace that didn't actually need to use relocations? We need to ask why this userspace is doing this, can we get some pointers to it? compute driver should have no reason to use mostly external objects, the OpenCL and level0 APIs should be good enough to figure this out. Dave.
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On 9/1/23 14:10, Danilo Krummrich wrote: On Fri, Sep 01, 2023 at 07:59:21AM +0200, Thomas Hellström (Intel) wrote: On 8/31/23 21:07, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote: Hi, On 8/31/23 13:18, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: Hi! On 8/30/23 17:00, Danilo Krummrich wrote: On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h index ed8d50200cc3..693e2da3f425 100644 --- a/include/drm/drm_gpuva_mgr.h +++ b/include/drm/drm_gpuva_mgr.h @@ -26,12 +26,16 @@ */ #include +#include +#include #include #include #include +#include struct drm_gpuva_manager; +struct drm_gpuva_gem; struct drm_gpuva_fn_ops; /** @@ -140,7 +144,7 @@ struct drm_gpuva { int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va); void drm_gpuva_remove(struct drm_gpuva *va); -void drm_gpuva_link(struct drm_gpuva *va); +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo); void drm_gpuva_unlink(struct drm_gpuva *va); struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr, @@ -240,15 +244,137 @@ struct drm_gpuva_manager { * @ops: _gpuva_fn_ops providing the split/merge steps to drivers */ const struct drm_gpuva_fn_ops *ops; + + /** +* @d_obj: Dummy GEM object; used internally to pass the GPU VMs +* dma-resv to _exec. +*/ + struct drm_gem_object d_obj; + + /** +* @resv: the _resv for _gem_objects mapped in this GPU VA +* space +*/ + struct dma_resv *resv; + + /** +* @exec: the _exec helper to lock external _gem_objects +*/ + struct drm_exec exec; + + /** +* @mt_ext: _tree storing external _gem_objects +*/ + struct maple_tree mt_ext; Why are you using a maple tree here? Insertion and removal is O(log(n)) instead of O(1) for a list? Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs could have mappings of the same extobj. I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry instead, which also seems to be the obvious choice. However, there is a locking conflict. A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of the corresponding drm_gem_object, or with an external lock provided by the driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing changes on the GPUVA space directly from the fence signalling path. Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already, we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being linked and we'd want to remove it for the last one being unlinked. (Actually we'd want to add the drm_gpuva_gem object to the extobj list even before, because otherwise we'd not acquire it's dma-resv lock of this GEM object through drm_gpuva_manager_lock(). But that's trival, we could do that when we create the drm_gpuva_gem, which we need to do anyways.) Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In order to do so, we'd (as discussed above) either need to hold the outer GPU-VM lock or the GPU-VMs dma-resv lock. Both would be illegal in the case drm_gpuva_unlink() is called from within the fence signalling path. For drivers like XE or Nouveau, we'd at least need to make sure to not mess up the locking hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO. Considering all that, I thought it's probably better to track extobjs separate from the drm_gpuva_gem, hence the maple tree choice. Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to external objects, or in the case of multiple mappings to the same gem object, only one of the drm_gpuvas is in the list. These are protected by the GPU-VM lock. I don't see a problem with removing those from the fence signalling path, though? I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs, since this is generic code I don't know how much mappings of an external object the corresponding driver potentially creates. This could become a pretty large list to iterate. Another reason was, that I want to keep the drm_gpuva structure as small as possible, hence avoiding another list_head.
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Fri, Sep 01, 2023 at 07:59:21AM +0200, Thomas Hellström (Intel) wrote: > > On 8/31/23 21:07, Danilo Krummrich wrote: > > On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote: > > > Hi, > > > > > > On 8/31/23 13:18, Danilo Krummrich wrote: > > > > On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) > > > > wrote: > > > > > Hi! > > > > > > > > > > On 8/30/23 17:00, Danilo Krummrich wrote: > > > > > > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) > > > > > > wrote: > > > > > > > On 8/30/23 14:49, Danilo Krummrich wrote: > > > > > > > > Hi Thomas, > > > > > > > > > > > > > > > > thanks for having a look! > > > > > > > > > > > > > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström > > > > > > > > (Intel) wrote: > > > > > > > > > Hi, Danilo. > > > > > > > > > > > > > > > > > > Some quick comments since I'm doing some Xe work in this > > > > > > > > > area. Will probably > > > > > > > > > get back with more. > > > > > > > > > > > > > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote: > > > > > > > > > > > > > > > > > > diff --git a/include/drm/drm_gpuva_mgr.h > > > > > > > > > > b/include/drm/drm_gpuva_mgr.h > > > > > > > > > > index ed8d50200cc3..693e2da3f425 100644 > > > > > > > > > > --- a/include/drm/drm_gpuva_mgr.h > > > > > > > > > > +++ b/include/drm/drm_gpuva_mgr.h > > > > > > > > > > @@ -26,12 +26,16 @@ > > > > > > > > > >*/ > > > > > > > > > > #include > > > > > > > > > > +#include > > > > > > > > > > +#include > > > > > > > > > > #include > > > > > > > > > > #include > > > > > > > > > > #include > > > > > > > > > > +#include > > > > > > > > > > struct drm_gpuva_manager; > > > > > > > > > > +struct drm_gpuva_gem; > > > > > > > > > > struct drm_gpuva_fn_ops; > > > > > > > > > > /** > > > > > > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva { > > > > > > > > > > int drm_gpuva_insert(struct drm_gpuva_manager *mgr, > > > > > > > > > > struct drm_gpuva *va); > > > > > > > > > > void drm_gpuva_remove(struct drm_gpuva *va); > > > > > > > > > > -void drm_gpuva_link(struct drm_gpuva *va); > > > > > > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct > > > > > > > > > > drm_gpuva_gem *vm_bo); > > > > > > > > > > void drm_gpuva_unlink(struct drm_gpuva *va); > > > > > > > > > > struct drm_gpuva *drm_gpuva_find(struct > > > > > > > > > > drm_gpuva_manager *mgr, > > > > > > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager { > > > > > > > > > > * @ops: _gpuva_fn_ops providing the > > > > > > > > > > split/merge steps to drivers > > > > > > > > > > */ > > > > > > > > > > const struct drm_gpuva_fn_ops *ops; > > > > > > > > > > + > > > > > > > > > > + /** > > > > > > > > > > +* @d_obj: Dummy GEM object; used internally to pass > > > > > > > > > > the GPU VMs > > > > > > > > > > +* dma-resv to _exec. > > > > > > > > > > +*/ > > > > > > > > > > + struct drm_gem_object d_obj; > > > > > > > > > > + > > > > > > > > > > + /** > > > > > > > > > > +* @resv: the _resv for _gem_objects mapped in > > > > > > > > > > this GPU VA > > > > > > > > > > +* space > > > > > > > > > > +*/ > > > > > > > > > > + struct dma_resv *resv; > > > > > > > > > > + > > > > > > > > > > + /** > > > > > > > > > > +* @exec: the _exec helper to lock external > > > > > > > > > > _gem_objects > > > > > > > > > > +*/ > > > > > > > > > > + struct drm_exec exec; > > > > > > > > > > + > > > > > > > > > > + /** > > > > > > > > > > +* @mt_ext: _tree storing external > > > > > > > > > > _gem_objects > > > > > > > > > > +*/ > > > > > > > > > > + struct maple_tree mt_ext; > > > > > > > > > Why are you using a maple tree here? Insertion and removal is > > > > > > > > > O(log(n)) > > > > > > > > > instead of O(1) for a list? > > > > > > > > > > > > > > > > > Having a list of drm_gem_objects directly wouldn't work, as > > > > > > > > multiple GPU-VMs > > > > > > > > could have mappings of the same extobj. > > > > > > > > > > > > > > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) > > > > > > > > as list entry > > > > > > > > instead, which also seems to be the obvious choice. However, > > > > > > > > there is a locking > > > > > > > > conflict. > > > > > > > > > > > > > > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each > > > > > > > > drm_gpuva_gem keeps > > > > > > > > a list of drm_gpuvas. Both lists are either protected with the > > > > > > > > dma-resv lock of > > > > > > > > the corresponding drm_gem_object, or with an external lock > > > > > > > > provided by the > > > > > > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by > > > > > > > > drivers performing > > > > > > > > changes on the GPUVA space directly from the fence signalling > > > > > > > > path. > > > > > > > > > > > > > > > > Now, similar to what
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On 8/31/23 21:07, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote: Hi, On 8/31/23 13:18, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: Hi! On 8/30/23 17:00, Danilo Krummrich wrote: On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h index ed8d50200cc3..693e2da3f425 100644 --- a/include/drm/drm_gpuva_mgr.h +++ b/include/drm/drm_gpuva_mgr.h @@ -26,12 +26,16 @@ */ #include +#include +#include #include #include #include +#include struct drm_gpuva_manager; +struct drm_gpuva_gem; struct drm_gpuva_fn_ops; /** @@ -140,7 +144,7 @@ struct drm_gpuva { int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va); void drm_gpuva_remove(struct drm_gpuva *va); -void drm_gpuva_link(struct drm_gpuva *va); +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo); void drm_gpuva_unlink(struct drm_gpuva *va); struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr, @@ -240,15 +244,137 @@ struct drm_gpuva_manager { * @ops: _gpuva_fn_ops providing the split/merge steps to drivers */ const struct drm_gpuva_fn_ops *ops; + + /** +* @d_obj: Dummy GEM object; used internally to pass the GPU VMs +* dma-resv to _exec. +*/ + struct drm_gem_object d_obj; + + /** +* @resv: the _resv for _gem_objects mapped in this GPU VA +* space +*/ + struct dma_resv *resv; + + /** +* @exec: the _exec helper to lock external _gem_objects +*/ + struct drm_exec exec; + + /** +* @mt_ext: _tree storing external _gem_objects +*/ + struct maple_tree mt_ext; Why are you using a maple tree here? Insertion and removal is O(log(n)) instead of O(1) for a list? Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs could have mappings of the same extobj. I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry instead, which also seems to be the obvious choice. However, there is a locking conflict. A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of the corresponding drm_gem_object, or with an external lock provided by the driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing changes on the GPUVA space directly from the fence signalling path. Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already, we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being linked and we'd want to remove it for the last one being unlinked. (Actually we'd want to add the drm_gpuva_gem object to the extobj list even before, because otherwise we'd not acquire it's dma-resv lock of this GEM object through drm_gpuva_manager_lock(). But that's trival, we could do that when we create the drm_gpuva_gem, which we need to do anyways.) Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In order to do so, we'd (as discussed above) either need to hold the outer GPU-VM lock or the GPU-VMs dma-resv lock. Both would be illegal in the case drm_gpuva_unlink() is called from within the fence signalling path. For drivers like XE or Nouveau, we'd at least need to make sure to not mess up the locking hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO. Considering all that, I thought it's probably better to track extobjs separate from the drm_gpuva_gem, hence the maple tree choice. Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to external objects, or in the case of multiple mappings to the same gem object, only one of the drm_gpuvas is in the list. These are protected by the GPU-VM lock. I don't see a problem with removing those from the fence signalling path, though? I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs, since this is generic code I don't know how much mappings of an external object the corresponding driver potentially creates. This could become a pretty large list to iterate. Another reason was, that I want to keep the drm_gpuva structure as small as possible, hence avoiding another list_head. Yes, the list might be pretty large, but OTOH you never iterate to access a single list element. When you need to iterate the
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Thu, Aug 31, 2023 at 06:53:01PM +0200, Thomas Hellström (Intel) wrote: > Hi, > > On 8/31/23 13:18, Danilo Krummrich wrote: > > On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: > > > Hi! > > > > > > On 8/30/23 17:00, Danilo Krummrich wrote: > > > > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) > > > > wrote: > > > > > On 8/30/23 14:49, Danilo Krummrich wrote: > > > > > > Hi Thomas, > > > > > > > > > > > > thanks for having a look! > > > > > > > > > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) > > > > > > wrote: > > > > > > > Hi, Danilo. > > > > > > > > > > > > > > Some quick comments since I'm doing some Xe work in this area. > > > > > > > Will probably > > > > > > > get back with more. > > > > > > > > > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote: > > > > > > > > > > > > diff --git a/include/drm/drm_gpuva_mgr.h > > > > > > > > b/include/drm/drm_gpuva_mgr.h > > > > > > > > index ed8d50200cc3..693e2da3f425 100644 > > > > > > > > --- a/include/drm/drm_gpuva_mgr.h > > > > > > > > +++ b/include/drm/drm_gpuva_mgr.h > > > > > > > > @@ -26,12 +26,16 @@ > > > > > > > > */ > > > > > > > > #include > > > > > > > > +#include > > > > > > > > +#include > > > > > > > > #include > > > > > > > > #include > > > > > > > > #include > > > > > > > > +#include > > > > > > > > struct drm_gpuva_manager; > > > > > > > > +struct drm_gpuva_gem; > > > > > > > > struct drm_gpuva_fn_ops; > > > > > > > > /** > > > > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva { > > > > > > > > int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct > > > > > > > > drm_gpuva *va); > > > > > > > > void drm_gpuva_remove(struct drm_gpuva *va); > > > > > > > > -void drm_gpuva_link(struct drm_gpuva *va); > > > > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem > > > > > > > > *vm_bo); > > > > > > > > void drm_gpuva_unlink(struct drm_gpuva *va); > > > > > > > > struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager > > > > > > > > *mgr, > > > > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager { > > > > > > > > * @ops: _gpuva_fn_ops providing the split/merge > > > > > > > > steps to drivers > > > > > > > > */ > > > > > > > > const struct drm_gpuva_fn_ops *ops; > > > > > > > > + > > > > > > > > + /** > > > > > > > > +* @d_obj: Dummy GEM object; used internally to pass > > > > > > > > the GPU VMs > > > > > > > > +* dma-resv to _exec. > > > > > > > > +*/ > > > > > > > > + struct drm_gem_object d_obj; > > > > > > > > + > > > > > > > > + /** > > > > > > > > +* @resv: the _resv for _gem_objects mapped in > > > > > > > > this GPU VA > > > > > > > > +* space > > > > > > > > +*/ > > > > > > > > + struct dma_resv *resv; > > > > > > > > + > > > > > > > > + /** > > > > > > > > +* @exec: the _exec helper to lock external > > > > > > > > _gem_objects > > > > > > > > +*/ > > > > > > > > + struct drm_exec exec; > > > > > > > > + > > > > > > > > + /** > > > > > > > > +* @mt_ext: _tree storing external > > > > > > > > _gem_objects > > > > > > > > +*/ > > > > > > > > + struct maple_tree mt_ext; > > > > > > > Why are you using a maple tree here? Insertion and removal is > > > > > > > O(log(n)) > > > > > > > instead of O(1) for a list? > > > > > > > > > > > > > Having a list of drm_gem_objects directly wouldn't work, as > > > > > > multiple GPU-VMs > > > > > > could have mappings of the same extobj. > > > > > > > > > > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) as > > > > > > list entry > > > > > > instead, which also seems to be the obvious choice. However, there > > > > > > is a locking > > > > > > conflict. > > > > > > > > > > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each > > > > > > drm_gpuva_gem keeps > > > > > > a list of drm_gpuvas. Both lists are either protected with the > > > > > > dma-resv lock of > > > > > > the corresponding drm_gem_object, or with an external lock provided > > > > > > by the > > > > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by > > > > > > drivers performing > > > > > > changes on the GPUVA space directly from the fence signalling path. > > > > > > > > > > > > Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are > > > > > > doing already, > > > > > > we'd want to add a drm_gpuva_gem to the extobj list for the first > > > > > > mapping being > > > > > > linked and we'd want to remove it for the last one being unlinked. > > > > > > > > > > > > (Actually we'd want to add the drm_gpuva_gem object to the extobj > > > > > > list even > > > > > > before, because otherwise we'd not acquire it's dma-resv lock of > > > > > > this GEM object > > > > > > through drm_gpuva_manager_lock(). But that's trival, we could do
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On 8/31/23 18:53, Thomas Hellström (Intel) wrote: Hi, On 8/31/23 13:18, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: Hi! On 8/30/23 17:00, Danilo Krummrich wrote: On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h index ed8d50200cc3..693e2da3f425 100644 --- a/include/drm/drm_gpuva_mgr.h +++ b/include/drm/drm_gpuva_mgr.h @@ -26,12 +26,16 @@ */ #include +#include +#include #include #include #include +#include struct drm_gpuva_manager; +struct drm_gpuva_gem; struct drm_gpuva_fn_ops; /** @@ -140,7 +144,7 @@ struct drm_gpuva { int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va); void drm_gpuva_remove(struct drm_gpuva *va); -void drm_gpuva_link(struct drm_gpuva *va); +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo); void drm_gpuva_unlink(struct drm_gpuva *va); struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr, @@ -240,15 +244,137 @@ struct drm_gpuva_manager { * @ops: _gpuva_fn_ops providing the split/merge steps to drivers */ const struct drm_gpuva_fn_ops *ops; + + /** + * @d_obj: Dummy GEM object; used internally to pass the GPU VMs + * dma-resv to _exec. + */ + struct drm_gem_object d_obj; + + /** + * @resv: the _resv for _gem_objects mapped in this GPU VA + * space + */ + struct dma_resv *resv; + + /** + * @exec: the _exec helper to lock external _gem_objects + */ + struct drm_exec exec; + + /** + * @mt_ext: _tree storing external _gem_objects + */ + struct maple_tree mt_ext; Why are you using a maple tree here? Insertion and removal is O(log(n)) instead of O(1) for a list? Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs could have mappings of the same extobj. I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry instead, which also seems to be the obvious choice. However, there is a locking conflict. A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of the corresponding drm_gem_object, or with an external lock provided by the driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing changes on the GPUVA space directly from the fence signalling path. Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already, we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being linked and we'd want to remove it for the last one being unlinked. (Actually we'd want to add the drm_gpuva_gem object to the extobj list even before, because otherwise we'd not acquire it's dma-resv lock of this GEM object through drm_gpuva_manager_lock(). But that's trival, we could do that when we create the drm_gpuva_gem, which we need to do anyways.) Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In order to do so, we'd (as discussed above) either need to hold the outer GPU-VM lock or the GPU-VMs dma-resv lock. Both would be illegal in the case drm_gpuva_unlink() is called from within the fence signalling path. For drivers like XE or Nouveau, we'd at least need to make sure to not mess up the locking hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO. Considering all that, I thought it's probably better to track extobjs separate from the drm_gpuva_gem, hence the maple tree choice. Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to external objects, or in the case of multiple mappings to the same gem object, only one of the drm_gpuvas is in the list. These are protected by the GPU-VM lock. I don't see a problem with removing those from the fence signalling path, though? I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs, since this is generic code I don't know how much mappings of an external object the corresponding driver potentially creates. This could become a pretty large list to iterate. Another reason was, that I want to keep the drm_gpuva structure as small as possible, hence avoiding another list_head. Yes, the list might be pretty large, but OTOH you never iterate to access a single list element. When you need to iterate the whole list you need to do that regardless of the data structure used. As for
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi, On 8/31/23 13:18, Danilo Krummrich wrote: On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: Hi! On 8/30/23 17:00, Danilo Krummrich wrote: On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: diff --git a/include/drm/drm_gpuva_mgr.h b/include/drm/drm_gpuva_mgr.h index ed8d50200cc3..693e2da3f425 100644 --- a/include/drm/drm_gpuva_mgr.h +++ b/include/drm/drm_gpuva_mgr.h @@ -26,12 +26,16 @@ */ #include +#include +#include #include #include #include +#include struct drm_gpuva_manager; +struct drm_gpuva_gem; struct drm_gpuva_fn_ops; /** @@ -140,7 +144,7 @@ struct drm_gpuva { int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct drm_gpuva *va); void drm_gpuva_remove(struct drm_gpuva *va); -void drm_gpuva_link(struct drm_gpuva *va); +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem *vm_bo); void drm_gpuva_unlink(struct drm_gpuva *va); struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr, @@ -240,15 +244,137 @@ struct drm_gpuva_manager { * @ops: _gpuva_fn_ops providing the split/merge steps to drivers */ const struct drm_gpuva_fn_ops *ops; + + /** +* @d_obj: Dummy GEM object; used internally to pass the GPU VMs +* dma-resv to _exec. +*/ + struct drm_gem_object d_obj; + + /** +* @resv: the _resv for _gem_objects mapped in this GPU VA +* space +*/ + struct dma_resv *resv; + + /** +* @exec: the _exec helper to lock external _gem_objects +*/ + struct drm_exec exec; + + /** +* @mt_ext: _tree storing external _gem_objects +*/ + struct maple_tree mt_ext; Why are you using a maple tree here? Insertion and removal is O(log(n)) instead of O(1) for a list? Having a list of drm_gem_objects directly wouldn't work, as multiple GPU-VMs could have mappings of the same extobj. I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list entry instead, which also seems to be the obvious choice. However, there is a locking conflict. A drm_gem_object keeps a list of drm_gpuva_gems, while each drm_gpuva_gem keeps a list of drm_gpuvas. Both lists are either protected with the dma-resv lock of the corresponding drm_gem_object, or with an external lock provided by the driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers performing changes on the GPUVA space directly from the fence signalling path. Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing already, we'd want to add a drm_gpuva_gem to the extobj list for the first mapping being linked and we'd want to remove it for the last one being unlinked. (Actually we'd want to add the drm_gpuva_gem object to the extobj list even before, because otherwise we'd not acquire it's dma-resv lock of this GEM object through drm_gpuva_manager_lock(). But that's trival, we could do that when we create the drm_gpuva_gem, which we need to do anyways.) Anyway, we'd probably want to keep removing the drm_gpuva_gem from the extobj list from drm_gpuva_unlink() when the last mapping of this BO is unlinked. In order to do so, we'd (as discussed above) either need to hold the outer GPU-VM lock or the GPU-VMs dma-resv lock. Both would be illegal in the case drm_gpuva_unlink() is called from within the fence signalling path. For drivers like XE or Nouveau, we'd at least need to make sure to not mess up the locking hierarchy of GPU-VM lock and dma-resv lock of the corresponding BO. Considering all that, I thought it's probably better to track extobjs separate from the drm_gpuva_gem, hence the maple tree choice. Hm. OK, in Xe we're having a list of the xe_vmas (drm_gpuvas) that point to external objects, or in the case of multiple mappings to the same gem object, only one of the drm_gpuvas is in the list. These are protected by the GPU-VM lock. I don't see a problem with removing those from the fence signalling path, though? I intentionally tried to avoid keeping a list of drm_gpuvas to track extobjs, since this is generic code I don't know how much mappings of an external object the corresponding driver potentially creates. This could become a pretty large list to iterate. Another reason was, that I want to keep the drm_gpuva structure as small as possible, hence avoiding another list_head. Yes, the list might be pretty large, but OTOH you never iterate to access a single list element. When you need to iterate the whole list you need to do that regardless of the data structure used. As for the list head, it might perhaps be aliased (union) with
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Thu, Aug 31, 2023 at 11:04:06AM +0200, Thomas Hellström (Intel) wrote: > Hi! > > On 8/30/23 17:00, Danilo Krummrich wrote: > > On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: > > > On 8/30/23 14:49, Danilo Krummrich wrote: > > > > Hi Thomas, > > > > > > > > thanks for having a look! > > > > > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) > > > > wrote: > > > > > Hi, Danilo. > > > > > > > > > > Some quick comments since I'm doing some Xe work in this area. Will > > > > > probably > > > > > get back with more. > > > > > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote: > > > > > > diff --git a/include/drm/drm_gpuva_mgr.h > > > > > > b/include/drm/drm_gpuva_mgr.h > > > > > > index ed8d50200cc3..693e2da3f425 100644 > > > > > > --- a/include/drm/drm_gpuva_mgr.h > > > > > > +++ b/include/drm/drm_gpuva_mgr.h > > > > > > @@ -26,12 +26,16 @@ > > > > > > */ > > > > > > #include > > > > > > +#include > > > > > > +#include > > > > > > #include > > > > > > #include > > > > > > #include > > > > > > +#include > > > > > > struct drm_gpuva_manager; > > > > > > +struct drm_gpuva_gem; > > > > > > struct drm_gpuva_fn_ops; > > > > > > /** > > > > > > @@ -140,7 +144,7 @@ struct drm_gpuva { > > > > > > int drm_gpuva_insert(struct drm_gpuva_manager *mgr, struct > > > > > > drm_gpuva *va); > > > > > > void drm_gpuva_remove(struct drm_gpuva *va); > > > > > > -void drm_gpuva_link(struct drm_gpuva *va); > > > > > > +void drm_gpuva_link(struct drm_gpuva *va, struct drm_gpuva_gem > > > > > > *vm_bo); > > > > > > void drm_gpuva_unlink(struct drm_gpuva *va); > > > > > > struct drm_gpuva *drm_gpuva_find(struct drm_gpuva_manager *mgr, > > > > > > @@ -240,15 +244,137 @@ struct drm_gpuva_manager { > > > > > > * @ops: _gpuva_fn_ops providing the split/merge > > > > > > steps to drivers > > > > > > */ > > > > > > const struct drm_gpuva_fn_ops *ops; > > > > > > + > > > > > > + /** > > > > > > +* @d_obj: Dummy GEM object; used internally to pass the GPU VMs > > > > > > +* dma-resv to _exec. > > > > > > +*/ > > > > > > + struct drm_gem_object d_obj; > > > > > > + > > > > > > + /** > > > > > > +* @resv: the _resv for _gem_objects mapped in this GPU > > > > > > VA > > > > > > +* space > > > > > > +*/ > > > > > > + struct dma_resv *resv; > > > > > > + > > > > > > + /** > > > > > > +* @exec: the _exec helper to lock external _gem_objects > > > > > > +*/ > > > > > > + struct drm_exec exec; > > > > > > + > > > > > > + /** > > > > > > +* @mt_ext: _tree storing external _gem_objects > > > > > > +*/ > > > > > > + struct maple_tree mt_ext; > > > > > Why are you using a maple tree here? Insertion and removal is > > > > > O(log(n)) > > > > > instead of O(1) for a list? > > > > > > > > > Having a list of drm_gem_objects directly wouldn't work, as multiple > > > > GPU-VMs > > > > could have mappings of the same extobj. > > > > > > > > I considered using the VM_BO abstraction (struct drm_gpuva_gem) as list > > > > entry > > > > instead, which also seems to be the obvious choice. However, there is a > > > > locking > > > > conflict. > > > > > > > > A drm_gem_object keeps a list of drm_gpuva_gems, while each > > > > drm_gpuva_gem keeps > > > > a list of drm_gpuvas. Both lists are either protected with the dma-resv > > > > lock of > > > > the corresponding drm_gem_object, or with an external lock provided by > > > > the > > > > driver (see drm_gem_gpuva_set_lock()). The latter is used by drivers > > > > performing > > > > changes on the GPUVA space directly from the fence signalling path. > > > > > > > > Now, similar to what drm_gpuva_link() and drm_gpuva_unlink() are doing > > > > already, > > > > we'd want to add a drm_gpuva_gem to the extobj list for the first > > > > mapping being > > > > linked and we'd want to remove it for the last one being unlinked. > > > > > > > > (Actually we'd want to add the drm_gpuva_gem object to the extobj list > > > > even > > > > before, because otherwise we'd not acquire it's dma-resv lock of this > > > > GEM object > > > > through drm_gpuva_manager_lock(). But that's trival, we could do that > > > > when we > > > > create the drm_gpuva_gem, which we need to do anyways.) > > > > > > > > Anyway, we'd probably want to keep removing the drm_gpuva_gem from the > > > > extobj > > > > list from drm_gpuva_unlink() when the last mapping of this BO is > > > > unlinked. In > > > > order to do so, we'd (as discussed above) either need to hold the outer > > > > GPU-VM > > > > lock or the GPU-VMs dma-resv lock. Both would be illegal in the case > > > > drm_gpuva_unlink() is called from within the fence signalling path. For > > > > drivers > > > > like XE or Nouveau, we'd at least need to make sure to not mess up the > > > > locking > > > > hierarchy of GPU-VM lock and dma-resv lock of the
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi! On 8/30/23 17:00, Danilo Krummrich wrote: On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: So far the DRM GPUVA manager offers common infrastructure to track GPU VA allocations and mappings, generically connect GPU VA mappings to their backing buffers and perform more complex mapping operations on the GPU VA space. However, there are more design patterns commonly used by drivers, which can potentially be generalized in order to make the DRM GPUVA manager represent a basic GPU-VM implementation. In this context, this patch aims at generalizing the following elements. 1) Provide a common dma-resv for GEM objects not being used outside of this GPU-VM. 2) Provide tracking of external GEM objects (GEM objects which are shared with other GPU-VMs). 3) Provide functions to efficiently lock all GEM objects dma-resv the GPU-VM contains mappings of. 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings of, such that validation of evicted GEM objects is accelerated. 5) Provide some convinience functions for common patterns. Rather than being designed as a "framework", the target is to make all features appear as a collection of optional helper functions, such that drivers are free to make use of the DRM GPUVA managers basic functionality and opt-in for other features without setting any feature flags, just by making use of the corresponding functions. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- include/drm/drm_gem.h | 48 ++- include/drm/drm_gpuva_mgr.h | 302 +- 3 files changed, 1010 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c index f86bfad74ff8..69872b205961 100644 --- a/drivers/gpu/drm/drm_gpuva_mgr.c +++ b/drivers/gpu/drm/drm_gpuva_mgr.c @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, /** * drm_gpuva_manager_init() - initialize a _gpuva_manager * @mgr: pointer to the _gpuva_manager to initialize + * @drm: the drivers _device * @name: the name of the GPU VA space * @start_offset: the start offset of the GPU VA space * @range: the size of the GPU VA space @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, */ void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, + struct drm_device *drm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range, @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, mgr->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + mt_init(>mt_ext); + + INIT_LIST_HEAD(>evict.list); + spin_lock_init(>evict.lock); + drm_gpuva_check_overflow(start_offset, range); mgr->mm_start = start_offset; mgr->mm_range = range; @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, reserve_range))) __drm_gpuva_insert(mgr, >kernel_alloc_node); } + + drm_gem_private_object_init(drm, >d_obj, 0); + mgr->resv = mgr->d_obj.resv; } EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr) __drm_gpuva_remove(>kernel_alloc_node); WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), -"GPUVA tree is not empty, potentially leaking memory."); +"GPUVA tree is not empty, potentially leaking memory.\n"); + + mtree_destroy(>mt_ext); + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); + + drm_gem_private_object_fini(>d_obj); } EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); +/** + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs + * @mgr: the _gpuva_manager + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for all _gem_objects the given + * _gpuva_manager contains mappings of. + * + * Drivers can obtain the corresponding _exec instance through + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() + * and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, + unsigned int num_fences) +{ + struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); + MA_STATE(mas, >mt_ext, 0, 0); + union { +
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Wed, Aug 30, 2023 at 03:42:08PM +0200, Thomas Hellström (Intel) wrote: > > On 8/30/23 14:49, Danilo Krummrich wrote: > > Hi Thomas, > > > > thanks for having a look! > > > > On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: > > > Hi, Danilo. > > > > > > Some quick comments since I'm doing some Xe work in this area. Will > > > probably > > > get back with more. > > > > > > On 8/20/23 23:53, Danilo Krummrich wrote: > > > > So far the DRM GPUVA manager offers common infrastructure to track GPU > > > > VA > > > > allocations and mappings, generically connect GPU VA mappings to their > > > > backing buffers and perform more complex mapping operations on the GPU > > > > VA > > > > space. > > > > > > > > However, there are more design patterns commonly used by drivers, which > > > > can potentially be generalized in order to make the DRM GPUVA manager > > > > represent a basic GPU-VM implementation. In this context, this patch > > > > aims > > > > at generalizing the following elements. > > > > > > > > 1) Provide a common dma-resv for GEM objects not being used outside of > > > > this GPU-VM. > > > > > > > > 2) Provide tracking of external GEM objects (GEM objects which are > > > > shared with other GPU-VMs). > > > > > > > > 3) Provide functions to efficiently lock all GEM objects dma-resv the > > > > GPU-VM contains mappings of. > > > > > > > > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings > > > > of, such that validation of evicted GEM objects is accelerated. > > > > > > > > 5) Provide some convinience functions for common patterns. > > > > > > > > Rather than being designed as a "framework", the target is to make all > > > > features appear as a collection of optional helper functions, such that > > > > drivers are free to make use of the DRM GPUVA managers basic > > > > functionality and opt-in for other features without setting any feature > > > > flags, just by making use of the corresponding functions. > > > > > > > > Signed-off-by: Danilo Krummrich > > > > --- > > > >drivers/gpu/drm/drm_gpuva_mgr.c | 688 > > > > +++- > > > >include/drm/drm_gem.h | 48 ++- > > > >include/drm/drm_gpuva_mgr.h | 302 +- > > > >3 files changed, 1010 insertions(+), 28 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c > > > > b/drivers/gpu/drm/drm_gpuva_mgr.c > > > > index f86bfad74ff8..69872b205961 100644 > > > > --- a/drivers/gpu/drm/drm_gpuva_mgr.c > > > > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c > > > > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > > > >/** > > > > * drm_gpuva_manager_init() - initialize a _gpuva_manager > > > > * @mgr: pointer to the _gpuva_manager to initialize > > > > + * @drm: the drivers _device > > > > * @name: the name of the GPU VA space > > > > * @start_offset: the start offset of the GPU VA space > > > > * @range: the size of the GPU VA space > > > > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > > > > */ > > > >void > > > >drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > > > + struct drm_device *drm, > > > >const char *name, > > > >u64 start_offset, u64 range, > > > >u64 reserve_offset, u64 reserve_range, > > > > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager > > > > *mgr, > > > > mgr->rb.tree = RB_ROOT_CACHED; > > > > INIT_LIST_HEAD(>rb.list); > > > > + mt_init(>mt_ext); > > > > + > > > > + INIT_LIST_HEAD(>evict.list); > > > > + spin_lock_init(>evict.lock); > > > > + > > > > drm_gpuva_check_overflow(start_offset, range); > > > > mgr->mm_start = start_offset; > > > > mgr->mm_range = range; > > > > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager > > > > *mgr, > > > > reserve_range))) > > > > __drm_gpuva_insert(mgr, > > > > >kernel_alloc_node); > > > > } > > > > + > > > > + drm_gem_private_object_init(drm, >d_obj, 0); > > > > + mgr->resv = mgr->d_obj.resv; > > > >} > > > >EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); > > > > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct > > > > drm_gpuva_manager *mgr) > > > > __drm_gpuva_remove(>kernel_alloc_node); > > > > WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), > > > > -"GPUVA tree is not empty, potentially leaking memory."); > > > > +"GPUVA tree is not empty, potentially leaking memory.\n"); > > > > + > > > > + mtree_destroy(>mt_ext); > > > > + WARN(!list_empty(>evict.list), "Evict list should be > > > > empty.\n"); > > > > + > > > > + drm_gem_private_object_fini(>d_obj); > > > >} > > > >
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On 8/30/23 14:49, Danilo Krummrich wrote: Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: So far the DRM GPUVA manager offers common infrastructure to track GPU VA allocations and mappings, generically connect GPU VA mappings to their backing buffers and perform more complex mapping operations on the GPU VA space. However, there are more design patterns commonly used by drivers, which can potentially be generalized in order to make the DRM GPUVA manager represent a basic GPU-VM implementation. In this context, this patch aims at generalizing the following elements. 1) Provide a common dma-resv for GEM objects not being used outside of this GPU-VM. 2) Provide tracking of external GEM objects (GEM objects which are shared with other GPU-VMs). 3) Provide functions to efficiently lock all GEM objects dma-resv the GPU-VM contains mappings of. 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings of, such that validation of evicted GEM objects is accelerated. 5) Provide some convinience functions for common patterns. Rather than being designed as a "framework", the target is to make all features appear as a collection of optional helper functions, such that drivers are free to make use of the DRM GPUVA managers basic functionality and opt-in for other features without setting any feature flags, just by making use of the corresponding functions. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- include/drm/drm_gem.h | 48 ++- include/drm/drm_gpuva_mgr.h | 302 +- 3 files changed, 1010 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c index f86bfad74ff8..69872b205961 100644 --- a/drivers/gpu/drm/drm_gpuva_mgr.c +++ b/drivers/gpu/drm/drm_gpuva_mgr.c @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, /** * drm_gpuva_manager_init() - initialize a _gpuva_manager * @mgr: pointer to the _gpuva_manager to initialize + * @drm: the drivers _device * @name: the name of the GPU VA space * @start_offset: the start offset of the GPU VA space * @range: the size of the GPU VA space @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, */ void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, + struct drm_device *drm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range, @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, mgr->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + mt_init(>mt_ext); + + INIT_LIST_HEAD(>evict.list); + spin_lock_init(>evict.lock); + drm_gpuva_check_overflow(start_offset, range); mgr->mm_start = start_offset; mgr->mm_range = range; @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, reserve_range))) __drm_gpuva_insert(mgr, >kernel_alloc_node); } + + drm_gem_private_object_init(drm, >d_obj, 0); + mgr->resv = mgr->d_obj.resv; } EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr) __drm_gpuva_remove(>kernel_alloc_node); WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), -"GPUVA tree is not empty, potentially leaking memory."); +"GPUVA tree is not empty, potentially leaking memory.\n"); + + mtree_destroy(>mt_ext); + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); + + drm_gem_private_object_fini(>d_obj); } EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); +/** + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs + * @mgr: the _gpuva_manager + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for all _gem_objects the given + * _gpuva_manager contains mappings of. + * + * Drivers can obtain the corresponding _exec instance through + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() + * and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, + unsigned int num_fences) +{ + struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); + MA_STATE(mas, >mt_ext, 0, 0); + union { + void *ptr; + uintptr_t cnt; + } ref; + int ret; + + ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); +
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
On Wed, Aug 30, 2023 at 09:48:02AM +0200, Christian König wrote: > > > Am 20.08.23 um 23:53 schrieb Danilo Krummrich: > > So far the DRM GPUVA manager offers common infrastructure to track GPU VA > > allocations and mappings, generically connect GPU VA mappings to their > > backing buffers and perform more complex mapping operations on the GPU VA > > space. > > > > However, there are more design patterns commonly used by drivers, which > > can potentially be generalized in order to make the DRM GPUVA manager > > represent a basic GPU-VM implementation. In this context, this patch aims > > at generalizing the following elements. > > > > 1) Provide a common dma-resv for GEM objects not being used outside of > > this GPU-VM. > > > > 2) Provide tracking of external GEM objects (GEM objects which are > > shared with other GPU-VMs). > > > > 3) Provide functions to efficiently lock all GEM objects dma-resv the > > GPU-VM contains mappings of. > > > > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings > > of, such that validation of evicted GEM objects is accelerated. > > > > 5) Provide some convinience functions for common patterns. > > Interesting work. > > You basically implement a bunch of the ideas I came up to improve the amdgpu > performance in the common manager now. The was one of the remaining blockers > I had for using this in amdgpu. > > Question is for example how do you track evictions? E.g. we don't have a > common concept of eviction in GEM as far as I know. Or is the driver > responsible for giving those notifications to the GPUVA manager? Right, it is the driver being responsible to adding a drm_gpuva_gem (or VM_BO) to the managers evict list. The idea was that drivers have control about the state of a drm_gpuva_gem, such that a driver can move it to driver specific lists as well, like all the ones you have in amdgpu. > > And would it be possible to lock only a specific area of the VM, e.g. every > BO mapped in the interval X..Y? Currently, the drm_gpuva_manager_lock() functions always lock the GPU-VMs dma-resv lock, plus all the dma-resv locks of the external objects the manager keeps track of. But surely, we could also add something like drm_gpuva_manager_lock_range() where we just iterate all drm_gpuvas between X and Y and lock the dma-resv locks of each drm_gpuva's backing BO. > > Regards, > Christian. > > > > > Rather than being designed as a "framework", the target is to make all > > features appear as a collection of optional helper functions, such that > > drivers are free to make use of the DRM GPUVA managers basic > > functionality and opt-in for other features without setting any feature > > flags, just by making use of the corresponding functions. > > > > Signed-off-by: Danilo Krummrich > > --- > > drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- > > include/drm/drm_gem.h | 48 ++- > > include/drm/drm_gpuva_mgr.h | 302 +- > > 3 files changed, 1010 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c > > b/drivers/gpu/drm/drm_gpuva_mgr.c > > index f86bfad74ff8..69872b205961 100644 > > --- a/drivers/gpu/drm/drm_gpuva_mgr.c > > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c > > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > > /** > >* drm_gpuva_manager_init() - initialize a _gpuva_manager > >* @mgr: pointer to the _gpuva_manager to initialize > > + * @drm: the drivers _device > >* @name: the name of the GPU VA space > >* @start_offset: the start offset of the GPU VA space > >* @range: the size of the GPU VA space > > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > >*/ > > void > > drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > + struct drm_device *drm, > >const char *name, > >u64 start_offset, u64 range, > >u64 reserve_offset, u64 reserve_range, > > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > mgr->rb.tree = RB_ROOT_CACHED; > > INIT_LIST_HEAD(>rb.list); > > + mt_init(>mt_ext); > > + > > + INIT_LIST_HEAD(>evict.list); > > + spin_lock_init(>evict.lock); > > + > > drm_gpuva_check_overflow(start_offset, range); > > mgr->mm_start = start_offset; > > mgr->mm_range = range; > > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > reserve_range))) > > __drm_gpuva_insert(mgr, >kernel_alloc_node); > > } > > + > > + drm_gem_private_object_init(drm, >d_obj, 0); > > + mgr->resv = mgr->d_obj.resv; > > } > > EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); > > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager > > *mgr) > > __drm_gpuva_remove(>kernel_alloc_node); > >
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi Thomas, thanks for having a look! On Wed, Aug 30, 2023 at 09:27:45AM +0200, Thomas Hellström (Intel) wrote: > Hi, Danilo. > > Some quick comments since I'm doing some Xe work in this area. Will probably > get back with more. > > On 8/20/23 23:53, Danilo Krummrich wrote: > > So far the DRM GPUVA manager offers common infrastructure to track GPU VA > > allocations and mappings, generically connect GPU VA mappings to their > > backing buffers and perform more complex mapping operations on the GPU VA > > space. > > > > However, there are more design patterns commonly used by drivers, which > > can potentially be generalized in order to make the DRM GPUVA manager > > represent a basic GPU-VM implementation. In this context, this patch aims > > at generalizing the following elements. > > > > 1) Provide a common dma-resv for GEM objects not being used outside of > > this GPU-VM. > > > > 2) Provide tracking of external GEM objects (GEM objects which are > > shared with other GPU-VMs). > > > > 3) Provide functions to efficiently lock all GEM objects dma-resv the > > GPU-VM contains mappings of. > > > > 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings > > of, such that validation of evicted GEM objects is accelerated. > > > > 5) Provide some convinience functions for common patterns. > > > > Rather than being designed as a "framework", the target is to make all > > features appear as a collection of optional helper functions, such that > > drivers are free to make use of the DRM GPUVA managers basic > > functionality and opt-in for other features without setting any feature > > flags, just by making use of the corresponding functions. > > > > Signed-off-by: Danilo Krummrich > > --- > > drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- > > include/drm/drm_gem.h | 48 ++- > > include/drm/drm_gpuva_mgr.h | 302 +- > > 3 files changed, 1010 insertions(+), 28 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c > > b/drivers/gpu/drm/drm_gpuva_mgr.c > > index f86bfad74ff8..69872b205961 100644 > > --- a/drivers/gpu/drm/drm_gpuva_mgr.c > > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c > > @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > > /** > >* drm_gpuva_manager_init() - initialize a _gpuva_manager > >* @mgr: pointer to the _gpuva_manager to initialize > > + * @drm: the drivers _device > >* @name: the name of the GPU VA space > >* @start_offset: the start offset of the GPU VA space > >* @range: the size of the GPU VA space > > @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, > >*/ > > void > > drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > + struct drm_device *drm, > >const char *name, > >u64 start_offset, u64 range, > >u64 reserve_offset, u64 reserve_range, > > @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > mgr->rb.tree = RB_ROOT_CACHED; > > INIT_LIST_HEAD(>rb.list); > > + mt_init(>mt_ext); > > + > > + INIT_LIST_HEAD(>evict.list); > > + spin_lock_init(>evict.lock); > > + > > drm_gpuva_check_overflow(start_offset, range); > > mgr->mm_start = start_offset; > > mgr->mm_range = range; > > @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, > > reserve_range))) > > __drm_gpuva_insert(mgr, >kernel_alloc_node); > > } > > + > > + drm_gem_private_object_init(drm, >d_obj, 0); > > + mgr->resv = mgr->d_obj.resv; > > } > > EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); > > @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager > > *mgr) > > __drm_gpuva_remove(>kernel_alloc_node); > > WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), > > -"GPUVA tree is not empty, potentially leaking memory."); > > +"GPUVA tree is not empty, potentially leaking memory.\n"); > > + > > + mtree_destroy(>mt_ext); > > + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); > > + > > + drm_gem_private_object_fini(>d_obj); > > } > > EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); > > +/** > > + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs > > + * @mgr: the _gpuva_manager > > + * @num_fences: the amount of _fences to reserve > > + * > > + * Calls drm_exec_prepare_obj() for all _gem_objects the given > > + * _gpuva_manager contains mappings of. > > + * > > + * Drivers can obtain the corresponding _exec instance through > > + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call > > drm_exec_init() > > + * and drm_exec_fini() accordingly. > > + * > > + * Returns: 0 on success, negative error code on failure. > > + */ > > +int > > +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, > > + unsigned
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Am 20.08.23 um 23:53 schrieb Danilo Krummrich: So far the DRM GPUVA manager offers common infrastructure to track GPU VA allocations and mappings, generically connect GPU VA mappings to their backing buffers and perform more complex mapping operations on the GPU VA space. However, there are more design patterns commonly used by drivers, which can potentially be generalized in order to make the DRM GPUVA manager represent a basic GPU-VM implementation. In this context, this patch aims at generalizing the following elements. 1) Provide a common dma-resv for GEM objects not being used outside of this GPU-VM. 2) Provide tracking of external GEM objects (GEM objects which are shared with other GPU-VMs). 3) Provide functions to efficiently lock all GEM objects dma-resv the GPU-VM contains mappings of. 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings of, such that validation of evicted GEM objects is accelerated. 5) Provide some convinience functions for common patterns. Interesting work. You basically implement a bunch of the ideas I came up to improve the amdgpu performance in the common manager now. The was one of the remaining blockers I had for using this in amdgpu. Question is for example how do you track evictions? E.g. we don't have a common concept of eviction in GEM as far as I know. Or is the driver responsible for giving those notifications to the GPUVA manager? And would it be possible to lock only a specific area of the VM, e.g. every BO mapped in the interval X..Y? Regards, Christian. Rather than being designed as a "framework", the target is to make all features appear as a collection of optional helper functions, such that drivers are free to make use of the DRM GPUVA managers basic functionality and opt-in for other features without setting any feature flags, just by making use of the corresponding functions. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- include/drm/drm_gem.h | 48 ++- include/drm/drm_gpuva_mgr.h | 302 +- 3 files changed, 1010 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c index f86bfad74ff8..69872b205961 100644 --- a/drivers/gpu/drm/drm_gpuva_mgr.c +++ b/drivers/gpu/drm/drm_gpuva_mgr.c @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, /** * drm_gpuva_manager_init() - initialize a _gpuva_manager * @mgr: pointer to the _gpuva_manager to initialize + * @drm: the drivers _device * @name: the name of the GPU VA space * @start_offset: the start offset of the GPU VA space * @range: the size of the GPU VA space @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, */ void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, + struct drm_device *drm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range, @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, mgr->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + mt_init(>mt_ext); + + INIT_LIST_HEAD(>evict.list); + spin_lock_init(>evict.lock); + drm_gpuva_check_overflow(start_offset, range); mgr->mm_start = start_offset; mgr->mm_range = range; @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, reserve_range))) __drm_gpuva_insert(mgr, >kernel_alloc_node); } + + drm_gem_private_object_init(drm, >d_obj, 0); + mgr->resv = mgr->d_obj.resv; } EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr) __drm_gpuva_remove(>kernel_alloc_node); WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), -"GPUVA tree is not empty, potentially leaking memory."); +"GPUVA tree is not empty, potentially leaking memory.\n"); + + mtree_destroy(>mt_ext); + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); + + drm_gem_private_object_fini(>d_obj); } EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); +/** + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs + * @mgr: the _gpuva_manager + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for all _gem_objects the given + * _gpuva_manager contains mappings of. + * + * Drivers can obtain the corresponding _exec instance through + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() + * and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, + unsigned int
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi, Danilo. Some quick comments since I'm doing some Xe work in this area. Will probably get back with more. On 8/20/23 23:53, Danilo Krummrich wrote: So far the DRM GPUVA manager offers common infrastructure to track GPU VA allocations and mappings, generically connect GPU VA mappings to their backing buffers and perform more complex mapping operations on the GPU VA space. However, there are more design patterns commonly used by drivers, which can potentially be generalized in order to make the DRM GPUVA manager represent a basic GPU-VM implementation. In this context, this patch aims at generalizing the following elements. 1) Provide a common dma-resv for GEM objects not being used outside of this GPU-VM. 2) Provide tracking of external GEM objects (GEM objects which are shared with other GPU-VMs). 3) Provide functions to efficiently lock all GEM objects dma-resv the GPU-VM contains mappings of. 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings of, such that validation of evicted GEM objects is accelerated. 5) Provide some convinience functions for common patterns. Rather than being designed as a "framework", the target is to make all features appear as a collection of optional helper functions, such that drivers are free to make use of the DRM GPUVA managers basic functionality and opt-in for other features without setting any feature flags, just by making use of the corresponding functions. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- include/drm/drm_gem.h | 48 ++- include/drm/drm_gpuva_mgr.h | 302 +- 3 files changed, 1010 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c index f86bfad74ff8..69872b205961 100644 --- a/drivers/gpu/drm/drm_gpuva_mgr.c +++ b/drivers/gpu/drm/drm_gpuva_mgr.c @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, /** * drm_gpuva_manager_init() - initialize a _gpuva_manager * @mgr: pointer to the _gpuva_manager to initialize + * @drm: the drivers _device * @name: the name of the GPU VA space * @start_offset: the start offset of the GPU VA space * @range: the size of the GPU VA space @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, */ void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, + struct drm_device *drm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range, @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, mgr->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + mt_init(>mt_ext); + + INIT_LIST_HEAD(>evict.list); + spin_lock_init(>evict.lock); + drm_gpuva_check_overflow(start_offset, range); mgr->mm_start = start_offset; mgr->mm_range = range; @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, reserve_range))) __drm_gpuva_insert(mgr, >kernel_alloc_node); } + + drm_gem_private_object_init(drm, >d_obj, 0); + mgr->resv = mgr->d_obj.resv; } EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr) __drm_gpuva_remove(>kernel_alloc_node); WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), -"GPUVA tree is not empty, potentially leaking memory."); +"GPUVA tree is not empty, potentially leaking memory.\n"); + + mtree_destroy(>mt_ext); + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); + + drm_gem_private_object_fini(>d_obj); } EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); +/** + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs + * @mgr: the _gpuva_manager + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for all _gem_objects the given + * _gpuva_manager contains mappings of. + * + * Drivers can obtain the corresponding _exec instance through + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() + * and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, + unsigned int num_fences) +{ + struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); + MA_STATE(mas, >mt_ext, 0, 0); + union { + void *ptr; + uintptr_t cnt; + } ref; + int ret; + + ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); + if (ret) + goto out; + + rcu_read_lock(); In xe we're protecting the external object list with an outer lock, (same as protecting the mgr itself).
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi Danilo, kernel test robot noticed the following build warnings: [auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f] url: https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143 base: 25205087df1ffe06ccea9302944ed1f77dc68c6f patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation config: arm-randconfig-r014-20230822 (https://download.01.org/0day-ci/archive/20230822/202308221050.ktj8ufma-...@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 12.3.0 reproduce: (https://download.01.org/0day-ci/archive/20230822/202308221050.ktj8ufma-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202308221050.ktj8ufma-...@intel.com/ All warnings (new ones prefixed by >>): >> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for >> 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes] 750 | drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, | ^ drivers/gpu/drm/drm_gpuva_mgr.c: In function '__drm_gpuva_sm_map': drivers/gpu/drm/drm_gpuva_mgr.c:1744:39: warning: variable 'prev' set but not used [-Wunused-but-set-variable] 1744 | struct drm_gpuva *va, *next, *prev = NULL; | ^~~~ -- >> drivers/gpu/drm/drm_gpuva_mgr.c:1091: warning: Function parameter or member >> '__vm_bo' not described in 'drm_gpuva_gem_obtain_prealloc' vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c 734 735 /** 736 * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs 737 * @mgr: the _gpuva_manager 738 * @num_fences: the amount of _fences to reserve 739 * 740 * Calls drm_exec_prepare_obj() for all _gem_objects the given 741 * _gpuva_manager contains mappings of. 742 * 743 * Drivers can obtain the corresponding _exec instance through 744 * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() 745 * and drm_exec_fini() accordingly. 746 * 747 * Returns: 0 on success, negative error code on failure. 748 */ 749 int > 750 drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, 751unsigned int num_fences) 752 { 753 struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); 754 MA_STATE(mas, >mt_ext, 0, 0); 755 union { 756 void *ptr; 757 uintptr_t cnt; 758 } ref; 759 int ret; 760 761 ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); 762 if (ret) 763 goto out; 764 765 rcu_read_lock(); 766 mas_for_each(, ref.ptr, ULONG_MAX) { 767 struct drm_gem_object *obj; 768 769 mas_pause(); 770 rcu_read_unlock(); 771 772 obj = (struct drm_gem_object *)(uintptr_t)mas.index; 773 ret = drm_exec_prepare_obj(exec, obj, num_fences); 774 if (ret) 775 goto out; 776 777 rcu_read_lock(); 778 } 779 rcu_read_unlock(); 780 781 out: 782 return ret; 783 } 784 EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects); 785 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi Danilo, kernel test robot noticed the following build warnings: [auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f] url: https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143 base: 25205087df1ffe06ccea9302944ed1f77dc68c6f patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation config: sparc-randconfig-r022-20230822 (https://download.01.org/0day-ci/archive/20230822/202308221021.jczejwoy-...@intel.com/config) compiler: sparc64-linux-gcc (GCC) 12.3.0 reproduce: (https://download.01.org/0day-ci/archive/20230822/202308221021.jczejwoy-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202308221021.jczejwoy-...@intel.com/ All warnings (new ones prefixed by >>): >> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for >> 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes] 750 | drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, | ^ drivers/gpu/drm/drm_gpuva_mgr.c: In function '__drm_gpuva_sm_map': drivers/gpu/drm/drm_gpuva_mgr.c:1744:39: warning: variable 'prev' set but not used [-Wunused-but-set-variable] 1744 | struct drm_gpuva *va, *next, *prev = NULL; | ^~~~ vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c 734 735 /** 736 * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs 737 * @mgr: the _gpuva_manager 738 * @num_fences: the amount of _fences to reserve 739 * 740 * Calls drm_exec_prepare_obj() for all _gem_objects the given 741 * _gpuva_manager contains mappings of. 742 * 743 * Drivers can obtain the corresponding _exec instance through 744 * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() 745 * and drm_exec_fini() accordingly. 746 * 747 * Returns: 0 on success, negative error code on failure. 748 */ 749 int > 750 drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, 751unsigned int num_fences) 752 { 753 struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); 754 MA_STATE(mas, >mt_ext, 0, 0); 755 union { 756 void *ptr; 757 uintptr_t cnt; 758 } ref; 759 int ret; 760 761 ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); 762 if (ret) 763 goto out; 764 765 rcu_read_lock(); 766 mas_for_each(, ref.ptr, ULONG_MAX) { 767 struct drm_gem_object *obj; 768 769 mas_pause(); 770 rcu_read_unlock(); 771 772 obj = (struct drm_gem_object *)(uintptr_t)mas.index; 773 ret = drm_exec_prepare_obj(exec, obj, num_fences); 774 if (ret) 775 goto out; 776 777 rcu_read_lock(); 778 } 779 rcu_read_unlock(); 780 781 out: 782 return ret; 783 } 784 EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects); 785 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Re: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
Hi Danilo, kernel test robot noticed the following build warnings: [auto build test WARNING on 25205087df1ffe06ccea9302944ed1f77dc68c6f] url: https://github.com/intel-lab-lkp/linux/commits/Danilo-Krummrich/drm-drm_exec-build-always-builtin/20230821-123143 base: 25205087df1ffe06ccea9302944ed1f77dc68c6f patch link:https://lore.kernel.org/r/20230820215320.4187-3-dakr%40redhat.com patch subject: [PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation config: i386-randconfig-r024-20230822 (https://download.01.org/0day-ci/archive/20230822/202308220935.ik8qpkf4-...@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce: (https://download.01.org/0day-ci/archive/20230822/202308220935.ik8qpkf4-...@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot | Closes: https://lore.kernel.org/oe-kbuild-all/202308220935.ik8qpkf4-...@intel.com/ All warnings (new ones prefixed by >>): >> drivers/gpu/drm/drm_gpuva_mgr.c:750:1: warning: no previous prototype for >> function 'drm_gpuva_manager_prepare_objects' [-Wmissing-prototypes] drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, ^ drivers/gpu/drm/drm_gpuva_mgr.c:749:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int ^ static drivers/gpu/drm/drm_gpuva_mgr.c:1744:32: warning: variable 'prev' set but not used [-Wunused-but-set-variable] struct drm_gpuva *va, *next, *prev = NULL; ^ 2 warnings generated. -- >> drivers/gpu/drm/drm_gpuva_mgr.c:1091: warning: Function parameter or member >> '__vm_bo' not described in 'drm_gpuva_gem_obtain_prealloc' vim +/drm_gpuva_manager_prepare_objects +750 drivers/gpu/drm/drm_gpuva_mgr.c 734 735 /** 736 * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs 737 * @mgr: the _gpuva_manager 738 * @num_fences: the amount of _fences to reserve 739 * 740 * Calls drm_exec_prepare_obj() for all _gem_objects the given 741 * _gpuva_manager contains mappings of. 742 * 743 * Drivers can obtain the corresponding _exec instance through 744 * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() 745 * and drm_exec_fini() accordingly. 746 * 747 * Returns: 0 on success, negative error code on failure. 748 */ 749 int > 750 drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, 751unsigned int num_fences) 752 { 753 struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); 754 MA_STATE(mas, >mt_ext, 0, 0); 755 union { 756 void *ptr; 757 uintptr_t cnt; 758 } ref; 759 int ret; 760 761 ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); 762 if (ret) 763 goto out; 764 765 rcu_read_lock(); 766 mas_for_each(, ref.ptr, ULONG_MAX) { 767 struct drm_gem_object *obj; 768 769 mas_pause(); 770 rcu_read_unlock(); 771 772 obj = (struct drm_gem_object *)(uintptr_t)mas.index; 773 ret = drm_exec_prepare_obj(exec, obj, num_fences); 774 if (ret) 775 goto out; 776 777 rcu_read_lock(); 778 } 779 rcu_read_unlock(); 780 781 out: 782 return ret; 783 } 784 EXPORT_SYMBOL_GPL(drm_gpuva_manager_prepare_objects); 785 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
[PATCH drm-misc-next 2/3] drm/gpuva_mgr: generalize dma_resv/extobj handling and GEM validation
So far the DRM GPUVA manager offers common infrastructure to track GPU VA allocations and mappings, generically connect GPU VA mappings to their backing buffers and perform more complex mapping operations on the GPU VA space. However, there are more design patterns commonly used by drivers, which can potentially be generalized in order to make the DRM GPUVA manager represent a basic GPU-VM implementation. In this context, this patch aims at generalizing the following elements. 1) Provide a common dma-resv for GEM objects not being used outside of this GPU-VM. 2) Provide tracking of external GEM objects (GEM objects which are shared with other GPU-VMs). 3) Provide functions to efficiently lock all GEM objects dma-resv the GPU-VM contains mappings of. 4) Provide tracking of evicted GEM objects the GPU-VM contains mappings of, such that validation of evicted GEM objects is accelerated. 5) Provide some convinience functions for common patterns. Rather than being designed as a "framework", the target is to make all features appear as a collection of optional helper functions, such that drivers are free to make use of the DRM GPUVA managers basic functionality and opt-in for other features without setting any feature flags, just by making use of the corresponding functions. Signed-off-by: Danilo Krummrich --- drivers/gpu/drm/drm_gpuva_mgr.c | 688 +++- include/drm/drm_gem.h | 48 ++- include/drm/drm_gpuva_mgr.h | 302 +- 3 files changed, 1010 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c index f86bfad74ff8..69872b205961 100644 --- a/drivers/gpu/drm/drm_gpuva_mgr.c +++ b/drivers/gpu/drm/drm_gpuva_mgr.c @@ -655,6 +655,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, /** * drm_gpuva_manager_init() - initialize a _gpuva_manager * @mgr: pointer to the _gpuva_manager to initialize + * @drm: the drivers _device * @name: the name of the GPU VA space * @start_offset: the start offset of the GPU VA space * @range: the size of the GPU VA space @@ -669,6 +670,7 @@ drm_gpuva_range_valid(struct drm_gpuva_manager *mgr, */ void drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, + struct drm_device *drm, const char *name, u64 start_offset, u64 range, u64 reserve_offset, u64 reserve_range, @@ -677,6 +679,11 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, mgr->rb.tree = RB_ROOT_CACHED; INIT_LIST_HEAD(>rb.list); + mt_init(>mt_ext); + + INIT_LIST_HEAD(>evict.list); + spin_lock_init(>evict.lock); + drm_gpuva_check_overflow(start_offset, range); mgr->mm_start = start_offset; mgr->mm_range = range; @@ -694,6 +701,9 @@ drm_gpuva_manager_init(struct drm_gpuva_manager *mgr, reserve_range))) __drm_gpuva_insert(mgr, >kernel_alloc_node); } + + drm_gem_private_object_init(drm, >d_obj, 0); + mgr->resv = mgr->d_obj.resv; } EXPORT_SYMBOL_GPL(drm_gpuva_manager_init); @@ -713,10 +723,575 @@ drm_gpuva_manager_destroy(struct drm_gpuva_manager *mgr) __drm_gpuva_remove(>kernel_alloc_node); WARN(!RB_EMPTY_ROOT(>rb.tree.rb_root), -"GPUVA tree is not empty, potentially leaking memory."); +"GPUVA tree is not empty, potentially leaking memory.\n"); + + mtree_destroy(>mt_ext); + WARN(!list_empty(>evict.list), "Evict list should be empty.\n"); + + drm_gem_private_object_fini(>d_obj); } EXPORT_SYMBOL_GPL(drm_gpuva_manager_destroy); +/** + * drm_gpuva_manager_prepare_objects() - prepare all assoiciated BOs + * @mgr: the _gpuva_manager + * @num_fences: the amount of _fences to reserve + * + * Calls drm_exec_prepare_obj() for all _gem_objects the given + * _gpuva_manager contains mappings of. + * + * Drivers can obtain the corresponding _exec instance through + * DRM_GPUVA_EXEC(). It is the drivers responsibility to call drm_exec_init() + * and drm_exec_fini() accordingly. + * + * Returns: 0 on success, negative error code on failure. + */ +int +drm_gpuva_manager_prepare_objects(struct drm_gpuva_manager *mgr, + unsigned int num_fences) +{ + struct drm_exec *exec = DRM_GPUVA_EXEC(mgr); + MA_STATE(mas, >mt_ext, 0, 0); + union { + void *ptr; + uintptr_t cnt; + } ref; + int ret; + + ret = drm_exec_prepare_obj(exec, >d_obj, num_fences); + if (ret) + goto out; + + rcu_read_lock(); + mas_for_each(, ref.ptr, ULONG_MAX) { + struct drm_gem_object *obj; + + mas_pause(); + rcu_read_unlock(); + + obj = (struct drm_gem_object *)(uintptr_t)mas.index; + ret = drm_exec_prepare_obj(exec, obj,