Re: [PATCH 1/9] drm: execution context for GEM buffers v3
Hi Christian, hi Thomas, On Fri, Mar 10, 2023 at 11:42:56AM +0100, Thomas Hellström (Intel) wrote: > Nice. This seems to have all we need for now for Xe as well, although not > for i915 ATM. A series to use drm_exec in Xe has been sent for review: https://patchwork.freedesktop.org/series/116477/ Thanks, Francois
Re: [PATCH 1/9] drm: execution context for GEM buffers v3
Hi Christian On 2/28/23 09:33, Christian König wrote: This adds the infrastructure for an execution context for GEM buffers which is similar to the existinc TTMs execbuf util and intended to replace it in the long term. The basic functionality is that we abstracts the necessary loop to lock many different GEM buffers with automated deadlock and duplicate handling. v2: drop xarray and use dynamic resized array instead, the locking overhead is unecessary and measureable. v3: drop duplicate tracking, radeon is really the only one needing that. Signed-off-by: Christian König Nice. This seems to have all we need for now for Xe as well, although not for i915 ATM. I see future exension of this for things like sleeping lock evictions (where locks may go out-of-scope during the locking loop), including other resv containers, passing through dma-buf move_notify() etc, what are your thoughts around that? Extend this or build a different functionality and make this a specialization of that? What about the drm_gem_lock_reservations() interface? While a bit less capable it's confusing having two apis doing essentially the same thing, could we deprecate that? Finally a comment below: --- Documentation/gpu/drm-mm.rst | 12 ++ drivers/gpu/drm/Kconfig | 6 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_exec.c | 249 +++ include/drm/drm_exec.h | 115 5 files changed, 384 insertions(+) create mode 100644 drivers/gpu/drm/drm_exec.c create mode 100644 include/drm/drm_exec.h diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index a79fd3549ff8..a52e6f4117d6 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -493,6 +493,18 @@ DRM Sync Objects .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c :export: +DRM Execution context += + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :doc: Overview + +.. kernel-doc:: include/drm/drm_exec.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :export: + GPU Scheduler = diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 17d252dc25e2..84a5fc28c48d 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -200,6 +200,12 @@ config DRM_TTM GPU memory types. Will be enabled automatically if a device driver uses it. +config DRM_EXEC + tristate + depends on DRM + help + Execution context for command submissions + config DRM_BUDDY tristate depends on DRM diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index ab4460fcd63f..d40defbb0347 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o # # Memory-management helpers # +# +obj-$(CONFIG_DRM_EXEC) += drm_exec.o obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c new file mode 100644 index ..df546cc5a227 --- /dev/null +++ b/drivers/gpu/drm/drm_exec.c @@ -0,0 +1,249 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ + +#include +#include +#include + +/** + * DOC: Overview + * + * This component mainly abstracts the retry loop necessary for locking + * multiple GEM objects while preparing hardware operations (e.g. command + * submissions, page table updates etc..). + * + * If a contention is detected while locking a GEM object the cleanup procedure + * unlocks all previously locked GEM objects and locks the contended one first + * before locking any further objects. + * + * After an object is locked fences slots can optionally be reserved on the + * dma_resv object inside the GEM object. + * + * A typical usage pattern should look like this:: + * + * struct drm_gem_object *obj; + * struct drm_exec exec; + * unsigned long index; + * int ret; + * + * drm_exec_init(&exec, true); + * drm_exec_while_not_all_locked(&exec) { + * ret = drm_exec_prepare_obj(&exec, boA, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * + * ret = drm_exec_lock(&exec, boB, 1); Should be drm_exec_prepare_obj()? + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * } + * + * drm_exec_for_each_locked_object(&exec, index, obj) { + * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ); + * ... + * } + * drm_exec_fini(&exec); + * + * See struct dma_exec for more details. + */ + +/* Dummy value used to initially enter the retry loop */ +#define DRM_EXEC_DUMMY (void*)~0 + +/* Unlock all objects and drop references */ +static void drm_exec_unlock_all(struct drm_exec *exec) +{ + struct drm_gem_object *obj; + unsign
Re: [PATCH 1/9] drm: execution context for GEM buffers v3
On 2023-02-28 03:33, Christian König wrote: > This adds the infrastructure for an execution context for GEM buffers > which is similar to the existinc TTMs execbuf util and intended to replace > it in the long term. > > The basic functionality is that we abstracts the necessary loop to lock > many different GEM buffers with automated deadlock and duplicate handling. > > v2: drop xarray and use dynamic resized array instead, the locking > overhead is unecessary and measureable. > v3: drop duplicate tracking, radeon is really the only one needing that. > > Signed-off-by: Christian König > --- > Documentation/gpu/drm-mm.rst | 12 ++ > drivers/gpu/drm/Kconfig | 6 + > drivers/gpu/drm/Makefile | 2 + > drivers/gpu/drm/drm_exec.c | 249 +++ > include/drm/drm_exec.h | 115 > 5 files changed, 384 insertions(+) > create mode 100644 drivers/gpu/drm/drm_exec.c > create mode 100644 include/drm/drm_exec.h > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst > index a79fd3549ff8..a52e6f4117d6 100644 > --- a/Documentation/gpu/drm-mm.rst > +++ b/Documentation/gpu/drm-mm.rst > @@ -493,6 +493,18 @@ DRM Sync Objects > .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c > :export: > > +DRM Execution context > += > + > +.. kernel-doc:: drivers/gpu/drm/drm_exec.c > + :doc: Overview > + > +.. kernel-doc:: include/drm/drm_exec.h > + :internal: > + > +.. kernel-doc:: drivers/gpu/drm/drm_exec.c > + :export: > + > GPU Scheduler > = > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > index 17d252dc25e2..84a5fc28c48d 100644 > --- a/drivers/gpu/drm/Kconfig > +++ b/drivers/gpu/drm/Kconfig > @@ -200,6 +200,12 @@ config DRM_TTM > GPU memory types. Will be enabled automatically if a device driver > uses it. > > +config DRM_EXEC > + tristate > + depends on DRM > + help > + Execution context for command submissions > + > config DRM_BUDDY > tristate > depends on DRM > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index ab4460fcd63f..d40defbb0347 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += > drm_panel_orientation_quirks.o > # > # Memory-management helpers > # > +# > +obj-$(CONFIG_DRM_EXEC) += drm_exec.o > > obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o > > diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c > new file mode 100644 > index ..df546cc5a227 > --- /dev/null > +++ b/drivers/gpu/drm/drm_exec.c > @@ -0,0 +1,249 @@ > +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ > + > +#include > +#include > +#include > + > +/** > + * DOC: Overview > + * > + * This component mainly abstracts the retry loop necessary for locking > + * multiple GEM objects while preparing hardware operations (e.g. command > + * submissions, page table updates etc..). > + * > + * If a contention is detected while locking a GEM object the cleanup > procedure > + * unlocks all previously locked GEM objects and locks the contended one > first > + * before locking any further objects. > + * > + * After an object is locked fences slots can optionally be reserved on the > + * dma_resv object inside the GEM object. > + * > + * A typical usage pattern should look like this:: > + * > + * struct drm_gem_object *obj; > + * struct drm_exec exec; > + * unsigned long index; > + * int ret; > + * > + * drm_exec_init(&exec, true); > + * drm_exec_while_not_all_locked(&exec) { > + * ret = drm_exec_prepare_obj(&exec, boA, 1); > + * drm_exec_continue_on_contention(&exec); > + * if (ret) > + * goto error; > + * > + * ret = drm_exec_lock(&exec, boB, 1); > + * drm_exec_continue_on_contention(&exec); > + * if (ret) > + * goto error; > + * } > + * > + * drm_exec_for_each_locked_object(&exec, index, obj) { > + * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ); > + * ... > + * } > + * drm_exec_fini(&exec); Maybe add the error: label here to show how recovery is to be had. > + * > + * See struct dma_exec for more details. > + */ > + > +/* Dummy value used to initially enter the retry loop */ > +#define DRM_EXEC_DUMMY (void*)~0 > + > +/* Unlock all objects and drop references */ > +static void drm_exec_unlock_all(struct drm_exec *exec) > +{ > + struct drm_gem_object *obj; > + unsigned long index; > + > + drm_exec_for_each_locked_object(exec, index, obj) { > + dma_resv_unlock(obj->resv); > + drm_gem_object_put(obj); > + } > + > + if (exec->prelocked) { > + dma_resv_unlock(exec->prelocked->resv); > + drm_gem_object_put(exec->prelocked); > + exec->prelocked = NULL; > + } > +} > + > +/** > + * drm_exec_init - initial
Re: [PATCH 1/9] drm: execution context for GEM buffers v3
On 2023-02-28 14:13, Danilo Krummrich wrote: > On 2/28/23 09:33, Christian König wrote: >> This adds the infrastructure for an execution context for GEM buffers >> which is similar to the existinc TTMs execbuf util and intended to replace > > "existing" > >> it in the long term. >> >> The basic functionality is that we abstracts the necessary loop to lock >> many different GEM buffers with automated deadlock and duplicate handling. >> >> v2: drop xarray and use dynamic resized array instead, the locking >> overhead is unecessary and measureable. > > "unecessary", "measurable" "unnecessary". -- Regards, Luben
Re: [PATCH 1/9] drm: execution context for GEM buffers v3
Am 28.02.23 um 20:13 schrieb Danilo Krummrich: [SNIP] + if (exec->prelocked) { + dma_resv_unlock(exec->prelocked->resv); + drm_gem_object_put(exec->prelocked); + exec->prelocked = NULL; + } Let's say we try to lock 3 objects A, B and C in chronological order and in the first "drm_exec_cleanup() iteration" C is contended. Firstly, we lock C in the next iteration. If now A or B is contended, we never set exec->prelocked to NULL in drm_exec_prepare_obj(), since we did not yet reach C. Hence, this causes a double unlock, since the prelocked object is also unlocked in the above loop. Maybe I miss a detail, but to me it looks like setting exec->prelocked to NULL and dropping the reference should be enough. Ah, yes of course. That wasn't correct and my test cases didn't covered it. Going to fix this and all the comments you pointed out and update the test cases, should be done by next week. Thanks, Christian.
Re: [PATCH 1/9] drm: execution context for GEM buffers v3
On 2/28/23 09:33, Christian König wrote: This adds the infrastructure for an execution context for GEM buffers which is similar to the existinc TTMs execbuf util and intended to replace "existing" it in the long term. The basic functionality is that we abstracts the necessary loop to lock many different GEM buffers with automated deadlock and duplicate handling. v2: drop xarray and use dynamic resized array instead, the locking overhead is unecessary and measureable. "unecessary", "measurable" v3: drop duplicate tracking, radeon is really the only one needing that. Signed-off-by: Christian König --- Documentation/gpu/drm-mm.rst | 12 ++ drivers/gpu/drm/Kconfig | 6 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_exec.c | 249 +++ include/drm/drm_exec.h | 115 5 files changed, 384 insertions(+) create mode 100644 drivers/gpu/drm/drm_exec.c create mode 100644 include/drm/drm_exec.h diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index a79fd3549ff8..a52e6f4117d6 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -493,6 +493,18 @@ DRM Sync Objects .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c :export: +DRM Execution context += + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :doc: Overview + +.. kernel-doc:: include/drm/drm_exec.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :export: + GPU Scheduler = diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 17d252dc25e2..84a5fc28c48d 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -200,6 +200,12 @@ config DRM_TTM GPU memory types. Will be enabled automatically if a device driver uses it. +config DRM_EXEC + tristate + depends on DRM + help + Execution context for command submissions + config DRM_BUDDY tristate depends on DRM diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index ab4460fcd63f..d40defbb0347 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o # # Memory-management helpers # +# +obj-$(CONFIG_DRM_EXEC) += drm_exec.o obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c new file mode 100644 index ..df546cc5a227 --- /dev/null +++ b/drivers/gpu/drm/drm_exec.c @@ -0,0 +1,249 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ + +#include +#include +#include + +/** + * DOC: Overview + * + * This component mainly abstracts the retry loop necessary for locking + * multiple GEM objects while preparing hardware operations (e.g. command + * submissions, page table updates etc..). + * + * If a contention is detected while locking a GEM object the cleanup procedure + * unlocks all previously locked GEM objects and locks the contended one first + * before locking any further objects. + * + * After an object is locked fences slots can optionally be reserved on the + * dma_resv object inside the GEM object. + * + * A typical usage pattern should look like this:: + * + * struct drm_gem_object *obj; + * struct drm_exec exec; + * unsigned long index; + * int ret; + * + * drm_exec_init(&exec, true); + * drm_exec_while_not_all_locked(&exec) { + * ret = drm_exec_prepare_obj(&exec, boA, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * + * ret = drm_exec_lock(&exec, boB, 1); This function doesn't seem to exist (anymore). + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * } + * + * drm_exec_for_each_locked_object(&exec, index, obj) { + * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ); + * ... + * } + * drm_exec_fini(&exec); + * + * See struct dma_exec for more details. + */ + +/* Dummy value used to initially enter the retry loop */ +#define DRM_EXEC_DUMMY (void*)~0 + +/* Unlock all objects and drop references */ +static void drm_exec_unlock_all(struct drm_exec *exec) +{ + struct drm_gem_object *obj; + unsigned long index; + + drm_exec_for_each_locked_object(exec, index, obj) { + dma_resv_unlock(obj->resv); + drm_gem_object_put(obj); + } + + if (exec->prelocked) { + dma_resv_unlock(exec->prelocked->resv); + drm_gem_object_put(exec->prelocked); + exec->prelocked = NULL; + } Let's say we try to lock 3 objects A, B and C in chronological order and in the first "drm_exec_cleanup() iteration" C is contended. Firstly, we lock C in the next iteration. If now A or B is contended, we never set
[PATCH 1/9] drm: execution context for GEM buffers v3
This adds the infrastructure for an execution context for GEM buffers which is similar to the existinc TTMs execbuf util and intended to replace it in the long term. The basic functionality is that we abstracts the necessary loop to lock many different GEM buffers with automated deadlock and duplicate handling. v2: drop xarray and use dynamic resized array instead, the locking overhead is unecessary and measureable. v3: drop duplicate tracking, radeon is really the only one needing that. Signed-off-by: Christian König --- Documentation/gpu/drm-mm.rst | 12 ++ drivers/gpu/drm/Kconfig | 6 + drivers/gpu/drm/Makefile | 2 + drivers/gpu/drm/drm_exec.c | 249 +++ include/drm/drm_exec.h | 115 5 files changed, 384 insertions(+) create mode 100644 drivers/gpu/drm/drm_exec.c create mode 100644 include/drm/drm_exec.h diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index a79fd3549ff8..a52e6f4117d6 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -493,6 +493,18 @@ DRM Sync Objects .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c :export: +DRM Execution context += + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :doc: Overview + +.. kernel-doc:: include/drm/drm_exec.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_exec.c + :export: + GPU Scheduler = diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 17d252dc25e2..84a5fc28c48d 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -200,6 +200,12 @@ config DRM_TTM GPU memory types. Will be enabled automatically if a device driver uses it. +config DRM_EXEC + tristate + depends on DRM + help + Execution context for command submissions + config DRM_BUDDY tristate depends on DRM diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index ab4460fcd63f..d40defbb0347 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -78,6 +78,8 @@ obj-$(CONFIG_DRM_PANEL_ORIENTATION_QUIRKS) += drm_panel_orientation_quirks.o # # Memory-management helpers # +# +obj-$(CONFIG_DRM_EXEC) += drm_exec.o obj-$(CONFIG_DRM_BUDDY) += drm_buddy.o diff --git a/drivers/gpu/drm/drm_exec.c b/drivers/gpu/drm/drm_exec.c new file mode 100644 index ..df546cc5a227 --- /dev/null +++ b/drivers/gpu/drm/drm_exec.c @@ -0,0 +1,249 @@ +/* SPDX-License-Identifier: GPL-2.0 OR MIT */ + +#include +#include +#include + +/** + * DOC: Overview + * + * This component mainly abstracts the retry loop necessary for locking + * multiple GEM objects while preparing hardware operations (e.g. command + * submissions, page table updates etc..). + * + * If a contention is detected while locking a GEM object the cleanup procedure + * unlocks all previously locked GEM objects and locks the contended one first + * before locking any further objects. + * + * After an object is locked fences slots can optionally be reserved on the + * dma_resv object inside the GEM object. + * + * A typical usage pattern should look like this:: + * + * struct drm_gem_object *obj; + * struct drm_exec exec; + * unsigned long index; + * int ret; + * + * drm_exec_init(&exec, true); + * drm_exec_while_not_all_locked(&exec) { + * ret = drm_exec_prepare_obj(&exec, boA, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * + * ret = drm_exec_lock(&exec, boB, 1); + * drm_exec_continue_on_contention(&exec); + * if (ret) + * goto error; + * } + * + * drm_exec_for_each_locked_object(&exec, index, obj) { + * dma_resv_add_fence(obj->resv, fence, DMA_RESV_USAGE_READ); + * ... + * } + * drm_exec_fini(&exec); + * + * See struct dma_exec for more details. + */ + +/* Dummy value used to initially enter the retry loop */ +#define DRM_EXEC_DUMMY (void*)~0 + +/* Unlock all objects and drop references */ +static void drm_exec_unlock_all(struct drm_exec *exec) +{ + struct drm_gem_object *obj; + unsigned long index; + + drm_exec_for_each_locked_object(exec, index, obj) { + dma_resv_unlock(obj->resv); + drm_gem_object_put(obj); + } + + if (exec->prelocked) { + dma_resv_unlock(exec->prelocked->resv); + drm_gem_object_put(exec->prelocked); + exec->prelocked = NULL; + } +} + +/** + * drm_exec_init - initialize a drm_exec object + * @exec: the drm_exec object to initialize + * @interruptible: if locks should be acquired interruptible + * + * Initialize the object and make sure that we can track locked and duplicate + * objects. + */ +void drm_exec_init(struct drm_exec *exec, bool interruptible) +{ + exec->interruptible = interruptible; +