Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Mon, May 17, 2021 at 03:55:59PM +0200, Daniel Vetter wrote: > On Fri, May 14, 2021 at 01:05:33PM -0700, Matthew Brost wrote: > > On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote: > > > On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote: > > > > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote: > > > > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > > > > > > i915_drm.h updates for 'set parallel submit' extension. > > > > > > > > > > > > Cc: Tvrtko Ursulin > > > > > > Cc: Tony Ye > > > > > > CC: Carl Zhang > > > > > > Cc: Daniel Vetter > > > > > > Cc: Jason Ekstrand > > > > > > Signed-off-by: Matthew Brost > > > > > > --- > > > > > > include/uapi/drm/i915_drm.h | 126 > > > > > > > > > > > > 1 file changed, 126 insertions(+) > > > > > > > > > > > > diff --git a/include/uapi/drm/i915_drm.h > > > > > > b/include/uapi/drm/i915_drm.h > > > > > > index 26d2e135aa31..0175b12b33b8 100644 > > > > > > --- a/include/uapi/drm/i915_drm.h > > > > > > +++ b/include/uapi/drm/i915_drm.h > > > > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > > > > > > * Extensions: > > > > > > * i915_context_engines_load_balance > > > > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > > > > > > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > > > > > > + * i915_context_engines_parallel_submit > > > > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) > > > > > > > > > > Hm just relalized, but I don't think this hyperlinsk correctly, and > > > > > I'm > > > > > also not sure this formats very well as a nice list. Using item lists > > > > > should look pretty nice like we're doing for the various kms > > > > > properties, > > > > > e.g. > > > > > > > > > > FOO: > > > > > Explain what FOO does > > > > > > > > > > BAR: > > > > > Explain what BAR does. struct bar also automatically generates a > > > > > link > > > > > > > > > > Please check with make htmldocs and polish this a bit (might need a > > > > > small > > > > > prep patch). > > > > > > > > > > > > > I agree the doc should look nice. To get there I might need to chat > > > > with you on > > > > IRC as I'm new to this. > > > > > > > > > > */ > > > > > > #define I915_CONTEXT_PARAM_ENGINES 0xa > > > > > > > > > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > > > > > > __u64 extensions; /* linked chain of extension blocks, 0 > > > > > > terminates */ > > > > > > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > > > > > > i915_context_engines_load_balance */ > > > > > > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see > > > > > > i915_context_engines_bond */ > > > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > > > > > > i915_context_engines_parallel_submit */ > > > > > > struct i915_engine_class_instance engines[0]; > > > > > > } __attribute__((packed)); > > > > > > > > > > > > +/* > > > > > > + * i915_context_engines_parallel_submit: > > > > > > + * > > > > > > + * Setup a gem context to allow multiple BBs to be submitted in a > > > > > > single execbuf > > > > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in > > > > > > parallel. > > > > > > + * > > > > > > + * All hardware contexts in the engine set are configured for > > > > > > parallel > > > > > > + * submission (i.e. once this gem context is configured for > > > > > > parallel submission, > > > > > > + * all the hardware contexts, regardless if a BB is available on > > > > > > each individual > > > > > > + * context, will be submitted to the GPU in parallel). A user can > > > > > > submit BBs to > > > > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but > > > > > > it is not > > > > > > + * recommended as it may reserve physical engines with nothing to > > > > > > run on them. > > > > > > + * Highly recommended to configure the gem context with N hardware > > > > > > contexts then > > > > > > + * always submit N BBs in a single IOCTL. > > > > > > + * > > > > > > + * Their are two currently defined ways to control the placement > > > > > > of the > > > > > > + * hardware contexts on physical engines: default behavior (no > > > > > > flags) and > > > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added > > > > > > the in the > > > > > > + * future as new hardware / use cases arise. Details of how to use > > > > > > this > > > > > > + * interface below above the flags. > > > > > > + * > > > > > > + * Returns -EINVAL if hardware context placement configuration > > > > > > invalid or if the > > > > > > + * placement configuration isn't supported on the platform / > > > > > > submission > > > > > > + * interface. > > > > > > + * Returns -ENODEV if extension isn't supported on the platform / > > > > > > submission > > > > > > + * inteface. > > > > > > + */ > > > > > > +struct i915_context_engines_parallel_submit { > > > > > > + struct
Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Fri, May 14, 2021 at 01:05:33PM -0700, Matthew Brost wrote: > On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote: > > On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote: > > > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote: > > > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > > > > > i915_drm.h updates for 'set parallel submit' extension. > > > > > > > > > > Cc: Tvrtko Ursulin > > > > > Cc: Tony Ye > > > > > CC: Carl Zhang > > > > > Cc: Daniel Vetter > > > > > Cc: Jason Ekstrand > > > > > Signed-off-by: Matthew Brost > > > > > --- > > > > > include/uapi/drm/i915_drm.h | 126 > > > > > > > > > > 1 file changed, 126 insertions(+) > > > > > > > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > > > > > index 26d2e135aa31..0175b12b33b8 100644 > > > > > --- a/include/uapi/drm/i915_drm.h > > > > > +++ b/include/uapi/drm/i915_drm.h > > > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > > > > > * Extensions: > > > > > * i915_context_engines_load_balance > > > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > > > > > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > > > > > + * i915_context_engines_parallel_submit > > > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) > > > > > > > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm > > > > also not sure this formats very well as a nice list. Using item lists > > > > should look pretty nice like we're doing for the various kms properties, > > > > e.g. > > > > > > > > FOO: > > > > Explain what FOO does > > > > > > > > BAR: > > > > Explain what BAR does. struct bar also automatically generates a link > > > > > > > > Please check with make htmldocs and polish this a bit (might need a > > > > small > > > > prep patch). > > > > > > > > > > I agree the doc should look nice. To get there I might need to chat with > > > you on > > > IRC as I'm new to this. > > > > > > > > */ > > > > > #define I915_CONTEXT_PARAM_ENGINES 0xa > > > > > > > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > > > > > __u64 extensions; /* linked chain of extension blocks, 0 > > > > > terminates */ > > > > > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > > > > > i915_context_engines_load_balance */ > > > > > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see > > > > > i915_context_engines_bond */ > > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > > > > > i915_context_engines_parallel_submit */ > > > > > struct i915_engine_class_instance engines[0]; > > > > > } __attribute__((packed)); > > > > > > > > > > +/* > > > > > + * i915_context_engines_parallel_submit: > > > > > + * > > > > > + * Setup a gem context to allow multiple BBs to be submitted in a > > > > > single execbuf > > > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in > > > > > parallel. > > > > > + * > > > > > + * All hardware contexts in the engine set are configured for > > > > > parallel > > > > > + * submission (i.e. once this gem context is configured for parallel > > > > > submission, > > > > > + * all the hardware contexts, regardless if a BB is available on > > > > > each individual > > > > > + * context, will be submitted to the GPU in parallel). A user can > > > > > submit BBs to > > > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but > > > > > it is not > > > > > + * recommended as it may reserve physical engines with nothing to > > > > > run on them. > > > > > + * Highly recommended to configure the gem context with N hardware > > > > > contexts then > > > > > + * always submit N BBs in a single IOCTL. > > > > > + * > > > > > + * Their are two currently defined ways to control the placement of > > > > > the > > > > > + * hardware contexts on physical engines: default behavior (no > > > > > flags) and > > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the > > > > > in the > > > > > + * future as new hardware / use cases arise. Details of how to use > > > > > this > > > > > + * interface below above the flags. > > > > > + * > > > > > + * Returns -EINVAL if hardware context placement configuration > > > > > invalid or if the > > > > > + * placement configuration isn't supported on the platform / > > > > > submission > > > > > + * interface. > > > > > + * Returns -ENODEV if extension isn't supported on the platform / > > > > > submission > > > > > + * inteface. > > > > > + */ > > > > > +struct i915_context_engines_parallel_submit { > > > > > + struct i915_user_extension base; > > > > > > > > Ok this is good, since it makes sure we can't possible use this in > > > > CTX_SETPARAM. > > > > > > > > > > Yep, this is at context creation time. Technically you still can call > > > this over > > > and over on the same gem context but Jason is taking that ability away I > > >
Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote: > On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote: > > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote: > > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > > > > i915_drm.h updates for 'set parallel submit' extension. > > > > > > > > Cc: Tvrtko Ursulin > > > > Cc: Tony Ye > > > > CC: Carl Zhang > > > > Cc: Daniel Vetter > > > > Cc: Jason Ekstrand > > > > Signed-off-by: Matthew Brost > > > > --- > > > > include/uapi/drm/i915_drm.h | 126 > > > > 1 file changed, 126 insertions(+) > > > > > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > > > > index 26d2e135aa31..0175b12b33b8 100644 > > > > --- a/include/uapi/drm/i915_drm.h > > > > +++ b/include/uapi/drm/i915_drm.h > > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > > > > * Extensions: > > > > * i915_context_engines_load_balance > > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > > > > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > > > > + * i915_context_engines_parallel_submit > > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) > > > > > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm > > > also not sure this formats very well as a nice list. Using item lists > > > should look pretty nice like we're doing for the various kms properties, > > > e.g. > > > > > > FOO: > > > Explain what FOO does > > > > > > BAR: > > > Explain what BAR does. struct bar also automatically generates a link > > > > > > Please check with make htmldocs and polish this a bit (might need a small > > > prep patch). > > > > > > > I agree the doc should look nice. To get there I might need to chat with > > you on > > IRC as I'm new to this. > > > > > > */ > > > > #define I915_CONTEXT_PARAM_ENGINES 0xa > > > > > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > > > > __u64 extensions; /* linked chain of extension blocks, 0 > > > > terminates */ > > > > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > > > > i915_context_engines_load_balance */ > > > > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see > > > > i915_context_engines_bond */ > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > > > > i915_context_engines_parallel_submit */ > > > > struct i915_engine_class_instance engines[0]; > > > > } __attribute__((packed)); > > > > > > > > +/* > > > > + * i915_context_engines_parallel_submit: > > > > + * > > > > + * Setup a gem context to allow multiple BBs to be submitted in a > > > > single execbuf > > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in > > > > parallel. > > > > + * > > > > + * All hardware contexts in the engine set are configured for parallel > > > > + * submission (i.e. once this gem context is configured for parallel > > > > submission, > > > > + * all the hardware contexts, regardless if a BB is available on each > > > > individual > > > > + * context, will be submitted to the GPU in parallel). A user can > > > > submit BBs to > > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but it > > > > is not > > > > + * recommended as it may reserve physical engines with nothing to run > > > > on them. > > > > + * Highly recommended to configure the gem context with N hardware > > > > contexts then > > > > + * always submit N BBs in a single IOCTL. > > > > + * > > > > + * Their are two currently defined ways to control the placement of the > > > > + * hardware contexts on physical engines: default behavior (no flags) > > > > and > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the > > > > in the > > > > + * future as new hardware / use cases arise. Details of how to use this > > > > + * interface below above the flags. > > > > + * > > > > + * Returns -EINVAL if hardware context placement configuration invalid > > > > or if the > > > > + * placement configuration isn't supported on the platform / submission > > > > + * interface. > > > > + * Returns -ENODEV if extension isn't supported on the platform / > > > > submission > > > > + * inteface. > > > > + */ > > > > +struct i915_context_engines_parallel_submit { > > > > + struct i915_user_extension base; > > > > > > Ok this is good, since it makes sure we can't possible use this in > > > CTX_SETPARAM. > > > > > > > Yep, this is at context creation time. Technically you still can call this > > over > > and over on the same gem context but Jason is taking that ability away I > > believe. I've also told the media team to setup the context once and don't > > touch > > it again. > > Only if you base your context param on drm_i915_gem_context_param, which > can be used both at create time with > drm_i915_gem_context_create_ext_setparam and with the CTX_SETPARAM ioctl. > But you don't, so this issue is fixed at the uapi
Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote: > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote: > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > > > i915_drm.h updates for 'set parallel submit' extension. > > > > > > Cc: Tvrtko Ursulin > > > Cc: Tony Ye > > > CC: Carl Zhang > > > Cc: Daniel Vetter > > > Cc: Jason Ekstrand > > > Signed-off-by: Matthew Brost > > > --- > > > include/uapi/drm/i915_drm.h | 126 > > > 1 file changed, 126 insertions(+) > > > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > > > index 26d2e135aa31..0175b12b33b8 100644 > > > --- a/include/uapi/drm/i915_drm.h > > > +++ b/include/uapi/drm/i915_drm.h > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > > > * Extensions: > > > * i915_context_engines_load_balance > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > > > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > > > + * i915_context_engines_parallel_submit > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) > > > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm > > also not sure this formats very well as a nice list. Using item lists > > should look pretty nice like we're doing for the various kms properties, > > e.g. > > > > FOO: > > Explain what FOO does > > > > BAR: > > Explain what BAR does. struct bar also automatically generates a link > > > > Please check with make htmldocs and polish this a bit (might need a small > > prep patch). > > > > I agree the doc should look nice. To get there I might need to chat with you > on > IRC as I'm new to this. > > > > */ > > > #define I915_CONTEXT_PARAM_ENGINES 0xa > > > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > > > __u64 extensions; /* linked chain of extension blocks, 0 terminates */ > > > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > > > i915_context_engines_load_balance */ > > > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond > > > */ > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > > > i915_context_engines_parallel_submit */ > > > struct i915_engine_class_instance engines[0]; > > > } __attribute__((packed)); > > > > > > +/* > > > + * i915_context_engines_parallel_submit: > > > + * > > > + * Setup a gem context to allow multiple BBs to be submitted in a single > > > execbuf > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel. > > > + * > > > + * All hardware contexts in the engine set are configured for parallel > > > + * submission (i.e. once this gem context is configured for parallel > > > submission, > > > + * all the hardware contexts, regardless if a BB is available on each > > > individual > > > + * context, will be submitted to the GPU in parallel). A user can submit > > > BBs to > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but it is > > > not > > > + * recommended as it may reserve physical engines with nothing to run on > > > them. > > > + * Highly recommended to configure the gem context with N hardware > > > contexts then > > > + * always submit N BBs in a single IOCTL. > > > + * > > > + * Their are two currently defined ways to control the placement of the > > > + * hardware contexts on physical engines: default behavior (no flags) and > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in > > > the > > > + * future as new hardware / use cases arise. Details of how to use this > > > + * interface below above the flags. > > > + * > > > + * Returns -EINVAL if hardware context placement configuration invalid > > > or if the > > > + * placement configuration isn't supported on the platform / submission > > > + * interface. > > > + * Returns -ENODEV if extension isn't supported on the platform / > > > submission > > > + * inteface. > > > + */ > > > +struct i915_context_engines_parallel_submit { > > > + struct i915_user_extension base; > > > > Ok this is good, since it makes sure we can't possible use this in > > CTX_SETPARAM. > > > > Yep, this is at context creation time. Technically you still can call this > over > and over on the same gem context but Jason is taking that ability away I > believe. I've also told the media team to setup the context once and don't > touch > it again. Only if you base your context param on drm_i915_gem_context_param, which can be used both at create time with drm_i915_gem_context_create_ext_setparam and with the CTX_SETPARAM ioctl. But you don't, so this issue is fixed at the uapi design and doesn't need to interface with Jason's prot-ctx rework much. There's still going to be some conflicts, so maybe ask Jason for a branch and rebase GuC on top of that for the next round. > > > > + > > > +/* > > > + * Default placement behvavior (currently unsupported): > > > + * > > > + * Rather than restricting parallel
Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote: > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > > i915_drm.h updates for 'set parallel submit' extension. > > > > Cc: Tvrtko Ursulin > > Cc: Tony Ye > > CC: Carl Zhang > > Cc: Daniel Vetter > > Cc: Jason Ekstrand > > Signed-off-by: Matthew Brost > > --- > > include/uapi/drm/i915_drm.h | 126 > > 1 file changed, 126 insertions(+) > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > > index 26d2e135aa31..0175b12b33b8 100644 > > --- a/include/uapi/drm/i915_drm.h > > +++ b/include/uapi/drm/i915_drm.h > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > > * Extensions: > > * i915_context_engines_load_balance > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > > + * i915_context_engines_parallel_submit > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm > also not sure this formats very well as a nice list. Using item lists > should look pretty nice like we're doing for the various kms properties, > e.g. > > FOO: > Explain what FOO does > > BAR: > Explain what BAR does. struct bar also automatically generates a link > > Please check with make htmldocs and polish this a bit (might need a small > prep patch). > I agree the doc should look nice. To get there I might need to chat with you on IRC as I'm new to this. > > */ > > #define I915_CONTEXT_PARAM_ENGINES 0xa > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > > __u64 extensions; /* linked chain of extension blocks, 0 terminates */ > > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > > i915_context_engines_load_balance */ > > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */ > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > > i915_context_engines_parallel_submit */ > > struct i915_engine_class_instance engines[0]; > > } __attribute__((packed)); > > > > +/* > > + * i915_context_engines_parallel_submit: > > + * > > + * Setup a gem context to allow multiple BBs to be submitted in a single > > execbuf > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel. > > + * > > + * All hardware contexts in the engine set are configured for parallel > > + * submission (i.e. once this gem context is configured for parallel > > submission, > > + * all the hardware contexts, regardless if a BB is available on each > > individual > > + * context, will be submitted to the GPU in parallel). A user can submit > > BBs to > > + * subset of the hardware contexts, in a single execbuf IOCTL, but it is > > not > > + * recommended as it may reserve physical engines with nothing to run on > > them. > > + * Highly recommended to configure the gem context with N hardware > > contexts then > > + * always submit N BBs in a single IOCTL. > > + * > > + * Their are two currently defined ways to control the placement of the > > + * hardware contexts on physical engines: default behavior (no flags) and > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the > > + * future as new hardware / use cases arise. Details of how to use this > > + * interface below above the flags. > > + * > > + * Returns -EINVAL if hardware context placement configuration invalid or > > if the > > + * placement configuration isn't supported on the platform / submission > > + * interface. > > + * Returns -ENODEV if extension isn't supported on the platform / > > submission > > + * inteface. > > + */ > > +struct i915_context_engines_parallel_submit { > > + struct i915_user_extension base; > > Ok this is good, since it makes sure we can't possible use this in > CTX_SETPARAM. > Yep, this is at context creation time. Technically you still can call this over and over on the same gem context but Jason is taking that ability away I believe. I've also told the media team to setup the context once and don't touch it again. > > + > > +/* > > + * Default placement behvavior (currently unsupported): > > + * > > + * Rather than restricting parallel submission to a single class with a > > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a > > mode that > > + * enables parallel submission across multiple engine classes. In this > > case each > > + * context's logical engine mask indicates where that context can placed. > > It is > > + * implied in this mode that all contexts have mutual exclusive placement > > (e.g. > > + * if one context is running CS0 no other contexts can run on CS0). > > + * > > + * Example 1 pseudo code: > > + * CSX[Y] = engine class X, logical instance Y > > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE > > + * set_engines(INVALID, INVALID) > > + * set_load_balance(engine_index=0, num_siblings=2,
Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension
On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote: > i915_drm.h updates for 'set parallel submit' extension. > > Cc: Tvrtko Ursulin > Cc: Tony Ye > CC: Carl Zhang > Cc: Daniel Vetter > Cc: Jason Ekstrand > Signed-off-by: Matthew Brost > --- > include/uapi/drm/i915_drm.h | 126 > 1 file changed, 126 insertions(+) > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h > index 26d2e135aa31..0175b12b33b8 100644 > --- a/include/uapi/drm/i915_drm.h > +++ b/include/uapi/drm/i915_drm.h > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { > * Extensions: > * i915_context_engines_load_balance > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) > * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) > + * i915_context_engines_parallel_submit > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) Hm just relalized, but I don't think this hyperlinsk correctly, and I'm also not sure this formats very well as a nice list. Using item lists should look pretty nice like we're doing for the various kms properties, e.g. FOO: Explain what FOO does BAR: Explain what BAR does. struct bar also automatically generates a link Please check with make htmldocs and polish this a bit (might need a small prep patch). > */ > #define I915_CONTEXT_PARAM_ENGINES 0xa > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { > __u64 extensions; /* linked chain of extension blocks, 0 terminates */ > #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see > i915_context_engines_load_balance */ > #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */ > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see > i915_context_engines_parallel_submit */ > struct i915_engine_class_instance engines[0]; > } __attribute__((packed)); > > +/* > + * i915_context_engines_parallel_submit: > + * > + * Setup a gem context to allow multiple BBs to be submitted in a single > execbuf > + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel. > + * > + * All hardware contexts in the engine set are configured for parallel > + * submission (i.e. once this gem context is configured for parallel > submission, > + * all the hardware contexts, regardless if a BB is available on each > individual > + * context, will be submitted to the GPU in parallel). A user can submit BBs > to > + * subset of the hardware contexts, in a single execbuf IOCTL, but it is not > + * recommended as it may reserve physical engines with nothing to run on > them. > + * Highly recommended to configure the gem context with N hardware contexts > then > + * always submit N BBs in a single IOCTL. > + * > + * Their are two currently defined ways to control the placement of the > + * hardware contexts on physical engines: default behavior (no flags) and > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the > + * future as new hardware / use cases arise. Details of how to use this > + * interface below above the flags. > + * > + * Returns -EINVAL if hardware context placement configuration invalid or if > the > + * placement configuration isn't supported on the platform / submission > + * interface. > + * Returns -ENODEV if extension isn't supported on the platform / submission > + * inteface. > + */ > +struct i915_context_engines_parallel_submit { > + struct i915_user_extension base; Ok this is good, since it makes sure we can't possible use this in CTX_SETPARAM. > + > +/* > + * Default placement behvavior (currently unsupported): > + * > + * Rather than restricting parallel submission to a single class with a > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a mode > that > + * enables parallel submission across multiple engine classes. In this case > each > + * context's logical engine mask indicates where that context can placed. It > is > + * implied in this mode that all contexts have mutual exclusive placement > (e.g. > + * if one context is running CS0 no other contexts can run on CS0). > + * > + * Example 1 pseudo code: > + * CSX[Y] = engine class X, logical instance Y > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE > + * set_engines(INVALID, INVALID) > + * set_load_balance(engine_index=0, num_siblings=2, engines=CS0[0],CS0[1]) > + * set_load_balance(engine_index=1, num_siblings=2, engines=CS1[0],CS1[1]) > + * set_parallel() > + * > + * Results in the following valid placements: > + * CS0[0], CS1[0] > + * CS0[0], CS1[1] > + * CS0[1], CS1[0] > + * CS0[1], CS1[1] > + * > + * Example 2 pseudo code: > + * CS[X] = generic engine of same class, logical instance X > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE > + * set_engines(INVALID, INVALID) > + * set_load_balance(engine_index=0, num_siblings=3, > engines=CS[0],CS[1],CS[2]) > + * set_load_balance(engine_index=1, num_siblings=3, > engines=CS[0],CS[1],CS[2]) > + *