Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-17 Thread Matthew Brost
On Mon, May 17, 2021 at 03:55:59PM +0200, Daniel Vetter wrote:
> On Fri, May 14, 2021 at 01:05:33PM -0700, Matthew Brost wrote:
> > On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote:
> > > On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote:
> > > > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote:
> > > > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> > > > > > i915_drm.h updates for 'set parallel submit' extension.
> > > > > > 
> > > > > > Cc: Tvrtko Ursulin 
> > > > > > Cc: Tony Ye 
> > > > > > CC: Carl Zhang 
> > > > > > Cc: Daniel Vetter 
> > > > > > Cc: Jason Ekstrand 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > >  include/uapi/drm/i915_drm.h | 126 
> > > > > > 
> > > > > >  1 file changed, 126 insertions(+)
> > > > > > 
> > > > > > diff --git a/include/uapi/drm/i915_drm.h 
> > > > > > b/include/uapi/drm/i915_drm.h
> > > > > > index 26d2e135aa31..0175b12b33b8 100644
> > > > > > --- a/include/uapi/drm/i915_drm.h
> > > > > > +++ b/include/uapi/drm/i915_drm.h
> > > > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
> > > > > >   * Extensions:
> > > > > >   *   i915_context_engines_load_balance 
> > > > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
> > > > > >   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> > > > > > + *   i915_context_engines_parallel_submit 
> > > > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)
> > > > > 
> > > > > Hm just relalized, but I don't think this hyperlinsk correctly, and 
> > > > > I'm
> > > > > also not sure this formats very well as a nice list. Using item lists
> > > > > should look pretty nice like we're doing for the various kms 
> > > > > properties,
> > > > > e.g.
> > > > > 
> > > > > FOO:
> > > > >   Explain what FOO does
> > > > > 
> > > > > BAR:
> > > > >   Explain what BAR does. struct bar also automatically generates a 
> > > > > link
> > > > > 
> > > > > Please check with make htmldocs and polish this a bit (might need a 
> > > > > small
> > > > > prep patch).
> > > > > 
> > > > 
> > > > I agree the doc should look nice. To get there I might need to chat 
> > > > with you on
> > > > IRC as I'm new to this. 
> > > > 
> > > > > >   */
> > > > > >  #define I915_CONTEXT_PARAM_ENGINES 0xa
> > > > > >  
> > > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
> > > > > > __u64 extensions; /* linked chain of extension blocks, 0 
> > > > > > terminates */
> > > > > >  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> > > > > > i915_context_engines_load_balance */
> > > > > >  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see 
> > > > > > i915_context_engines_bond */
> > > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > > > i915_context_engines_parallel_submit */
> > > > > > struct i915_engine_class_instance engines[0];
> > > > > >  } __attribute__((packed));
> > > > > >  
> > > > > > +/*
> > > > > > + * i915_context_engines_parallel_submit:
> > > > > > + *
> > > > > > + * Setup a gem context to allow multiple BBs to be submitted in a 
> > > > > > single execbuf
> > > > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in 
> > > > > > parallel.
> > > > > > + *
> > > > > > + * All hardware contexts in the engine set are configured for 
> > > > > > parallel
> > > > > > + * submission (i.e. once this gem context is configured for 
> > > > > > parallel submission,
> > > > > > + * all the hardware contexts, regardless if a BB is available on 
> > > > > > each individual
> > > > > > + * context, will be submitted to the GPU in parallel). A user can 
> > > > > > submit BBs to
> > > > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but 
> > > > > > it is not
> > > > > > + * recommended as it may reserve physical engines with nothing to 
> > > > > > run on them.
> > > > > > + * Highly recommended to configure the gem context with N hardware 
> > > > > > contexts then
> > > > > > + * always submit N BBs in a single IOCTL.
> > > > > > + *
> > > > > > + * Their are two currently defined ways to control the placement 
> > > > > > of the
> > > > > > + * hardware contexts on physical engines: default behavior (no 
> > > > > > flags) and
> > > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added 
> > > > > > the in the
> > > > > > + * future as new hardware / use cases arise. Details of how to use 
> > > > > > this
> > > > > > + * interface below above the flags.
> > > > > > + *
> > > > > > + * Returns -EINVAL if hardware context placement configuration 
> > > > > > invalid or if the
> > > > > > + * placement configuration isn't supported on the platform / 
> > > > > > submission
> > > > > > + * interface.
> > > > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > > > submission
> > > > > > + * inteface.
> > > > > > + */
> > > > > > +struct i915_context_engines_parallel_submit {
> > > > > > +   struct 

Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-17 Thread Daniel Vetter
On Fri, May 14, 2021 at 01:05:33PM -0700, Matthew Brost wrote:
> On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote:
> > On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote:
> > > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote:
> > > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> > > > > i915_drm.h updates for 'set parallel submit' extension.
> > > > > 
> > > > > Cc: Tvrtko Ursulin 
> > > > > Cc: Tony Ye 
> > > > > CC: Carl Zhang 
> > > > > Cc: Daniel Vetter 
> > > > > Cc: Jason Ekstrand 
> > > > > Signed-off-by: Matthew Brost 
> > > > > ---
> > > > >  include/uapi/drm/i915_drm.h | 126 
> > > > > 
> > > > >  1 file changed, 126 insertions(+)
> > > > > 
> > > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > > > index 26d2e135aa31..0175b12b33b8 100644
> > > > > --- a/include/uapi/drm/i915_drm.h
> > > > > +++ b/include/uapi/drm/i915_drm.h
> > > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
> > > > >   * Extensions:
> > > > >   *   i915_context_engines_load_balance 
> > > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
> > > > >   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> > > > > + *   i915_context_engines_parallel_submit 
> > > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)
> > > > 
> > > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm
> > > > also not sure this formats very well as a nice list. Using item lists
> > > > should look pretty nice like we're doing for the various kms properties,
> > > > e.g.
> > > > 
> > > > FOO:
> > > >   Explain what FOO does
> > > > 
> > > > BAR:
> > > >   Explain what BAR does. struct bar also automatically generates a link
> > > > 
> > > > Please check with make htmldocs and polish this a bit (might need a 
> > > > small
> > > > prep patch).
> > > > 
> > > 
> > > I agree the doc should look nice. To get there I might need to chat with 
> > > you on
> > > IRC as I'm new to this. 
> > > 
> > > > >   */
> > > > >  #define I915_CONTEXT_PARAM_ENGINES   0xa
> > > > >  
> > > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
> > > > >   __u64 extensions; /* linked chain of extension blocks, 0 
> > > > > terminates */
> > > > >  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> > > > > i915_context_engines_load_balance */
> > > > >  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see 
> > > > > i915_context_engines_bond */
> > > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > > i915_context_engines_parallel_submit */
> > > > >   struct i915_engine_class_instance engines[0];
> > > > >  } __attribute__((packed));
> > > > >  
> > > > > +/*
> > > > > + * i915_context_engines_parallel_submit:
> > > > > + *
> > > > > + * Setup a gem context to allow multiple BBs to be submitted in a 
> > > > > single execbuf
> > > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in 
> > > > > parallel.
> > > > > + *
> > > > > + * All hardware contexts in the engine set are configured for 
> > > > > parallel
> > > > > + * submission (i.e. once this gem context is configured for parallel 
> > > > > submission,
> > > > > + * all the hardware contexts, regardless if a BB is available on 
> > > > > each individual
> > > > > + * context, will be submitted to the GPU in parallel). A user can 
> > > > > submit BBs to
> > > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but 
> > > > > it is not
> > > > > + * recommended as it may reserve physical engines with nothing to 
> > > > > run on them.
> > > > > + * Highly recommended to configure the gem context with N hardware 
> > > > > contexts then
> > > > > + * always submit N BBs in a single IOCTL.
> > > > > + *
> > > > > + * Their are two currently defined ways to control the placement of 
> > > > > the
> > > > > + * hardware contexts on physical engines: default behavior (no 
> > > > > flags) and
> > > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the 
> > > > > in the
> > > > > + * future as new hardware / use cases arise. Details of how to use 
> > > > > this
> > > > > + * interface below above the flags.
> > > > > + *
> > > > > + * Returns -EINVAL if hardware context placement configuration 
> > > > > invalid or if the
> > > > > + * placement configuration isn't supported on the platform / 
> > > > > submission
> > > > > + * interface.
> > > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > > submission
> > > > > + * inteface.
> > > > > + */
> > > > > +struct i915_context_engines_parallel_submit {
> > > > > + struct i915_user_extension base;
> > > > 
> > > > Ok this is good, since it makes sure we can't possible use this in
> > > > CTX_SETPARAM.
> > > > 
> > > 
> > > Yep, this is at context creation time. Technically you still can call 
> > > this over
> > > and over on the same gem context but Jason is taking that ability away I
> > > 

Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-14 Thread Matthew Brost
On Wed, May 12, 2021 at 10:34:59AM +0200, Daniel Vetter wrote:
> On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote:
> > On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote:
> > > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> > > > i915_drm.h updates for 'set parallel submit' extension.
> > > > 
> > > > Cc: Tvrtko Ursulin 
> > > > Cc: Tony Ye 
> > > > CC: Carl Zhang 
> > > > Cc: Daniel Vetter 
> > > > Cc: Jason Ekstrand 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >  include/uapi/drm/i915_drm.h | 126 
> > > >  1 file changed, 126 insertions(+)
> > > > 
> > > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > > index 26d2e135aa31..0175b12b33b8 100644
> > > > --- a/include/uapi/drm/i915_drm.h
> > > > +++ b/include/uapi/drm/i915_drm.h
> > > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
> > > >   * Extensions:
> > > >   *   i915_context_engines_load_balance 
> > > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
> > > >   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> > > > + *   i915_context_engines_parallel_submit 
> > > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)
> > > 
> > > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm
> > > also not sure this formats very well as a nice list. Using item lists
> > > should look pretty nice like we're doing for the various kms properties,
> > > e.g.
> > > 
> > > FOO:
> > >   Explain what FOO does
> > > 
> > > BAR:
> > >   Explain what BAR does. struct bar also automatically generates a link
> > > 
> > > Please check with make htmldocs and polish this a bit (might need a small
> > > prep patch).
> > > 
> > 
> > I agree the doc should look nice. To get there I might need to chat with 
> > you on
> > IRC as I'm new to this. 
> > 
> > > >   */
> > > >  #define I915_CONTEXT_PARAM_ENGINES 0xa
> > > >  
> > > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
> > > > __u64 extensions; /* linked chain of extension blocks, 0 
> > > > terminates */
> > > >  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> > > > i915_context_engines_load_balance */
> > > >  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see 
> > > > i915_context_engines_bond */
> > > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > > i915_context_engines_parallel_submit */
> > > > struct i915_engine_class_instance engines[0];
> > > >  } __attribute__((packed));
> > > >  
> > > > +/*
> > > > + * i915_context_engines_parallel_submit:
> > > > + *
> > > > + * Setup a gem context to allow multiple BBs to be submitted in a 
> > > > single execbuf
> > > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in 
> > > > parallel.
> > > > + *
> > > > + * All hardware contexts in the engine set are configured for parallel
> > > > + * submission (i.e. once this gem context is configured for parallel 
> > > > submission,
> > > > + * all the hardware contexts, regardless if a BB is available on each 
> > > > individual
> > > > + * context, will be submitted to the GPU in parallel). A user can 
> > > > submit BBs to
> > > > + * subset of the hardware contexts, in a single execbuf IOCTL, but it 
> > > > is not
> > > > + * recommended as it may reserve physical engines with nothing to run 
> > > > on them.
> > > > + * Highly recommended to configure the gem context with N hardware 
> > > > contexts then
> > > > + * always submit N BBs in a single IOCTL.
> > > > + *
> > > > + * Their are two currently defined ways to control the placement of the
> > > > + * hardware contexts on physical engines: default behavior (no flags) 
> > > > and
> > > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the 
> > > > in the
> > > > + * future as new hardware / use cases arise. Details of how to use this
> > > > + * interface below above the flags.
> > > > + *
> > > > + * Returns -EINVAL if hardware context placement configuration invalid 
> > > > or if the
> > > > + * placement configuration isn't supported on the platform / submission
> > > > + * interface.
> > > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > > submission
> > > > + * inteface.
> > > > + */
> > > > +struct i915_context_engines_parallel_submit {
> > > > +   struct i915_user_extension base;
> > > 
> > > Ok this is good, since it makes sure we can't possible use this in
> > > CTX_SETPARAM.
> > > 
> > 
> > Yep, this is at context creation time. Technically you still can call this 
> > over
> > and over on the same gem context but Jason is taking that ability away I
> > believe. I've also told the media team to setup the context once and don't 
> > touch
> > it again.
> 
> Only if you base your context param on drm_i915_gem_context_param, which
> can be used both at create time with
> drm_i915_gem_context_create_ext_setparam and with the CTX_SETPARAM ioctl.
> But you don't, so this issue is fixed at the uapi 

Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-12 Thread Daniel Vetter
On Tue, May 11, 2021 at 11:44:28AM -0700, Matthew Brost wrote:
> On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote:
> > On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> > > i915_drm.h updates for 'set parallel submit' extension.
> > > 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Tony Ye 
> > > CC: Carl Zhang 
> > > Cc: Daniel Vetter 
> > > Cc: Jason Ekstrand 
> > > Signed-off-by: Matthew Brost 
> > > ---
> > >  include/uapi/drm/i915_drm.h | 126 
> > >  1 file changed, 126 insertions(+)
> > > 
> > > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > > index 26d2e135aa31..0175b12b33b8 100644
> > > --- a/include/uapi/drm/i915_drm.h
> > > +++ b/include/uapi/drm/i915_drm.h
> > > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
> > >   * Extensions:
> > >   *   i915_context_engines_load_balance 
> > > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
> > >   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> > > + *   i915_context_engines_parallel_submit 
> > > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)
> > 
> > Hm just relalized, but I don't think this hyperlinsk correctly, and I'm
> > also not sure this formats very well as a nice list. Using item lists
> > should look pretty nice like we're doing for the various kms properties,
> > e.g.
> > 
> > FOO:
> >   Explain what FOO does
> > 
> > BAR:
> >   Explain what BAR does. struct bar also automatically generates a link
> > 
> > Please check with make htmldocs and polish this a bit (might need a small
> > prep patch).
> > 
> 
> I agree the doc should look nice. To get there I might need to chat with you 
> on
> IRC as I'm new to this. 
> 
> > >   */
> > >  #define I915_CONTEXT_PARAM_ENGINES   0xa
> > >  
> > > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
> > >   __u64 extensions; /* linked chain of extension blocks, 0 terminates */
> > >  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> > > i915_context_engines_load_balance */
> > >  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond 
> > > */
> > > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > > i915_context_engines_parallel_submit */
> > >   struct i915_engine_class_instance engines[0];
> > >  } __attribute__((packed));
> > >  
> > > +/*
> > > + * i915_context_engines_parallel_submit:
> > > + *
> > > + * Setup a gem context to allow multiple BBs to be submitted in a single 
> > > execbuf
> > > + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel.
> > > + *
> > > + * All hardware contexts in the engine set are configured for parallel
> > > + * submission (i.e. once this gem context is configured for parallel 
> > > submission,
> > > + * all the hardware contexts, regardless if a BB is available on each 
> > > individual
> > > + * context, will be submitted to the GPU in parallel). A user can submit 
> > > BBs to
> > > + * subset of the hardware contexts, in a single execbuf IOCTL, but it is 
> > > not
> > > + * recommended as it may reserve physical engines with nothing to run on 
> > > them.
> > > + * Highly recommended to configure the gem context with N hardware 
> > > contexts then
> > > + * always submit N BBs in a single IOCTL.
> > > + *
> > > + * Their are two currently defined ways to control the placement of the
> > > + * hardware contexts on physical engines: default behavior (no flags) and
> > > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in 
> > > the
> > > + * future as new hardware / use cases arise. Details of how to use this
> > > + * interface below above the flags.
> > > + *
> > > + * Returns -EINVAL if hardware context placement configuration invalid 
> > > or if the
> > > + * placement configuration isn't supported on the platform / submission
> > > + * interface.
> > > + * Returns -ENODEV if extension isn't supported on the platform / 
> > > submission
> > > + * inteface.
> > > + */
> > > +struct i915_context_engines_parallel_submit {
> > > + struct i915_user_extension base;
> > 
> > Ok this is good, since it makes sure we can't possible use this in
> > CTX_SETPARAM.
> > 
> 
> Yep, this is at context creation time. Technically you still can call this 
> over
> and over on the same gem context but Jason is taking that ability away I
> believe. I've also told the media team to setup the context once and don't 
> touch
> it again.

Only if you base your context param on drm_i915_gem_context_param, which
can be used both at create time with
drm_i915_gem_context_create_ext_setparam and with the CTX_SETPARAM ioctl.
But you don't, so this issue is fixed at the uapi design and doesn't need
to interface with Jason's prot-ctx rework much.

There's still going to be some conflicts, so maybe ask Jason for a branch
and rebase GuC on top of that for the next round.

> 
> > > +
> > > +/*
> > > + * Default placement behvavior (currently unsupported):
> > > + *
> > > + * Rather than restricting parallel 

Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-11 Thread Matthew Brost
On Tue, May 11, 2021 at 05:11:44PM +0200, Daniel Vetter wrote:
> On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> > i915_drm.h updates for 'set parallel submit' extension.
> > 
> > Cc: Tvrtko Ursulin 
> > Cc: Tony Ye 
> > CC: Carl Zhang 
> > Cc: Daniel Vetter 
> > Cc: Jason Ekstrand 
> > Signed-off-by: Matthew Brost 
> > ---
> >  include/uapi/drm/i915_drm.h | 126 
> >  1 file changed, 126 insertions(+)
> > 
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index 26d2e135aa31..0175b12b33b8 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
> >   * Extensions:
> >   *   i915_context_engines_load_balance 
> > (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
> >   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> > + *   i915_context_engines_parallel_submit 
> > (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)
> 
> Hm just relalized, but I don't think this hyperlinsk correctly, and I'm
> also not sure this formats very well as a nice list. Using item lists
> should look pretty nice like we're doing for the various kms properties,
> e.g.
> 
> FOO:
>   Explain what FOO does
> 
> BAR:
>   Explain what BAR does. struct bar also automatically generates a link
> 
> Please check with make htmldocs and polish this a bit (might need a small
> prep patch).
> 

I agree the doc should look nice. To get there I might need to chat with you on
IRC as I'm new to this. 

> >   */
> >  #define I915_CONTEXT_PARAM_ENGINES 0xa
> >  
> > @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
> > __u64 extensions; /* linked chain of extension blocks, 0 terminates */
> >  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> > i915_context_engines_load_balance */
> >  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
> > +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> > i915_context_engines_parallel_submit */
> > struct i915_engine_class_instance engines[0];
> >  } __attribute__((packed));
> >  
> > +/*
> > + * i915_context_engines_parallel_submit:
> > + *
> > + * Setup a gem context to allow multiple BBs to be submitted in a single 
> > execbuf
> > + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel.
> > + *
> > + * All hardware contexts in the engine set are configured for parallel
> > + * submission (i.e. once this gem context is configured for parallel 
> > submission,
> > + * all the hardware contexts, regardless if a BB is available on each 
> > individual
> > + * context, will be submitted to the GPU in parallel). A user can submit 
> > BBs to
> > + * subset of the hardware contexts, in a single execbuf IOCTL, but it is 
> > not
> > + * recommended as it may reserve physical engines with nothing to run on 
> > them.
> > + * Highly recommended to configure the gem context with N hardware 
> > contexts then
> > + * always submit N BBs in a single IOCTL.
> > + *
> > + * Their are two currently defined ways to control the placement of the
> > + * hardware contexts on physical engines: default behavior (no flags) and
> > + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
> > + * future as new hardware / use cases arise. Details of how to use this
> > + * interface below above the flags.
> > + *
> > + * Returns -EINVAL if hardware context placement configuration invalid or 
> > if the
> > + * placement configuration isn't supported on the platform / submission
> > + * interface.
> > + * Returns -ENODEV if extension isn't supported on the platform / 
> > submission
> > + * inteface.
> > + */
> > +struct i915_context_engines_parallel_submit {
> > +   struct i915_user_extension base;
> 
> Ok this is good, since it makes sure we can't possible use this in
> CTX_SETPARAM.
> 

Yep, this is at context creation time. Technically you still can call this over
and over on the same gem context but Jason is taking that ability away I
believe. I've also told the media team to setup the context once and don't touch
it again.

> > +
> > +/*
> > + * Default placement behvavior (currently unsupported):
> > + *
> > + * Rather than restricting parallel submission to a single class with a
> > + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a 
> > mode that
> > + * enables parallel submission across multiple engine classes. In this 
> > case each
> > + * context's logical engine mask indicates where that context can placed. 
> > It is
> > + * implied in this mode that all contexts have mutual exclusive placement 
> > (e.g.
> > + * if one context is running CS0 no other contexts can run on CS0).
> > + *
> > + * Example 1 pseudo code:
> > + * CSX[Y] = engine class X, logical instance Y
> > + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> > + * set_engines(INVALID, INVALID)
> > + * set_load_balance(engine_index=0, num_siblings=2, 

Re: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension

2021-05-11 Thread Daniel Vetter
On Thu, May 06, 2021 at 10:30:48AM -0700, Matthew Brost wrote:
> i915_drm.h updates for 'set parallel submit' extension.
> 
> Cc: Tvrtko Ursulin 
> Cc: Tony Ye 
> CC: Carl Zhang 
> Cc: Daniel Vetter 
> Cc: Jason Ekstrand 
> Signed-off-by: Matthew Brost 
> ---
>  include/uapi/drm/i915_drm.h | 126 
>  1 file changed, 126 insertions(+)
> 
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 26d2e135aa31..0175b12b33b8 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param {
>   * Extensions:
>   *   i915_context_engines_load_balance 
> (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE)
>   *   i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND)
> + *   i915_context_engines_parallel_submit 
> (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT)

Hm just relalized, but I don't think this hyperlinsk correctly, and I'm
also not sure this formats very well as a nice list. Using item lists
should look pretty nice like we're doing for the various kms properties,
e.g.

FOO:
  Explain what FOO does

BAR:
  Explain what BAR does. struct bar also automatically generates a link

Please check with make htmldocs and polish this a bit (might need a small
prep patch).

>   */
>  #define I915_CONTEXT_PARAM_ENGINES   0xa
>  
> @@ -1894,9 +1895,134 @@ struct i915_context_param_engines {
>   __u64 extensions; /* linked chain of extension blocks, 0 terminates */
>  #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see 
> i915_context_engines_load_balance */
>  #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */
> +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see 
> i915_context_engines_parallel_submit */
>   struct i915_engine_class_instance engines[0];
>  } __attribute__((packed));
>  
> +/*
> + * i915_context_engines_parallel_submit:
> + *
> + * Setup a gem context to allow multiple BBs to be submitted in a single 
> execbuf
> + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel.
> + *
> + * All hardware contexts in the engine set are configured for parallel
> + * submission (i.e. once this gem context is configured for parallel 
> submission,
> + * all the hardware contexts, regardless if a BB is available on each 
> individual
> + * context, will be submitted to the GPU in parallel). A user can submit BBs 
> to
> + * subset of the hardware contexts, in a single execbuf IOCTL, but it is not
> + * recommended as it may reserve physical engines with nothing to run on 
> them.
> + * Highly recommended to configure the gem context with N hardware contexts 
> then
> + * always submit N BBs in a single IOCTL.
> + *
> + * Their are two currently defined ways to control the placement of the
> + * hardware contexts on physical engines: default behavior (no flags) and
> + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the
> + * future as new hardware / use cases arise. Details of how to use this
> + * interface below above the flags.
> + *
> + * Returns -EINVAL if hardware context placement configuration invalid or if 
> the
> + * placement configuration isn't supported on the platform / submission
> + * interface.
> + * Returns -ENODEV if extension isn't supported on the platform / submission
> + * inteface.
> + */
> +struct i915_context_engines_parallel_submit {
> + struct i915_user_extension base;

Ok this is good, since it makes sure we can't possible use this in
CTX_SETPARAM.

> +
> +/*
> + * Default placement behvavior (currently unsupported):
> + *
> + * Rather than restricting parallel submission to a single class with a
> + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a mode 
> that
> + * enables parallel submission across multiple engine classes. In this case 
> each
> + * context's logical engine mask indicates where that context can placed. It 
> is
> + * implied in this mode that all contexts have mutual exclusive placement 
> (e.g.
> + * if one context is running CS0 no other contexts can run on CS0).
> + *
> + * Example 1 pseudo code:
> + * CSX[Y] = engine class X, logical instance Y
> + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> + * set_engines(INVALID, INVALID)
> + * set_load_balance(engine_index=0, num_siblings=2, engines=CS0[0],CS0[1])
> + * set_load_balance(engine_index=1, num_siblings=2, engines=CS1[0],CS1[1])
> + * set_parallel()
> + *
> + * Results in the following valid placements:
> + * CS0[0], CS1[0]
> + * CS0[0], CS1[1]
> + * CS0[1], CS1[0]
> + * CS0[1], CS1[1]
> + *
> + * Example 2 pseudo code:
> + * CS[X] = generic engine of same class, logical instance X
> + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE
> + * set_engines(INVALID, INVALID)
> + * set_load_balance(engine_index=0, num_siblings=3, 
> engines=CS[0],CS[1],CS[2])
> + * set_load_balance(engine_index=1, num_siblings=3, 
> engines=CS[0],CS[1],CS[2])
> + *