Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-16 Thread Tvrtko Ursulin



On 15/09/2021 17:58, Matthew Brost wrote:

On Wed, Sep 15, 2021 at 09:24:15AM +0100, Tvrtko Ursulin wrote:


On 14/09/2021 19:04, Matthew Brost wrote:

On Tue, Sep 14, 2021 at 09:34:08AM +0100, Tvrtko Ursulin wrote:




8<


Today we have:

for_each intel_engines: // intel_engines is a flat list of all engines
intel_engine_setup()

You propose to change it to:

for_each engine_class:
 for 0..max_global_engine_instance:
for_each intel_engines:
   skip engine not present
   skip class not matching

   count logical instance

 for_each intel_engines:
skip engine not present
skip wrong class

intel_engine_setup()


I propose:

// Leave as is:

for_each intel_engines:
 intel_engine_setup()

// Add:

for_each engine_class:
 logical = 0
 for_each gt->engine_class[class]:
skip engine not present

engine->logical_instance = logical++


When code which actually needs a preturbed "map" arrives you add that in to
this second loop.



See above, why introduce an algorithm that doesn't work for future parts
+ future patches are land imminently? It makes zero sense whatsoever.
With your proposal we would literally land code to just throw it away a
couple of months from now + break patches we intend to land soon. This


It sure works, it just walks the per class list instead of walking the flat
list skipping one class at the time.

Just add the map based transformation to the second pass later, when it
becomes required.



I can flatten the algorithm if that helps alleviate your concerns but
with that being said, I've played around this locally and IMO makes the
code way more ugly. Sure it eliminates some iterations of the loop but
who really cares about that in a one time setup function?


algorithm works and has no reason whatsoever to be optimal as it a one
time setup call. I really don't understand why we are still talking
about this paint color.


I don't think bike shedding is not an appropriate term when complaint is how
proposed algorithm is needlessly complicated.



Are you just ignoring the fact that the algorithm (map) is needed in
pending patches? IMO it is more complicated to write throw away code
when the proper algorithm is already written. If the logical mapping was
straight forward on all platforms as the ones currently upstream I would
100% agree with your suggestion, but it isn't on unembargoed platforms
eminently going upstream. The algorithm I have works for the current
platforms + the pending platforms. IMO is 100% acceptable to merge
something looking towards a known future.


FWIW my 2c is that unused bits detract from review. And my gut feeling 
still is that code can be written in a simpler way and that the map 
transform can still plug in easily on top in a later series.


I said FWIW since even if I am right you can still view my comments as 
external/community inputs at this point and proceed however you wish.


Regards,

Tvrtko


Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-15 Thread Matthew Brost
On Wed, Sep 15, 2021 at 09:24:15AM +0100, Tvrtko Ursulin wrote:
> 
> On 14/09/2021 19:04, Matthew Brost wrote:
> > On Tue, Sep 14, 2021 at 09:34:08AM +0100, Tvrtko Ursulin wrote:
> > > 
> 
> 8<
> 
> > > Today we have:
> > > 
> > > for_each intel_engines: // intel_engines is a flat list of all engines
> > >   intel_engine_setup()
> > > 
> > > You propose to change it to:
> > > 
> > > for_each engine_class:
> > > for 0..max_global_engine_instance:
> > >for_each intel_engines:
> > >   skip engine not present
> > >   skip class not matching
> > > 
> > >   count logical instance
> > > 
> > > for_each intel_engines:
> > >skip engine not present
> > >skip wrong class
> > > 
> > >intel_engine_setup()
> > > 
> > > 
> > > I propose:
> > > 
> > > // Leave as is:
> > > 
> > > for_each intel_engines:
> > > intel_engine_setup()
> > > 
> > > // Add:
> > > 
> > > for_each engine_class:
> > > logical = 0
> > > for_each gt->engine_class[class]:
> > >skip engine not present
> > > 
> > >engine->logical_instance = logical++
> > > 
> > > 
> > > When code which actually needs a preturbed "map" arrives you add that in 
> > > to
> > > this second loop.
> > > 
> > 
> > See above, why introduce an algorithm that doesn't work for future parts
> > + future patches are land imminently? It makes zero sense whatsoever.
> > With your proposal we would literally land code to just throw it away a
> > couple of months from now + break patches we intend to land soon. This
> 
> It sure works, it just walks the per class list instead of walking the flat
> list skipping one class at the time.
> 
> Just add the map based transformation to the second pass later, when it
> becomes required.
> 

I can flatten the algorithm if that helps alleviate your concerns but
with that being said, I've played around this locally and IMO makes the
code way more ugly. Sure it eliminates some iterations of the loop but
who really cares about that in a one time setup function?

> > algorithm works and has no reason whatsoever to be optimal as it a one
> > time setup call. I really don't understand why we are still talking
> > about this paint color.
> 
> I don't think bike shedding is not an appropriate term when complaint is how
> proposed algorithm is needlessly complicated.
>

Are you just ignoring the fact that the algorithm (map) is needed in
pending patches? IMO it is more complicated to write throw away code
when the proper algorithm is already written. If the logical mapping was
straight forward on all platforms as the ones currently upstream I would
100% agree with your suggestion, but it isn't on unembargoed platforms
eminently going upstream. The algorithm I have works for the current
platforms + the pending platforms. IMO is 100% acceptable to merge
something looking towards a known future.

Matt

> Regards,
> 
> Tvrtko


Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-15 Thread Tvrtko Ursulin



On 14/09/2021 19:04, Matthew Brost wrote:

On Tue, Sep 14, 2021 at 09:34:08AM +0100, Tvrtko Ursulin wrote:




8<


Today we have:

for_each intel_engines: // intel_engines is a flat list of all engines
intel_engine_setup()

You propose to change it to:

for_each engine_class:
for 0..max_global_engine_instance:
   for_each intel_engines:
  skip engine not present
  skip class not matching

  count logical instance

for_each intel_engines:
   skip engine not present
   skip wrong class

   intel_engine_setup()


I propose:

// Leave as is:

for_each intel_engines:
intel_engine_setup()

// Add:

for_each engine_class:
logical = 0
for_each gt->engine_class[class]:
   skip engine not present

   engine->logical_instance = logical++


When code which actually needs a preturbed "map" arrives you add that in to
this second loop.



See above, why introduce an algorithm that doesn't work for future parts
+ future patches are land imminently? It makes zero sense whatsoever.
With your proposal we would literally land code to just throw it away a
couple of months from now + break patches we intend to land soon. This


It sure works, it just walks the per class list instead of walking the 
flat list skipping one class at the time.


Just add the map based transformation to the second pass later, when it 
becomes required.



algorithm works and has no reason whatsoever to be optimal as it a one
time setup call. I really don't understand why we are still talking
about this paint color.


I don't think bike shedding is not an appropriate term when complaint is 
how proposed algorithm is needlessly complicated.


Regards,

Tvrtko


Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-14 Thread Matthew Brost
On Tue, Sep 14, 2021 at 09:34:08AM +0100, Tvrtko Ursulin wrote:
> 
> On 13/09/2021 17:50, Matthew Brost wrote:
> > On Mon, Sep 13, 2021 at 10:24:43AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 10/09/2021 20:49, Matthew Brost wrote:
> > > > On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 20/08/2021 23:44, Matthew Brost wrote:
> > > > > > Add logical engine mapping. This is required for split-frame, as
> > > > > > workloads need to be placed on engines in a logically contiguous 
> > > > > > manner.
> > > > > > 
> > > > > > v2:
> > > > > > (Daniel Vetter)
> > > > > >  - Add kernel doc for new fields
> > > > > > 
> > > > > > Signed-off-by: Matthew Brost 
> > > > > > ---
> > > > > > drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 
> > > > > > ---
> > > > > > drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
> > > > > > .../drm/i915/gt/intel_execlists_submission.c  |  1 +
> > > > > > drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
> > > > > > .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> > > > > > 5 files changed, 60 insertions(+), 29 deletions(-)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > > > > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > > > index 0d9105a31d84..4d790f9a65dd 100644
> > > > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > > > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct 
> > > > > > intel_engine_cs *engine, u16 iir)
> > > > > > GEM_DEBUG_WARN_ON(iir);
> > > > > > }
> > > > > > -static int intel_engine_setup(struct intel_gt *gt, enum 
> > > > > > intel_engine_id id)
> > > > > > +static int intel_engine_setup(struct intel_gt *gt, enum 
> > > > > > intel_engine_id id,
> > > > > > + u8 logical_instance)
> > > > > > {
> > > > > > const struct engine_info *info = &intel_engines[id];
> > > > > > struct drm_i915_private *i915 = gt->i915;
> > > > > > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt 
> > > > > > *gt, enum intel_engine_id id)
> > > > > > engine->class = info->class;
> > > > > > engine->instance = info->instance;
> > > > > > +   engine->logical_mask = BIT(logical_instance);
> > > > > > __sprint_engine_name(engine);
> > > > > > engine->props.heartbeat_interval_ms =
> > > > > > @@ -572,6 +574,37 @@ static intel_engine_mask_t 
> > > > > > init_engine_mask(struct intel_gt *gt)
> > > > > > return info->engine_mask;
> > > > > > }
> > > > > > +static void populate_logical_ids(struct intel_gt *gt, u8 
> > > > > > *logical_ids,
> > > > > > +u8 class, const u8 *map, u8 
> > > > > > num_instances)
> > > > > > +{
> > > > > > +   int i, j;
> > > > > > +   u8 current_logical_id = 0;
> > > > > > +
> > > > > > +   for (j = 0; j < num_instances; ++j) {
> > > > > > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > > > > > +   if (!HAS_ENGINE(gt, i) ||
> > > > > > +   intel_engines[i].class != class)
> > > > > > +   continue;
> > > > > > +
> > > > > > +   if (intel_engines[i].instance == map[j]) {
> > > > > > +   logical_ids[intel_engines[i].instance] =
> > > > > > +   current_logical_id++;
> > > > > > +   break;
> > > > > > +   }
> > > > > > +   }
> > > > > > +   }
> > > > > > +}
> > > > > > +
> > > > > > +static void setup_logical_ids(struct intel_gt *gt, u8 
> > > > > > *logical_ids, u8 class)
> > > > > > +{
> > > > > > +   int i;
> > > > > > +   u8 map[MAX_ENGINE_INSTANCE + 1];
> > > > > > +
> > > > > > +   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> > > > > > +   map[i] = i;
> > > > > 
> > > > > What's the point of the map array since it is 1:1 with instance?
> > > > > 
> > > > 
> > > > Future products do not have a 1 to 1 mapping and that mapping can change
> > > > based on fusing, e.g. XeHP SDV.
> > > > 
> > > > Also technically ICL / TGL / ADL physical instance 2 maps to logical
> > > > instance 1.
> > > 
> > > I don't follow the argument. All I can see is that "map[i] = i" always in
> > > the proposed code, which is then used to check "instance == 
> > > map[instance]".
> > > So I'd suggest to remove this array from the code until there is a need 
> > > for
> > > it.
> > > 
> > 
> > Ok, this logic is slightly confusing and makes more sense once we have
> > non-standard mappings. Yes, map is setup in a 1 to 1 mapping by default
> > with the value in map[i] being a physical instance. Populate_logical_ids
> > searches the map finding all physical instances present in the map
> > assigning each found instance a new logical id increasing by 1 each
> > time.
> > 
> > e.g. If the map 

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-14 Thread Tvrtko Ursulin



On 13/09/2021 17:50, Matthew Brost wrote:

On Mon, Sep 13, 2021 at 10:24:43AM +0100, Tvrtko Ursulin wrote:


On 10/09/2021 20:49, Matthew Brost wrote:

On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:


On 20/08/2021 23:44, Matthew Brost wrote:

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
(Daniel Vetter)
 - Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
.../drm/i915/gt/intel_execlists_submission.c  |  1 +
drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
}
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
{
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
engine->class = info->class;
engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
engine->props.heartbeat_interval_ms =
@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
}
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;


What's the point of the map array since it is 1:1 with instance?



Future products do not have a 1 to 1 mapping and that mapping can change
based on fusing, e.g. XeHP SDV.

Also technically ICL / TGL / ADL physical instance 2 maps to logical
instance 1.


I don't follow the argument. All I can see is that "map[i] = i" always in
the proposed code, which is then used to check "instance == map[instance]".
So I'd suggest to remove this array from the code until there is a need for
it.



Ok, this logic is slightly confusing and makes more sense once we have
non-standard mappings. Yes, map is setup in a 1 to 1 mapping by default
with the value in map[i] being a physical instance. Populate_logical_ids
searches the map finding all physical instances present in the map
assigning each found instance a new logical id increasing by 1 each
time.

e.g. If the map is setup 0-N and only physical instance 0 / 2 are
present they will get logical mapping 0 / 1 respectfully.

This algorithm works for non-standard mappings too /w fused parts. e.g.
on XeHP SDV the map is: { 0, 2, 4, 6, 1, 3, 5, 7 } and if any of the
physical instances can't be found due to fusing the logical mapping is
still correct per the bspec.

This array is absolutely needed for multi-lrc submission to work, even
on ICL / TGL / ADL as the GuC only supports logically contiguous engine
instances.


No idea how can an array fixed at "map[i] = i" be absolutely needed when 
you can just write it like "i". Sometimes it is okay to lay some ground 
work for future platforms but in this case to me it's just obfuscation 
which should be added later, when it is required.



+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
/**
 * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
 * @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engin

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-13 Thread Matthew Brost
On Mon, Sep 13, 2021 at 10:24:43AM +0100, Tvrtko Ursulin wrote:
> 
> On 10/09/2021 20:49, Matthew Brost wrote:
> > On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 20/08/2021 23:44, Matthew Brost wrote:
> > > > Add logical engine mapping. This is required for split-frame, as
> > > > workloads need to be placed on engines in a logically contiguous manner.
> > > > 
> > > > v2:
> > > >(Daniel Vetter)
> > > > - Add kernel doc for new fields
> > > > 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 
> > > > ---
> > > >drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
> > > >.../drm/i915/gt/intel_execlists_submission.c  |  1 +
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
> > > >.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> > > >5 files changed, 60 insertions(+), 29 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > index 0d9105a31d84..4d790f9a65dd 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs 
> > > > *engine, u16 iir)
> > > > GEM_DEBUG_WARN_ON(iir);
> > > >}
> > > > -static int intel_engine_setup(struct intel_gt *gt, enum 
> > > > intel_engine_id id)
> > > > +static int intel_engine_setup(struct intel_gt *gt, enum 
> > > > intel_engine_id id,
> > > > + u8 logical_instance)
> > > >{
> > > > const struct engine_info *info = &intel_engines[id];
> > > > struct drm_i915_private *i915 = gt->i915;
> > > > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, 
> > > > enum intel_engine_id id)
> > > > engine->class = info->class;
> > > > engine->instance = info->instance;
> > > > +   engine->logical_mask = BIT(logical_instance);
> > > > __sprint_engine_name(engine);
> > > > engine->props.heartbeat_interval_ms =
> > > > @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
> > > > intel_gt *gt)
> > > > return info->engine_mask;
> > > >}
> > > > +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
> > > > +u8 class, const u8 *map, u8 
> > > > num_instances)
> > > > +{
> > > > +   int i, j;
> > > > +   u8 current_logical_id = 0;
> > > > +
> > > > +   for (j = 0; j < num_instances; ++j) {
> > > > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > > > +   if (!HAS_ENGINE(gt, i) ||
> > > > +   intel_engines[i].class != class)
> > > > +   continue;
> > > > +
> > > > +   if (intel_engines[i].instance == map[j]) {
> > > > +   logical_ids[intel_engines[i].instance] =
> > > > +   current_logical_id++;
> > > > +   break;
> > > > +   }
> > > > +   }
> > > > +   }
> > > > +}
> > > > +
> > > > +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 
> > > > class)
> > > > +{
> > > > +   int i;
> > > > +   u8 map[MAX_ENGINE_INSTANCE + 1];
> > > > +
> > > > +   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> > > > +   map[i] = i;
> > > 
> > > What's the point of the map array since it is 1:1 with instance?
> > > 
> > 
> > Future products do not have a 1 to 1 mapping and that mapping can change
> > based on fusing, e.g. XeHP SDV.
> > 
> > Also technically ICL / TGL / ADL physical instance 2 maps to logical
> > instance 1.
> 
> I don't follow the argument. All I can see is that "map[i] = i" always in
> the proposed code, which is then used to check "instance == map[instance]".
> So I'd suggest to remove this array from the code until there is a need for
> it.
> 

Ok, this logic is slightly confusing and makes more sense once we have
non-standard mappings. Yes, map is setup in a 1 to 1 mapping by default
with the value in map[i] being a physical instance. Populate_logical_ids
searches the map finding all physical instances present in the map
assigning each found instance a new logical id increasing by 1 each
time.

e.g. If the map is setup 0-N and only physical instance 0 / 2 are
present they will get logical mapping 0 / 1 respectfully.

This algorithm works for non-standard mappings too /w fused parts. e.g.
on XeHP SDV the map is: { 0, 2, 4, 6, 1, 3, 5, 7 } and if any of the
physical instances can't be found due to fusing the logical mapping is
still correct per the bspec.

This array is absolutely needed for multi-lrc submission to work, even
on ICL / TGL / ADL as the GuC only supports logically contiguous engine
instances.

> > > > +   populate

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-13 Thread Tvrtko Ursulin



On 10/09/2021 20:49, Matthew Brost wrote:

On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:


On 20/08/2021 23:44, Matthew Brost wrote:

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
   (Daniel Vetter)
- Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
   .../drm/i915/gt/intel_execlists_submission.c  |  1 +
   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
   5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
   }
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
   {
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
engine->class = info->class;
engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
engine->props.heartbeat_interval_ms =
@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
   }
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;


What's the point of the map array since it is 1:1 with instance?



Future products do not have a 1 to 1 mapping and that mapping can change
based on fusing, e.g. XeHP SDV.

Also technically ICL / TGL / ADL physical instance 2 maps to logical
instance 1.


I don't follow the argument. All I can see is that "map[i] = i" always 
in the proposed code, which is then used to check "instance == 
map[instance]". So I'd suggest to remove this array from the code until 
there is a need for it.



+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
   /**
* intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
* @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
drm_WARN_ON(&i915->drm, engine_mask == 0);
@@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
-   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
-   err = intel_engine_setup(gt, i);
-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
-   mask |= BIT(i);
+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanu

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Matthew Brost
On Fri, Sep 10, 2021 at 12:12:42PM +0100, Tvrtko Ursulin wrote:
> 
> On 20/08/2021 23:44, Matthew Brost wrote:
> > Add logical engine mapping. This is required for split-frame, as
> > workloads need to be placed on engines in a logically contiguous manner.
> > 
> > v2:
> >   (Daniel Vetter)
> >- Add kernel doc for new fields
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
> >   drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
> >   .../drm/i915/gt/intel_execlists_submission.c  |  1 +
> >   drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
> >   5 files changed, 60 insertions(+), 29 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > index 0d9105a31d84..4d790f9a65dd 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > @@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs 
> > *engine, u16 iir)
> > GEM_DEBUG_WARN_ON(iir);
> >   }
> > -static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
> > +static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
> > + u8 logical_instance)
> >   {
> > const struct engine_info *info = &intel_engines[id];
> > struct drm_i915_private *i915 = gt->i915;
> > @@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> > intel_engine_id id)
> > engine->class = info->class;
> > engine->instance = info->instance;
> > +   engine->logical_mask = BIT(logical_instance);
> > __sprint_engine_name(engine);
> > engine->props.heartbeat_interval_ms =
> > @@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
> > intel_gt *gt)
> > return info->engine_mask;
> >   }
> > +static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
> > +u8 class, const u8 *map, u8 num_instances)
> > +{
> > +   int i, j;
> > +   u8 current_logical_id = 0;
> > +
> > +   for (j = 0; j < num_instances; ++j) {
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   if (!HAS_ENGINE(gt, i) ||
> > +   intel_engines[i].class != class)
> > +   continue;
> > +
> > +   if (intel_engines[i].instance == map[j]) {
> > +   logical_ids[intel_engines[i].instance] =
> > +   current_logical_id++;
> > +   break;
> > +   }
> > +   }
> > +   }
> > +}
> > +
> > +static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 
> > class)
> > +{
> > +   int i;
> > +   u8 map[MAX_ENGINE_INSTANCE + 1];
> > +
> > +   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
> > +   map[i] = i;
> 
> What's the point of the map array since it is 1:1 with instance?
> 

Future products do not have a 1 to 1 mapping and that mapping can change
based on fusing, e.g. XeHP SDV.

Also technically ICL / TGL / ADL physical instance 2 maps to logical
instance 1.

> > +   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
> > +}
> > +
> >   /**
> >* intel_engines_init_mmio() - allocate and prepare the Engine Command 
> > Streamers
> >* @gt: pointer to struct intel_gt
> > @@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > struct drm_i915_private *i915 = gt->i915;
> > const unsigned int engine_mask = init_engine_mask(gt);
> > unsigned int mask = 0;
> > -   unsigned int i;
> > +   unsigned int i, class;
> > +   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
> > int err;
> > drm_WARN_ON(&i915->drm, engine_mask == 0);
> > @@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
> > if (i915_inject_probe_failure(i915))
> > return -ENODEV;
> > -   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
> > -   if (!HAS_ENGINE(gt, i))
> > -   continue;
> > +   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
> > +   setup_logical_ids(gt, logical_ids, class);
> > -   err = intel_engine_setup(gt, i);
> > -   if (err)
> > -   goto cleanup;
> > +   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
> > +   u8 instance = intel_engines[i].instance;
> > +
> > +   if (intel_engines[i].class != class ||
> > +   !HAS_ENGINE(gt, i))
> > +   continue;
> > -   mask |= BIT(i);
> > +   err = intel_engine_setup(gt, i,
> > +logical_ids[instance]);
> > +   if (err)
> > +   goto cleanup;
> > +
> > +   mask |= BIT(i);
> 
> I still this there is a less clu

Re: [Intel-gfx] [PATCH 08/27] drm/i915: Add logical engine mapping

2021-09-10 Thread Tvrtko Ursulin



On 20/08/2021 23:44, Matthew Brost wrote:

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
  (Daniel Vetter)
   - Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
  .../drm/i915/gt/intel_execlists_submission.c  |  1 +
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
  5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
  }
  
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)

+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
  {
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
  
  	engine->class = info->class;

engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
  
  	engine->props.heartbeat_interval_ms =

@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
  }
  
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,

+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;


What's the point of the map array since it is 1:1 with instance?


+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
  /**
   * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
   * @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
  
  	drm_WARN_ON(&i915->drm, engine_mask == 0);

@@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
  
-	for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {

-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
  
-		err = intel_engine_setup(gt, i);

-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
  
-		mask |= BIT(i);

+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanup;
+
+   mask |= BIT(i);


I still this there is a less clunky way to set this up in less code and 
more readable at the same time. Like do it in two passes so you can 
iterate gt->engine_class[] array instead of having to implement a skip 
condition (both on class and HAS_ENGINE at two places) and also avoid 
walking the flat intel_engines array recursively.



+   }
}
  
  	/*

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ed91bcff20eb..fddf35546b58 10

[PATCH 08/27] drm/i915: Add logical engine mapping

2021-08-20 Thread Matthew Brost
Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
 (Daniel Vetter)
  - Add kernel doc for new fields

Signed-off-by: Matthew Brost 
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c | 60 ---
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  5 ++
 .../drm/i915/gt/intel_execlists_submission.c  |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c|  2 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 21 +--
 5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 0d9105a31d84..4d790f9a65dd 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -290,7 +290,8 @@ static void nop_irq_handler(struct intel_engine_cs *engine, 
u16 iir)
GEM_DEBUG_WARN_ON(iir);
 }
 
-static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
+static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id,
+ u8 logical_instance)
 {
const struct engine_info *info = &intel_engines[id];
struct drm_i915_private *i915 = gt->i915;
@@ -334,6 +335,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
 
engine->class = info->class;
engine->instance = info->instance;
+   engine->logical_mask = BIT(logical_instance);
__sprint_engine_name(engine);
 
engine->props.heartbeat_interval_ms =
@@ -572,6 +574,37 @@ static intel_engine_mask_t init_engine_mask(struct 
intel_gt *gt)
return info->engine_mask;
 }
 
+static void populate_logical_ids(struct intel_gt *gt, u8 *logical_ids,
+u8 class, const u8 *map, u8 num_instances)
+{
+   int i, j;
+   u8 current_logical_id = 0;
+
+   for (j = 0; j < num_instances; ++j) {
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   if (!HAS_ENGINE(gt, i) ||
+   intel_engines[i].class != class)
+   continue;
+
+   if (intel_engines[i].instance == map[j]) {
+   logical_ids[intel_engines[i].instance] =
+   current_logical_id++;
+   break;
+   }
+   }
+   }
+}
+
+static void setup_logical_ids(struct intel_gt *gt, u8 *logical_ids, u8 class)
+{
+   int i;
+   u8 map[MAX_ENGINE_INSTANCE + 1];
+
+   for (i = 0; i < MAX_ENGINE_INSTANCE + 1; ++i)
+   map[i] = i;
+   populate_logical_ids(gt, logical_ids, class, map, ARRAY_SIZE(map));
+}
+
 /**
  * intel_engines_init_mmio() - allocate and prepare the Engine Command 
Streamers
  * @gt: pointer to struct intel_gt
@@ -583,7 +616,8 @@ int intel_engines_init_mmio(struct intel_gt *gt)
struct drm_i915_private *i915 = gt->i915;
const unsigned int engine_mask = init_engine_mask(gt);
unsigned int mask = 0;
-   unsigned int i;
+   unsigned int i, class;
+   u8 logical_ids[MAX_ENGINE_INSTANCE + 1];
int err;
 
drm_WARN_ON(&i915->drm, engine_mask == 0);
@@ -593,15 +627,23 @@ int intel_engines_init_mmio(struct intel_gt *gt)
if (i915_inject_probe_failure(i915))
return -ENODEV;
 
-   for (i = 0; i < ARRAY_SIZE(intel_engines); i++) {
-   if (!HAS_ENGINE(gt, i))
-   continue;
+   for (class = 0; class < MAX_ENGINE_CLASS + 1; ++class) {
+   setup_logical_ids(gt, logical_ids, class);
 
-   err = intel_engine_setup(gt, i);
-   if (err)
-   goto cleanup;
+   for (i = 0; i < ARRAY_SIZE(intel_engines); ++i) {
+   u8 instance = intel_engines[i].instance;
+
+   if (intel_engines[i].class != class ||
+   !HAS_ENGINE(gt, i))
+   continue;
 
-   mask |= BIT(i);
+   err = intel_engine_setup(gt, i,
+logical_ids[instance]);
+   if (err)
+   goto cleanup;
+
+   mask |= BIT(i);
+   }
}
 
/*
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index ed91bcff20eb..fddf35546b58 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -266,6 +266,11 @@ struct intel_engine_cs {
unsigned int guc_id;
 
intel_engine_mask_t mask;
+   /**
+* @logical_mask: logical mask of engine, reported to user space via
+* query IOCTL and used to communicate with the GuC in logical space
+*/
+   intel_engine_mask_t logical_mas