Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
On 02/02/2023 09:39, Andrzej Hajda wrote: On 02.02.2023 09:33, Tvrtko Ursulin wrote: On 02/02/2023 07:43, Andrzej Hajda wrote: On 01.02.2023 17:51, Tvrtko Ursulin wrote: [snip] Btw - do you have any idea why the test is suppressed already?! CI told me BAT was a success... Except this patch, igt@i915_selftest@live@gt_tlb always succeeds[1][2]. So I guess this is just CI logic which do not trust new tests, sounds reasonable. Lets wait few days to see if it changes. Did another run and it is all still green. Also have another r-b from Matt now. Okay to merge from your point of view? Regards, Tvrtko [1]: http://gfx-ci.igk.intel.com/cibuglog-ng/results/all?query_key=d3cc1f04e52acd0f911cd54fd855a3f085a40e14 [2]: https://lore.kernel.org/intel-gfx/?q=igt%40i915_selftest%40live%40gt_tlb Regards Andrzej Regards, Tvrtko
Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
On Wed, Feb 01, 2023 at 04:51:46PM +, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin > > As the logic for selecting the register and corresponsing values grew, the > code become a bit unsightly. Consolidate by storing the required values at > engine init time in the engine itself, and by doing so minimise the amount > of invariant platform and engine checks during each and every TLB > invalidation. > > v2: > * Fail engine probe if TLB invlidations registers are unknown. > > v3: > * Rebase. > > Signed-off-by: Tvrtko Ursulin > Cc: Andrzej Hajda > Cc: Matt Roper > Reviewed-by: Andrzej Hajda # v1 Reviewed-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_engine_cs.c| 96 + > drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 ++ > drivers/gpu/drm/i915/gt/intel_gt.c | 138 +++ > 3 files changed, 133 insertions(+), 116 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > index d4e29da74612..e430945743ec 100644 > --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c > +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c > @@ -9,6 +9,7 @@ > > #include "gem/i915_gem_context.h" > #include "gem/i915_gem_internal.h" > +#include "gt/intel_gt_print.h" > #include "gt/intel_gt_regs.h" > > #include "i915_cmd_parser.h" > @@ -1143,12 +1144,107 @@ static int init_status_page(struct intel_engine_cs > *engine) > return ret; > } > > +static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine) > +{ > + static const union intel_engine_tlb_inv_reg gen8_regs[] = { > + [RENDER_CLASS].reg = GEN8_RTCR, > + [VIDEO_DECODE_CLASS].reg= GEN8_M1TCR, /* , GEN8_M2TCR */ > + [VIDEO_ENHANCEMENT_CLASS].reg = GEN8_VTCR, > + [COPY_ENGINE_CLASS].reg = GEN8_BTCR, > + }; > + static const union intel_engine_tlb_inv_reg gen12_regs[] = { > + [RENDER_CLASS].reg = GEN12_GFX_TLB_INV_CR, > + [VIDEO_DECODE_CLASS].reg= GEN12_VD_TLB_INV_CR, > + [VIDEO_ENHANCEMENT_CLASS].reg = GEN12_VE_TLB_INV_CR, > + [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR, > + [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR, > + }; > + static const union intel_engine_tlb_inv_reg xehp_regs[] = { > + [RENDER_CLASS].mcr_reg= XEHP_GFX_TLB_INV_CR, > + [VIDEO_DECODE_CLASS].mcr_reg = XEHP_VD_TLB_INV_CR, > + [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR, > + [COPY_ENGINE_CLASS].mcr_reg = XEHP_BLT_TLB_INV_CR, > + [COMPUTE_CLASS].mcr_reg = XEHP_COMPCTX_TLB_INV_CR, > + }; > + struct drm_i915_private *i915 = engine->i915; > + const union intel_engine_tlb_inv_reg *regs; > + union intel_engine_tlb_inv_reg reg; > + unsigned int class = engine->class; > + unsigned int num = 0; > + u32 val; > + > + /* > + * New platforms should not be added with catch-all-newer (>=) > + * condition so that any later platform added triggers the below warning > + * and in turn mandates a human cross-check of whether the invalidation > + * flows have compatible semantics. > + * > + * For instance with the 11.00 -> 12.00 transition three out of five > + * respective engine registers were moved to masked type. Then after the > + * 12.00 -> 12.50 transition multi cast handling is required too. > + */ > + > + if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) || > + GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) { > + regs = xehp_regs; > + num = ARRAY_SIZE(xehp_regs); > + } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) || > +GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) { > + regs = gen12_regs; > + num = ARRAY_SIZE(gen12_regs); > + } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { > + regs = gen8_regs; > + num = ARRAY_SIZE(gen8_regs); > + } else if (GRAPHICS_VER(i915) < 8) { > + return 0; > + } > + > + if (gt_WARN_ONCE(engine->gt, !num, > + "Platform does not implement TLB invalidation!")) > + return -ENODEV; > + > + if (gt_WARN_ON_ONCE(engine->gt, > + class >= num || > + (!regs[class].reg.reg && > + !regs[class].mcr_reg.reg))) > + return -ERANGE; > + > + reg = regs[class]; > + > + if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) { > + reg.reg.reg += 4 * engine->instance; /* GEN8_M2TCR */ > + val = 0; > + } else { > + val = engine->instance; > + } > + > + val = BIT(val); > + > + engine->tlb_inv.mcr = regs == xehp_regs; > + engine->tlb_inv.reg = reg; > +
Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
On 02.02.2023 09:33, Tvrtko Ursulin wrote: On 02/02/2023 07:43, Andrzej Hajda wrote: On 01.02.2023 17:51, Tvrtko Ursulin wrote: [snip] Btw - do you have any idea why the test is suppressed already?! CI told me BAT was a success... Except this patch, igt@i915_selftest@live@gt_tlb always succeeds[1][2]. So I guess this is just CI logic which do not trust new tests, sounds reasonable. Lets wait few days to see if it changes. [1]: http://gfx-ci.igk.intel.com/cibuglog-ng/results/all?query_key=d3cc1f04e52acd0f911cd54fd855a3f085a40e14 [2]: https://lore.kernel.org/intel-gfx/?q=igt%40i915_selftest%40live%40gt_tlb Regards Andrzej Regards, Tvrtko
Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
On 02/02/2023 07:43, Andrzej Hajda wrote: On 01.02.2023 17:51, Tvrtko Ursulin wrote: [snip] +static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine) +{ + static const union intel_engine_tlb_inv_reg gen8_regs[] = { + [RENDER_CLASS].reg = GEN8_RTCR, + [VIDEO_DECODE_CLASS].reg = GEN8_M1TCR, /* , GEN8_M2TCR */ + [VIDEO_ENHANCEMENT_CLASS].reg = GEN8_VTCR, + [COPY_ENGINE_CLASS].reg = GEN8_BTCR, + }; + static const union intel_engine_tlb_inv_reg gen12_regs[] = { + [RENDER_CLASS].reg = GEN12_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].reg = GEN12_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].reg = GEN12_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR, + [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR, + }; + static const union intel_engine_tlb_inv_reg xehp_regs[] = { + [RENDER_CLASS].mcr_reg = XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].mcr_reg = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].mcr_reg = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS].mcr_reg = XEHP_COMPCTX_TLB_INV_CR, + }; + struct drm_i915_private *i915 = engine->i915; + const union intel_engine_tlb_inv_reg *regs; + union intel_engine_tlb_inv_reg reg; + unsigned int class = engine->class; + unsigned int num = 0; + u32 val; + + /* + * New platforms should not be added with catch-all-newer (>=) + * condition so that any later platform added triggers the below warning + * and in turn mandates a human cross-check of whether the invalidation + * flows have compatible semantics. + * + * For instance with the 11.00 -> 12.00 transition three out of five + * respective engine registers were moved to masked type. Then after the + * 12.00 -> 12.50 transition multi cast handling is required too. + */ + + if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) { + regs = xehp_regs; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) { + regs = gen12_regs; + num = ARRAY_SIZE(gen12_regs); + } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { + regs = gen8_regs; + num = ARRAY_SIZE(gen8_regs); + } else if (GRAPHICS_VER(i915) < 8) { + return 0; + } + + if (gt_WARN_ONCE(engine->gt, !num, + "Platform does not implement TLB invalidation!")) + return -ENODEV; + + if (gt_WARN_ON_ONCE(engine->gt, + class >= num || + (!regs[class].reg.reg && + !regs[class].mcr_reg.reg))) + return -ERANGE; + + reg = regs[class]; + + if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) { As selftest pointed out it should cover also gen 9-11. Btw maybe it is worth to convert this pseudo array indexing to direct assignment: if ((GRAPHICS_VER(i915) <= 11 && class == VIDEO_DECODE_CLASS && engine->instance == 1) { reg.reg = GEN8_M2TCR; val = 0; } Yes good call, v4 sent. Btw - do you have any idea why the test is suppressed already?! CI told me BAT was a success... Regards, Tvrtko
Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
On 01.02.2023 17:51, Tvrtko Ursulin wrote: From: Tvrtko Ursulin As the logic for selecting the register and corresponsing values grew, the code become a bit unsightly. Consolidate by storing the required values at engine init time in the engine itself, and by doing so minimise the amount of invariant platform and engine checks during each and every TLB invalidation. v2: * Fail engine probe if TLB invlidations registers are unknown. v3: * Rebase. Signed-off-by: Tvrtko Ursulin Cc: Andrzej Hajda Cc: Matt Roper Reviewed-by: Andrzej Hajda # v1 --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 96 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 ++ drivers/gpu/drm/i915/gt/intel_gt.c | 138 +++ 3 files changed, 133 insertions(+), 116 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index d4e29da74612..e430945743ec 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -9,6 +9,7 @@ #include "gem/i915_gem_context.h" #include "gem/i915_gem_internal.h" +#include "gt/intel_gt_print.h" #include "gt/intel_gt_regs.h" #include "i915_cmd_parser.h" @@ -1143,12 +1144,107 @@ static int init_status_page(struct intel_engine_cs *engine) return ret; } +static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine) +{ + static const union intel_engine_tlb_inv_reg gen8_regs[] = { + [RENDER_CLASS].reg = GEN8_RTCR, + [VIDEO_DECODE_CLASS].reg= GEN8_M1TCR, /* , GEN8_M2TCR */ + [VIDEO_ENHANCEMENT_CLASS].reg = GEN8_VTCR, + [COPY_ENGINE_CLASS].reg = GEN8_BTCR, + }; + static const union intel_engine_tlb_inv_reg gen12_regs[] = { + [RENDER_CLASS].reg = GEN12_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].reg= GEN12_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].reg = GEN12_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR, + [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR, + }; + static const union intel_engine_tlb_inv_reg xehp_regs[] = { + [RENDER_CLASS].mcr_reg= XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].mcr_reg = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].mcr_reg = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS].mcr_reg = XEHP_COMPCTX_TLB_INV_CR, + }; + struct drm_i915_private *i915 = engine->i915; + const union intel_engine_tlb_inv_reg *regs; + union intel_engine_tlb_inv_reg reg; + unsigned int class = engine->class; + unsigned int num = 0; + u32 val; + + /* +* New platforms should not be added with catch-all-newer (>=) +* condition so that any later platform added triggers the below warning +* and in turn mandates a human cross-check of whether the invalidation +* flows have compatible semantics. +* +* For instance with the 11.00 -> 12.00 transition three out of five +* respective engine registers were moved to masked type. Then after the +* 12.00 -> 12.50 transition multi cast handling is required too. +*/ + + if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) { + regs = xehp_regs; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) { + regs = gen12_regs; + num = ARRAY_SIZE(gen12_regs); + } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { + regs = gen8_regs; + num = ARRAY_SIZE(gen8_regs); + } else if (GRAPHICS_VER(i915) < 8) { + return 0; + } + + if (gt_WARN_ONCE(engine->gt, !num, +"Platform does not implement TLB invalidation!")) + return -ENODEV; + + if (gt_WARN_ON_ONCE(engine->gt, +class >= num || +(!regs[class].reg.reg && + !regs[class].mcr_reg.reg))) + return -ERANGE; + + reg = regs[class]; + + if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) { As selftest pointed out it should cover also gen 9-11. Btw maybe it is worth to convert this pseudo array indexing to direct assignment: if ((GRAPHICS_VER(i915) <= 11 && class == VIDEO_DECODE_CLASS && engine->instance == 1) { reg.reg = GEN8_M2TCR; val = 0; } Regards Andrzej + reg.reg.reg += 4 * engine->instance; /* GEN8_M2TCR */ + val = 0; + } else { +
[Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow
From: Tvrtko Ursulin As the logic for selecting the register and corresponsing values grew, the code become a bit unsightly. Consolidate by storing the required values at engine init time in the engine itself, and by doing so minimise the amount of invariant platform and engine checks during each and every TLB invalidation. v2: * Fail engine probe if TLB invlidations registers are unknown. v3: * Rebase. Signed-off-by: Tvrtko Ursulin Cc: Andrzej Hajda Cc: Matt Roper Reviewed-by: Andrzej Hajda # v1 --- drivers/gpu/drm/i915/gt/intel_engine_cs.c| 96 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 15 ++ drivers/gpu/drm/i915/gt/intel_gt.c | 138 +++ 3 files changed, 133 insertions(+), 116 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index d4e29da74612..e430945743ec 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -9,6 +9,7 @@ #include "gem/i915_gem_context.h" #include "gem/i915_gem_internal.h" +#include "gt/intel_gt_print.h" #include "gt/intel_gt_regs.h" #include "i915_cmd_parser.h" @@ -1143,12 +1144,107 @@ static int init_status_page(struct intel_engine_cs *engine) return ret; } +static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine) +{ + static const union intel_engine_tlb_inv_reg gen8_regs[] = { + [RENDER_CLASS].reg = GEN8_RTCR, + [VIDEO_DECODE_CLASS].reg= GEN8_M1TCR, /* , GEN8_M2TCR */ + [VIDEO_ENHANCEMENT_CLASS].reg = GEN8_VTCR, + [COPY_ENGINE_CLASS].reg = GEN8_BTCR, + }; + static const union intel_engine_tlb_inv_reg gen12_regs[] = { + [RENDER_CLASS].reg = GEN12_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].reg= GEN12_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].reg = GEN12_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR, + [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR, + }; + static const union intel_engine_tlb_inv_reg xehp_regs[] = { + [RENDER_CLASS].mcr_reg= XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS].mcr_reg = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS].mcr_reg = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS].mcr_reg = XEHP_COMPCTX_TLB_INV_CR, + }; + struct drm_i915_private *i915 = engine->i915; + const union intel_engine_tlb_inv_reg *regs; + union intel_engine_tlb_inv_reg reg; + unsigned int class = engine->class; + unsigned int num = 0; + u32 val; + + /* +* New platforms should not be added with catch-all-newer (>=) +* condition so that any later platform added triggers the below warning +* and in turn mandates a human cross-check of whether the invalidation +* flows have compatible semantics. +* +* For instance with the 11.00 -> 12.00 transition three out of five +* respective engine registers were moved to masked type. Then after the +* 12.00 -> 12.50 transition multi cast handling is required too. +*/ + + if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) { + regs = xehp_regs; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) || + GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) { + regs = gen12_regs; + num = ARRAY_SIZE(gen12_regs); + } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { + regs = gen8_regs; + num = ARRAY_SIZE(gen8_regs); + } else if (GRAPHICS_VER(i915) < 8) { + return 0; + } + + if (gt_WARN_ONCE(engine->gt, !num, +"Platform does not implement TLB invalidation!")) + return -ENODEV; + + if (gt_WARN_ON_ONCE(engine->gt, +class >= num || +(!regs[class].reg.reg && + !regs[class].mcr_reg.reg))) + return -ERANGE; + + reg = regs[class]; + + if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) { + reg.reg.reg += 4 * engine->instance; /* GEN8_M2TCR */ + val = 0; + } else { + val = engine->instance; + } + + val = BIT(val); + + engine->tlb_inv.mcr = regs == xehp_regs; + engine->tlb_inv.reg = reg; + engine->tlb_inv.done = val; + + if (GRAPHICS_VER(i915) >= 12 && + (engine->class == VIDEO_DECODE_CLASS || +engine->class == VIDEO_ENHANCEMENT_CLASS || +