Re: [Intel-gfx] [PATCH] drm/i915/display/xelpd: Fix incorrect color capability reporting
> -Original Message- > From: Sharma, Swati2 > Sent: Thursday, July 8, 2021 1:07 AM > To: Shankar, Uma ; intel-gfx@lists.freedesktop.org > Subject: Re: [PATCH] drm/i915/display/xelpd: Fix incorrect color capability > reporting > > Reviewed-by: Swati Sharma Merged the change to drm-intel-next. Thanks for the review Regards, Uma Shankar > > On 07-Jul-21 3:22 PM, Uma Shankar wrote: > > On XELPD platforms, color management support is not yet enabled. > > Fix wrongly reporting the same through platform info, which was > > resulting in incorrect initialization and usage. > > > > Cc: Swati Sharma > > Signed-off-by: Uma Shankar > > --- > > drivers/gpu/drm/i915/i915_pci.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_pci.c > > b/drivers/gpu/drm/i915/i915_pci.c index a7bfdd827bc8..8ff1990528d1 > > 100644 > > --- a/drivers/gpu/drm/i915/i915_pci.c > > +++ b/drivers/gpu/drm/i915/i915_pci.c > > @@ -947,7 +947,7 @@ static const struct intel_device_info adl_s_info = > > { > > > > #define XE_LPD_FEATURES \ > > .abox_mask = GENMASK(1, 0), > > \ > > - .color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 }, > > \ > > + .color = { .degamma_lut_size = 0, .gamma_lut_size = 0 }, > > \ > > .cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) | > \ > > BIT(TRANSCODER_C) | BIT(TRANSCODER_D), > \ > > .dbuf.size = 4096, > > \ > > > > -- > ~Swati Sharma ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 09/10] drm/i915/step: Add intel_step_name() helper
On Thu, Jul 08, 2021 at 04:18:20PM -0700, Anusha Srivatsa wrote: > Add a helper to convert the step info to string. > This is specifically useful when we want to load a specific > firmware for a given stepping/substepping combination. What if we use macros to generate the per-stepping code here as well as the stepping values in the enum? In intel_step.h: #define STEPPING_NAME_LIST(func) \ func(A0) func(A1) func(A2) func(B0) ... #define STEPPING_ENUM_VAL(name) STEP_##name, enum intel_step { STEP_NONE = 0, STEPPING_NAME_LIST(STEPPING_ENUM_VAL) STEP_FUTURE, STEP_FOREVER, }; and in intel_step.c: #define STEPPING_NAME_CASE(name)\ case STEP_##name: \ return #name; \ break; const char *intel_step_name(enum intel_step step) { switch(step) { STEPPING_NAME_LIST(STEPPING_NAME_CASE) default: return "**"; } } This has the advantage that anytime a new stepping is added (in STEPPING_NAME_LIST) it will generate a new "STEP_XX" enum value and a new case statement to return "XX" as the name; we won't have to remember to update two separate places in the code. Matt > > Suggested-by: Jani Nikula > Signed-off-by: Anusha Srivatsa > --- > drivers/gpu/drm/i915/intel_step.c | 58 +++ > drivers/gpu/drm/i915/intel_step.h | 1 + > 2 files changed, 59 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_step.c > b/drivers/gpu/drm/i915/intel_step.c > index 99c0d3df001b..9af7f30b777e 100644 > --- a/drivers/gpu/drm/i915/intel_step.c > +++ b/drivers/gpu/drm/i915/intel_step.c > @@ -182,3 +182,61 @@ void intel_step_init(struct drm_i915_private *i915) > > RUNTIME_INFO(i915)->step = step; > } > + > +const char *intel_step_name(enum intel_step step) { > + switch (step) { > + case STEP_A0: > + return "A0"; > + break; > + case STEP_A1: > + return "A1"; > + break; > + case STEP_A2: > + return "A2"; > + break; > + case STEP_B0: > + return "B0"; > + break; > + case STEP_B1: > + return "B1"; > + break; > + case STEP_B2: > + return "B2"; > + break; > + case STEP_C0: > + return "C0"; > + break; > + case STEP_C1: > + return "C1"; > + break; > + case STEP_D0: > + return "D0"; > + break; > + case STEP_D1: > + return "D1"; > + break; > + case STEP_E0: > + return "E0"; > + break; > + case STEP_F0: > + return "F0"; > + break; > + case STEP_G0: > + return "G0"; > + break; > + case STEP_H0: > + return "H0"; > + break; > + case STEP_I0: > + return "I0"; > + break; > + case STEP_I1: > + return "I1"; > + break; > + case STEP_J0: > + return "J0"; > + break; > + default: > + return "**"; > + } > +} > diff --git a/drivers/gpu/drm/i915/intel_step.h > b/drivers/gpu/drm/i915/intel_step.h > index 3e8b2babd9da..2fbe51483472 100644 > --- a/drivers/gpu/drm/i915/intel_step.h > +++ b/drivers/gpu/drm/i915/intel_step.h > @@ -43,5 +43,6 @@ enum intel_step { > }; > > void intel_step_init(struct drm_i915_private *i915); > +const char *intel_step_name(enum intel_step step); > > #endif /* __INTEL_STEP_H__ */ > -- > 2.32.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 08/10] drm/i915/bxt: Use revid->stepping tables
On Thu, Jul 08, 2021 at 04:18:19PM -0700, Anusha Srivatsa wrote: > Switch BXT to use a revid->stepping table as we're trying to do on all > platforms going forward. > > Signed-off-by: Anusha Srivatsa > --- > drivers/gpu/drm/i915/intel_step.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/drivers/gpu/drm/i915/intel_step.c > b/drivers/gpu/drm/i915/intel_step.c > index c4ce02d22828..99c0d3df001b 100644 > --- a/drivers/gpu/drm/i915/intel_step.c > +++ b/drivers/gpu/drm/i915/intel_step.c > @@ -31,6 +31,15 @@ static const struct intel_step_info skl_revid_step_tbl[] = > { > [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 }, > }; > > +static const struct intel_step_info bxt_revids[] = { > + [0] = { .gt_step = STEP_A0 }, > + [1] = { .gt_step = STEP_A1 }, > + [2] = { .gt_step = STEP_A2 }, > + [6] = { .gt_step = STEP_B0 }, > + [7] = { .gt_step = STEP_B1 }, > + [8] = { .gt_step = STEP_B2 }, I realize the mistake originates from the #define's that you're replacing with these tables, but the values in this table aren't the correct GT/display steppings, but rather the SoC stepping; that's the wrong thing for us to be matching on for workarounds, DMC versions, etc. You want to use column #4 of the bspec table, not column #2. Also we need to update this to use the proper revisions from the bspec; most of the ones you have here were temporary placeholders before the platform was released and the actual revisions that showed up in real hardware are higher than any of your table entries. If we take into account the right-most column of the bspec we'd actually want: static const struct intel_step_info bxt_revids[] = { [0xA] = { .gt_step = STEP_C0 }, [0xB] = { .gt_step = STEP_C0 }, [0xC] = { .gt_step = STEP_D0 }, [0xD] = { .gt_step = STEP_E0 }, }; Matt > +}; > + > static const struct intel_step_info kbl_revid_step_tbl[] = { > [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, > [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, > @@ -129,6 +138,9 @@ void intel_step_init(struct drm_i915_private *i915) > } else if (IS_KABYLAKE(i915)) { > revids = kbl_revid_step_tbl; > size = ARRAY_SIZE(kbl_revid_step_tbl); > + } else if (IS_BROXTON(i915)) { > + revids = bxt_revids; > + size = ARRAY_SIZE(bxt_revids); > } else if (IS_SKYLAKE(i915)) { > revids = skl_revid_step_tbl; > size = ARRAY_SIZE(skl_revid_step_tbl); > -- > 2.32.0 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BAT: failure for Get stepping info from RUNTIME_INFO->step
== Series Details == Series: Get stepping info from RUNTIME_INFO->step URL : https://patchwork.freedesktop.org/series/92346/ State : failure == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20560 Summary --- **FAILURE** Serious unknown changes coming with Patchwork_20560 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_20560, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/index.html Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_20560: ### IGT changes ### Possible regressions * igt@core_hotunplug@unbind-rebind: - fi-bxt-dsi: [PASS][1] -> [DMESG-WARN][2] [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bxt-dsi/igt@core_hotunp...@unbind-rebind.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bxt-dsi/igt@core_hotunp...@unbind-rebind.html Known issues Here are the changes found in Patchwork_20560 that come from known issues: ### IGT changes ### Issues hit * igt@gem_sync@basic-each: - fi-bdw-5557u: [PASS][3] -> [INCOMPLETE][4] ([i915#2944]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@gem_s...@basic-each.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bdw-5557u/igt@gem_s...@basic-each.html Warnings * igt@runner@aborted: - fi-bdw-5557u: [FAIL][5] ([i915#2722] / [i915#3744]) -> [FAIL][6] ([i915#2722]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@run...@aborted.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bdw-5557u/igt@run...@aborted.html [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722 [i915#2944]: https://gitlab.freedesktop.org/drm/intel/issues/2944 [i915#3744]: https://gitlab.freedesktop.org/drm/intel/issues/3744 Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20560 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20560: 2ecb4107afb2f9dfa4c87599cdb00c099b3de1e3 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 2ecb4107afb2 drm/i915/dmc: Modify intel_get_stepping_info() de45975f5a75 drm/i915/step: Add intel_step_name() helper 5b936277263e drm/i915/bxt: Use revid->stepping tables 2bc809b73f67 drm/i915/cnl: Drop all workarounds da4af914b48b drm/i915/dg1: Use revid->stepping tables 95363c8b9dc9 drm/i915/rkl: Use revid->stepping tables 2f61797e047b drm/i915/jsl_ehl: Use revid->stepping tables daee55a59631 drm/i915/icl: Use revid->stepping tables 8ad43b5ade6f drm/i915/skl: Use revid->stepping tables c109c3b789aa drm/i915: Make pre-production detection use direct revid comparison == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Get stepping info from RUNTIME_INFO->step
== Series Details == Series: Get stepping info from RUNTIME_INFO->step URL : https://patchwork.freedesktop.org/series/92346/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. - +drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct i915_vma *[assigned] vma +drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] __iomem *[assigned] iomem +drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct i915_address_space * +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block +drivers/gpu/drm/i915/gt/intel_ring_submission.c:1210:24: warning: Using plain integer as NULL pointer +drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216 +drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216 +./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080) +./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080) +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning:
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Get stepping info from RUNTIME_INFO->step
== Series Details == Series: Get stepping info from RUNTIME_INFO->step URL : https://patchwork.freedesktop.org/series/92346/ State : warning == Summary == $ dim checkpatch origin/drm-tip c109c3b789aa drm/i915: Make pre-production detection use direct revid comparison 8ad43b5ade6f drm/i915/skl: Use revid->stepping tables -:54: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #54: FILE: drivers/gpu/drm/i915/i915_drv.h:1512: +#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, until)) total: 0 errors, 0 warnings, 1 checks, 85 lines checked daee55a59631 drm/i915/icl: Use revid->stepping tables -:93: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #93: FILE: drivers/gpu/drm/i915/i915_drv.h:1519: +#define IS_ICL_GT_STEP(p, since, until) \ + (IS_ICELAKE(p) && IS_GT_STEP(p, since, until)) total: 0 errors, 0 warnings, 1 checks, 94 lines checked 2f61797e047b drm/i915/jsl_ehl: Use revid->stepping tables -:51: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #51: FILE: drivers/gpu/drm/i915/i915_drv.h:1522: +#define IS_JSL_EHL_GT_STEP(p, since, until) \ + (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until)) -:53: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #53: FILE: drivers/gpu/drm/i915/i915_drv.h:1524: +#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \ + (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until)) total: 0 errors, 0 warnings, 2 checks, 51 lines checked 95363c8b9dc9 drm/i915/rkl: Use revid->stepping tables -:48: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #48: FILE: drivers/gpu/drm/i915/i915_drv.h:1539: +#define IS_RKL_DISPLAY_STEP(p, since, until) \ + (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until)) total: 0 errors, 0 warnings, 1 checks, 51 lines checked da4af914b48b drm/i915/dg1: Use revid->stepping tables -:124: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #124: FILE: drivers/gpu/drm/i915/i915_drv.h:1533: +#define IS_DG1_GT_STEP(p, since, until) \ + (IS_DG1(p) && IS_GT_STEP(p, since, until)) -:126: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects? #126: FILE: drivers/gpu/drm/i915/i915_drv.h:1535: +#define IS_DG1_DISPLAY_STEP(p, since, until) \ + (IS_DG1(p) && IS_DISPLAY_STEP(p, since, until)) total: 0 errors, 0 warnings, 2 checks, 118 lines checked 2bc809b73f67 drm/i915/cnl: Drop all workarounds 5b936277263e drm/i915/bxt: Use revid->stepping tables de45975f5a75 drm/i915/step: Add intel_step_name() helper -:22: ERROR:OPEN_BRACE: open brace '{' following function definitions go on the next line #22: FILE: drivers/gpu/drm/i915/intel_step.c:186: +const char *intel_step_name(enum intel_step step) { -:26: WARNING:UNNECESSARY_BREAK: break is not useful after a return #26: FILE: drivers/gpu/drm/i915/intel_step.c:190: + return "A0"; + break; -:29: WARNING:UNNECESSARY_BREAK: break is not useful after a return #29: FILE: drivers/gpu/drm/i915/intel_step.c:193: + return "A1"; + break; -:32: WARNING:UNNECESSARY_BREAK: break is not useful after a return #32: FILE: drivers/gpu/drm/i915/intel_step.c:196: + return "A2"; + break; -:35: WARNING:UNNECESSARY_BREAK: break is not useful after a return #35: FILE: drivers/gpu/drm/i915/intel_step.c:199: + return "B0"; + break; -:38: WARNING:UNNECESSARY_BREAK: break is not useful after a return #38: FILE: drivers/gpu/drm/i915/intel_step.c:202: + return "B1"; + break; -:41: WARNING:UNNECESSARY_BREAK: break is not useful after a return #41: FILE: drivers/gpu/drm/i915/intel_step.c:205: + return "B2"; + break; -:44: WARNING:UNNECESSARY_BREAK: break is not useful after a return #44: FILE: drivers/gpu/drm/i915/intel_step.c:208: + return "C0"; + break; -:47: WARNING:UNNECESSARY_BREAK: break is not useful after a return #47: FILE: drivers/gpu/drm/i915/intel_step.c:211: + return "C1"; + break; -:50: WARNING:UNNECESSARY_BREAK: break is not useful after a return #50: FILE: drivers/gpu/drm/i915/intel_step.c:214: + return "D0"; + break; -:53: WARNING:UNNECESSARY_BREAK: break is not useful after a return #53: FILE: drivers/gpu/drm/i915/intel_step.c:217: + return "D1"; + break; -:56: WARNING:UNNECESSARY_BREAK: break is not useful after a return #56: FILE: drivers/gpu/drm/i915/intel_step.c:220: + return "E0"; + break; -:59: WARNING:UNNECESSARY_BREAK: break is not useful after a return #59: FILE: drivers/gpu/drm/i915/intel_step.c:223: + return "F0"; + break; -:62: WARNING:UNNECESSARY_BREAK: break is not useful after a return #62: FILE: drivers/gpu/drm/i915/intel_step.c:226: + return "G0"; +
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/sched dependency tracking and dma-resv fixes (rev2)
== Series Details == Series: drm/sched dependency tracking and dma-resv fixes (rev2) URL : https://patchwork.freedesktop.org/series/92333/ State : success == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20559 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20559/index.html Changes --- No changes found Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20559 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20559: ca32cb2b56ef01204baff47cde1058e14ddf23ca @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == ca32cb2b56ef dma-resv: Give the docs a do-over 10b231f895fe drm/i915: Don't break exclusive fence ordering 56fc78cd5073 drm/i915: delete exclude argument from i915_sw_fence_await_reservation d022184c169b drm/etnaviv: Don't break exclusive fence ordering 9e395399b983 drm/msm: always wait for the exclusive fence b149f2ca5c3f drm/msm: Don't break exclusive fence ordering 6219e79695d6 drm/sched: Check locking in drm_sched_job_await_implicit 588331541d89 drm/sched: Don't store self-dependencies 28feb868173e drm/gem: Delete gem array fencing helpers 5e34a7c752a3 drm/etnaviv: Use scheduler dependency handling d4fa82c9c97f drm/v3d: Use scheduler dependency handling 34e837ac1d7b drm/v3d: Move drm_sched_job_init to v3d_job_init a51bbbac956b drm/lima: use scheduler dependency tracking dca5c513ac3d drm/panfrost: use scheduler dependency tracking f46108672694 drm/sched: improve docs around drm_sched_entity 6f8a6dac8300 drm/sched: drop entity parameter from drm_sched_push_job 2554ae9762d1 drm/sched: Add dependency tracking 998b636c6f3c drm/sched: Barriers are needed for entity->last_scheduled 3afe40cffa3e drm/sched: Split drm_sched_job_init aa63f4780928 drm/sched: entity->rq selection cannot fail == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20559/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency tracking and dma-resv fixes (rev2)
== Series Details == Series: drm/sched dependency tracking and dma-resv fixes (rev2) URL : https://patchwork.freedesktop.org/series/92333/ State : warning == Summary == $ dim checkpatch origin/drm-tip aa63f4780928 drm/sched: entity->rq selection cannot fail -:52: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & recovery code rather than BUG() or BUG_ON() #52: FILE: drivers/gpu/drm/scheduler/sched_main.c:584: + BUG_ON(!entity->rq); -:55: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 2 warnings, 0 checks, 17 lines checked 3afe40cffa3e drm/sched: Split drm_sched_job_init -:205: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned' #205: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173: + unsigned seq; -:284: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & recovery code rather than BUG() or BUG_ON() #284: FILE: drivers/gpu/drm/scheduler/sched_main.c:614: + BUG_ON(!entity); -:365: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #365: FILE: include/drm/gpu_scheduler.h:391: +struct drm_sched_fence *drm_sched_fence_alloc( -:373: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 3 warnings, 1 checks, 235 lines checked 998b636c6f3c drm/sched: Barriers are needed for entity->last_scheduled -:86: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 43 lines checked 2554ae9762d1 drm/sched: Add dependency tracking -:193: CHECK:LINE_SPACING: Please don't use multiple blank lines #193: FILE: drivers/gpu/drm/scheduler/sched_main.c:722: + + -:268: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'? #268: FILE: include/drm/gpu_scheduler.h:243: +* drm_sched_job_await_implicit() this can be ommitted and left as NULL. -:282: CHECK:LINE_SPACING: Please don't use multiple blank lines #282: FILE: include/drm/gpu_scheduler.h:376: + + -:285: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 2 warnings, 2 checks, 227 lines checked 6f8a6dac8300 drm/sched: drop entity parameter from drm_sched_push_job -:203: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 102 lines checked f46108672694 drm/sched: improve docs around drm_sched_entity -:14: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: move entity handling into separate file")' #14: move here: 620e762f9a98 ("drm/scheduler: move entity handling into -:396: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 1 errors, 1 warnings, 0 checks, 346 lines checked dca5c513ac3d drm/panfrost: use scheduler dependency tracking -:209: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 157 lines checked a51bbbac956b drm/lima: use scheduler dependency tracking -:114: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 73 lines checked 34e837ac1d7b drm/v3d: Move drm_sched_job_init to v3d_job_init -:356: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 302 lines checked d4fa82c9c97f drm/v3d: Use scheduler dependency handling -:202: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 1 warnings, 0 checks, 161 lines checked 5e34a7c752a3 drm/etnaviv: Use scheduler dependency handling -:13: WARNING:REPEATED_WORD: Possible repeated word: 'to' #13: I wanted to to in the previous round (and did, for all other drivers). -:119: WARNING:LINE_SPACING: Missing a blank line after declarations #119: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:551: + struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd); + if (!in_fence) { -:293: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: Daniel Vetter ' total: 0 errors, 3 warnings, 0 checks, 241 lines checked 28feb868173e drm/gem: Delete gem array fenc
Re: [Intel-gfx] [PATCH 52/53] drm/i915/dg2: Update to bigjoiner path
On Thu, Jul 01, 2021 at 01:24:26PM -0700, Matt Roper wrote: > From: Animesh Manna > > In verify_mpllb_state() encoder is retrieved from best_encoder > of connector_state. As there will be only one connector_state > for bigjoiner and checking encoder may not be needed for > bigjoiner-slave. This code path related to mpll is done on dg2 > and need this fix to avoid null pointer dereference issue. > > Cc: Manasi Navare > Signed-off-by: Animesh Manna > Signed-off-by: Matt Roper Reviewed-by: Manasi Navare Manasi > --- > drivers/gpu/drm/i915/display/intel_display.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c > b/drivers/gpu/drm/i915/display/intel_display.c > index 9655f1b1b41b..3f4e811145b6 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state, > if (!new_crtc_state->hw.active) > return; > > + if (new_crtc_state->bigjoiner_slave) > + return; > + > encoder = intel_get_crtc_new_encoder(state, new_crtc_state); > intel_mpllb_readout_hw_state(encoder, &mpllb_hw_state); > > -- > 2.25.4 > ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments
== Series Details == Series: series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments URL : https://patchwork.freedesktop.org/series/92342/ State : success == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20558 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20558/index.html Known issues Here are the changes found in Patchwork_20558 that come from known issues: ### IGT changes ### {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888 Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20558 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20558: 0be142e780934cd09e46b4699fe3f7cd9b7adde0 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 0be142e78093 drm/i915/display/xelpd: Exetend Wa_14011508470 8d9f68df0f34 drm/i915/display/adl_p: Correctly program MBUS DBOX A credits ccd65a2b8785 drm/i915: Limit Wa_22010178259 to affected platforms 4818897f7082 drm/i915: Limit maximum number of memory channels 1a869323ea02 drm/i915/adl_s: Extend Wa_1406941453 6cea025795d1 drm/i915: Implement Wa_1508744258 78692bfbbdb4 drm/i915: Settle on "adl-x" in WA comments == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20558/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments
== Series Details == Series: series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments URL : https://patchwork.freedesktop.org/series/92342/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. - +drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct i915_vma *[assigned] vma +drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] __iomem *[assigned] iomem +drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type in assignment (different address spaces) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/dg1: Compute MEM Bandwidth using MCHBAR (rev2)
== Series Details == Series: drm/i915/dg1: Compute MEM Bandwidth using MCHBAR (rev2) URL : https://patchwork.freedesktop.org/series/92094/ State : success == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20557 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/index.html Known issues Here are the changes found in Patchwork_20557 that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_suspend@basic-s0: - fi-cfl-8109u: [PASS][1] -> [INCOMPLETE][2] ([i915#155]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155 [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888 Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20557 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20557: 11350b4a8bc8324518472387c20aebe74ec93069 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 11350b4a8bc8 drm/i915/dg1: Compute MEM Bandwidth using MCHBAR == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/10] drm/i915/step: Add intel_step_name() helper
Add a helper to convert the step info to string. This is specifically useful when we want to load a specific firmware for a given stepping/substepping combination. Suggested-by: Jani Nikula Signed-off-by: Anusha Srivatsa --- drivers/gpu/drm/i915/intel_step.c | 58 +++ drivers/gpu/drm/i915/intel_step.h | 1 + 2 files changed, 59 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index 99c0d3df001b..9af7f30b777e 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -182,3 +182,61 @@ void intel_step_init(struct drm_i915_private *i915) RUNTIME_INFO(i915)->step = step; } + +const char *intel_step_name(enum intel_step step) { + switch (step) { + case STEP_A0: + return "A0"; + break; + case STEP_A1: + return "A1"; + break; + case STEP_A2: + return "A2"; + break; + case STEP_B0: + return "B0"; + break; + case STEP_B1: + return "B1"; + break; + case STEP_B2: + return "B2"; + break; + case STEP_C0: + return "C0"; + break; + case STEP_C1: + return "C1"; + break; + case STEP_D0: + return "D0"; + break; + case STEP_D1: + return "D1"; + break; + case STEP_E0: + return "E0"; + break; + case STEP_F0: + return "F0"; + break; + case STEP_G0: + return "G0"; + break; + case STEP_H0: + return "H0"; + break; + case STEP_I0: + return "I0"; + break; + case STEP_I1: + return "I1"; + break; + case STEP_J0: + return "J0"; + break; + default: + return "**"; + } +} diff --git a/drivers/gpu/drm/i915/intel_step.h b/drivers/gpu/drm/i915/intel_step.h index 3e8b2babd9da..2fbe51483472 100644 --- a/drivers/gpu/drm/i915/intel_step.h +++ b/drivers/gpu/drm/i915/intel_step.h @@ -43,5 +43,6 @@ enum intel_step { }; void intel_step_init(struct drm_i915_private *i915); +const char *intel_step_name(enum intel_step step); #endif /* __INTEL_STEP_H__ */ -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 07/10] drm/i915/cnl: Drop all workarounds
From: Matt Roper All of the Cannon Lake hardware that came out had graphics fused off, and our userspace drivers have already dropped their support for the platform; CNL-specific code in i915 that isn't inherited by subsequent platforms is effectively dead code. Let's remove all of the CNL-specific workarounds as a quick and easy first step. References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 55 - 1 file changed, 55 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 62321e9149db..9b257a394305 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -514,35 +514,6 @@ static void cfl_ctx_workarounds_init(struct intel_engine_cs *engine, GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE); } -static void cnl_ctx_workarounds_init(struct intel_engine_cs *engine, -struct i915_wa_list *wal) -{ - /* WaForceContextSaveRestoreNonCoherent:cnl */ - wa_masked_en(wal, CNL_HDC_CHICKEN0, -HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT); - - /* WaDisableReplayBufferBankArbitrationOptimization:cnl */ - wa_masked_en(wal, COMMON_SLICE_CHICKEN2, -GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION); - - /* WaPushConstantDereferenceHoldDisable:cnl */ - wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE); - - /* FtrEnableFastAnisoL1BankingFix:cnl */ - wa_masked_en(wal, HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX); - - /* WaDisable3DMidCmdPreemption:cnl */ - wa_masked_dis(wal, GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL); - - /* WaDisableGPGPUMidCmdPreemption:cnl */ - wa_masked_field_set(wal, GEN8_CS_CHICKEN1, - GEN9_PREEMPT_GPGPU_LEVEL_MASK, - GEN9_PREEMPT_GPGPU_COMMAND_LEVEL); - - /* WaDisableEarlyEOT:cnl */ - wa_masked_en(wal, GEN8_ROW_CHICKEN, DISABLE_EARLY_EOT); -} - static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { @@ -704,8 +675,6 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine, gen12_ctx_workarounds_init(engine, wal); else if (GRAPHICS_VER(i915) == 11) icl_ctx_workarounds_init(engine, wal); - else if (IS_CANNONLAKE(i915)) - cnl_ctx_workarounds_init(engine, wal); else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) cfl_ctx_workarounds_init(engine, wal); else if (IS_GEMINILAKE(i915)) @@ -982,15 +951,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal) wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr); } -static void -cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) -{ - /* WaInPlaceDecompressionHang:cnl */ - wa_write_or(wal, - GEN9_GAMT_ECO_REG_RW_IA, - GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); -} - static void icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) { @@ -1140,8 +1100,6 @@ gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal) gen12_gt_workarounds_init(i915, wal); else if (GRAPHICS_VER(i915) == 11) icl_gt_workarounds_init(i915, wal); - else if (IS_CANNONLAKE(i915)) - cnl_gt_workarounds_init(i915, wal); else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915)) cfl_gt_workarounds_init(i915, wal); else if (IS_GEMINILAKE(i915)) @@ -1418,17 +1376,6 @@ static void cml_whitelist_build(struct intel_engine_cs *engine) cfl_whitelist_build(engine); } -static void cnl_whitelist_build(struct intel_engine_cs *engine) -{ - struct i915_wa_list *w = &engine->whitelist; - - if (engine->class != RENDER_CLASS) - return; - - /* WaEnablePreemptionGranularityControlByUMD:cnl */ - whitelist_reg(w, GEN8_CS_CHICKEN1); -} - static void icl_whitelist_build(struct intel_engine_cs *engine) { struct i915_wa_list *w = &engine->whitelist; @@ -1542,8 +1489,6 @@ void intel_engine_init_whitelist(struct intel_engine_cs *engine) tgl_whitelist_build(engine); else if (GRAPHICS_VER(i915) == 11) icl_whitelist_build(engine); - else if (IS_CANNONLAKE(i915)) - cnl_whitelist_build(engine); else if (IS_COMETLAKE(i915)) cml_whitelist_build(engine); else if (IS_COFFEELAKE(i915)) -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/10] drm/i915/rkl: Use revid->stepping tables
From: Matt Roper Switch RKL to use a revid->stepping table as we're trying to do on all platforms going forward. Bspec: 44501 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_psr.c | 4 ++-- drivers/gpu/drm/i915/i915_drv.h | 8 ++-- drivers/gpu/drm/i915/intel_step.c| 9 + 3 files changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 9643624fe160..74b2aa3c2946 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -594,7 +594,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp) if (intel_dp->psr.psr2_sel_fetch_enabled) { /* WA 1408330847 */ if (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) || - IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0)) + IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)) intel_de_rmw(dev_priv, CHICKEN_PAR1_1, DIS_RAM_BYPASS_PSR2_MAN_TRACK, DIS_RAM_BYPASS_PSR2_MAN_TRACK); @@ -1342,7 +1342,7 @@ static void intel_psr_disable_locked(struct intel_dp *intel_dp) /* WA 1408330847 */ if (intel_dp->psr.psr2_sel_fetch_enabled && (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) || -IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0))) +IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))) intel_de_rmw(dev_priv, CHICKEN_PAR1_1, DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 78db92bbb1c6..592e7177202e 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1536,12 +1536,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, (IS_TIGERLAKE(__i915) && !(IS_TGL_U(__i915) || IS_TGL_Y(__i915)) && \ IS_GT_STEP(__i915, since, until)) -#define RKL_REVID_A0 0x0 -#define RKL_REVID_B0 0x1 -#define RKL_REVID_C0 0x4 - -#define IS_RKL_REVID(p, since, until) \ - (IS_ROCKETLAKE(p) && IS_REVID(p, since, until)) +#define IS_RKL_DISPLAY_STEP(p, since, until) \ + (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until)) #define DG1_REVID_A0 0x0 #define DG1_REVID_B0 0x1 diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index 61666a3dd672..1593ab25f41a 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -69,6 +69,12 @@ static const struct intel_step_info tgl_revid_step_tbl[] = { [1] = { .gt_step = STEP_B0, .display_step = STEP_D0 }, }; +static const struct intel_step_info rkl_revid_step_tbl[] = { + [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, + [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, + [4] = { .gt_step = STEP_C0, .display_step = STEP_C0 }, +}; + static const struct intel_step_info adls_revid_step_tbl[] = { [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, [0x1] = { .gt_step = STEP_A0, .display_step = STEP_A2 }, @@ -97,6 +103,9 @@ void intel_step_init(struct drm_i915_private *i915) } else if (IS_ALDERLAKE_S(i915)) { revids = adls_revid_step_tbl; size = ARRAY_SIZE(adls_revid_step_tbl); + } else if (IS_ROCKETLAKE(i915)) { + revids = rkl_revid_step_tbl; + size = ARRAY_SIZE(rkl_revid_step_tbl); } else if (IS_TGL_U(i915) || IS_TGL_Y(i915)) { revids = tgl_uy_revid_step_tbl; size = ARRAY_SIZE(tgl_uy_revid_step_tbl); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/10] drm/i915/dmc: Modify intel_get_stepping_info()
With all platforms having the tepping info in intel_step.c, it makes no sense to maintain a separate lookup table in intel_dmc.c Let modify intel_Get_stepping_info() to grab stepping info from the central location towards which everything is moving. Cc: Jani Nikula Signed-off-by: Anusha Srivatsa --- drivers/gpu/drm/i915/display/intel_dmc.c | 51 +--- 1 file changed, 9 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c b/drivers/gpu/drm/i915/display/intel_dmc.c index f8789d4543bf..895bee8f9782 100644 --- a/drivers/gpu/drm/i915/display/intel_dmc.c +++ b/drivers/gpu/drm/i915/display/intel_dmc.c @@ -247,50 +247,16 @@ bool intel_dmc_has_payload(struct drm_i915_private *i915) return i915->dmc.dmc_info[DMC_FW_MAIN].payload; } -static const struct stepping_info skl_stepping_info[] = { - {'A', '0'}, {'B', '0'}, {'C', '0'}, - {'D', '0'}, {'E', '0'}, {'F', '0'}, - {'G', '0'}, {'H', '0'}, {'I', '0'}, - {'J', '0'}, {'K', '0'} -}; - -static const struct stepping_info bxt_stepping_info[] = { - {'A', '0'}, {'A', '1'}, {'A', '2'}, - {'B', '0'}, {'B', '1'}, {'B', '2'} -}; - -static const struct stepping_info icl_stepping_info[] = { - {'A', '0'}, {'A', '1'}, {'A', '2'}, - {'B', '0'}, {'B', '2'}, - {'C', '0'} -}; - -static const struct stepping_info no_stepping_info = { '*', '*' }; - static const struct stepping_info * -intel_get_stepping_info(struct drm_i915_private *dev_priv) +intel_get_stepping_info(struct drm_i915_private *dev_priv, + struct stepping_info *si) { - const struct stepping_info *si; - unsigned int size; - - if (IS_ICELAKE(dev_priv)) { - size = ARRAY_SIZE(icl_stepping_info); - si = icl_stepping_info; - } else if (IS_SKYLAKE(dev_priv)) { - size = ARRAY_SIZE(skl_stepping_info); - si = skl_stepping_info; - } else if (IS_BROXTON(dev_priv)) { - size = ARRAY_SIZE(bxt_stepping_info); - si = bxt_stepping_info; - } else { - size = 0; - si = NULL; - } - - if (INTEL_REVID(dev_priv) < size) - return si + INTEL_REVID(dev_priv); + struct intel_step_info step = RUNTIME_INFO(dev_priv)->step; + const char *step_name = intel_step_name(step.display_step); - return &no_stepping_info; + si->stepping = step_name[0]; +si->substepping = step_name[1]; + return si; } static void gen9_set_dc_state_debugmask(struct drm_i915_private *dev_priv) @@ -616,7 +582,8 @@ static void parse_dmc_fw(struct drm_i915_private *dev_priv, struct intel_package_header *package_header; struct intel_dmc_header_base *dmc_header; struct intel_dmc *dmc = &dev_priv->dmc; - const struct stepping_info *si = intel_get_stepping_info(dev_priv); + struct stepping_info display_info = { '*', '*'}; + const struct stepping_info *si = intel_get_stepping_info(dev_priv, &display_info); u32 readcount = 0; u32 r, offset; int id; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 00/10] Get stepping info from RUNTIME_INFO->step
The changes are added on top of Matt's series: https://patchwork.freedesktop.org/series/92299/ This series modifies the way we get stepping indo for DMC to load the right firmware for the right stepping/substepping combinations. Since we have a lookup table for BXT in intel_dmc.c and BXT stepping changes were missing from Matt's series, I have added a patch for it. Anusha Srivatsa (3): drm/i915/bxt: Use revid->stepping tables drm/i915/step: Add intel_step_name() helper drm/i915/dmc: Modify intel_get_stepping_info() Matt Roper (7): drm/i915: Make pre-production detection use direct revid comparison drm/i915/skl: Use revid->stepping tables drm/i915/icl: Use revid->stepping tables drm/i915/jsl_ehl: Use revid->stepping tables drm/i915/rkl: Use revid->stepping tables drm/i915/dg1: Use revid->stepping tables drm/i915/cnl: Drop all workarounds .../drm/i915/display/intel_display_power.c| 2 +- drivers/gpu/drm/i915/display/intel_dmc.c | 51 ++- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +- drivers/gpu/drm/i915/display/intel_psr.c | 4 +- drivers/gpu/drm/i915/gt/intel_region_lmem.c | 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 81 ++ drivers/gpu/drm/i915/i915_drv.c | 8 +- drivers/gpu/drm/i915/i915_drv.h | 80 ++ drivers/gpu/drm/i915/intel_pm.c | 2 +- drivers/gpu/drm/i915/intel_step.c | 142 +- drivers/gpu/drm/i915/intel_step.h | 8 + 11 files changed, 187 insertions(+), 195 deletions(-) -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 04/10] drm/i915/jsl_ehl: Use revid->stepping tables
From: Matt Roper Switch JSL/EHL to use a revid->stepping table as we're trying to do on all platforms going forward. Bspec: 29153 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 9 - drivers/gpu/drm/i915/intel_step.c | 8 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c index 882bfd499e55..dfc31b682848 100644 --- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c +++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c @@ -2674,7 +2674,7 @@ static bool ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915) { return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) && -IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER)) || +IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) || IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) && i915->dpll.ref_clks.nssc == 38400; } diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index e2d8acb8c1c9..4c0c15bbdac2 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1043,7 +1043,7 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) /* Wa_1607087056:icl,ehl,jsl */ if (IS_ICELAKE(i915) || - IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0)) + IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_A0)) wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e26ff8624945..78db92bbb1c6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1519,11 +1519,10 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_ICL_GT_STEP(p, since, until) \ (IS_ICELAKE(p) && IS_GT_STEP(p, since, until)) -#define EHL_REVID_A00x0 -#define EHL_REVID_B00x1 - -#define IS_JSL_EHL_REVID(p, since, until) \ - (IS_JSL_EHL(p) && IS_REVID(p, since, until)) +#define IS_JSL_EHL_GT_STEP(p, since, until) \ + (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until)) +#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \ + (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until)) #define IS_TGL_DISPLAY_STEP(__i915, since, until) \ (IS_TIGERLAKE(__i915) && \ diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index 4d8248cf67d3..61666a3dd672 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -51,6 +51,11 @@ static const struct intel_step_info icl_revid_step_tbl[] = { [7] = { .gt_step = STEP_D0, .display_step = STEP_D0 }, }; +static const struct intel_step_info jsl_ehl_revid_step_tbl[] = { + [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, + [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, +}; + static const struct intel_step_info tgl_uy_revid_step_tbl[] = { [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, [1] = { .gt_step = STEP_B0, .display_step = STEP_C0 }, @@ -98,6 +103,9 @@ void intel_step_init(struct drm_i915_private *i915) } else if (IS_TIGERLAKE(i915)) { revids = tgl_revid_step_tbl; size = ARRAY_SIZE(tgl_revid_step_tbl); + } else if (IS_JSL_EHL(i915)) { + revids = jsl_ehl_revid_step_tbl; + size = ARRAY_SIZE(jsl_ehl_revid_step_tbl); } else if (IS_ICELAKE(i915)) { revids = icl_revid_step_tbl; size = ARRAY_SIZE(icl_revid_step_tbl); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 08/10] drm/i915/bxt: Use revid->stepping tables
Switch BXT to use a revid->stepping table as we're trying to do on all platforms going forward. Signed-off-by: Anusha Srivatsa --- drivers/gpu/drm/i915/intel_step.c | 12 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index c4ce02d22828..99c0d3df001b 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -31,6 +31,15 @@ static const struct intel_step_info skl_revid_step_tbl[] = { [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 }, }; +static const struct intel_step_info bxt_revids[] = { + [0] = { .gt_step = STEP_A0 }, + [1] = { .gt_step = STEP_A1 }, + [2] = { .gt_step = STEP_A2 }, + [6] = { .gt_step = STEP_B0 }, + [7] = { .gt_step = STEP_B1 }, + [8] = { .gt_step = STEP_B2 }, +}; + static const struct intel_step_info kbl_revid_step_tbl[] = { [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, @@ -129,6 +138,9 @@ void intel_step_init(struct drm_i915_private *i915) } else if (IS_KABYLAKE(i915)) { revids = kbl_revid_step_tbl; size = ARRAY_SIZE(kbl_revid_step_tbl); + } else if (IS_BROXTON(i915)) { + revids = bxt_revids; + size = ARRAY_SIZE(bxt_revids); } else if (IS_SKYLAKE(i915)) { revids = skl_revid_step_tbl; size = ARRAY_SIZE(skl_revid_step_tbl); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 06/10] drm/i915/dg1: Use revid->stepping tables
From: Matt Roper Switch DG1 to use a revid->stepping table as we're trying to do on all platforms going forward. This removes the last use of IS_REVID() and REVID_FOREVER, so remove those now-unused macros as well to prevent their accidental use on future platforms. Bspec: 44463 Signed-off-by: Matt Roper --- .../gpu/drm/i915/display/intel_display_power.c | 2 +- drivers/gpu/drm/i915/gt/intel_region_lmem.c| 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +- drivers/gpu/drm/i915/i915_drv.h| 18 -- drivers/gpu/drm/i915/intel_pm.c| 2 +- drivers/gpu/drm/i915/intel_step.c | 8 6 files changed, 20 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 285380079aab..975a7e25cea5 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv) int config, i; if (IS_ALDERLAKE_S(dev_priv) || - IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) || + IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) || IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0)) /* Wa_1409767108:tgl,dg1,adl-s */ table = wa_1409767108_buddy_page_masks; diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c index 1f43aba2e9e2..50d11a84e7a9 100644 --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c @@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt) static bool get_legacy_lowmem_region(struct intel_uncore *uncore, u64 *start, u32 *size) { - if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0)) + if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0)) return false; *start = 0; diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 4c0c15bbdac2..62321e9149db 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -,7 +,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) gen12_gt_workarounds_init(i915, wal); /* Wa_1607087056:dg1 */ - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0)) + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0)) wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS); @@ -1522,7 +1522,7 @@ static void dg1_whitelist_build(struct intel_engine_cs *engine) tgl_whitelist_build(engine); /* GEN:BUG:1409280441:dg1 */ - if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) && + if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) && (engine->class == RENDER_CLASS || engine->class == COPY_ENGINE_CLASS)) whitelist_reg_ext(w, RING_ID(engine->mmio_base), @@ -1592,7 +1592,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) { struct drm_i915_private *i915 = engine->i915; - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) || + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) || IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) { /* * Wa_1607138336:tgl[a0],dg1[a0] @@ -1638,7 +1638,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) } if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) || - IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) || + IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */ wa_masked_en(wal, GEN7_ROW_CHICKEN2, @@ -1652,7 +1652,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) } - if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) || + if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { /* * Wa_1607030317:tgl diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 592e7177202e..496c468229fc 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1317,19 +1317,10 @@ static inline struct drm_i915_private *pdev_to_i915(struct pci_dev *pdev) #define IS_DISPLAY_VER(i915, from, until) \ (DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until)) -#define REVID_FOREVER 0xff #define INTEL_REVID(dev_priv) (to_pci_dev((dev_priv)->drm.dev)->revision) #define HAS_DSB(dev_priv) (INTEL_IN
[Intel-gfx] [PATCH 02/10] drm/i915/skl: Use revid->stepping tables
From: Matt Roper Switch SKL to use a revid->stepping table as we're trying to do on all platforms going forward. Also add some additional stepping definitions for completeness, even if we don't have any workarounds tied to them. Note that SKL has a case where a newer revision ID corresponds to an older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1). Also, the lack of a revision ID 0x8 in the table is intentional and not an oversight. We'll re-write the KBL-specific comment to make it clear that these kind of quirks are expected. Finally, since we're already touching the KBL area too, let's rename the KBL table to match the naming convention used by all of the other platforms. Bspec: 13626 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- drivers/gpu/drm/i915/i915_drv.h | 11 +-- drivers/gpu/drm/i915/intel_step.c | 35 - drivers/gpu/drm/i915/intel_step.h | 4 +++ 4 files changed, 33 insertions(+), 19 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index d9a5a445ceec..6dfd564e078f 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -883,7 +883,7 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE); /* WaInPlaceDecompressionHang:skl */ - if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER)) + if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER)) wa_write_or(wal, GEN9_GAMT_ECO_REG_RW_IA, GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 4f2a61cb024a..775057626ee6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1509,16 +1509,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_TGL_Y(dev_priv) \ IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_ULX) -#define SKL_REVID_A0 0x0 -#define SKL_REVID_B0 0x1 -#define SKL_REVID_C0 0x2 -#define SKL_REVID_D0 0x3 -#define SKL_REVID_E0 0x4 -#define SKL_REVID_F0 0x5 -#define SKL_REVID_G0 0x6 -#define SKL_REVID_H0 0x7 - -#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, until)) +#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, until)) #define IS_KBL_GT_STEP(dev_priv, since, until) \ (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until)) diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index ba9479a67521..bfd63f56c200 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -7,15 +7,31 @@ #include "intel_step.h" /* - * KBL revision ID ordering is bizarre; higher revision ID's map to lower - * steppings in some cases. So rather than test against the revision ID - * directly, let's map that into our own range of increasing ID's that we - * can test against in a regular manner. + * Some platforms have unusual ways of mapping PCI revision ID to GT/display + * steppings. E.g., in some cases a higher PCI revision may translate to a + * lower stepping of the GT and/or display IP. This file provides lookup + * tables to map the PCI revision into a standard set of stepping values that + * can be compared numerically. + * + * Also note that some revisions/steppings may have been set aside as + * placeholders but never materialized in real hardware; in those cases there + * may be jumps in the revision IDs or stepping values in the tables below. */ +static const struct intel_step_info skl_revid_step_tbl[] = { + [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, + [0x1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, + [0x2] = { .gt_step = STEP_C0, .display_step = STEP_C0 }, + [0x3] = { .gt_step = STEP_D0, .display_step = STEP_D0 }, + [0x4] = { .gt_step = STEP_E0, .display_step = STEP_E0 }, + [0x5] = { .gt_step = STEP_F0, .display_step = STEP_F0 }, + [0x6] = { .gt_step = STEP_G0, .display_step = STEP_G0 }, + [0x7] = { .gt_step = STEP_H0, .display_step = STEP_H0 }, + [0x9] = { .gt_step = STEP_J0, .display_step = STEP_J0 }, + [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 }, +}; -/* FIXME: what about REVID_E0 */ -static const struct intel_step_info kbl_revids[] = { +static const struct intel_step_info kbl_revid_step_tbl[] = { [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, [2] = { .gt_step = STEP_C0, .display_step = STEP_B0 }, @@ -74,8 +90,11 @@ void intel_step_init(struct drm_i915_private *i915) revids = tgl_revid_step_tbl; si
[Intel-gfx] [PATCH 03/10] drm/i915/icl: Use revid->stepping tables
From: Matt Roper Switch ICL to use a revid->stepping table as we're trying to do on all platforms going forward. While we're at it, let's include some additional steppings that have popped up, even if we don't yet have any workarounds tied to those steppings (we probably need to audit our workaround list soon to see if any of the bounds have moved or if new workarounds have appeared). Note that the current bspec table is missing information about how to map PCI revision ID to GT/display steppings; it only provides an SoC stepping. The mapping to GT/display steppings (which aren't always the same as the SoC stepping) used to be in the bspec, but was apparently dropped during an update in Nov 2019; I've made my changes here based on an older bspec snapshot that still had the necessary information. We've requested that the missing information be restored. Bspec: 21441 # pre-Nov 2019 snapshot Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++-- drivers/gpu/drm/i915/i915_drv.h | 10 ++ drivers/gpu/drm/i915/intel_step.c | 12 drivers/gpu/drm/i915/intel_step.h | 2 ++ 4 files changed, 22 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index 6dfd564e078f..e2d8acb8c1c9 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -557,7 +557,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, /* Wa_1604370585:icl (pre-prod) * Formerly known as WaPushConstantDereferenceHoldDisable */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0)) wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE); @@ -573,12 +573,12 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs *engine, /* Wa_2006611047:icl (pre-prod) * Formerly known as WaDisableImprovedTdlClkGating */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0)) wa_masked_en(wal, GEN7_ROW_CHICKEN2, GEN11_TDL_CLOCK_GATING_FIX_DISABLE); /* Wa_2006665173:icl (pre-prod) */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0)) wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3, GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC); @@ -1023,13 +1023,13 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal) GAMW_ECO_DEV_CTX_RELOAD_DISABLE); /* Wa_1405779004:icl (pre-prod) */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0)) wa_write_or(wal, SLICE_UNIT_LEVEL_CLKGATE, MSCUNIT_CLKGATE_DIS); /* Wa_1406838659:icl (pre-prod) */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, INF_UNIT_LEVEL_CLKGATE, CGPSF_CLKGATE_DIS); @@ -1725,7 +1725,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) PMFLUSHDONE_LNEBLK); /* Wa_1406609255:icl (pre-prod) */ - if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0)) + if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0)) wa_write_or(wal, GEN7_SARCHKMD, GEN7_DISABLE_DEMAND_PREFETCH); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 775057626ee6..e26ff8624945 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1516,14 +1516,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \ (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until)) -#define ICL_REVID_A0 0x0 -#define ICL_REVID_A2 0x1 -#define ICL_REVID_B0 0x3 -#define ICL_REVID_B2 0x4 -#define ICL_REVID_C0 0x5 - -#define IS_ICL_REVID(p, since, until) \ - (IS_ICELAKE(p) && IS_REVID(p, since, until)) +#define IS_ICL_GT_STEP(p, since, until) \ + (IS_ICELAKE(p) && IS_GT_STEP(p, since, until)) #define EHL_REVID_A00x0 #define EHL_REVID_B00x1 diff --git a/drivers/gpu/drm/i915/intel_step.c b/drivers/gpu/drm/i915/intel_step.c index bfd63f56c200..4d8248cf67d3 100644 --- a/drivers/gpu/drm/i915/intel_step.c +++ b/drivers/gpu/drm/i915/intel_step.c @@ -42,6 +42,15 @@ static const struct intel_step_info kbl_revid_step_tbl[]
[Intel-gfx] [PATCH 01/10] drm/i915: Make pre-production detection use direct revid comparison
From: Matt Roper Although we're converting our workarounds to use a revid->stepping lookup table, the function that detects pre-production hardware should continue to compare against PCI revision ID values directly. These are listed in the bspec as integers, so it's easier to confirm their correctness if we just use an integer literal rather than a symbolic name anyway. Since the BXT, GLK, and CNL revid macros were never used in any workaround code, just remove them completely. Bspec: 13620, 19131, 13626, 18329 Signed-off-by: Matt Roper --- drivers/gpu/drm/i915/i915_drv.c | 8 drivers/gpu/drm/i915/i915_drv.h | 24 drivers/gpu/drm/i915/intel_step.h | 1 + 3 files changed, 5 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 30d8cd8c69b1..90136995f5eb 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct drm_i915_private *dev_priv) bool pre = false; pre |= IS_HSW_EARLY_SDV(dev_priv); - pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0); - pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST); - pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0); - pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2); + pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6; + pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA; + pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1; + pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3; if (pre) { drm_err(&dev_priv->drm, "This is a pre-production stepping. " diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index d14cda2ff923..4f2a61cb024a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1520,35 +1520,11 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, until)) -#define BXT_REVID_A0 0x0 -#define BXT_REVID_A1 0x1 -#define BXT_REVID_B0 0x3 -#define BXT_REVID_B_LAST 0x8 -#define BXT_REVID_C0 0x9 - -#define IS_BXT_REVID(dev_priv, since, until) \ - (IS_BROXTON(dev_priv) && IS_REVID(dev_priv, since, until)) - #define IS_KBL_GT_STEP(dev_priv, since, until) \ (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until)) #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \ (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until)) -#define GLK_REVID_A0 0x0 -#define GLK_REVID_A1 0x1 -#define GLK_REVID_A2 0x2 -#define GLK_REVID_B0 0x3 - -#define IS_GLK_REVID(dev_priv, since, until) \ - (IS_GEMINILAKE(dev_priv) && IS_REVID(dev_priv, since, until)) - -#define CNL_REVID_A0 0x0 -#define CNL_REVID_B0 0x1 -#define CNL_REVID_C0 0x2 - -#define IS_CNL_REVID(p, since, until) \ - (IS_CANNONLAKE(p) && IS_REVID(p, since, until)) - #define ICL_REVID_A0 0x0 #define ICL_REVID_A2 0x1 #define ICL_REVID_B0 0x3 diff --git a/drivers/gpu/drm/i915/intel_step.h b/drivers/gpu/drm/i915/intel_step.h index 958a8bb5d677..8efacef6ab31 100644 --- a/drivers/gpu/drm/i915/intel_step.h +++ b/drivers/gpu/drm/i915/intel_step.h @@ -22,6 +22,7 @@ struct intel_step_info { enum intel_step { STEP_NONE = 0, STEP_A0, + STEP_A1, STEP_A2, STEP_B0, STEP_B1, -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup
> -Original Message- > From: Roper, Matthew D > Sent: Thursday, July 8, 2021 4:05 PM > To: Srivatsa, Anusha > Cc: Jani Nikula ; intel-gfx@lists.freedesktop.org > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup > > On Thu, Jul 08, 2021 at 11:37:50AM -0700, Srivatsa, Anusha wrote: > > > > > > > -Original Message- > > > From: Jani Nikula > > > Sent: Thursday, July 8, 2021 12:33 AM > > > To: Roper, Matthew D ; intel- > > > g...@lists.freedesktop.org > > > Cc: Srivatsa, Anusha > > > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup > > > > > > On Wed, 07 Jul 2021, Matt Roper wrote: > > > > PCI revision IDs don't always map to GT and display IP steppings > > > > in an intuitive/sensible way. On many of our recent platforms > > > > we've switched to using revid->stepping lookup tables with the > > > > infrastructure in intel_step.c to handle stepping lookups and > > > > comparisons. Since it's confusing to have some of our platforms > > > > using the new lookup tables and some still using old revid > > > > comparisons, let's migrate all the old platforms over to the table > > > > approach since that's what we want to standardize on going > > > > forward. The only place that revision ID's should really get used > > > > directly now is when checking to see if we're running on pre-production > hardware. > > > > > > Anusha, Matt, please sort this out between the two of you. :) > > > > > > https://patchwork.freedesktop.org/series/92257/ > > > > > @Roper, Matthew D the series doesn't add the steeping table for BXT and > GLK. > > Right, that was intentional because we don't use the steppings for those > platforms anywhere in the code. But if that's changing with your DMC series, > I can add the tables for those two as well. > Yes, will need GLK and BXT Thanks Anusha > Matt > > > > > Anusha > > > BR, > > > Jani. > > > > > > > > > > > > > > Let's also take the opportunity to drop a bit of effectively dead > > > > code in the workarounds file too. > > > > > > > > Cc: Jani Nikula > > > > > > > > Matt Roper (7): > > > > drm/i915: Make pre-production detection use direct revid comparison > > > > drm/i915/skl: Use revid->stepping tables > > > > drm/i915/icl: Use revid->stepping tables > > > > drm/i915/jsl_ehl: Use revid->stepping tables > > > > drm/i915/rkl: Use revid->stepping tables > > > > drm/i915/dg1: Use revid->stepping tables > > > > drm/i915/cnl: Drop all workarounds > > > > > > > > .../drm/i915/display/intel_display_power.c| 2 +- > > > > drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +- > > > > drivers/gpu/drm/i915/display/intel_psr.c | 4 +- > > > > drivers/gpu/drm/i915/gt/intel_region_lmem.c | 2 +- > > > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 81 +++ > > > > drivers/gpu/drm/i915/i915_drv.c | 8 +- > > > > drivers/gpu/drm/i915/i915_drv.h | 80 +++--- > > > > drivers/gpu/drm/i915/intel_pm.c | 2 +- > > > > drivers/gpu/drm/i915/intel_step.c | 72 +++-- > > > > drivers/gpu/drm/i915/intel_step.h | 7 ++ > > > > 10 files changed, 107 insertions(+), 153 deletions(-) > > > > > > -- > > > Jani Nikula, Intel Open Source Graphics Center > > -- > Matt Roper > Graphics Software Engineer > VTT-OSGC Platform Enablement > Intel Corporation > (916) 356-2795 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup
On Thu, Jul 08, 2021 at 11:37:50AM -0700, Srivatsa, Anusha wrote: > > > > -Original Message- > > From: Jani Nikula > > Sent: Thursday, July 8, 2021 12:33 AM > > To: Roper, Matthew D ; intel- > > g...@lists.freedesktop.org > > Cc: Srivatsa, Anusha > > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup > > > > On Wed, 07 Jul 2021, Matt Roper wrote: > > > PCI revision IDs don't always map to GT and display IP steppings in an > > > intuitive/sensible way. On many of our recent platforms we've > > > switched to using revid->stepping lookup tables with the > > > infrastructure in intel_step.c to handle stepping lookups and > > > comparisons. Since it's confusing to have some of our platforms using > > > the new lookup tables and some still using old revid comparisons, > > > let's migrate all the old platforms over to the table approach since > > > that's what we want to standardize on going forward. The only place > > > that revision ID's should really get used directly now is when > > > checking to see if we're running on pre-production hardware. > > > > Anusha, Matt, please sort this out between the two of you. :) > > > > https://patchwork.freedesktop.org/series/92257/ > > > @Roper, Matthew D the series doesn't add the steeping table for BXT and GLK. Right, that was intentional because we don't use the steppings for those platforms anywhere in the code. But if that's changing with your DMC series, I can add the tables for those two as well. Matt > > Anusha > > BR, > > Jani. > > > > > > > > > > Let's also take the opportunity to drop a bit of effectively dead code > > > in the workarounds file too. > > > > > > Cc: Jani Nikula > > > > > > Matt Roper (7): > > > drm/i915: Make pre-production detection use direct revid comparison > > > drm/i915/skl: Use revid->stepping tables > > > drm/i915/icl: Use revid->stepping tables > > > drm/i915/jsl_ehl: Use revid->stepping tables > > > drm/i915/rkl: Use revid->stepping tables > > > drm/i915/dg1: Use revid->stepping tables > > > drm/i915/cnl: Drop all workarounds > > > > > > .../drm/i915/display/intel_display_power.c| 2 +- > > > drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +- > > > drivers/gpu/drm/i915/display/intel_psr.c | 4 +- > > > drivers/gpu/drm/i915/gt/intel_region_lmem.c | 2 +- > > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 81 +++ > > > drivers/gpu/drm/i915/i915_drv.c | 8 +- > > > drivers/gpu/drm/i915/i915_drv.h | 80 +++--- > > > drivers/gpu/drm/i915/intel_pm.c | 2 +- > > > drivers/gpu/drm/i915/intel_step.c | 72 +++-- > > > drivers/gpu/drm/i915/intel_step.h | 7 ++ > > > 10 files changed, 107 insertions(+), 153 deletions(-) > > > > -- > > Jani Nikula, Intel Open Source Graphics Center -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled
On 2021-07-08 1:37 p.m., Daniel Vetter wrote: It might be good enough on x86 with just READ_ONCE, but the write side should then at least be WRITE_ONCE because x86 has total store order. It's definitely not enough on arm. Fix this proplery, which means - explain the need for the barrier in both places - point at the other side in each comment Also pull out the !sched_list case as the first check, so that the code flow is clearer. While at it sprinkle some comments around because it was very non-obvious to me what's actually going on here and why. Note that we really need full barriers here, at first I thought store-release and load-acquire on ->last_scheduled would be enough, but we actually requiring ordering between that and the queue state. Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Steven Price Cc: Daniel Vetter Cc: Andrey Grodzovsky Cc: Lee Jones Cc: Boris Brezillon --- drivers/gpu/drm/scheduler/sched_entity.c | 27 ++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 64d398166644..4e1124ed80e0 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED); dma_fence_put(entity->last_scheduled); + entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished); + /* +* if the queue is empty we allow drm_sched_job_arm() to locklessly Probably meant drm_sched_entity_select_rq here +* access ->last_scheduled. This only works if we set the pointer before +* we dequeue and if we a write barrier here. +*/ + smp_wmb(); + spsc_queue_pop(&entity->job_queue); return sched_job; } @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) struct drm_gpu_scheduler *sched; struct drm_sched_rq *rq; - if (spsc_queue_count(&entity->job_queue) || !entity->sched_list) + /* single possible engine and already selected */ + if (!entity->sched_list) + return; + + /* queue non-empty, stay on the same engine */ + if (spsc_queue_count(&entity->job_queue)) return; Shouldn't smp_rmb be here in between ? Given the queue is empty we want to be certain we are reading the most recent value of entity->last_scheduled Andrey - fence = READ_ONCE(entity->last_scheduled); + fence = entity->last_scheduled; + + /* +* Only when the queue is empty are we guaranteed the the scheduler +* thread cannot change ->last_scheduled. To enforce ordering we need +* a read barrier here. See drm_sched_entity_pop_job() for the other +* side. +*/ + smp_rmb(); + + /* stay on the same engine if the previous job hasn't finished */ if (fence && !dma_fence_is_signaled(fence)) return; ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.BAT: success for CT changes required for GuC submission
== Series Details == Series: CT changes required for GuC submission URL : https://patchwork.freedesktop.org/series/92330/ State : success == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20556 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20556/index.html Changes --- No changes found Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20556 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20556: 2d17849eda02a4072eb8ab2ba74f5bb44dc8a027 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 2d17849eda02 drm/i915/guc: Module load failure test for CT buffer creation f6019e334f58 drm/i915/guc: Optimize CTB writes and reads 8ca5983abea9 drm/i915/guc: Add stall timer to non blocking CTB send function d8d221ac4b12 drm/i915/guc: Add non blocking CTB send function df6020220043 drm/i915/guc: Increase size of CTB buffers 6d0520a4e3e0 drm/i915/guc: Improve error message for unsolicited CT response c5e508db6c3f drm/i915/guc: Relax CTB response timeout == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20556/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for CT changes required for GuC submission
== Series Details == Series: CT changes required for GuC submission URL : https://patchwork.freedesktop.org/series/92330/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. - +drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct i915_vma *[assigned] vma +drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] __iomem *[assigned] iomem +drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct i915_address_space * +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block +drivers/gpu/drm/i915/gt/intel_ring_submission.c:1210:24: warning: Using plain integer as NULL pointer +drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216 +drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216 +./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080) +./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080) +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: con
[Intel-gfx] ✓ Fi.CI.BAT: success for Set BPP in the kernel (rev2)
== Series Details == Series: Set BPP in the kernel (rev2) URL : https://patchwork.freedesktop.org/series/92312/ State : success == Summary == CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20554 Summary --- **SUCCESS** No regressions found. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/index.html Known issues Here are the changes found in Patchwork_20554 that come from known issues: ### IGT changes ### Issues hit * igt@gem_exec_suspend@basic-s0: - fi-cfl-8109u: [PASS][1] -> [INCOMPLETE][2] ([i915#155]) [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html Warnings * igt@runner@aborted: - fi-bdw-5557u: [FAIL][3] ([i915#2722] / [i915#3744]) -> [FAIL][4] ([i915#1602] / [i915#2029]) [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@run...@aborted.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/fi-bdw-5557u/igt@run...@aborted.html [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155 [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602 [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029 [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722 [i915#3744]: https://gitlab.freedesktop.org/drm/intel/issues/3744 Participating hosts (40 -> 39) -- Missing(1): fi-bsw-cyan Build changes - * Linux: CI_DRM_10320 -> Patchwork_20554 CI-20190529: 20190529 CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git Patchwork_20554: ed4b3f15937bc71ae8a2db7c8c6a998ef31d3067 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == ed4b3f15937b drm/i915/display/dsc: Force dsc BPP c4f9e04d46c8 drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP enable ea416f2dba91 drm/i915/display: Add write permissions for fec support == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/index.html ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915/gem: ioctl clean-ups (rev8)
== Series Details == Series: drm/i915/gem: ioctl clean-ups (rev8) URL : https://patchwork.freedesktop.org/series/89443/ State : failure == Summary == Applying: drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE Using index info to reconstruct a base tree... M drivers/gpu/drm/i915/Makefile M drivers/gpu/drm/i915/gem/i915_gem_context.c A drivers/gpu/drm/i915/gt/intel_context_param.c M drivers/gpu/drm/i915/gt/intel_context_param.h M include/uapi/drm/i915_drm.h Falling back to patching base and 3-way merge... Auto-merging include/uapi/drm/i915_drm.h Auto-merging drivers/gpu/drm/i915/gt/intel_context_param.h CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gt/intel_context_param.h Auto-merging drivers/gpu/drm/i915/gem/i915_gem_context.c CONFLICT (content): Merge conflict in drivers/gpu/drm/i915/gem/i915_gem_context.c error: Failed to merge in the changes. hint: Use 'git am --show-current-patch=diff' to see the failed patch Patch failed at 0001 drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Set BPP in the kernel (rev2)
== Series Details == Series: Set BPP in the kernel (rev2) URL : https://patchwork.freedesktop.org/series/92312/ State : warning == Summary == $ dim sparse --fast origin/drm-tip Sparse version: v0.6.2 Fast mode used, each commit won't be checked separately. - +drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct i915_vma *[assigned] vma +drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] __iomem *[assigned] iomem +drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct i915_address_space * +drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect type in assignment (different address spaces) +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct i915_address_space *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct i915_address_space [noderef] __rcu *vm +drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect type in argument 1 (different address spaces) +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31 +drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block +./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080) +./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080) +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block +./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fw
[Intel-gfx] [PATCH] drm/sched: Barriers are needed for entity->last_scheduled
It might be good enough on x86 with just READ_ONCE, but the write side should then at least be WRITE_ONCE because x86 has total store order. It's definitely not enough on arm. Fix this proplery, which means - explain the need for the barrier in both places - point at the other side in each comment Also pull out the !sched_list case as the first check, so that the code flow is clearer. While at it sprinkle some comments around because it was very non-obvious to me what's actually going on here and why. Note that we really need full barriers here, at first I thought store-release and load-acquire on ->last_scheduled would be enough, but we actually requiring ordering between that and the queue state. v2: Put smp_rmp() in the right place and fix up comment (Andrey) Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Steven Price Cc: Daniel Vetter Cc: Andrey Grodzovsky Cc: Lee Jones Cc: Boris Brezillon --- drivers/gpu/drm/scheduler/sched_entity.c | 27 ++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 64d398166644..6366006c0fcf 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED); dma_fence_put(entity->last_scheduled); + entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished); + /* +* If the queue is empty we allow drm_sched_entity_select_rq() to +* locklessly access ->last_scheduled. This only works if we set the +* pointer before we dequeue and if we a write barrier here. +*/ + smp_wmb(); + spsc_queue_pop(&entity->job_queue); return sched_job; } @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) struct drm_gpu_scheduler *sched; struct drm_sched_rq *rq; - if (spsc_queue_count(&entity->job_queue) || !entity->sched_list) + /* single possible engine and already selected */ + if (!entity->sched_list) + return; + + /* queue non-empty, stay on the same engine */ + if (spsc_queue_count(&entity->job_queue)) return; - fence = READ_ONCE(entity->last_scheduled); + /* +* Only when the queue is empty are we guaranteed that the scheduler +* thread cannot change ->last_scheduled. To enforce ordering we need +* a read barrier here. See drm_sched_entity_pop_job() for the other +* side. +*/ + smp_rmb(); + + fence = entity->last_scheduled; + + /* stay on the same engine if the previous job hasn't finished */ if (fence && !dma_fence_is_signaled(fence)) return; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/7] drm/i915/adl_s: Extend Wa_1406941453
BSpec: 54370 Cc: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index c346229e2be00..72562c233ad20 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -1677,8 +1677,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) GEN8_RC_SEMA_IDLE_MSG_DISABLE); } - if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) { - /* Wa_1406941453:tgl,rkl,dg1 */ + if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) || + IS_ALDERLAKE_S(i915)) { + /* Wa_1406941453:tgl,rkl,dg1,adl-s */ wa_masked_en(wal, GEN10_SAMPLER_MODE, ENABLE_SMALLPL); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915: Limit maximum number of memory channels
Alderlake-P PCODE is returning 4 memory channels while it has a maximum of 2. So adding this limit and printing a debug message but the real fix will need to come from PCODE. HSDES: 22013272110 Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/intel_dram.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_dram.c b/drivers/gpu/drm/i915/intel_dram.c index 879b0f007be31..de1d426627ef1 100644 --- a/drivers/gpu/drm/i915/intel_dram.c +++ b/drivers/gpu/drm/i915/intel_dram.c @@ -467,6 +467,10 @@ static int icl_pcode_read_mem_global_info(struct drm_i915_private *dev_priv) } dram_info->num_channels = (val & 0xf0) >> 4; + if (dram_info->num_channels > 2) { + drm_info(&dev_priv->drm, "More DRAM channels than expected, setting to max.\n"); + dram_info->num_channels = 2; + } dram_info->num_qgv_points = (val & 0xf00) >> 8; return 0; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/7] drm/i915: Settle on "adl-x" in WA comments
From: Lucas De Marchi Most of the places are using this format so lets consolidate it. Signed-off-by: José Roberto de Souza Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/display/intel_cdclk.c | 2 +- drivers/gpu/drm/i915/display/intel_cursor.c| 2 +- drivers/gpu/drm/i915/display/intel_display.c | 2 +- drivers/gpu/drm/i915/display/intel_psr.c | 10 +- drivers/gpu/drm/i915/display/skl_universal_plane.c | 2 +- drivers/gpu/drm/i915/gt/intel_workarounds.c| 2 +- drivers/gpu/drm/i915/intel_pm.c| 4 ++-- 7 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c index df2d8ce4a12f6..71067a62264de 100644 --- a/drivers/gpu/drm/i915/display/intel_cdclk.c +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c @@ -2878,7 +2878,7 @@ void intel_init_cdclk_hooks(struct drm_i915_private *dev_priv) dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk; dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk; dev_priv->display.calc_voltage_level = tgl_calc_voltage_level; - /* Wa_22011320316:adlp[a0] */ + /* Wa_22011320316:adl-p[a0] */ if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)) dev_priv->cdclk.table = adlp_a_step_cdclk_table; else diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c b/drivers/gpu/drm/i915/display/intel_cursor.c index bb61e736de911..f61a25fb87e90 100644 --- a/drivers/gpu/drm/i915/display/intel_cursor.c +++ b/drivers/gpu/drm/i915/display/intel_cursor.c @@ -383,7 +383,7 @@ static u32 i9xx_cursor_ctl(const struct intel_crtc_state *crtc_state, if (plane_state->hw.rotation & DRM_MODE_ROTATE_180) cntl |= MCURSOR_ROTATE_180; - /* Wa_22012358565:adlp */ + /* Wa_22012358565:adl-p */ if (DISPLAY_VER(dev_priv) == 13) cntl |= MCURSOR_ARB_SLOTS(1); diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 026c28c612f07..65ddb6ca16e67 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -975,7 +975,7 @@ void intel_enable_pipe(const struct intel_crtc_state *new_crtc_state) /* FIXME: assert CPU port conditions for SNB+ */ } - /* Wa_22012358565:adlp */ + /* Wa_22012358565:adl-p */ if (DISPLAY_VER(dev_priv) == 13) intel_de_rmw(dev_priv, PIPE_ARB_CTL(pipe), 0, PIPE_ARB_USE_PROG_SLOTS); diff --git a/drivers/gpu/drm/i915/display/intel_psr.c b/drivers/gpu/drm/i915/display/intel_psr.c index 9643624fe160d..4dfe1dceb8635 100644 --- a/drivers/gpu/drm/i915/display/intel_psr.c +++ b/drivers/gpu/drm/i915/display/intel_psr.c @@ -545,7 +545,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp) val |= EDP_PSR2_FRAME_BEFORE_SU(intel_dp->psr.sink_sync_latency + 1); val |= intel_psr2_get_tp_time(intel_dp); - /* Wa_22012278275:adlp */ + /* Wa_22012278275:adl-p */ if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_D1)) { static const u8 map[] = { 2, /* 5 lines */ @@ -733,7 +733,7 @@ tgl_dc3co_exitline_compute_config(struct intel_dp *intel_dp, if (!dc3co_is_pipe_port_compatible(intel_dp, crtc_state)) return; - /* Wa_16011303918:adlp */ + /* Wa_16011303918:adl-p */ if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)) return; @@ -965,7 +965,7 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp, return false; } - /* Wa_16011303918:adlp */ + /* Wa_16011303918:adl-p */ if (crtc_state->vrr.enable && IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)) { drm_dbg_kms(&dev_priv->drm, @@ -1160,7 +1160,7 @@ static void intel_psr_enable_source(struct intel_dp *intel_dp) intel_dp->psr.psr2_sel_fetch_enabled ? IGNORE_PSR2_HW_TRACKING : 0); - /* Wa_16011168373:adlp */ + /* Wa_16011168373:adl-p */ if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) && intel_dp->psr.psr2_enabled) intel_de_rmw(dev_priv, @@ -1346,7 +1346,7 @@ static void intel_psr_disable_locked(struct intel_dp *intel_dp) intel_de_rmw(dev_priv, CHICKEN_PAR1_1, DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0); - /* Wa_16011168373:adlp */ + /* Wa_16011168373:adl-p */ if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) && intel_dp->psr.psr2_enabled) intel_de_rmw(dev_priv, diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_pl
[Intel-gfx] [PATCH 6/7] drm/i915/display/adl_p: Correctly program MBUS DBOX A credits
Alderlake-P have different values for MBUS DBOX A credits depending if MBUS join is enabled or not. BSpec: 50343 BSpec: 54369 Cc: Matt Atwood Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/display/intel_display.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 65ddb6ca16e67..fe380896eb99e 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3400,13 +3400,17 @@ static void glk_pipe_scaler_clock_gating_wa(struct drm_i915_private *dev_priv, intel_de_write(dev_priv, CLKGATE_DIS_PSL(pipe), val); } -static void icl_pipe_mbus_enable(struct intel_crtc *crtc) +static void icl_pipe_mbus_enable(struct intel_crtc *crtc, bool joined_mbus) { struct drm_i915_private *dev_priv = to_i915(crtc->base.dev); enum pipe pipe = crtc->pipe; u32 val; - val = MBUS_DBOX_A_CREDIT(2); + /* Wa_22010947358:adl-p */ + if (IS_ALDERLAKE_P(dev_priv)) + val = joined_mbus ? MBUS_DBOX_A_CREDIT(6) : MBUS_DBOX_A_CREDIT(4); + else + val = MBUS_DBOX_A_CREDIT(2); if (DISPLAY_VER(dev_priv) >= 12) { val |= MBUS_DBOX_BW_CREDIT(2); @@ -3561,8 +3565,12 @@ static void hsw_crtc_enable(struct intel_atomic_state *state, if (dev_priv->display.initial_watermarks) dev_priv->display.initial_watermarks(state, crtc); - if (DISPLAY_VER(dev_priv) >= 11) - icl_pipe_mbus_enable(crtc); + if (DISPLAY_VER(dev_priv) >= 11) { + const struct intel_dbuf_state *dbuf_state = + intel_atomic_get_new_dbuf_state(state); + + icl_pipe_mbus_enable(crtc, dbuf_state->joined_mbus); + } if (new_crtc_state->bigjoiner_slave) intel_crtc_vblank_on(new_crtc_state); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/7] drm/i915: Limit Wa_22010178259 to affected platforms
This workaround is not needed for platforms with display 13. Cc: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/display/intel_display_power.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 285380079aab2..6fc766da66054 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -5822,10 +5822,11 @@ static void tgl_bw_buddy_init(struct drm_i915_private *dev_priv) intel_de_write(dev_priv, BW_BUDDY_PAGE_MASK(i), table[config].page_mask); - /* Wa_22010178259:tgl,rkl */ - intel_de_rmw(dev_priv, BW_BUDDY_CTL(i), -BW_BUDDY_TLB_REQ_TIMER_MASK, -BW_BUDDY_TLB_REQ_TIMER(0x8)); + /* Wa_22010178259:tgl,dg1,rkl,adl-s */ + if (DISPLAY_VER(dev_priv) == 12) + intel_de_rmw(dev_priv, BW_BUDDY_CTL(i), +BW_BUDDY_TLB_REQ_TIMER_MASK, +BW_BUDDY_TLB_REQ_TIMER(0x8)); } } } -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/7] drm/i915/display/xelpd: Exetend Wa_14011508470
This workaround is also applicable to xelpd display so extending it. Cc: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/display/intel_display_power.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c b/drivers/gpu/drm/i915/display/intel_display_power.c index 6fc766da66054..d92db471411e5 100644 --- a/drivers/gpu/drm/i915/display/intel_display_power.c +++ b/drivers/gpu/drm/i915/display/intel_display_power.c @@ -5883,8 +5883,8 @@ static void icl_display_core_init(struct drm_i915_private *dev_priv, if (resume && intel_dmc_has_payload(dev_priv)) intel_dmc_load_program(dev_priv); - /* Wa_14011508470 */ - if (DISPLAY_VER(dev_priv) == 12) { + /* Wa_14011508470:tgl,dg1,rkl,adl-s,adl-p */ + if (DISPLAY_VER(dev_priv) >= 12) { val = DCPR_CLEAR_MEMSTAT_DIS | DCPR_SEND_RESP_IMM | DCPR_MASK_LPMODE | DCPR_MASK_MAXLATENCY_MEMUP_CLR; intel_uncore_rmw(&dev_priv->uncore, GEN11_CHICKEN_DCPR_2, 0, val); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/7] drm/i915: Implement Wa_1508744258
Same bit was required for Wa_14012131227 in DG1 now it is also required as Wa_1508744258 to TGL, RKL, DG1, ADL-S and ADL-P. Cc: Gwan-gyeong Mun Signed-off-by: José Roberto de Souza --- drivers/gpu/drm/i915/gt/intel_workarounds.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c index e5e3f820074a9..c346229e2be00 100644 --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c @@ -670,6 +670,13 @@ static void gen12_ctx_workarounds_init(struct intel_engine_cs *engine, FF_MODE2_GS_TIMER_MASK, FF_MODE2_GS_TIMER_224, 0); + + /* +* Wa_14012131227:dg1 +* Wa_1508744258:tgl,rkl,dg1,adl-s,adl-p +*/ + wa_masked_en(wal, GEN7_COMMON_SLICE_CHICKEN1, +GEN9_RHWO_OPTIMIZATION_DISABLE); } static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine, -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/display/xelpd: Fix incorrect color capability reporting
== Series Details == Series: drm/i915/display/xelpd: Fix incorrect color capability reporting URL : https://patchwork.freedesktop.org/series/92266/ State : success == Summary == CI Bug Log - changes from CI_DRM_10308_full -> Patchwork_20542_full Summary --- **SUCCESS** No regressions found. Possible new issues --- Here are the unknown changes that may have been introduced in Patchwork_20542_full: ### IGT changes ### Suppressed The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs: - {shard-rkl}:[FAIL][1] ([i915#3678]) -> [SKIP][2] +2 similar issues [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-rkl-2/igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-rkl-6/igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs.html * igt@sysfs_timeslice_duration@timeout@vcs0: - {shard-rkl}:[PASS][3] -> [FAIL][4] [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-rkl-6/igt@sysfs_timeslice_duration@time...@vcs0.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-rkl-6/igt@sysfs_timeslice_duration@time...@vcs0.html Known issues Here are the changes found in Patchwork_20542_full that come from known issues: ### IGT changes ### Issues hit * igt@gem_create@create-massive: - shard-snb: NOTRUN -> [DMESG-WARN][5] ([i915#3002]) [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-snb2/igt@gem_cre...@create-massive.html - shard-kbl: NOTRUN -> [DMESG-WARN][6] ([i915#3002]) [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl1/igt@gem_cre...@create-massive.html * igt@gem_ctx_persistence@legacy-engines-mixed: - shard-snb: NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#1099]) +5 similar issues [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-mixed.html * igt@gem_eio@in-flight-contexts-1us: - shard-tglb: [PASS][8] -> [TIMEOUT][9] ([i915#3063]) [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-tglb7/igt@gem_...@in-flight-contexts-1us.html [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-tglb3/igt@gem_...@in-flight-contexts-1us.html * igt@gem_exec_fair@basic-flow@rcs0: - shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) +2 similar issues [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-tglb5/igt@gem_exec_fair@basic-f...@rcs0.html [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html * igt@gem_exec_fair@basic-none@rcs0: - shard-kbl: [PASS][12] -> [FAIL][13] ([i915#2842]) +1 similar issue [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-kbl7/igt@gem_exec_fair@basic-n...@rcs0.html [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl7/igt@gem_exec_fair@basic-n...@rcs0.html * igt@gem_exec_fair@basic-pace@vecs0: - shard-kbl: [PASS][14] -> [SKIP][15] ([fdo#109271]) +1 similar issue [14]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-kbl3/igt@gem_exec_fair@basic-p...@vecs0.html [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html * igt@gem_exec_fair@basic-throttle@rcs0: - shard-glk: [PASS][16] -> [FAIL][17] ([i915#2842]) +2 similar issues [16]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-glk8/igt@gem_exec_fair@basic-throt...@rcs0.html [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-glk8/igt@gem_exec_fair@basic-throt...@rcs0.html * igt@gem_exec_reloc@basic-wide-active@vcs1: - shard-iclb: NOTRUN -> [FAIL][18] ([i915#3633]) [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-iclb1/igt@gem_exec_reloc@basic-wide-act...@vcs1.html * igt@gem_exec_suspend@basic-s3-devices: - shard-iclb: [PASS][19] -> [INCOMPLETE][20] ([i915#1185]) [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-iclb4/igt@gem_exec_susp...@basic-s3-devices.html [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-iclb3/igt@gem_exec_susp...@basic-s3-devices.html * igt@gem_mmap_gtt@big-copy-xy: - shard-skl: [PASS][21] -> [FAIL][22] ([i915#307]) [21]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-skl4/igt@gem_mmap_...@big-copy-xy.html [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-skl1/igt@gem_mmap_...@big-copy-xy.html * igt@gem_ppgtt@flink-and-close-vma-leak: - shard-glk: [PASS][23] -> [FAIL][24] ([i915#
Re: [Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled
On Thu, Jul 8, 2021 at 8:56 PM Andrey Grodzovsky wrote: > On 2021-07-08 1:37 p.m., Daniel Vetter wrote: > > It might be good enough on x86 with just READ_ONCE, but the write side > > should then at least be WRITE_ONCE because x86 has total store order. > > > > It's definitely not enough on arm. > > > > Fix this proplery, which means > > - explain the need for the barrier in both places > > - point at the other side in each comment > > > > Also pull out the !sched_list case as the first check, so that the > > code flow is clearer. > > > > While at it sprinkle some comments around because it was very > > non-obvious to me what's actually going on here and why. > > > > Note that we really need full barriers here, at first I thought > > store-release and load-acquire on ->last_scheduled would be enough, > > but we actually requiring ordering between that and the queue state. > > > > Signed-off-by: Daniel Vetter > > Cc: "Christian König" > > Cc: Steven Price > > Cc: Daniel Vetter > > Cc: Andrey Grodzovsky > > Cc: Lee Jones > > Cc: Boris Brezillon > > --- > > drivers/gpu/drm/scheduler/sched_entity.c | 27 ++-- > > 1 file changed, 25 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c > > b/drivers/gpu/drm/scheduler/sched_entity.c > > index 64d398166644..4e1124ed80e0 100644 > > --- a/drivers/gpu/drm/scheduler/sched_entity.c > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c > > @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct > > drm_sched_entity *entity) > > dma_fence_set_error(&sched_job->s_fence->finished, > > -ECANCELED); > > > > dma_fence_put(entity->last_scheduled); > > + > > entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished); > > > > + /* > > + * if the queue is empty we allow drm_sched_job_arm() to locklessly > > > Probably meant drm_sched_entity_select_rq here Which is called from drm_sched_job_arm but yes. I'll switch it around. > > + * access ->last_scheduled. This only works if we set the pointer > > before > > + * we dequeue and if we a write barrier here. > > + */ > > + smp_wmb(); > > + > > spsc_queue_pop(&entity->job_queue); > > return sched_job; > > } > > @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct > > drm_sched_entity *entity) > > struct drm_gpu_scheduler *sched; > > struct drm_sched_rq *rq; > > > > - if (spsc_queue_count(&entity->job_queue) || !entity->sched_list) > > + /* single possible engine and already selected */ > > + if (!entity->sched_list) > > + return; > > + > > + /* queue non-empty, stay on the same engine */ > > + if (spsc_queue_count(&entity->job_queue)) > > return; > > > Shouldn't smp_rmb be here in between ? Given the queue is empty we want to > be certain we are reading the most recent value of entity->last_scheduled Yeah I had a load_acquire barrier here earlier and then put the smp_rmb() on the wrong side. Will fix. > > Andrey > > > > > > > - fence = READ_ONCE(entity->last_scheduled); > > + fence = entity->last_scheduled; > > + > > + /* > > + * Only when the queue is empty are we guaranteed the the scheduler > > + * thread cannot change ->last_scheduled. To enforce ordering we need > > + * a read barrier here. See drm_sched_entity_pop_job() for the other > > + * side. > > + */ > > + smp_rmb(); > > + > > + /* stay on the same engine if the previous job hasn't finished */ > > if (fence && !dma_fence_is_signaled(fence)) > > return; > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup
> -Original Message- > From: Jani Nikula > Sent: Thursday, July 8, 2021 12:33 AM > To: Roper, Matthew D ; intel- > g...@lists.freedesktop.org > Cc: Srivatsa, Anusha > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup > > On Wed, 07 Jul 2021, Matt Roper wrote: > > PCI revision IDs don't always map to GT and display IP steppings in an > > intuitive/sensible way. On many of our recent platforms we've > > switched to using revid->stepping lookup tables with the > > infrastructure in intel_step.c to handle stepping lookups and > > comparisons. Since it's confusing to have some of our platforms using > > the new lookup tables and some still using old revid comparisons, > > let's migrate all the old platforms over to the table approach since > > that's what we want to standardize on going forward. The only place > > that revision ID's should really get used directly now is when > > checking to see if we're running on pre-production hardware. > > Anusha, Matt, please sort this out between the two of you. :) > > https://patchwork.freedesktop.org/series/92257/ > @Roper, Matthew D the series doesn't add the steeping table for BXT and GLK. Anusha > BR, > Jani. > > > > > > Let's also take the opportunity to drop a bit of effectively dead code > > in the workarounds file too. > > > > Cc: Jani Nikula > > > > Matt Roper (7): > > drm/i915: Make pre-production detection use direct revid comparison > > drm/i915/skl: Use revid->stepping tables > > drm/i915/icl: Use revid->stepping tables > > drm/i915/jsl_ehl: Use revid->stepping tables > > drm/i915/rkl: Use revid->stepping tables > > drm/i915/dg1: Use revid->stepping tables > > drm/i915/cnl: Drop all workarounds > > > > .../drm/i915/display/intel_display_power.c| 2 +- > > drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +- > > drivers/gpu/drm/i915/display/intel_psr.c | 4 +- > > drivers/gpu/drm/i915/gt/intel_region_lmem.c | 2 +- > > drivers/gpu/drm/i915/gt/intel_workarounds.c | 81 +++ > > drivers/gpu/drm/i915/i915_drv.c | 8 +- > > drivers/gpu/drm/i915/i915_drv.h | 80 +++--- > > drivers/gpu/drm/i915/intel_pm.c | 2 +- > > drivers/gpu/drm/i915/intel_step.c | 72 +++-- > > drivers/gpu/drm/i915/intel_step.h | 7 ++ > > 10 files changed, 107 insertions(+), 153 deletions(-) > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/7] drm/i915/skl: Use revid->stepping tables
> -Original Message- > From: Intel-gfx On Behalf Of > Matt Roper > Sent: Wednesday, July 7, 2021 10:38 PM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 2/7] drm/i915/skl: Use revid->stepping tables > > Switch SKL to use a revid->stepping table as we're trying to do on all > platforms going forward. Also add some additional stepping definitions for > completeness, even if we don't have any workarounds tied to them. > > Note that SKL has a case where a newer revision ID corresponds to an older > GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1). Also, the lack of a > revision > ID 0x8 in the table is intentional and not an oversight. > We'll re-write the KBL-specific comment to make it clear that these kind of > quirks are expected. > > Finally, since we're already touching the KBL area too, let's rename the KBL > table to match the naming convention used by all of the other platforms. > > Bspec: 13626 > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/gt/intel_workarounds.c | 2 +- > drivers/gpu/drm/i915/i915_drv.h | 11 +-- > drivers/gpu/drm/i915/intel_step.c | 35 - > drivers/gpu/drm/i915/intel_step.h | 4 +++ > 4 files changed, 33 insertions(+), 19 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c > b/drivers/gpu/drm/i915/gt/intel_workarounds.c > index d9a5a445ceec..6dfd564e078f 100644 > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c > @@ -883,7 +883,7 @@ skl_gt_workarounds_init(struct drm_i915_private > *i915, struct i915_wa_list *wal) > GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE); > > /* WaInPlaceDecompressionHang:skl */ > - if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER)) > + if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER)) > wa_write_or(wal, > GEN9_GAMT_ECO_REG_RW_IA, > GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS); > diff --git a/drivers/gpu/drm/i915/i915_drv.h > b/drivers/gpu/drm/i915/i915_drv.h index 796e6838bc79..300575f64ca6 > 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1462,16 +1462,7 @@ IS_SUBPLATFORM(const struct drm_i915_private > *i915, #define IS_TGL_Y(dev_priv) \ > IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, > INTEL_SUBPLATFORM_ULX) > > -#define SKL_REVID_A0 0x0 > -#define SKL_REVID_B0 0x1 > -#define SKL_REVID_C0 0x2 > -#define SKL_REVID_D0 0x3 > -#define SKL_REVID_E0 0x4 > -#define SKL_REVID_F0 0x5 > -#define SKL_REVID_G0 0x6 > -#define SKL_REVID_H0 0x7 > - > -#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, > until)) > +#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, > +since, until)) > > #define IS_KBL_GT_STEP(dev_priv, since, until) \ > (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until)) diff - > -git a/drivers/gpu/drm/i915/intel_step.c > b/drivers/gpu/drm/i915/intel_step.c > index ba9479a67521..bfd63f56c200 100644 > --- a/drivers/gpu/drm/i915/intel_step.c > +++ b/drivers/gpu/drm/i915/intel_step.c > @@ -7,15 +7,31 @@ > #include "intel_step.h" > > /* > - * KBL revision ID ordering is bizarre; higher revision ID's map to lower > - * steppings in some cases. So rather than test against the revision ID > - * directly, let's map that into our own range of increasing ID's that we > - * can test against in a regular manner. > + * Some platforms have unusual ways of mapping PCI revision ID to > + GT/display > + * steppings. E.g., in some cases a higher PCI revision may translate > + to a > + * lower stepping of the GT and/or display IP. This file provides > + lookup > + * tables to map the PCI revision into a standard set of stepping > + values that > + * can be compared numerically. > + * > + * Also note that some revisions/steppings may have been set aside as > + * placeholders but never materialized in real hardware; in those cases > + there > + * may be jumps in the revision IDs or stepping values in the tables below. > */ > > +static const struct intel_step_info skl_revid_step_tbl[] = { > + [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 }, > + [0x1] = { .gt_step = STEP_B0, .display_step = STEP_B0 }, > + [0x2] = { .gt_step = STEP_C0, .display_step = STEP_C0 }, > + [0x3] = { .gt_step = STEP_D0, .display_step = STEP_D0 }, > + [0x4] = { .gt_step = STEP_E0, .display_step = STEP_E0 }, > + [0x5] = { .gt_step = STEP_F0, .display_step = STEP_F0 }, > + [0x6] = { .gt_step = STEP_G0, .display_step = STEP_G0 }, > + [0x7] = { .gt_step = STEP_H0, .display_step = STEP_H0 }, > + [0x9] = { .gt_step = STEP_J0, .display_step = STEP_J0 }, > + [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 }, }; Feedback I received was to avoid adding .display_step if it is same as .gt_step and have something li
Re: [Intel-gfx] [PATCH 1/7] drm/i915: Make pre-production detection use direct revid comparison
> -Original Message- > From: Intel-gfx On Behalf Of > Matt Roper > Sent: Wednesday, July 7, 2021 10:38 PM > To: intel-gfx@lists.freedesktop.org > Subject: [Intel-gfx] [PATCH 1/7] drm/i915: Make pre-production detection > use direct revid comparison > > Although we're converting our workarounds to use a revid->stepping lookup > table, the function that detects pre-production hardware should continue to > compare against PCI revision ID values directly. These are listed in the > bspec > as integers, so it's easier to confirm their correctness if we just use an > integer > literal rather than a symbolic name anyway. > > Since the BXT, GLK, and CNL revid macros were never used in any > workaround code, just remove them completely. > > Bspec: 13620, 19131, 13626, 18329 > Signed-off-by: Matt Roper > --- > drivers/gpu/drm/i915/i915_drv.c | 8 > drivers/gpu/drm/i915/i915_drv.h | 24 > drivers/gpu/drm/i915/intel_step.h | 1 + > 3 files changed, 5 insertions(+), 28 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c > b/drivers/gpu/drm/i915/i915_drv.c index 30d8cd8c69b1..90136995f5eb > 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct > drm_i915_private *dev_priv) > bool pre = false; > > pre |= IS_HSW_EARLY_SDV(dev_priv); > - pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0); > - pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST); > - pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0); > - pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2); > + pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6; > + pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA; > + pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1; > + pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3; > > if (pre) { > drm_err(&dev_priv->drm, "This is a pre-production stepping. > " > diff --git a/drivers/gpu/drm/i915/i915_drv.h > b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..796e6838bc79 > 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1473,35 +1473,11 @@ IS_SUBPLATFORM(const struct drm_i915_private > *i915, > > #define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, > until)) > > -#define BXT_REVID_A0 0x0 > -#define BXT_REVID_A1 0x1 > -#define BXT_REVID_B0 0x3 > -#define BXT_REVID_B_LAST 0x8 > -#define BXT_REVID_C0 0x9 > - > -#define IS_BXT_REVID(dev_priv, since, until) \ > - (IS_BROXTON(dev_priv) && IS_REVID(dev_priv, since, until)) Here, we can have IS_BXT_GT_STEP, similar to other platform and use in intel_detect_preproduction_hw() above. Same for other platforms - SKL and GLK. KBL already uses IS_KBL_GT_STEP. Anusha > #define IS_KBL_GT_STEP(dev_priv, since, until) \ > (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until)) > #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \ > (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, > until)) > > -#define GLK_REVID_A0 0x0 > -#define GLK_REVID_A1 0x1 > -#define GLK_REVID_A2 0x2 > -#define GLK_REVID_B0 0x3 > - > -#define IS_GLK_REVID(dev_priv, since, until) \ > - (IS_GEMINILAKE(dev_priv) && IS_REVID(dev_priv, since, until)) > - > -#define CNL_REVID_A0 0x0 > -#define CNL_REVID_B0 0x1 > -#define CNL_REVID_C0 0x2 > - > -#define IS_CNL_REVID(p, since, until) \ > - (IS_CANNONLAKE(p) && IS_REVID(p, since, until)) > - > #define ICL_REVID_A0 0x0 > #define ICL_REVID_A2 0x1 > #define ICL_REVID_B0 0x3 > diff --git a/drivers/gpu/drm/i915/intel_step.h > b/drivers/gpu/drm/i915/intel_step.h > index 958a8bb5d677..8efacef6ab31 100644 > --- a/drivers/gpu/drm/i915/intel_step.h > +++ b/drivers/gpu/drm/i915/intel_step.h > @@ -22,6 +22,7 @@ struct intel_step_info { enum intel_step { > STEP_NONE = 0, > STEP_A0, > + STEP_A1, > STEP_A2, > STEP_B0, > STEP_B1, > -- > 2.25.4 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 00/30] drm/i915/gem: ioctl clean-ups (v9)
On Thu, Jul 08, 2021 at 10:48:05AM -0500, Jason Ekstrand wrote: > Overview: > - > > This patch series attempts to clean up some of the IOCTL mess we've created > over the last few years. The most egregious bit being context mutability. > In summary, this series: > > 1. Drops two never-used context params: RINGSIZE and NO_ZEROMAP > 2. Drops the entire CONTEXT_CLONE API > 3. Implements SINGLE_TIMELINE with a syncobj instead of actually sharing > intel_timeline between engines. > 4. Adds a few sanity restrictions to the balancing/bonding API. > 5. Implements a proto-ctx mechanism so that the engine set and VM can only > be set early on in the lifetime of a context, before anything ever > executes on it. This effectively makes the VM and engine set > immutable. > > This series has been tested with IGT as well as the Iris, ANV, and the > Intel media driver doing an 8K decode (this uses bonding/balancing). I've > also done quite a bit of git archeology to ensure that nothing in here will > break anything that's already shipped at some point in history. It's > possible I've missed something, but I've dug quite a bit. > > > Details and motivation: > --- > > In very broad strokes, there's an effort going on right now within Intel to > try and clean up and simplify i915 anywhere we can. We obviously don't > want to break any shipping userspace but, as can be seen by this series, > there's a lot i915 theoretically supports which userspace doesn't actually > need. Some of this, like the two context params used here, were simply > oversights where we went through the usual API review process and merged > the i915 bits but the userspace bits never landed for some reason. > > Not all are so innocent, however. For instance, there's an entire context > cloning API which allows one to create a context with certain parameters > "cloned" from some other context. This entire API has never been used by > any userspace except IGT and there were never patches to any other > userspace to use it. It never should have landed. Also, when we added > support for setting explicit engine sets and sharing VMs across contexts, > people decided to do so via SET_CONTEXT_PARAM. While this allowed them to > re-use existing API, it did so at the cost of making those states mutable > which leads to a plethora of potential race conditions. There were even > IGT tests merged to cover some of theses: > > - gem_vm_create@async-destroy and gem_vm_create@destroy-race which test >swapping out the VM on a running context. > > - gem_ctx_persistence@replace* which test whether a client can escape a >non-persistent context by submitting a hanging batch and then swapping >out the engine set before the hang is detected. > > - api_intel_bb@bb-with-vm which tests the that intel_bb_assign_vm works >properly. This API is never used by any other IGT test. > > There is also an entire deferred flush and set state framework in > i915_gem_cotnext.c which exists for safely swapping out the VM while there > is work in-flight on a context. > > So, clearly people knew that this API was inherently racy and difficult to > implement but they landed it anyway. Why? The best explanation I've been > given is because it makes the API more "unified" or "symmetric" for this > stuff to go through SET_CONTEXT_PARAM. It's not because any userspace > actually wants to be able to swap out the VM or the set of engines on a > running context. That would be utterly insane. > > This patch series cleans up this particular mess by introducing the concept > of a i915_gem_proto_context data structure which contains context creation > information. When you initially call GEM_CONTEXT_CREATE, a proto-context > in created instead of an actual context. Then, the first time something is > done on the context besides SET_CONTEXT_PARAM, an actual context is > created. This allows us to keep the old drivers which use > SET_CONTEXT_PARAM to set up the engine set (see also media) while ensuring > that, once you have an i915_gem_context, the VM and the engine set are > immutable state. > > Eventually, there are more clean-ups I'd like to do on top of this which > should make working with contexts inside i915 simpler and safer: > > 1. Move the GEM handle -> vma LUT from i915_gem_context into either > i915_ppgtt or drm_i915_file_private depending on whether or not the > hardware has a full PPGTT. > > 2. Move the delayed context destruction code into intel_context or a > per-engine wrapper struct rather than i915_gem_context. > > 3. Get rid of the separation between context close and context destroy > > 4. Get rid of the RCU on i915_gem_context > > However, these should probably be done as a separate patch series as this > one is already starting to get longish, especially if you consider the 89 > IGT patches that go along with it. > > Test-with: 20210707210215.351483-1-ja...@jlekstran
[Intel-gfx] [PATCH] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR
From: Clint Taylor The PUNIT FW is currently returning 0 for all memory bandwidth parameters. Read the values directly from MCHBAR offsets 0x5918 and 0x4000(4). v2 (Lucas): tidy up checking for ret slightly v3 (Lucas): - Squash change to double the memory bandwidth based on MCHBAR Gear_type - Move ICL_GEAR_TYPE_MASK to the appropriate place and change prefix to DG1 - Move register definitions to i915_reg.h - Make the MCHBAR path permanent for DG1 - Convert to REG_BIT()/REG_GENMASK() v4: Drop unneeded initializations Cc: Ville Syrjälä Cc: Matt Roper Cc: Jani Saarinen Signed-off-by: Clint Taylor Signed-off-by: Jani Nikula Signed-off-by: Matthew Auld Reviewed-by: Lucas De Marchi Signed-off-by: Lucas De Marchi Reviewed-by: José Roberto de Souza --- drivers/gpu/drm/i915/display/intel_bw.c | 41 - drivers/gpu/drm/i915/i915_reg.h | 12 2 files changed, 52 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/display/intel_bw.c b/drivers/gpu/drm/i915/display/intel_bw.c index bfb398f0432e..0d5d52548925 100644 --- a/drivers/gpu/drm/i915/display/intel_bw.c +++ b/drivers/gpu/drm/i915/display/intel_bw.c @@ -23,6 +23,41 @@ struct intel_qgv_info { u8 t_bl; }; +static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv, + struct intel_qgv_point *sp, + int point) +{ + u32 dclk_ratio, dclk_reference; + u32 val; + + val = intel_uncore_read(&dev_priv->uncore, SA_PERF_STATUS_0_0_0_MCHBAR_PC); + dclk_ratio = REG_FIELD_GET(DG1_QCLK_RATIO_MASK, val); + if (val & DG1_QCLK_REFERENCE) + dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */ + else + dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */ + sp->dclk = dclk_ratio * dclk_reference; + + val = intel_uncore_read(&dev_priv->uncore, SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU); + if (val & DG1_GEAR_TYPE) + sp->dclk *= 2; + + if (sp->dclk == 0) + return -EINVAL; + + val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR); + sp->t_rp = REG_FIELD_GET(DG1_DRAM_T_RP_MASK, val); + sp->t_rdpre = REG_FIELD_GET(DG1_DRAM_T_RDPRE_MASK, val); + + val = intel_uncore_read(&dev_priv->uncore, MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH); + sp->t_rcd = REG_FIELD_GET(DG1_DRAM_T_RCD_MASK, val); + sp->t_ras = REG_FIELD_GET(DG1_DRAM_T_RAS_MASK, val); + + sp->t_rc = sp->t_rp + sp->t_ras; + + return 0; +} + static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv, struct intel_qgv_point *sp, int point) @@ -99,7 +134,11 @@ static int icl_get_qgv_points(struct drm_i915_private *dev_priv, for (i = 0; i < qi->num_points; i++) { struct intel_qgv_point *sp = &qi->points[i]; - ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i); + if (IS_DG1(dev_priv)) + ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i); + else + ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i); + if (ret) return ret; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 16a19239d86d..943fe485c662 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -11060,6 +11060,7 @@ enum skl_power_gate { #define SKL_MEMORY_FREQ_MULTIPLIER_HZ 2 #define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5E04) #define SKL_REQ_DATA_MASK (0xF << 0) +#define DG1_GEAR_TYPE REG_BIT(16) #define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5000) #define SKL_DRAM_DDR_TYPE_MASK(0x3 << 0) @@ -11095,6 +11096,17 @@ enum skl_power_gate { #define CNL_DRAM_RANK_3 (0x2 << 9) #define CNL_DRAM_RANK_4 (0x3 << 9) +#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x5918) +#define DG1_QCLK_RATIO_MASK REG_GENMASK(9, 2) +#define DG1_QCLK_REFERENCEREG_BIT(10) + +#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4000) +#define DG1_DRAM_T_RDPRE_MASKREG_GENMASK(16, 11) +#define DG1_DRAM_T_RP_MASK REG_GENMASK(6, 0) +#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 0x4004) +#define DG1_DRAM_T_RCD_MASK REG_GENMASK(15, 9) +#define DG1_DRAM_T_RAS_MASK REG_GENMASK(8, 1) + /* * Please see hsw_read_dcomp() and hsw_write_dcomp() before using this register, * since on HSW we can't write to it using intel_
[Intel-gfx] PR for new GuC v62.0.3 and HuC v7.9.3 binaries
The following changes since commit d79c26779d459063b8052b7fe0a48bce4e08d0d9: amdgpu: update vcn firmware for green sardine for 21.20 (2021-06-29 07:26:03 -0400) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-firmware guc_62.0_huc_7.9 for you to fetch changes up to f4d897acd200190350a5f2148316c51c6c57bc9b: firmware/i915/guc: Add HuC v7.9.3 for TGL & DG1 (2021-06-29 14:20:03 -0700) John Harrison (3): firmware/i915/guc: Add GuC v62.0.0 for all platforms firmware/i915/guc: Add GuC v62.0.3 for ADL-P firmware/i915/guc: Add HuC v7.9.3 for TGL & DG1 WHENCE | 38 +- i915/adlp_guc_62.0.3.bin | Bin 0 -> 336704 bytes i915/bxt_guc_62.0.0.bin | Bin 0 -> 199616 bytes i915/cml_guc_62.0.0.bin | Bin 0 -> 200448 bytes i915/dg1_guc_62.0.0.bin | Bin 0 -> 315648 bytes i915/dg1_huc_7.9.3.bin | Bin 0 -> 589888 bytes i915/ehl_guc_62.0.0.bin | Bin 0 -> 327488 bytes i915/glk_guc_62.0.0.bin | Bin 0 -> 20 bytes i915/icl_guc_62.0.0.bin | Bin 0 -> 327488 bytes i915/kbl_guc_62.0.0.bin | Bin 0 -> 200448 bytes i915/skl_guc_62.0.0.bin | Bin 0 -> 199552 bytes i915/tgl_guc_62.0.0.bin | Bin 0 -> 326016 bytes i915/tgl_huc_7.9.3.bin | Bin 0 -> 589888 bytes 13 files changed, 37 insertions(+), 1 deletion(-) create mode 100644 i915/adlp_guc_62.0.3.bin create mode 100644 i915/bxt_guc_62.0.0.bin create mode 100644 i915/cml_guc_62.0.0.bin create mode 100644 i915/dg1_guc_62.0.0.bin create mode 100644 i915/dg1_huc_7.9.3.bin create mode 100644 i915/ehl_guc_62.0.0.bin create mode 100644 i915/glk_guc_62.0.0.bin create mode 100644 i915/icl_guc_62.0.0.bin create mode 100644 i915/kbl_guc_62.0.0.bin create mode 100644 i915/skl_guc_62.0.0.bin create mode 100644 i915/tgl_guc_62.0.0.bin create mode 100644 i915/tgl_huc_7.9.3.bin ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 20/20] dma-resv: Give the docs a do-over
Specifically document the new/clarified rules around how the shared fences do not have any ordering requirements against the exclusive fence. But also document all the things a bit better, given how central struct dma_resv to dynamic buffer management the docs have been very inadequat. - Lots more links to other pieces of the puzzle. Unfortunately ttm_buffer_object has no docs, so no links :-( - Explain/complain a bit about dma_resv_locking_ctx(). I still don't like that one, but fixing the ttm call chains is going to be horrible. Plus we want to plug in real slowpath locking when we do that anyway. - Main part of the patch is some actual docs for struct dma_resv. Overall I think we still have a lot of bad naming in this area (e.g. dma_resv.fence is singular, but contains the multiple shared fences), but I think that's more indicative of how the semantics and rules are just not great. Another thing that's real awkard is how chaining exclusive fences right now means direct dma_resv.exclusive_fence pointer access with an rcu_assign_pointer. Not so great either. v2: - Fix a pile of typos (Matt, Jason) - Hammer it in that breaking the rules leads to use-after-free issues around dma-buf sharing (Christian) Reviewed-by: Christian König Cc: Jason Ekstrand Cc: Matthew Auld Reviewed-by: Matthew Auld Signed-off-by: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/dma-buf/dma-resv.c | 24 ++--- include/linux/dma-buf.h| 7 +++ include/linux/dma-resv.h | 104 +++-- 3 files changed, 124 insertions(+), 11 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index f26c71747d43..a3acb6479ddf 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -48,6 +48,8 @@ * write operations) or N shared fences (read operations). The RCU * mechanism is used to protect read access to fences from locked * write-side updates. + * + * See struct dma_resv for more details. */ DEFINE_WD_CLASS(reservation_ww_class); @@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini); * @num_fences: number of fences we want to add * * Should be called before dma_resv_add_shared_fence(). Must - * be called with obj->lock held. + * be called with @obj locked through dma_resv_lock(). + * + * Note that the preallocated slots need to be re-reserved if @obj is unlocked + * at any time before calling dma_resv_add_shared_fence(). This is validated + * when CONFIG_DEBUG_MUTEXES is enabled. * * RETURNS * Zero for success, or -errno @@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max); * @obj: the reservation object * @fence: the shared fence to add * - * Add a fence to a shared slot, obj->lock must be held, and + * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and * dma_resv_reserve_shared() has been called. + * + * See also &dma_resv.fence for a discussion of the semantics. */ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) { @@ -278,9 +286,11 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence); /** * dma_resv_add_excl_fence - Add an exclusive fence. * @obj: the reservation object - * @fence: the shared fence to add + * @fence: the exclusive fence to add * - * Add a fence to the exclusive slot. The obj->lock must be held. + * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock(). + * Note that this function replaces all fences attached to @obj, see also + * &dma_resv.fence_excl for a discussion of the semantics. */ void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence) { @@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) * fence * * Callers are not required to hold specific locks, but maybe hold - * dma_resv_lock() already + * dma_resv_lock() already. + * * RETURNS - * true if all fences signaled, else false + * + * True if all fences signaled, else false. */ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 2b814fde0d11..8cc0c55877a6 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -420,6 +420,13 @@ struct dma_buf { * - Dynamic importers should set fences for any access that they can't * disable immediately from their &dma_buf_attach_ops.move_notify * callback. +* +* IMPORTANT: +* +* All drivers must obey the struct dma_resv rules, specifically the +* rules for updating fences, see &dma_resv.fence_excl and +* &dma_resv.fence. If these dependency rules are broken access tracking +* can be lost resulting in use after free issues. */ struct dma_resv *resv; diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h index e1ca2080a1ff..9100dd3dc21f 1
[Intel-gfx] [PATCH v3 19/20] drm/i915: Don't break exclusive fence ordering
There's only one exclusive slot, and we must not break the ordering. Adding a new exclusive fence drops all previous fences from the dma_resv. To avoid violating the signalling order we err on the side of over-synchronizing by waiting for the existing fences, even if userspace asked us to ignore them. A better fix would be to us a dma_fence_chain or _array like e.g. amdgpu now uses, but it probably makes sense to lift this into dma-resv.c code as a proper concept, so that drivers don't have to hack up their own solution each on their own. Hence go with the simple fix for now. Another option is the fence import ioctl from Jason: https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/ v2: Improve commit message per Lucas' suggestion. Cc: Lucas Stach Signed-off-by: Daniel Vetter Cc: Maarten Lankhorst Cc: "Thomas Hellström" Cc: Jason Ekstrand --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 47e07179347a..9d717c8842e2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -1775,6 +1775,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) struct i915_vma *vma = ev->vma; unsigned int flags = ev->flags; struct drm_i915_gem_object *obj = vma->obj; + bool async, write; assert_vma_held(vma); @@ -1806,7 +1807,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb) flags &= ~EXEC_OBJECT_ASYNC; } - if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) { + async = flags & EXEC_OBJECT_ASYNC; + write = flags & EXEC_OBJECT_WRITE; + + if (err == 0 && (!async || write)) { err = i915_request_await_object (eb->request, obj, flags & EXEC_OBJECT_WRITE); } -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 18/20] drm/i915: delete exclude argument from i915_sw_fence_await_reservation
No longer used, the last user disappeared with commit d07f0e59b2c762584478920cd2d11fba2980a94a Author: Chris Wilson Date: Fri Oct 28 13:58:44 2016 +0100 drm/i915: Move GEM activity tracking into a common struct reservation_object Signed-off-by: Daniel Vetter Cc: Maarten Lankhorst Cc: "Thomas Hellström" Cc: Jason Ekstrand --- drivers/gpu/drm/i915/display/intel_display.c | 4 ++-- drivers/gpu/drm/i915/gem/i915_gem_clflush.c| 2 +- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_sw_fence.c | 6 +- drivers/gpu/drm/i915/i915_sw_fence.h | 1 - 5 files changed, 5 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 98e0f4ed7e4a..678c7839034e 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -9,7 +9,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane, */ if (intel_crtc_needs_modeset(crtc_state)) { ret = i915_sw_fence_await_reservation(&state->commit_ready, - old_obj->base.resv, NULL, + old_obj->base.resv, false, 0, GFP_KERNEL); if (ret < 0) @@ -11153,7 +11153,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane, struct dma_fence *fence; ret = i915_sw_fence_await_reservation(&state->commit_ready, - obj->base.resv, NULL, + obj->base.resv, false, i915_fence_timeout(dev_priv), GFP_KERNEL); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c index daf9284ef1f5..93439d2c7a58 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c @@ -106,7 +106,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj, clflush = clflush_work_create(obj); if (clflush) { i915_sw_fence_await_reservation(&clflush->base.chain, - obj->base.resv, NULL, true, + obj->base.resv, true, i915_fence_timeout(to_i915(obj->base.dev)), I915_FENCE_GFP); dma_resv_add_excl_fence(obj->base.resv, &clflush->base.dma); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 68ce366f46cf..47e07179347a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2095,7 +2095,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb, /* Wait for all writes (and relocs) into the batch to complete */ err = i915_sw_fence_await_reservation(&pw->base.chain, - pw->batch->resv, NULL, false, + pw->batch->resv, false, 0, I915_FENCE_GFP); if (err < 0) goto err_commit; diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index c589a681da77..91711a46b1c7 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence, int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, struct dma_resv *resv, - const struct dma_fence_ops *exclude, bool write, unsigned long timeout, gfp_t gfp) @@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, return ret; for (i = 0; i < count; i++) { - if (shared[i]->ops == exclude) - continue; - pending = i915_sw_fence_await_dma_fence(fence, shared[i], timeout, @@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence *fence, excl = dma_resv_get_excl_unlocked(resv); } - if (ret >= 0 && excl && excl->ops
[Intel-gfx] [PATCH v3 17/20] drm/etnaviv: Don't break exclusive fence ordering
There's only one exclusive slot, and we must not break the ordering. Adding a new exclusive fence drops all previous fences from the dma_resv. To avoid violating the signalling order we err on the side of over-synchronizing by waiting for the existing fences, even if userspace asked us to ignore them. A better fix would be to us a dma_fence_chain or _array like e.g. amdgpu now uses, but it probably makes sense to lift this into dma-resv.c code as a proper concept, so that drivers don't have to hack up their own solution each on their own. Hence go with the simple fix for now. Another option is the fence import ioctl from Jason: https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/ v2: Improve commit message per Lucas' suggestion. Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: etna...@lists.freedesktop.org --- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 5b97ce1299ad..07454db4b150 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit) for (i = 0; i < submit->nr_bos; i++) { struct etnaviv_gem_submit_bo *bo = &submit->bos[i]; struct dma_resv *robj = bo->obj->base.resv; + bool write = bo->flags & ETNA_SUBMIT_BO_WRITE; - if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) { + if (!(write)) { ret = dma_resv_reserve_shared(robj, 1); if (ret) return ret; } - if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) + /* exclusive fences must be ordered */ + if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write) continue; ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base, - bo->flags & ETNA_SUBMIT_BO_WRITE); + write); if (ret) return ret; } -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 16/20] drm/msm: always wait for the exclusive fence
From: Christian König Drivers also need to to sync to the exclusive fence when a shared one is present. Signed-off-by: Christian König [danvet: Not that hard to compile-test on arm ...] Signed-off-by: Daniel Vetter Cc: Rob Clark Cc: Sean Paul Cc: linux-arm-...@vger.kernel.org Cc: freedr...@lists.freedesktop.org --- drivers/gpu/drm/msm/msm_gem.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 141178754231..d9c4f1deeafb 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj, struct dma_fence *fence; int i, ret; - fobj = dma_resv_shared_list(obj->resv); - if (!fobj || (fobj->shared_count == 0)) { - fence = dma_resv_excl_fence(obj->resv); - /* don't need to wait on our own fences, since ring is fifo */ - if (fence && (fence->context != fctx->context)) { - ret = dma_fence_wait(fence, true); - if (ret) - return ret; - } + fence = dma_resv_excl_fence(obj->resv); + /* don't need to wait on our own fences, since ring is fifo */ + if (fence && (fence->context != fctx->context)) { + ret = dma_fence_wait(fence, true); + if (ret) + return ret; } + fobj = dma_resv_shared_list(obj->resv); if (!exclusive || !fobj) return 0; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 15/20] drm/msm: Don't break exclusive fence ordering
There's only one exclusive slot, and we must not break the ordering. Adding a new exclusive fence drops all previous fences from the dma_resv. To avoid violating the signalling order we err on the side of over-synchronizing by waiting for the existing fences, even if userspace asked us to ignore them. A better fix would be to us a dma_fence_chain or _array like e.g. amdgpu now uses, but - msm has a synchronous dma_fence_wait for anything from another context, so doesn't seem to care much, - and it probably makes sense to lift this into dma-resv.c code as a proper concept, so that drivers don't have to hack up their own solution each on their own. v2: Improve commit message per Lucas' suggestion. Cc: Lucas Stach Signed-off-by: Daniel Vetter Cc: Rob Clark Cc: Sean Paul Cc: linux-arm-...@vger.kernel.org Cc: freedr...@lists.freedesktop.org --- drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index b71da71a3dd8..edd0051d849f 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, bool no_implicit) return ret; } - if (no_implicit) + /* exclusive fences must be ordered */ + if (no_implicit && !write) continue; ret = msm_gem_sync_object(&msm_obj->base, submit->ring->fctx, -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 14/20] drm/sched: Check locking in drm_sched_job_await_implicit
You really need to hold the reservation here or all kinds of funny things can happen between grabbing the dependencies and inserting the new fences. Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Daniel Vetter Cc: Luben Tuikov Cc: Andrey Grodzovsky Cc: Alex Deucher Cc: Jack Zhang --- drivers/gpu/drm/scheduler/sched_main.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index db326a1ebf3c..67eca88e070e 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -708,6 +708,8 @@ int drm_sched_job_await_implicit(struct drm_sched_job *job, struct dma_fence **fences; unsigned int i, fence_count; + dma_resv_assert_held(obj->resv); + if (!write) { struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv); -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 13/20] drm/sched: Don't store self-dependencies
This is essentially part of drm_sched_dependency_optimized(), which only amdgpu seems to make use of. Use it a bit more. This would mean that as-is amdgpu can't use the dependency helpers, at least not with the current approach amdgpu has for deciding whether a vm_flush is needed. Since amdgpu also has very special rules around implicit fencing it can't use those helpers either, and adding a drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too onerous. That way the special case handling for amdgpu sticks even more out and we have higher chances that reviewers that go across all drivers wont miss it. Reviewed-by: Lucas Stach Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Daniel Vetter Cc: Luben Tuikov Cc: Andrey Grodzovsky Cc: Alex Deucher Cc: Jack Zhang --- drivers/gpu/drm/scheduler/sched_main.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ad62f1d2991c..db326a1ebf3c 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -654,6 +654,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job, if (!fence) return 0; + /* if it's a fence from us it's guaranteed to be earlier */ + if (fence->context == job->entity->fence_context || + fence->context == job->entity->fence_context + 1) { + dma_fence_put(fence); + return 0; + } + /* Deduplicate if we already depend on a fence from the same context. * This lets the size of the array of deps scale with the number of * engines involved, rather than the number of BOs. -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 12/20] drm/gem: Delete gem array fencing helpers
Integrated into the scheduler now and all users converted over. Signed-off-by: Daniel Vetter Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/drm_gem.c | 96 --- include/drm/drm_gem.h | 5 -- 2 files changed, 101 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 68deb1de8235..24d49a2636e0 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, ww_acquire_fini(acquire_ctx); } EXPORT_SYMBOL(drm_gem_unlock_reservations); - -/** - * drm_gem_fence_array_add - Adds the fence to an array of fences to be - * waited on, deduplicating fences from the same context. - * - * @fence_array: array of dma_fence * for the job to block on. - * @fence: the dma_fence to add to the list of dependencies. - * - * This functions consumes the reference for @fence both on success and error - * cases. - * - * Returns: - * 0 on success, or an error on failing to expand the array. - */ -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence) -{ - struct dma_fence *entry; - unsigned long index; - u32 id = 0; - int ret; - - if (!fence) - return 0; - - /* Deduplicate if we already depend on a fence from the same context. -* This lets the size of the array of deps scale with the number of -* engines involved, rather than the number of BOs. -*/ - xa_for_each(fence_array, index, entry) { - if (entry->context != fence->context) - continue; - - if (dma_fence_is_later(fence, entry)) { - dma_fence_put(entry); - xa_store(fence_array, index, fence, GFP_KERNEL); - } else { - dma_fence_put(fence); - } - return 0; - } - - ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL); - if (ret != 0) - dma_fence_put(fence); - - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add); - -/** - * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked - * in the GEM object's reservation object to an array of dma_fences for use in - * scheduling a rendering job. - * - * This should be called after drm_gem_lock_reservations() on your array of - * GEM objects used in the job but before updating the reservations with your - * own fences. - * - * @fence_array: array of dma_fence * for the job to block on. - * @obj: the gem object to add new dependencies from. - * @write: whether the job might write the object (so we need to depend on - * shared fences in the reservation object). - */ -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, -struct drm_gem_object *obj, -bool write) -{ - int ret; - struct dma_fence **fences; - unsigned int i, fence_count; - - if (!write) { - struct dma_fence *fence = - dma_resv_get_excl_unlocked(obj->resv); - - return drm_gem_fence_array_add(fence_array, fence); - } - - ret = dma_resv_get_fences(obj->resv, NULL, - &fence_count, &fences); - if (ret || !fence_count) - return ret; - - for (i = 0; i < fence_count; i++) { - ret = drm_gem_fence_array_add(fence_array, fences[i]); - if (ret) - break; - } - - for (; i < fence_count; i++) - dma_fence_put(fences[i]); - kfree(fences); - return ret; -} -EXPORT_SYMBOL(drm_gem_fence_array_add_implicit); diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 240049566592..6d5e33b89074 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, struct ww_acquire_ctx *acquire_ctx); -int drm_gem_fence_array_add(struct xarray *fence_array, - struct dma_fence *fence); -int drm_gem_fence_array_add_implicit(struct xarray *fence_array, -struct drm_gem_object *obj, -bool write); int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, u32 handle, u64 *offset); -- 2.32.0 ___ Intel-gfx mailing list Int
[Intel-gfx] [PATCH v3 11/20] drm/etnaviv: Use scheduler dependency handling
We need to pull the drm_sched_job_init much earlier, but that's very minor surgery. v2: Actually fix up cleanup paths by calling drm_sched_job_init, which I wanted to to in the previous round (and did, for all other drivers). Spotted by Lucas. Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Sumit Semwal Cc: "Christian König" Cc: etna...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/etnaviv/etnaviv_gem.h| 5 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 58 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 63 +--- drivers/gpu/drm/etnaviv/etnaviv_sched.h | 3 +- 4 files changed, 35 insertions(+), 94 deletions(-) diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h index 98e60df882b6..63688e6e4580 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h @@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo { u64 va; struct etnaviv_gem_object *obj; struct etnaviv_vram_mapping *mapping; - struct dma_fence *excl; - unsigned int nr_shared; - struct dma_fence **shared; }; /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc, @@ -95,7 +92,7 @@ struct etnaviv_gem_submit { struct etnaviv_file_private *ctx; struct etnaviv_gpu *gpu; struct etnaviv_iommu_context *mmu_context, *prev_mmu_context; - struct dma_fence *out_fence, *in_fence; + struct dma_fence *out_fence; int out_fence_id; struct list_head node; /* GPU active submit list */ struct etnaviv_cmdbuf cmdbuf; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c index 4dd7d9d541c0..5b97ce1299ad 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c @@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit) if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT) continue; - if (bo->flags & ETNA_SUBMIT_BO_WRITE) { - ret = dma_resv_get_fences(robj, &bo->excl, - &bo->nr_shared, - &bo->shared); - if (ret) - return ret; - } else { - bo->excl = dma_resv_get_excl_unlocked(robj); - } - + ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base, + bo->flags & ETNA_SUBMIT_BO_WRITE); + if (ret) + return ret; } return ret; @@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref) wake_up_all(&submit->gpu->fence_event); - if (submit->in_fence) - dma_fence_put(submit->in_fence); if (submit->out_fence) { /* first remove from IDR, so fence can not be found anymore */ mutex_lock(&submit->gpu->fence_lock); @@ -529,7 +521,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, &submit->cmdbuf, ALIGN(args->stream_size, 8) + 8); if (ret) - goto err_submit_objects; + goto err_submit_put; submit->ctx = file->driver_priv; etnaviv_iommu_context_get(submit->ctx->mmu); @@ -537,51 +529,61 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data, submit->exec_state = args->exec_state; submit->flags = args->flags; + ret = drm_sched_job_init(&submit->sched_job, +&ctx->sched_entity[args->pipe], +submit->ctx); + if (ret) + goto err_submit_put; + ret = submit_lookup_objects(submit, file, bos, args->nr_bos); if (ret) - goto err_submit_objects; + goto err_submit_job; if ((priv->mmu_global->version != ETNAVIV_IOMMU_V2) && !etnaviv_cmd_validate_one(gpu, stream, args->stream_size / 4, relocs, args->nr_relocs)) { ret = -EINVAL; - goto err_submit_objects; + goto err_submit_job; } if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) { - submit->in_fence = sync_file_get_fence(args->fence_fd); - if (!submit->in_fence) { + struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd); + if (!in_fence) { ret = -EINVAL; - goto err_submit_objects; + goto err_submit_job; } + + ret = dr
[Intel-gfx] [PATCH v3 09/20] drm/v3d: Move drm_sched_job_init to v3d_job_init
Prep work for using the scheduler dependency handling. We need to call drm_sched_job_init earlier so we can use the new drm_sched_job_await* functions for dependency handling here. v2: Slightly better commit message and rebase to include the drm_sched_job_arm() call (Emma). v3: Cleanup jobs under construction correctly (Emma) Cc: Melissa Wen Signed-off-by: Daniel Vetter Cc: Emma Anholt --- drivers/gpu/drm/v3d/v3d_drv.h | 1 + drivers/gpu/drm/v3d/v3d_gem.c | 88 ++--- drivers/gpu/drm/v3d/v3d_sched.c | 15 +++--- 3 files changed, 44 insertions(+), 60 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 8a390738d65b..1d870261eaac 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -332,6 +332,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int v3d_wait_bo_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); +void v3d_job_cleanup(struct v3d_job *job); void v3d_job_put(struct v3d_job *job); void v3d_reset(struct v3d_dev *v3d); void v3d_invalidate_caches(struct v3d_dev *v3d); diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 69ac20e11b09..5eccd3658938 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -392,6 +392,12 @@ v3d_render_job_free(struct kref *ref) v3d_job_free(ref); } +void v3d_job_cleanup(struct v3d_job *job) +{ + drm_sched_job_cleanup(&job->base); + v3d_job_put(job); +} + void v3d_job_put(struct v3d_job *job) { kref_put(&job->refcount, job->free); @@ -433,9 +439,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data, static int v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, struct v3d_job *job, void (*free)(struct kref *ref), -u32 in_sync) +u32 in_sync, enum v3d_queue queue) { struct dma_fence *in_fence = NULL; + struct v3d_file_priv *v3d_priv = file_priv->driver_priv; int ret; job->v3d = v3d; @@ -446,35 +453,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, return ret; xa_init_flags(&job->deps, XA_FLAGS_ALLOC); + ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], +v3d_priv); + if (ret) + goto fail; ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence); if (ret == -EINVAL) - goto fail; + goto fail_job; ret = drm_gem_fence_array_add(&job->deps, in_fence); if (ret) - goto fail; + goto fail_job; kref_init(&job->refcount); return 0; +fail_job: + drm_sched_job_cleanup(&job->base); fail: xa_destroy(&job->deps); pm_runtime_put_autosuspend(v3d->drm.dev); return ret; } -static int -v3d_push_job(struct v3d_file_priv *v3d_priv, -struct v3d_job *job, enum v3d_queue queue) +static void +v3d_push_job(struct v3d_job *job) { - int ret; - - ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], -v3d_priv); - if (ret) - return ret; - drm_sched_job_arm(&job->base); job->done_fence = dma_fence_get(&job->base.s_fence->finished); @@ -483,8 +488,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv, kref_get(&job->refcount); drm_sched_entity_push_job(&job->base); - - return 0; } static void @@ -530,7 +533,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv) { struct v3d_dev *v3d = to_v3d_dev(dev); - struct v3d_file_priv *v3d_priv = file_priv->driver_priv; struct drm_v3d_submit_cl *args = data; struct v3d_bin_job *bin = NULL; struct v3d_render_job *render; @@ -556,7 +558,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, INIT_LIST_HEAD(&render->unref_list); ret = v3d_job_init(v3d, file_priv, &render->base, - v3d_render_job_free, args->in_sync_rcl); + v3d_render_job_free, args->in_sync_rcl, V3D_RENDER); if (ret) { kfree(render); return ret; @@ -570,7 +572,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, } ret = v3d_job_init(v3d, file_priv, &bin->base, - v3d_job_free, args->in_sync_bcl); + v3d_job_free, args->in_sync_bcl, V3D_BIN); if (ret) { v3d_job_put(&render->base); kfree(bin); @@ -592,7 +594,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, goto fail; } - ret = v3d_
[Intel-gfx] [PATCH v3 08/20] drm/lima: use scheduler dependency tracking
Nothing special going on here. Aside reviewing the code, it seems like drm_sched_job_arm() should be moved into lima_sched_context_queue_task and put under some mutex together with drm_sched_push_job(). See the kerneldoc for drm_sched_push_job(). Signed-off-by: Daniel Vetter Cc: Qiang Yu Cc: Sumit Semwal Cc: "Christian König" Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/lima/lima_gem.c | 4 ++-- drivers/gpu/drm/lima/lima_sched.c | 21 - drivers/gpu/drm/lima/lima_sched.h | 3 --- 3 files changed, 2 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index c528f40981bb..e54a88d5037a 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -267,7 +267,7 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, struct lima_bo *bo, if (explicit) return 0; - return drm_gem_fence_array_add_implicit(&task->deps, &bo->base.base, write); + return drm_sched_job_await_implicit(&task->base, &bo->base.base, write); } static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit) @@ -285,7 +285,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit) if (err) return err; - err = drm_gem_fence_array_add(&submit->task->deps, fence); + err = drm_sched_job_await_fence(&submit->task->base, fence); if (err) { dma_fence_put(fence); return err; diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index e968b5a8f0b0..99d5f6f1a882 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task, task->num_bos = num_bos; task->vm = lima_vm_get(vm); - xa_init_flags(&task->deps, XA_FLAGS_ALLOC); - return 0; } void lima_sched_task_fini(struct lima_sched_task *task) { - struct dma_fence *fence; - unsigned long index; int i; drm_sched_job_cleanup(&task->base); - xa_for_each(&task->deps, index, fence) { - dma_fence_put(fence); - } - xa_destroy(&task->deps); - if (task->bos) { for (i = 0; i < task->num_bos; i++) drm_gem_object_put(&task->bos[i]->base.base); @@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task) return fence; } -static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job, - struct drm_sched_entity *entity) -{ - struct lima_sched_task *task = to_lima_task(job); - - if (!xa_empty(&task->deps)) - return xa_erase(&task->deps, task->last_dep++); - - return NULL; -} - static int lima_pm_busy(struct lima_device *ldev) { int ret; @@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job) } static const struct drm_sched_backend_ops lima_sched_ops = { - .dependency = lima_sched_dependency, .run_job = lima_sched_run_job, .timedout_job = lima_sched_timedout_job, .free_job = lima_sched_free_job, diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h index ac70006b0e26..6a11764d87b3 100644 --- a/drivers/gpu/drm/lima/lima_sched.h +++ b/drivers/gpu/drm/lima/lima_sched.h @@ -23,9 +23,6 @@ struct lima_sched_task { struct lima_vm *vm; void *frame; - struct xarray deps; - unsigned long last_dep; - struct lima_bo **bos; int num_bos; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 10/20] drm/v3d: Use scheduler dependency handling
With the prep work out of the way this isn't tricky anymore. Aside: The chaining of the various jobs is a bit awkward, with the possibility of failure in bad places. I think with the drm_sched_job_init/arm split and maybe preloading the job->dependencies xarray this should be fixable. Cc: Melissa Wen Signed-off-by: Daniel Vetter Cc: Cc: Emma Anholt --- drivers/gpu/drm/v3d/v3d_drv.h | 5 - drivers/gpu/drm/v3d/v3d_gem.c | 25 - drivers/gpu/drm/v3d/v3d_sched.c | 29 + 3 files changed, 9 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 1d870261eaac..f80f4ff1f7aa 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -192,11 +192,6 @@ struct v3d_job { struct drm_gem_object **bo; u32 bo_count; - /* Array of struct dma_fence * to block on before submitting this job. -*/ - struct xarray deps; - unsigned long last_dep; - /* v3d fence to be signaled by IRQ handler when the job is complete. */ struct dma_fence *irq_fence; diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c index 5eccd3658938..42b07ffbea5e 100644 --- a/drivers/gpu/drm/v3d/v3d_gem.c +++ b/drivers/gpu/drm/v3d/v3d_gem.c @@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job, return ret; for (i = 0; i < job->bo_count; i++) { - ret = drm_gem_fence_array_add_implicit(&job->deps, - job->bo[i], true); + ret = drm_sched_job_await_implicit(&job->base, + job->bo[i], true); if (ret) { drm_gem_unlock_reservations(job->bo, job->bo_count, acquire_ctx); @@ -354,8 +354,6 @@ static void v3d_job_free(struct kref *ref) { struct v3d_job *job = container_of(ref, struct v3d_job, refcount); - unsigned long index; - struct dma_fence *fence; int i; for (i = 0; i < job->bo_count; i++) { @@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref) } kvfree(job->bo); - xa_for_each(&job->deps, index, fence) { - dma_fence_put(fence); - } - xa_destroy(&job->deps); - dma_fence_put(job->irq_fence); dma_fence_put(job->done_fence); @@ -452,7 +445,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret < 0) return ret; - xa_init_flags(&job->deps, XA_FLAGS_ALLOC); ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue], v3d_priv); if (ret) @@ -462,7 +454,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, if (ret == -EINVAL) goto fail_job; - ret = drm_gem_fence_array_add(&job->deps, in_fence); + ret = drm_sched_job_await_fence(&job->base, in_fence); if (ret) goto fail_job; @@ -472,7 +464,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv, fail_job: drm_sched_job_cleanup(&job->base); fail: - xa_destroy(&job->deps); pm_runtime_put_autosuspend(v3d->drm.dev); return ret; } @@ -619,8 +610,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (bin) { v3d_push_job(&bin->base); - ret = drm_gem_fence_array_add(&render->base.deps, - dma_fence_get(bin->base.done_fence)); + ret = drm_sched_job_await_fence(&render->base.base, + dma_fence_get(bin->base.done_fence)); if (ret) goto fail_unreserve; } @@ -630,7 +621,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data, if (clean_job) { struct dma_fence *render_fence = dma_fence_get(render->base.done_fence); - ret = drm_gem_fence_array_add(&clean_job->deps, render_fence); + ret = drm_sched_job_await_fence(&clean_job->base, render_fence); if (ret) goto fail_unreserve; v3d_push_job(clean_job); @@ -820,8 +811,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data, mutex_lock(&v3d->sched_lock); v3d_push_job(&job->base); - ret = drm_gem_fence_array_add(&clean_job->deps, - dma_fence_get(job->base.done_fence)); + ret = drm_sched_job_await_fence(&clean_job->base, + dma_fence_get(job->base.done_fence)); if (ret) goto fail_unreserve; diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 3f352d73af9c..f0de584f452c 100644 --- a/drivers/gpu/drm
[Intel-gfx] [PATCH v3 06/20] drm/sched: improve docs around drm_sched_entity
I found a few too many things that are tricky and not documented, so I started typing. I found a few more things that looked broken while typing, see the varios FIXME in drm_sched_entity. Also some of the usual logics: - actually include sched_entity.c declarations, that was lost in the move here: 620e762f9a98 ("drm/scheduler: move entity handling into separate file") - Ditch the kerneldoc for internal functions, keep the comments where they're describing more than what the function name already implies. - Switch drm_sched_entity to inline docs. Signed-off-by: Daniel Vetter --- Documentation/gpu/drm-mm.rst | 3 + drivers/gpu/drm/scheduler/sched_entity.c | 85 - include/drm/gpu_scheduler.h | 145 ++- 3 files changed, 146 insertions(+), 87 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index d5a73fa2c9ef..0198fa43d254 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -504,3 +504,6 @@ Scheduler Function References .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c :export: + +.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c + :export: diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index e2a6803910ce..ebb0dcb8a942 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -45,8 +45,14 @@ * @guilty: atomic_t set to 1 when a job on this queue * is found to be guilty causing a timeout * - * Note: the sched_list must have at least one element to schedule - * the entity + * Note that the &sched_list must have at least one element to schedule the entity. + * + * For changing @priority later on at runtime see + * drm_sched_entity_set_priority(). For changing the set of schedulers + * @sched_list at runtime see drm_sched_entity_modify_sched(). + * + * An entity is cleaned up by callind drm_sched_entity_fini(). See also + * drm_sched_entity_destroy(). * * Returns 0 on success or a negative error code on failure. */ @@ -92,6 +98,11 @@ EXPORT_SYMBOL(drm_sched_entity_init); * @sched_list: the list of new drm scheds which will replace * existing entity->sched_list * @num_sched_list: number of drm sched in sched_list + * + * Note that this must be called under the same common lock for @entity as + * drm_sched_job_arm() and drm_sched_entity_push_job(), or the driver needs to + * guarantee through some other means that this is never called while new jobs + * can be pushed to @entity. */ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, struct drm_gpu_scheduler **sched_list, @@ -104,13 +115,6 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity *entity, } EXPORT_SYMBOL(drm_sched_entity_modify_sched); -/** - * drm_sched_entity_is_idle - Check if entity is idle - * - * @entity: scheduler entity - * - * Returns true if the entity does not have any unscheduled jobs. - */ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) { rmb(); /* for list_empty to work without lock */ @@ -123,13 +127,7 @@ static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) return false; } -/** - * drm_sched_entity_is_ready - Check if entity is ready - * - * @entity: scheduler entity - * - * Return true if entity could provide a job. - */ +/* Return true if entity could provide a job. */ bool drm_sched_entity_is_ready(struct drm_sched_entity *entity) { if (spsc_queue_peek(&entity->job_queue) == NULL) @@ -192,14 +190,7 @@ long drm_sched_entity_flush(struct drm_sched_entity *entity, long timeout) } EXPORT_SYMBOL(drm_sched_entity_flush); -/** - * drm_sched_entity_kill_jobs_cb - helper for drm_sched_entity_kill_jobs - * - * @f: signaled fence - * @cb: our callback structure - * - * Signal the scheduler finished fence when the entity in question is killed. - */ +/* Signal the scheduler finished fence when the entity in question is killed. */ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, struct dma_fence_cb *cb) { @@ -224,14 +215,6 @@ drm_sched_job_dependency(struct drm_sched_job *job, return NULL; } -/** - * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed - * - * @entity: entity which is cleaned up - * - * Makes sure that all remaining jobs in an entity are killed before it is - * destroyed. - */ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) { struct drm_sched_job *job; @@ -273,9 +256,11 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) * * @entity: scheduler entity * - * This should be called after @drm_sched_entity_do_release. It goes over the - * entity and signals all jobs with an error code if the process was killed. + * Cleanups up @entity which has been init
[Intel-gfx] [PATCH v3 05/20] drm/sched: drop entity parameter from drm_sched_push_job
Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Emma Anholt Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Alex Deucher Cc: Nirmoy Das Cc: Dave Airlie Cc: Chen Li Cc: Lee Jones Cc: Deepak R Varma Cc: Kevin Wang Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Dennis Li Cc: Boris Brezillon Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 +- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 +- drivers/gpu/drm/lima/lima_gem.c | 3 +-- drivers/gpu/drm/lima/lima_sched.c| 5 ++--- drivers/gpu/drm/lima/lima_sched.h| 3 +-- drivers/gpu/drm/panfrost/panfrost_job.c | 2 +- drivers/gpu/drm/scheduler/sched_entity.c | 6 ++ drivers/gpu/drm/v3d/v3d_gem.c| 2 +- include/drm/gpu_scheduler.h | 3 +-- 10 files changed, 12 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index a4ec092af9a7..18f63567fb69 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, trace_amdgpu_cs_ioctl(job); amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket); - drm_sched_entity_push_job(&job->base, entity); + drm_sched_entity_push_job(&job->base); amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 5ddb955d2315..b86099c1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, *f = dma_fence_get(&job->base.s_fence->finished); amdgpu_job_free_resources(job); - drm_sched_entity_push_job(&job->base, entity); + drm_sched_entity_push_job(&job->base); return 0; } diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index 05f412204118..180bb633d5c5 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, /* the scheduler holds on to the job now */ kref_get(&submit->refcount); - drm_sched_entity_push_job(&submit->sched_job, sched_entity); + drm_sched_entity_push_job(&submit->sched_job); out_unlock: mutex_unlock(&submit->gpu->fence_lock); diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index de62966243cd..c528f40981bb 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit) goto err_out2; } - fence = lima_sched_context_queue_task( - submit->ctx->context + submit->pipe, submit->task); + fence = lima_sched_context_queue_task(submit->task); for (i = 0; i < submit->nr_bos; i++) { if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE) diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index 38f755580507..e968b5a8f0b0 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe, drm_sched_entity_fini(&context->base); } -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context, - struct lima_sched_task *task) +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task) { struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished); trace_lima_task_submit(task); - drm_sched_entity_push_job(&task->base, &context->base); + drm_sched_entity_push_job(&task->base); return fence; } diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h index 90f03c48ef4a..ac70006b0e26 100644 --- a/drivers/gpu/drm/lima/lima_sched.h +++ b/drivers/gpu/drm/lima/lima_sched.h @@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe, atomic_t *guilty); void lim
[Intel-gfx] [PATCH v3 07/20] drm/panfrost: use scheduler dependency tracking
Just deletes some code that's now more shared. Note that thanks to the split into drm_sched_job_init/arm we can now easily pull the _init() part from under the submission lock way ahead where we're adding the sync file in-fences as dependencies. v2: Correctly clean up the partially set up job, now that job_init() and job_arm() are apart (Emma). Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: Sumit Semwal Cc: "Christian König" Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/panfrost/panfrost_drv.c | 16 --- drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++-- drivers/gpu/drm/panfrost/panfrost_job.h | 5 +--- 3 files changed, 17 insertions(+), 41 deletions(-) diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c index 1ffaef5ec5ff..9f53bea07d61 100644 --- a/drivers/gpu/drm/panfrost/panfrost_drv.c +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev, if (ret) goto fail; - ret = drm_gem_fence_array_add(&job->deps, fence); + ret = drm_sched_job_await_fence(&job->base, fence); if (ret) goto fail; @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, struct drm_panfrost_submit *args = data; struct drm_syncobj *sync_out = NULL; struct panfrost_job *job; - int ret = 0; + int ret = 0, slot; if (!args->jc) return -EINVAL; @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, kref_init(&job->refcount); - xa_init_flags(&job->deps, XA_FLAGS_ALLOC); - job->pfdev = pfdev; job->jc = args->jc; job->requirements = args->requirements; job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev); job->file_priv = file->driver_priv; + slot = panfrost_job_get_slot(job); + + ret = drm_sched_job_init(&job->base, +&job->file_priv->sched_entity[slot], +NULL); + if (ret) + goto fail_job_put; + ret = panfrost_copy_in_sync(dev, file, args, job); if (ret) goto fail_job; @@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data, drm_syncobj_replace_fence(sync_out, job->render_done_fence); fail_job: + drm_sched_job_cleanup(&job->base); +fail_job_put: panfrost_job_put(job); fail_out_sync: if (sync_out) diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 4bc962763e1f..86c843d8822e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in return &fence->base; } -static int panfrost_job_get_slot(struct panfrost_job *job) +int panfrost_job_get_slot(struct panfrost_job *job) { /* JS0: fragment jobs. * JS1: vertex/tiler jobs @@ -242,13 +242,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js) static int panfrost_acquire_object_fences(struct drm_gem_object **bos, int bo_count, - struct xarray *deps) + struct drm_sched_job *job) { int i, ret; for (i = 0; i < bo_count; i++) { /* panfrost always uses write mode in its current uapi */ - ret = drm_gem_fence_array_add_implicit(deps, bos[i], true); + ret = drm_sched_job_await_implicit(job, bos[i], true); if (ret) return ret; } @@ -269,31 +269,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos, int panfrost_job_push(struct panfrost_job *job) { struct panfrost_device *pfdev = job->pfdev; - int slot = panfrost_job_get_slot(job); - struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot]; struct ww_acquire_ctx acquire_ctx; int ret = 0; - ret = drm_gem_lock_reservations(job->bos, job->bo_count, &acquire_ctx); if (ret) return ret; mutex_lock(&pfdev->sched_lock); - - ret = drm_sched_job_init(&job->base, entity, NULL); - if (ret) { - mutex_unlock(&pfdev->sched_lock); - goto unlock; - } - drm_sched_job_arm(&job->base); job->render_done_fence = dma_fence_get(&job->base.s_fence->finished); ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
[Intel-gfx] [PATCH v3 04/20] drm/sched: Add dependency tracking
Instead of just a callback we can just glue in the gem helpers that panfrost, v3d and lima currently use. There's really not that many ways to skin this cat. On the naming bikeshed: The idea for using _await_ to denote adding dependencies to a job comes from i915, where that's used quite extensively all over the place, in lots of datastructures. v2/3: Rebased. Reviewed-by: Steven Price (v1) Signed-off-by: Daniel Vetter Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Andrey Grodzovsky Cc: Lee Jones Cc: Nirmoy Das Cc: Boris Brezillon Cc: Luben Tuikov Cc: Alex Deucher Cc: Jack Zhang Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org --- drivers/gpu/drm/scheduler/sched_entity.c | 18 +++- drivers/gpu/drm/scheduler/sched_main.c | 103 +++ include/drm/gpu_scheduler.h | 31 ++- 3 files changed, 146 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 4e1124ed80e0..c7e6d29c9a33 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, job->sched->ops->free_job(job); } +static struct dma_fence * +drm_sched_job_dependency(struct drm_sched_job *job, +struct drm_sched_entity *entity) +{ + if (!xa_empty(&job->dependencies)) + return xa_erase(&job->dependencies, job->last_dependency++); + + if (job->sched->ops->dependency) + return job->sched->ops->dependency(job, entity); + + return NULL; +} + /** * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed * @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity) struct drm_sched_fence *s_fence = job->s_fence; /* Wait for all dependencies to avoid data corruptions */ - while ((f = job->sched->ops->dependency(job, entity))) + while ((f = drm_sched_job_dependency(job, entity))) dma_fence_wait(f, false); drm_sched_fence_scheduled(s_fence); @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity) */ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) { - struct drm_gpu_scheduler *sched = entity->rq->sched; struct drm_sched_job *sched_job; sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue)); @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) return NULL; while ((entity->dependency = - sched->ops->dependency(sched_job, entity))) { + drm_sched_job_dependency(sched_job, entity))) { trace_drm_sched_job_wait_dep(sched_job, entity->dependency); if (drm_sched_entity_add_dependency_cb(entity)) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 7e94754eb34c..ad62f1d2991c 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -594,6 +594,8 @@ int drm_sched_job_init(struct drm_sched_job *job, INIT_LIST_HEAD(&job->list); + xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC); + return 0; } EXPORT_SYMBOL(drm_sched_job_init); @@ -631,6 +633,98 @@ void drm_sched_job_arm(struct drm_sched_job *job) } EXPORT_SYMBOL(drm_sched_job_arm); +/** + * drm_sched_job_await_fence - adds the fence as a job dependency + * @job: scheduler job to add the dependencies to + * @fence: the dma_fence to add to the list of dependencies. + * + * Note that @fence is consumed in both the success and error cases. + * + * Returns: + * 0 on success, or an error on failing to expand the array. + */ +int drm_sched_job_await_fence(struct drm_sched_job *job, + struct dma_fence *fence) +{ + struct dma_fence *entry; + unsigned long index; + u32 id = 0; + int ret; + + if (!fence) + return 0; + + /* Deduplicate if we already depend on a fence from the same context. +* This lets the size of the array of deps scale with the number of +* engines involved, rather than the number of BOs. +*/ + xa_for_each(&job->dependencies, index, entry) { + if (entry->context != fence->context) + continue; + + if (dma_fence_is_later(fence, entry)) { + dma_fence_put(entry); + xa_store(&job->dependencies, index, fence, GFP_KERNEL); + } else { + dma_fence_put(fence); + } + return 0; + } + + ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b,
[Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled
It might be good enough on x86 with just READ_ONCE, but the write side should then at least be WRITE_ONCE because x86 has total store order. It's definitely not enough on arm. Fix this proplery, which means - explain the need for the barrier in both places - point at the other side in each comment Also pull out the !sched_list case as the first check, so that the code flow is clearer. While at it sprinkle some comments around because it was very non-obvious to me what's actually going on here and why. Note that we really need full barriers here, at first I thought store-release and load-acquire on ->last_scheduled would be enough, but we actually requiring ordering between that and the queue state. Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Steven Price Cc: Daniel Vetter Cc: Andrey Grodzovsky Cc: Lee Jones Cc: Boris Brezillon --- drivers/gpu/drm/scheduler/sched_entity.c | 27 ++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 64d398166644..4e1124ed80e0 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity) dma_fence_set_error(&sched_job->s_fence->finished, -ECANCELED); dma_fence_put(entity->last_scheduled); + entity->last_scheduled = dma_fence_get(&sched_job->s_fence->finished); + /* +* if the queue is empty we allow drm_sched_job_arm() to locklessly +* access ->last_scheduled. This only works if we set the pointer before +* we dequeue and if we a write barrier here. +*/ + smp_wmb(); + spsc_queue_pop(&entity->job_queue); return sched_job; } @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) struct drm_gpu_scheduler *sched; struct drm_sched_rq *rq; - if (spsc_queue_count(&entity->job_queue) || !entity->sched_list) + /* single possible engine and already selected */ + if (!entity->sched_list) + return; + + /* queue non-empty, stay on the same engine */ + if (spsc_queue_count(&entity->job_queue)) return; - fence = READ_ONCE(entity->last_scheduled); + fence = entity->last_scheduled; + + /* +* Only when the queue is empty are we guaranteed the the scheduler +* thread cannot change ->last_scheduled. To enforce ordering we need +* a read barrier here. See drm_sched_entity_pop_job() for the other +* side. +*/ + smp_rmb(); + + /* stay on the same engine if the previous job hasn't finished */ if (fence && !dma_fence_is_signaled(fence)) return; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 01/20] drm/sched: entity->rq selection cannot fail
If it does, someone managed to set up a sched_entity without schedulers, which is just a driver bug. We BUG_ON() here because in the next patch drm_sched_job_init() will be split up, with drm_sched_job_arm() never failing. And that's the part where the rq selection will end up in. Note that if having an empty sched_list set on an entity is indeed a valid use-case, we can keep that check in job_init even after the split into job_init/arm. Signed-off-by: Daniel Vetter Cc: "Christian König" Cc: Luben Tuikov Cc: Daniel Vetter Cc: Steven Price Cc: Andrey Grodzovsky Cc: Boris Brezillon Cc: Jack Zhang --- drivers/gpu/drm/scheduler/sched_entity.c | 2 +- drivers/gpu/drm/scheduler/sched_main.c | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index 79554aa4dbb1..6fc116ee7302 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -45,7 +45,7 @@ * @guilty: atomic_t set to 1 when a job on this queue * is found to be guilty causing a timeout * - * Note: the sched_list should have at least one element to schedule + * Note: the sched_list must have at least one element to schedule * the entity * * Returns 0 on success or a negative error code on failure. diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 33c414d55fab..01dd47154181 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job, struct drm_gpu_scheduler *sched; drm_sched_entity_select_rq(entity); - if (!entity->rq) - return -ENOENT; + BUG_ON(!entity->rq); sched = entity->rq->sched; -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH v3 02/20] drm/sched: Split drm_sched_job_init
This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. Cc: Melissa Wen Acked-by: Steven Price (v2) Signed-off-by: Daniel Vetter Cc: Lucas Stach Cc: Russell King Cc: Christian Gmeiner Cc: Qiang Yu Cc: Rob Herring Cc: Tomeu Vizoso Cc: Steven Price Cc: Alyssa Rosenzweig Cc: David Airlie Cc: Daniel Vetter Cc: Sumit Semwal Cc: "Christian König" Cc: Masahiro Yamada Cc: Kees Cook Cc: Adam Borowski Cc: Nick Terrell Cc: Mauro Carvalho Chehab Cc: Paul Menzel Cc: Sami Tolvanen Cc: Viresh Kumar Cc: Alex Deucher Cc: Dave Airlie Cc: Nirmoy Das Cc: Deepak R Varma Cc: Lee Jones Cc: Kevin Wang Cc: Chen Li Cc: Luben Tuikov Cc: "Marek Olšák" Cc: Dennis Li Cc: Maarten Lankhorst Cc: Andrey Grodzovsky Cc: Sonny Jiang Cc: Boris Brezillon Cc: Tian Tao Cc: etna...@lists.freedesktop.org Cc: l...@lists.freedesktop.org Cc: linux-me...@vger.kernel.org Cc: linaro-mm-...@lists.linaro.org Cc: Emma Anholt --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 2 + drivers/gpu/drm/etnaviv/etnaviv_sched.c | 2 + drivers/gpu/drm/lima/lima_sched.c| 2 + drivers/gpu/drm/panfrost/panfrost_job.c | 2 + drivers/gpu/drm/scheduler/sched_entity.c | 6 +-- drivers/gpu/drm/scheduler/sched_fence.c | 19 --- drivers/gpu/drm/scheduler/sched_main.c | 64 drivers/gpu/drm/v3d/v3d_gem.c| 2 + include/drm/gpu_scheduler.h | 7 ++- 10 files changed, 86 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index c5386d13eb4a..a4ec092af9a7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, if (r) goto error_unlock; + drm_sched_job_arm(&job->base); + /* No memory allocation is allowed while holding the notifier lock. * The lock is held until amdgpu_cs_submit is finished and fence is * added to BOs. diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index d33e6d97cc89..5ddb955d2315 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity, if (r) return r; + drm_sched_job_arm(&job->base); + *f = dma_fence_get(&job->base.s_fence->finished); amdgpu_job_free_resources(job); drm_sched_entity_push_job(&job->base, entity); diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c index feb6da1b6ceb..05f412204118 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity, if (ret) goto out_unlock; + drm_sched_job_arm(&submit->sched_job); + submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished); submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr, submit->out_fence, 0, diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c index dba8329937a3..38f755580507 100644 --- a/drivers/gpu/drm/lima/lima_sched.c +++ b/drivers/gpu/drm/lima/lima_sched.c @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task, return err; } + drm_sched_job_arm(&task->base); + task->num_bos = num_bos; task->vm = lima_vm_get(vm); diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 71a72fb50e6b..2992dc85325f 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/pan
[Intel-gfx] [PATCH v3 00/20] drm/sched dependency tracking and dma-resv fixes
Hil all, I figured I'll combine the two series, they build on top of each another anyway. Changes: - drop broken i915 patch (Matt) - typos and improvements in the dma-resv patch - bunch of fixes to the drm_sched_job_init/arm split (Melissa, Christian) - threw a drm_sched_entity doc patch on top Testing & review very much welcome. Cheers, Daniel Christian König (1): drm/msm: always wait for the exclusive fence Daniel Vetter (19): drm/sched: entity->rq selection cannot fail drm/sched: Split drm_sched_job_init drm/sched: Barriers are needed for entity->last_scheduled drm/sched: Add dependency tracking drm/sched: drop entity parameter from drm_sched_push_job drm/sched: improve docs around drm_sched_entity drm/panfrost: use scheduler dependency tracking drm/lima: use scheduler dependency tracking drm/v3d: Move drm_sched_job_init to v3d_job_init drm/v3d: Use scheduler dependency handling drm/etnaviv: Use scheduler dependency handling drm/gem: Delete gem array fencing helpers drm/sched: Don't store self-dependencies drm/sched: Check locking in drm_sched_job_await_implicit drm/msm: Don't break exclusive fence ordering drm/etnaviv: Don't break exclusive fence ordering drm/i915: delete exclude argument from i915_sw_fence_await_reservation drm/i915: Don't break exclusive fence ordering dma-resv: Give the docs a do-over Documentation/gpu/drm-mm.rst | 3 + drivers/dma-buf/dma-resv.c| 24 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c| 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 4 +- drivers/gpu/drm/drm_gem.c | 96 - drivers/gpu/drm/etnaviv/etnaviv_gem.h | 5 +- drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 64 +++--- drivers/gpu/drm/etnaviv/etnaviv_sched.c | 65 +- drivers/gpu/drm/etnaviv/etnaviv_sched.h | 3 +- drivers/gpu/drm/i915/display/intel_display.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_clflush.c | 2 +- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 8 +- drivers/gpu/drm/i915/i915_sw_fence.c | 6 +- drivers/gpu/drm/i915/i915_sw_fence.h | 1 - drivers/gpu/drm/lima/lima_gem.c | 7 +- drivers/gpu/drm/lima/lima_sched.c | 28 +-- drivers/gpu/drm/lima/lima_sched.h | 6 +- drivers/gpu/drm/msm/msm_gem.c | 16 +- drivers/gpu/drm/msm/msm_gem_submit.c | 3 +- drivers/gpu/drm/panfrost/panfrost_drv.c | 16 +- drivers/gpu/drm/panfrost/panfrost_job.c | 39 +--- drivers/gpu/drm/panfrost/panfrost_job.h | 5 +- drivers/gpu/drm/scheduler/sched_entity.c | 140 +++-- drivers/gpu/drm/scheduler/sched_fence.c | 19 +- drivers/gpu/drm/scheduler/sched_main.c| 177 +++-- drivers/gpu/drm/v3d/v3d_drv.h | 6 +- drivers/gpu/drm/v3d/v3d_gem.c | 115 +-- drivers/gpu/drm/v3d/v3d_sched.c | 44 + include/drm/drm_gem.h | 5 - include/drm/gpu_scheduler.h | 186 ++ include/linux/dma-buf.h | 7 + include/linux/dma-resv.h | 104 +- 32 files changed, 674 insertions(+), 538 deletions(-) -- 2.32.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
On Tue, Jul 06, 2021 at 12:14:16PM -0700, Nathan Chancellor wrote: > On 7/6/2021 10:06 AM, Will Deacon wrote: > > On Tue, Jul 06, 2021 at 04:39:11PM +0100, Robin Murphy wrote: > > > On 2021-07-06 15:05, Christoph Hellwig wrote: > > > > On Tue, Jul 06, 2021 at 03:01:04PM +0100, Robin Murphy wrote: > > > > > FWIW I was pondering the question of whether to do something along > > > > > those > > > > > lines or just scrap the default assignment entirely, so since I > > > > > hadn't got > > > > > round to saying that I've gone ahead and hacked up the alternative > > > > > (similarly untested) for comparison :) > > > > > > > > > > TBH I'm still not sure which one I prefer... > > > > > > > > Claire did implement something like your suggestion originally, but > > > > I don't really like it as it doesn't scale for adding multiple global > > > > pools, e.g. for the 64-bit addressable one for the various encrypted > > > > secure guest schemes. > > > > > > Ah yes, that had slipped my mind, and it's a fair point indeed. Since > > > we're > > > not concerned with a minimal fix for backports anyway I'm more than happy > > > to > > > focus on Will's approach. Another thing is that that looks to take us a > > > quiet step closer to the possibility of dynamically resizing a SWIOTLB > > > pool, > > > which is something that some of the hypervisor protection schemes looking > > > to > > > build on top of this series may want to explore at some point. > > > > Ok, I'll split that nasty diff I posted up into a reviewable series and we > > can take it from there. > > For what it's worth, I attempted to boot Will's diff on top of Konrad's > devel/for-linus-5.14 and it did not work; in fact, I got no output on my > monitor period, even with earlyprintk=, and I do not think this machine has > a serial console. Looking back at the diff, I completely messed up swiotlb_exit() by mixing up physical and virtual addresses. > Robin's fix does work, it survived ten reboots with no issues getting to X > and I do not see the KASAN and slub debug messages anymore but I understand > that this is not the preferred solution it seems (although Konrad did want > to know if it works). > > I am happy to test any further patches or follow ups as needed, just keep me > on CC. Cheers. Since this isn't 5.14 material any more, I'll CC you on a series next week. Will ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [v7 3/3] drm/i915/display/dsc: Force dsc BPP
> -Original Message- > From: Nikula, Jani > Sent: Thursday, July 8, 2021 9:53 PM > To: Kulkarni, Vandita ; intel- > g...@lists.freedesktop.org > Subject: RE: [v7 3/3] drm/i915/display/dsc: Force dsc BPP > > On Thu, 08 Jul 2021, "Kulkarni, Vandita" wrote: > >> -Original Message- > >> From: Nikula, Jani > >> Sent: Thursday, July 8, 2021 6:44 PM > >> To: Kulkarni, Vandita ; intel- > >> g...@lists.freedesktop.org > >> Cc: Kulkarni, Vandita > >> Subject: Re: [v7 3/3] drm/i915/display/dsc: Force dsc BPP > >> > >> On Thu, 08 Jul 2021, Jani Nikula wrote: > >> > On Thu, 08 Jul 2021, Vandita Kulkarni > wrote: > >> >> Set DSC BPP to the value forced through debugfs. It can go from > >> >> bpc to bpp-1. > >> >> > >> >> Signed-off-by: Vandita Kulkarni > >> >> --- > >> >> drivers/gpu/drm/i915/display/intel_dp.c | 17 + > >> >> 1 file changed, 17 insertions(+) > >> >> > >> >> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c > >> >> b/drivers/gpu/drm/i915/display/intel_dp.c > >> >> index 5b52beaddada..3e50cdd7e448 100644 > >> >> --- a/drivers/gpu/drm/i915/display/intel_dp.c > >> >> +++ b/drivers/gpu/drm/i915/display/intel_dp.c > >> >> @@ -1240,6 +1240,23 @@ static int > >> >> intel_dp_dsc_compute_config(struct > >> intel_dp *intel_dp, > >> >> pipe_config->port_clock = intel_dp->common_rates[limits- > >> >max_clock]; > >> >> pipe_config->lane_count = limits->max_lane_count; > >> >> > >> >> + if (intel_dp->force_dsc_en) { > >> > >> Oh, this should check for intel_dp->force_dsc_bpp. We don't want to > >> always force the bpp when we force dsc enable. > > Okay will fix this. > > And I was returning -EINVAL , to fail the test on setting invalid BPP. > > Okay, if it makes the test easier, I guess it's fine. Up to you. Okay, for now I have sent a patch like you suggested, as I see that there are no negative test cases . Have sent the v2 of this patch. So, it wouldn't make much difference. Thanks, Vandita > > BR, > Jani. > > > > >> > >> >> + /* As of today we support DSC for only RGB */ > >> >> + if (intel_dp->force_dsc_bpp >= 8 && > >> >> + intel_dp->force_dsc_bpp < pipe_bpp) { > >> >> + drm_dbg_kms(&dev_priv->drm, > >> >> + "DSC BPP forced to %d", > >> >> + intel_dp->force_dsc_bpp); > >> >> + pipe_config->dsc.compressed_bpp = > >> >> + intel_dp->force_dsc_bpp; > >> >> + } else { > >> >> + drm_dbg_kms(&dev_priv->drm, > >> >> + "Invalid DSC BPP %d", > >> >> + intel_dp->force_dsc_bpp); > >> >> + return -EINVAL; > >> > > >> > I'd just let it use the normal compressed_bpp, with the debug > >> > message, instead of returning -EINVAL. > >> > > >> >> + } > >> >> + } > >> >> + > >> > > >> > This should be *after* the below blocks, because otherwise > >> > compressed_bpp will be overridden by the normal case, not by the > >> > force case! > >> > > >> > BR, > >> > Jani. > >> > > >> >> if (intel_dp_is_edp(intel_dp)) { > >> >> pipe_config->dsc.compressed_bpp = > >> >> min_t(u16, > >> drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, > >> > >> -- > >> Jani Nikula, Intel Open Source Graphics Center > > -- > Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [v7 3/3] drm/i915/display/dsc: Force dsc BPP
On Thu, 08 Jul 2021, "Kulkarni, Vandita" wrote: >> -Original Message- >> From: Nikula, Jani >> Sent: Thursday, July 8, 2021 6:44 PM >> To: Kulkarni, Vandita ; intel- >> g...@lists.freedesktop.org >> Cc: Kulkarni, Vandita >> Subject: Re: [v7 3/3] drm/i915/display/dsc: Force dsc BPP >> >> On Thu, 08 Jul 2021, Jani Nikula wrote: >> > On Thu, 08 Jul 2021, Vandita Kulkarni wrote: >> >> Set DSC BPP to the value forced through debugfs. It can go from bpc >> >> to bpp-1. >> >> >> >> Signed-off-by: Vandita Kulkarni >> >> --- >> >> drivers/gpu/drm/i915/display/intel_dp.c | 17 + >> >> 1 file changed, 17 insertions(+) >> >> >> >> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c >> >> b/drivers/gpu/drm/i915/display/intel_dp.c >> >> index 5b52beaddada..3e50cdd7e448 100644 >> >> --- a/drivers/gpu/drm/i915/display/intel_dp.c >> >> +++ b/drivers/gpu/drm/i915/display/intel_dp.c >> >> @@ -1240,6 +1240,23 @@ static int intel_dp_dsc_compute_config(struct >> intel_dp *intel_dp, >> >> pipe_config->port_clock = intel_dp->common_rates[limits- >> >max_clock]; >> >> pipe_config->lane_count = limits->max_lane_count; >> >> >> >> + if (intel_dp->force_dsc_en) { >> >> Oh, this should check for intel_dp->force_dsc_bpp. We don't want to always >> force the bpp when we force dsc enable. > Okay will fix this. > And I was returning -EINVAL , to fail the test on setting invalid BPP. Okay, if it makes the test easier, I guess it's fine. Up to you. BR, Jani. > >> >> >> + /* As of today we support DSC for only RGB */ >> >> + if (intel_dp->force_dsc_bpp >= 8 && >> >> + intel_dp->force_dsc_bpp < pipe_bpp) { >> >> + drm_dbg_kms(&dev_priv->drm, >> >> + "DSC BPP forced to %d", >> >> + intel_dp->force_dsc_bpp); >> >> + pipe_config->dsc.compressed_bpp = >> >> + intel_dp->force_dsc_bpp; >> >> + } else { >> >> + drm_dbg_kms(&dev_priv->drm, >> >> + "Invalid DSC BPP %d", >> >> + intel_dp->force_dsc_bpp); >> >> + return -EINVAL; >> > >> > I'd just let it use the normal compressed_bpp, with the debug message, >> > instead of returning -EINVAL. >> > >> >> + } >> >> + } >> >> + >> > >> > This should be *after* the below blocks, because otherwise >> > compressed_bpp will be overridden by the normal case, not by the force >> > case! >> > >> > BR, >> > Jani. >> > >> >> if (intel_dp_is_edp(intel_dp)) { >> >> pipe_config->dsc.compressed_bpp = >> >> min_t(u16, >> drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4, >> >> -- >> Jani Nikula, Intel Open Source Graphics Center -- Jani Nikula, Intel Open Source Graphics Center ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/7] CT changes required for GuC submission
As part of enabling GuC submission discussed in [1], [2], and [3] we need optimize and update the CT code as this is now in the critical path of submission. This series includes the patches to do that which is the first 7 patches from [3]. The patches should have addressed all the feedback in [3] and should be ready to merge once CI returns a we get a few more RBs. v2: Fix checkpatch warning, address a couple of Michal's comments v3: Address John Harrison's comments v4: Address remaining comments, resend for patchworks to merge Signed-off-by: Matthew Brost [1] https://patchwork.freedesktop.org/series/89844/ [2] https://patchwork.freedesktop.org/series/91417/ [3] https://patchwork.freedesktop.org/series/91840/ Signed-off-by: Matthew Brost John Harrison (1): drm/i915/guc: Module load failure test for CT buffer creation Matthew Brost (6): drm/i915/guc: Relax CTB response timeout drm/i915/guc: Improve error message for unsolicited CT response drm/i915/guc: Increase size of CTB buffers drm/i915/guc: Add non blocking CTB send function drm/i915/guc: Add stall timer to non blocking CTB send function drm/i915/guc: Optimize CTB writes and reads .../gt/uc/abi/guc_communication_ctb_abi.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 +- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 256 +++--- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 14 +- 4 files changed, 234 insertions(+), 50 deletions(-) -- 2.28.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/7] drm/i915/guc: Increase size of CTB buffers
With the introduction of non-blocking CTBs more than one CTB can be in flight at a time. Increasing the size of the CTBs should reduce how often software hits the case where no space is available in the CTB buffer. Cc: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 80db59b45c45..43e03aa2dde8 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct intel_guc_ct *ct) * ++---+--+ * * Size of each `CT Buffer`_ must be multiple of 4K. - * As we don't expect too many messages, for now use minimum sizes. + * We don't expect too many messages in flight at any time, unless we are + * using the GuC submission. In that case each request requires a minimum + * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this + * enough space to avoid backpressure on the driver. We increase the size + * of the receive buffer (relative to the send) to ensure a G2H response + * CTB has a landing spot. */ #define CTB_DESC_SIZE ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K) #define CTB_H2G_BUFFER_SIZE(SZ_4K) -#define CTB_G2H_BUFFER_SIZE(SZ_4K) +#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE) struct ct_request { struct list_head link; @@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct ct_incoming_msg **msg) /* beware of buffer wrap case */ if (unlikely(available < 0)) available += size; - CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail); + CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size); GEM_BUG_ON(available < 0); header = cmds[head]; -- 2.28.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads
CTB writes are now in the path of command submission and should be optimized for performance. Rather than reading CTB descriptor values (e.g. head, tail) which could result in accesses across the PCIe bus, store shadow local copies and only read/write the descriptor values when absolutely necessary. Also store the current space in the each channel locally. v2: (Michal) - Add additional sanity checks for head / tail pointers - Use GUC_CTB_HDR_LEN rather than magic 1 v3: (Michal / John H) - Drop redundant check of head value v4: (John H) - Drop redundant checks of tail / head values v5: (Michal) - Address more nits v6: (Michal) - Add GEM_BUG_ON sanity check on ctb->space Signed-off-by: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 93 +++ drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 6 ++ 2 files changed, 67 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index db3e85b89573..ad33708c2818 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct guc_ct_buffer_desc *desc) static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb) { ctb->broken = false; + ctb->tail = 0; + ctb->head = 0; + ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size); + guc_ct_buffer_desc_init(ctb->desc); } @@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct, { struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; struct guc_ct_buffer_desc *desc = ctb->desc; - u32 head = desc->head; - u32 tail = desc->tail; + u32 tail = ctb->tail; u32 size = ctb->size; - u32 used; u32 header; u32 hxg; u32 type; @@ -396,25 +398,22 @@ static int ct_write(struct intel_guc_ct *ct, if (unlikely(desc->status)) goto corrupted; - if (unlikely((tail | head) >= size)) { - CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n", -head, tail, size); + GEM_BUG_ON(tail > size); + +#ifdef CONFIG_DRM_I915_DEBUG_GUC + if (unlikely(tail != READ_ONCE(desc->tail))) { + CT_ERROR(ct, "Tail was modified %u != %u\n", +desc->tail, tail); + desc->status |= GUC_CTB_STATUS_MISMATCH; + goto corrupted; + } + if (unlikely(READ_ONCE(desc->head) >= size)) { + CT_ERROR(ct, "Invalid head offset %u >= %u)\n", +desc->head, size); desc->status |= GUC_CTB_STATUS_OVERFLOW; goto corrupted; } - - /* -* tail == head condition indicates empty. GuC FW does not support -* using up the entire buffer to get tail == head meaning full. -*/ - if (tail < head) - used = (size - head) + tail; - else - used = tail - head; - - /* make sure there is a space including extra dw for the header */ - if (unlikely(used + len + GUC_CTB_HDR_LEN >= size)) - return -ENOSPC; +#endif /* * dw0: CT header (including fence) @@ -452,6 +451,11 @@ static int ct_write(struct intel_guc_ct *ct, */ write_barrier(ct); + /* update local copies */ + ctb->tail = tail; + GEM_BUG_ON(ctb->space < len + GUC_CTB_HDR_LEN); + ctb->space -= len + GUC_CTB_HDR_LEN; + /* now update descriptor */ WRITE_ONCE(desc->tail, tail); @@ -469,7 +473,7 @@ static int ct_write(struct intel_guc_ct *ct, * @req: pointer to pending request * @status:placeholder for status * - * For each sent request, Guc shall send bac CT response message. + * For each sent request, GuC shall send back CT response message. * Our message handler will update status of tracked request once * response message with given fence is received. Wait here and * check for valid response status value. @@ -525,24 +529,36 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct) return ret; } -static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) +static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw) { + struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; struct guc_ct_buffer_desc *desc = ctb->desc; - u32 head = READ_ONCE(desc->head); + u32 head; u32 space; - space = CIRC_SPACE(desc->tail, head, ctb->size); + if (ctb->space >= len_dw) + return true; + + head = READ_ONCE(desc->head); + if (unlikely(head > ctb->size)) { + CT_ERROR(ct, "Invalid head offset %u >= %u)\n", +head, ctb->size); + desc->status |= GUC_CTB_STATUS_OVERFLOW; + ctb->broken
[Intel-gfx] [PATCH 1/7] drm/i915/guc: Relax CTB response timeout
In upcoming patch we will allow more CTB requests to be sent in parallel to the GuC for processing, so we shouldn't assume any more that GuC will always reply without 10ms. Use bigger value hardcoded value of 1s instead. v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option v3: (Daniel Vetter) - Use hardcoded value of 1s rather than config option v4: (Michal) - Use defines for timeout values Signed-off-by: Matthew Brost Cc: Michal Wajdeczko Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 43409044528e..b86575b99537 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) /* * Fast commands should complete in less than 10us, so sample quickly * up to that length of time, then switch to a slower sleep-wait loop. -* No GuC command should ever take longer than 10ms. +* No GuC command should ever take longer than 10ms but many GuC +* commands can be inflight at time, so use a 1s timeout on the slower +* sleep-wait loop. */ +#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10 +#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000 #define done \ (FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \ GUC_HXG_ORIGIN_GUC) - err = wait_for_us(done, 10); + err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS); if (err) - err = wait_for(done, 10); + err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS); #undef done if (unlikely(err)) -- 2.28.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function
Add non blocking CTB send function, intel_guc_send_nb. GuC submission will send CTBs in the critical path and does not need to wait for these CTBs to complete before moving on, hence the need for this new function. The non-blocking CTB now must have a flow control mechanism to ensure the buffer isn't overrun. A lazy spin wait is used as we believe the flow control condition should be rare with a properly sized buffer. The function, intel_guc_send_nb, is exported in this patch but unused. Several patches later in the series make use of this function. v2: (Michal) - Use define for H2G room calculations - Move INTEL_GUC_SEND_NB define (Daniel Vetter) - Use msleep_interruptible rather than cond_resched v3: (Michal) - Move includes to following patch - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g v4: (John H) - Update comment, add type local variable Signed-off-by: John Harrison Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- .../gt/uc/abi/guc_communication_ctb_abi.h | 3 +- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 ++- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 4 +- 4 files changed, 91 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h index e933ca02d0eb..99e1fad5ca20 100644 --- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h +++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h @@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64); * +---+---+--+ */ -#define GUC_CTB_MSG_MIN_LEN1u +#define GUC_CTB_HDR_LEN1u +#define GUC_CTB_MSG_MIN_LENGUC_CTB_HDR_LEN #define GUC_CTB_MSG_MAX_LEN256u #define GUC_CTB_MSG_0_FENCE(0x << 16) #define GUC_CTB_MSG_0_FORMAT (0xf << 12) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 4abc59f6f3cd..72e4653222e2 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct intel_guc_log *log) static inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len) { - return intel_guc_ct_send(&guc->ct, action, len, NULL, 0); + return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, 0); +} + +static +inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len) +{ + return intel_guc_ct_send(&guc->ct, action, len, NULL, 0, +INTEL_GUC_CT_SEND_NB); } static inline int @@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 *action, u32 len, u32 *response_buf, u32 response_buf_size) { return intel_guc_ct_send(&guc->ct, action, len, -response_buf, response_buf_size); +response_buf, response_buf_size, 0); } static inline void intel_guc_to_host_event_handler(struct intel_guc *guc) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 43e03aa2dde8..3d6cba8d91ad 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -3,6 +3,8 @@ * Copyright © 2016-2019 Intel Corporation */ +#include + #include "i915_drv.h" #include "intel_guc_ct.h" #include "gt/intel_gt.h" @@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct) static int ct_write(struct intel_guc_ct *ct, const u32 *action, u32 len /* in dwords */, - u32 fence) + u32 fence, u32 flags) { struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; struct guc_ct_buffer_desc *desc = ctb->desc; @@ -383,6 +385,7 @@ static int ct_write(struct intel_guc_ct *ct, u32 used; u32 header; u32 hxg; + u32 type; u32 *cmds = ctb->cmds; unsigned int i; @@ -408,8 +411,8 @@ static int ct_write(struct intel_guc_ct *ct, else used = tail - head; - /* make sure there is a space including extra dw for the fence */ - if (unlikely(used + len + 1 >= size)) + /* make sure there is a space including extra dw for the header */ + if (unlikely(used + len + GUC_CTB_HDR_LEN >= size)) return -ENOSPC; /* @@ -421,9 +424,11 @@ static int ct_write(struct intel_guc_ct *ct, FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) | FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence); - hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) | - FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION
[Intel-gfx] [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function
Implement a stall timer which fails H2G CTBs once a period of time with no forward progress is reached to prevent deadlock. v2: (Michal) - Improve error message in ct_deadlock() - Set broken when ct_deadlock() returns true - Return -EPIPE on ct_deadlock() v3: (Michal) - Add ms to stall timer comment (Matthew) - Move broken check to intel_guc_ct_send() Signed-off-by: John Harrison Signed-off-by: Daniele Ceraolo Spurio Signed-off-by: Matthew Brost Reviewed-by: John Harrison --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h | 4 ++ 2 files changed, 59 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index 3d6cba8d91ad..db3e85b89573 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -4,6 +4,9 @@ */ #include +#include +#include +#include #include "i915_drv.h" #include "intel_guc_ct.h" @@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct) goto err_deregister; ct->enabled = true; + ct->stall_time = KTIME_MAX; return 0; @@ -389,9 +393,6 @@ static int ct_write(struct intel_guc_ct *ct, u32 *cmds = ctb->cmds; unsigned int i; - if (unlikely(ctb->broken)) - return -EPIPE; - if (unlikely(desc->status)) goto corrupted; @@ -505,6 +506,25 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status) return err; } +#define GUC_CTB_TIMEOUT_MS 1500 +static inline bool ct_deadlocked(struct intel_guc_ct *ct) +{ + long timeout = GUC_CTB_TIMEOUT_MS; + bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout; + + if (unlikely(ret)) { + struct guc_ct_buffer_desc *send = ct->ctbs.send.desc; + struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc; + + CT_ERROR(ct, "Communication stalled for %lld ms, desc status=%#x,%#x\n", +ktime_ms_delta(ktime_get(), ct->stall_time), +send->status, recv->status); + ct->ctbs.send.broken = true; + } + + return ret; +} + static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) { struct guc_ct_buffer_desc *desc = ctb->desc; @@ -516,6 +536,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw) return space >= len_dw; } +static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw) +{ + struct intel_guc_ct_buffer *ctb = &ct->ctbs.send; + + lockdep_assert_held(&ct->ctbs.send.lock); + + if (unlikely(!h2g_has_room(ctb, len_dw))) { + if (ct->stall_time == KTIME_MAX) + ct->stall_time = ktime_get(); + + if (unlikely(ct_deadlocked(ct))) + return -EPIPE; + else + return -EBUSY; + } + + ct->stall_time = KTIME_MAX; + return 0; +} + static int ct_send_nb(struct intel_guc_ct *ct, const u32 *action, u32 len, @@ -528,11 +568,9 @@ static int ct_send_nb(struct intel_guc_ct *ct, spin_lock_irqsave(&ctb->lock, spin_flags); - ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN); - if (unlikely(!ret)) { - ret = -EBUSY; + ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN); + if (unlikely(ret)) goto out; - } fence = ct_get_next_fence(ct); ret = ct_write(ct, action, len, fence, flags); @@ -575,8 +613,13 @@ static int ct_send(struct intel_guc_ct *ct, retry: spin_lock_irqsave(&ctb->lock, flags); if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) { + if (ct->stall_time == KTIME_MAX) + ct->stall_time = ktime_get(); spin_unlock_irqrestore(&ctb->lock, flags); + if (unlikely(ct_deadlocked(ct))) + return -EPIPE; + if (msleep_interruptible(sleep_period_ms)) return -EINTR; sleep_period_ms = sleep_period_ms << 1; @@ -584,6 +627,8 @@ static int ct_send(struct intel_guc_ct *ct, goto retry; } + ct->stall_time = KTIME_MAX; + fence = ct_get_next_fence(ct); request.fence = fence; request.status = 0; @@ -646,6 +691,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len, return -ENODEV; } + if (unlikely(ct->ctbs.send.broken)) + return -EPIPE; + if (flags & INTEL_GUC_CT_SEND_NB) return ct_send_nb(ct, action, len, flags); diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h index 5bb8bef024c8..bee03794c1eb 100644 --- a/drivers/gpu/dr
[Intel-gfx] [PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response
Improve the error message when a unsolicited CT response is received by printing fence that couldn't be found, the last fence, and all requests with a response outstanding. Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index b86575b99537..80db59b45c45 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, struct ct_incoming_msg *r found = true; break; } - spin_unlock_irqrestore(&ct->requests.lock, flags); - if (!found) { CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence); - return -ENOKEY; + CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence, +ct->requests.last_fence); + list_for_each_entry(req, &ct->requests.pending, link) + CT_ERROR(ct, "request %u awaits response\n", +req->fence); + err = -ENOKEY; } + spin_unlock_irqrestore(&ct->requests.lock, flags); if (unlikely(err)) return err; -- 2.28.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation
From: John Harrison Add several module failure load inject points in the CT buffer creation code path. Signed-off-by: John Harrison Signed-off-by: Matthew Brost Reviewed-by: Michal Wajdeczko --- drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c index ad33708c2818..83ec60ea3f89 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c @@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 type, { int err; + err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO); + if (unlikely(err)) + return err; + err = guc_action_register_ct_buffer(ct_to_guc(ct), type, desc_addr, buff_addr, size); if (unlikely(err)) @@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct) u32 *cmds; int err; + err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO); + if (err) + return err; + GEM_BUG_ON(ct->vma); blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + CTB_G2H_BUFFER_SIZE; -- 2.28.0 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 29/30] drm/i915/gem: Roll all of context creation together
Now that we have the whole engine set and VM at context creation time, we can just assign those fields instead of creating first and handling the VM and engines later. This lets us avoid creating useless VMs and engine sets and lets us get rid of the complex VM setting code. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 176 ++ .../gpu/drm/i915/gem/selftests/mock_context.c | 33 ++-- 2 files changed, 73 insertions(+), 136 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5f5375b15c530..c67e305f5bc74 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1279,56 +1279,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state) return 0; } -static struct i915_gem_context * -__create_context(struct drm_i915_private *i915, -const struct i915_gem_proto_context *pc) -{ - struct i915_gem_context *ctx; - struct i915_gem_engines *e; - int err; - int i; - - ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); - if (!ctx) - return ERR_PTR(-ENOMEM); - - kref_init(&ctx->ref); - ctx->i915 = i915; - ctx->sched = pc->sched; - mutex_init(&ctx->mutex); - INIT_LIST_HEAD(&ctx->link); - - spin_lock_init(&ctx->stale.lock); - INIT_LIST_HEAD(&ctx->stale.engines); - - mutex_init(&ctx->engines_mutex); - e = default_engines(ctx, pc->legacy_rcs_sseu); - if (IS_ERR(e)) { - err = PTR_ERR(e); - goto err_free; - } - RCU_INIT_POINTER(ctx->engines, e); - - INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL); - mutex_init(&ctx->lut_mutex); - - /* NB: Mark all slices as needing a remap so that when the context first -* loads it will restore whatever remap state already exists. If there -* is no remap info, it will be a NOP. */ - ctx->remap_slice = ALL_L3_SLICES(i915); - - ctx->user_flags = pc->user_flags; - - for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) - ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; - - return ctx; - -err_free: - kfree(ctx); - return ERR_PTR(err); -} - static inline struct i915_gem_engines * __context_engines_await(const struct i915_gem_context *ctx, bool *user_engines) @@ -1372,54 +1322,31 @@ context_apply_all(struct i915_gem_context *ctx, i915_sw_fence_complete(&e->fence); } -static void __apply_ppgtt(struct intel_context *ce, void *vm) -{ - i915_vm_put(ce->vm); - ce->vm = i915_vm_get(vm); -} - -static struct i915_address_space * -__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm) -{ - struct i915_address_space *old; - - old = rcu_replace_pointer(ctx->vm, - i915_vm_open(vm), - lockdep_is_held(&ctx->mutex)); - GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old)); - - context_apply_all(ctx, __apply_ppgtt, vm); - - return old; -} - -static void __assign_ppgtt(struct i915_gem_context *ctx, - struct i915_address_space *vm) -{ - if (vm == rcu_access_pointer(ctx->vm)) - return; - - vm = __set_ppgtt(ctx, vm); - if (vm) - i915_vm_close(vm); -} - static struct i915_gem_context * i915_gem_create_context(struct drm_i915_private *i915, const struct i915_gem_proto_context *pc) { struct i915_gem_context *ctx; - int ret; + struct i915_address_space *vm = NULL; + struct i915_gem_engines *e; + int err; + int i; - ctx = __create_context(i915, pc); - if (IS_ERR(ctx)) - return ctx; + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return ERR_PTR(-ENOMEM); + + kref_init(&ctx->ref); + ctx->i915 = i915; + ctx->sched = pc->sched; + mutex_init(&ctx->mutex); + INIT_LIST_HEAD(&ctx->link); + + spin_lock_init(&ctx->stale.lock); + INIT_LIST_HEAD(&ctx->stale.engines); if (pc->vm) { - /* __assign_ppgtt() requires this mutex to be held */ - mutex_lock(&ctx->mutex); - __assign_ppgtt(ctx, pc->vm); - mutex_unlock(&ctx->mutex); + vm = i915_vm_get(pc->vm); } else if (HAS_FULL_PPGTT(i915)) { struct i915_ppgtt *ppgtt; @@ -1427,50 +1354,65 @@ i915_gem_create_context(struct drm_i915_private *i915, if (IS_ERR(ppgtt)) { drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n", PTR_ERR(ppgtt)); - context_close(ctx); - return ERR_CAST(ppgtt); +
[Intel-gfx] [PATCH 30/30] drm/i915: Finalize contexts in GEM_CONTEXT_CREATE on version 13+
All the proto-context stuff for context creation exists to allow older userspace drivers to set VMs and engine sets via SET_CONTEXT_PARAM. Drivers need to update to use CONTEXT_CREATE_EXT_* for this going forward. Force the issue by blocking the old mechanism on any future hardware generations. Signed-off-by: Jason Ekstrand Cc: Jon Bloomfield Cc: Carl Zhang Cc: Michal Mrozek --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 - 1 file changed, 30 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index c67e305f5bc74..7d6f52d8a8012 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1996,9 +1996,28 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data, goto err_pc; } - ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id); - if (ret < 0) - goto err_pc; + if (GRAPHICS_VER(i915) > 12) { + struct i915_gem_context *ctx; + + /* Get ourselves a context ID */ + ret = xa_alloc(&ext_data.fpriv->context_xa, &id, NULL, + xa_limit_32b, GFP_KERNEL); + if (ret) + goto err_pc; + + ctx = i915_gem_create_context(i915, ext_data.pc); + if (IS_ERR(ctx)) { + ret = PTR_ERR(ctx); + goto err_pc; + } + + proto_context_close(ext_data.pc); + gem_context_register(ctx, ext_data.fpriv, id); + } else { + ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id); + if (ret < 0) + goto err_pc; + } args->ctx_id = id; drm_dbg(&i915->drm, "HW context %d created\n", args->ctx_id); @@ -2181,15 +2200,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, mutex_lock(&file_priv->proto_context_lock); ctx = __context_lookup(file_priv, args->ctx_id); if (!ctx) { - /* FIXME: We should consider disallowing SET_CONTEXT_PARAM -* for most things on future platforms. Clients should be -* using CONTEXT_CREATE_EXT_PARAM instead. -*/ pc = xa_load(&file_priv->proto_context_xa, args->ctx_id); - if (pc) + if (pc) { + /* Contexts should be finalized inside +* GEM_CONTEXT_CREATE starting with graphics +* version 13. +*/ + WARN_ON(GRAPHICS_VER(file_priv->dev_priv) > 12); ret = set_proto_ctx_param(file_priv, pc, args); - else + } else { ret = -ENOENT; + } } mutex_unlock(&file_priv->proto_context_lock); -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 28/30] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec
We want to delete __assign_ppgtt and, generally, stop setting the VM after context creation. This is the one place I could find in the selftests where we set a VM after the fact. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 3e59746afdc82..8eb5050f8cb3e 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg) struct i915_gem_context *ctx; struct intel_context *ce; - ctx = kernel_context(i915, NULL); + ctx = kernel_context(i915, ctx_vm(parent)); if (IS_ERR(ctx)) { err = PTR_ERR(ctx); goto out_test; } - mutex_lock(&ctx->mutex); - __assign_ppgtt(ctx, ctx_vm(parent)); - mutex_unlock(&ctx->mutex); - ce = i915_gem_context_get_engine(ctx, engine->legacy_idx); GEM_BUG_ON(IS_ERR(ce)); -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 27/30] drm/i915/selftests: Take a VM in kernel_context()
This better models where we want to go with contexts in general where things like the VM and engine set are create parameters instead of being set after the fact. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- .../drm/i915/gem/selftests/i915_gem_context.c | 4 ++-- .../gpu/drm/i915/gem/selftests/mock_context.c | 9 - .../gpu/drm/i915/gem/selftests/mock_context.h | 4 +++- drivers/gpu/drm/i915/gt/selftest_execlists.c | 20 +-- drivers/gpu/drm/i915/gt/selftest_hangcheck.c | 2 +- 5 files changed, 24 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c index 92544a174cc9a..3e59746afdc82 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c @@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg) struct i915_gem_context *ctx; struct intel_context *ce; - ctx = kernel_context(i915); + ctx = kernel_context(i915, NULL); if (IS_ERR(ctx)) { err = PTR_ERR(ctx); goto out_file; @@ -813,7 +813,7 @@ static int igt_shared_ctx_exec(void *arg) struct i915_gem_context *ctx; struct intel_context *ce; - ctx = kernel_context(i915); + ctx = kernel_context(i915, NULL); if (IS_ERR(ctx)) { err = PTR_ERR(ctx); goto out_test; diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c b/drivers/gpu/drm/i915/gem/selftests/mock_context.c index 61aaac4a334cf..500ef27ba4771 100644 --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c @@ -150,7 +150,8 @@ live_context_for_engine(struct intel_engine_cs *engine, struct file *file) } struct i915_gem_context * -kernel_context(struct drm_i915_private *i915) +kernel_context(struct drm_i915_private *i915, + struct i915_address_space *vm) { struct i915_gem_context *ctx; struct i915_gem_proto_context *pc; @@ -159,6 +160,12 @@ kernel_context(struct drm_i915_private *i915) if (IS_ERR(pc)) return ERR_CAST(pc); + if (vm) { + if (pc->vm) + i915_vm_put(pc->vm); + pc->vm = i915_vm_get(vm); + } + ctx = i915_gem_create_context(i915, pc); proto_context_close(pc); if (IS_ERR(ctx)) diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h b/drivers/gpu/drm/i915/gem/selftests/mock_context.h index 2a6121d33352d..7a02fd9b5866a 100644 --- a/drivers/gpu/drm/i915/gem/selftests/mock_context.h +++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h @@ -10,6 +10,7 @@ struct file; struct drm_i915_private; struct intel_engine_cs; +struct i915_address_space; void mock_init_contexts(struct drm_i915_private *i915); @@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file *file); struct i915_gem_context * live_context_for_engine(struct intel_engine_cs *engine, struct file *file); -struct i915_gem_context *kernel_context(struct drm_i915_private *i915); +struct i915_gem_context *kernel_context(struct drm_i915_private *i915, + struct i915_address_space *vm); void kernel_context_close(struct i915_gem_context *ctx); #endif /* !__MOCK_CONTEXT_H */ diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c b/drivers/gpu/drm/i915/gt/selftest_execlists.c index 6abce18d0ef32..73ddc6e147305 100644 --- a/drivers/gpu/drm/i915/gt/selftest_execlists.c +++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c @@ -1539,12 +1539,12 @@ static int live_busywait_preempt(void *arg) * preempt the busywaits used to synchronise between rings. */ - ctx_hi = kernel_context(gt->i915); + ctx_hi = kernel_context(gt->i915, NULL); if (!ctx_hi) return -ENOMEM; ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY; - ctx_lo = kernel_context(gt->i915); + ctx_lo = kernel_context(gt->i915, NULL); if (!ctx_lo) goto err_ctx_hi; ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY; @@ -1741,12 +1741,12 @@ static int live_preempt(void *arg) if (igt_spinner_init(&spin_lo, gt)) goto err_spin_hi; - ctx_hi = kernel_context(gt->i915); + ctx_hi = kernel_context(gt->i915, NULL); if (!ctx_hi) goto err_spin_lo; ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY; - ctx_lo = kernel_context(gt->i915); + ctx_lo = kernel_context(gt->i915, NULL); if (!ctx_lo) goto err_ctx_hi; ctx_lo->sched.priority
[Intel-gfx] [PATCH 24/30] drm/i915/gem: Delay context creation (v3)
The current context uAPI allows for two methods of setting context parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM. The former is allowed to be called at any time while the later happens as part of GEM_CONTEXT_CREATE. Currently, everything settable via one is settable via the other. While some params are fairly simple and setting them on a live context is harmless such as the context priority, others are far trickier such as the VM or the set of engines. In order to swap out the VM, for instance, we have to delay until all current in-flight work is complete, swap in the new VM, and then continue. This leads to a plethora of potential race conditions we'd really rather avoid. In previous patches, we added a i915_gem_proto_context struct which is capable of storing and tracking all such create parameters. This commit delays the creation of the actual context until after the client is done configuring it with SET_CONTEXT_PARAM. From the perspective of the client, it has the same u32 context ID the whole time. From the perspective of i915, however, it's an i915_gem_proto_context right up until the point where we attempt to do something which the proto-context can't handle. Then the real context gets created. This is accomplished via a little xarray dance. When GEM_CONTEXT_CREATE is called, we create a proto-context, reserve a slot in context_xa but leave it NULL, the proto-context in the corresponding slot in proto_context_xa. Then, whenever we go to look up a context, we first check context_xa. If it's there, we return the i915_gem_context and we're done. If it's not, we look in proto_context_xa and, if we find it there, we create the actual context and kill the proto-context. In order for this dance to work properly, everything which ever touches a proto-context is guarded by drm_i915_file_private::proto_context_lock, including context creation. Yes, this means context creation now takes a giant global lock but it can't really be helped and that should never be on any driver's fast-path anyway. v2 (Daniel Vetter): - Commit message grammatical fixes. - Use WARN_ON instead of GEM_BUG_ON - Rename lazy_create_context_locked to finalize_create_context_locked - Rework the control-flow logic in the setparam ioctl - Better documentation all around v3 (kernel test robot): - Make finalize_create_context_locked static Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 203 ++ drivers/gpu/drm/i915/gem/i915_gem_context.h | 3 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 54 + .../gpu/drm/i915/gem/selftests/mock_context.c | 5 +- drivers/gpu/drm/i915/i915_drv.h | 76 +-- 5 files changed, 283 insertions(+), 58 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5a1402544d48d..c4f89e4b1665f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -278,6 +278,42 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) return err; } +static int proto_context_register_locked(struct drm_i915_file_private *fpriv, +struct i915_gem_proto_context *pc, +u32 *id) +{ + int ret; + void *old; + + lockdep_assert_held(&fpriv->proto_context_lock); + + ret = xa_alloc(&fpriv->context_xa, id, NULL, xa_limit_32b, GFP_KERNEL); + if (ret) + return ret; + + old = xa_store(&fpriv->proto_context_xa, *id, pc, GFP_KERNEL); + if (xa_is_err(old)) { + xa_erase(&fpriv->context_xa, *id); + return xa_err(old); + } + WARN_ON(old); + + return 0; +} + +static int proto_context_register(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + u32 *id) +{ + int ret; + + mutex_lock(&fpriv->proto_context_lock); + ret = proto_context_register_locked(fpriv, pc, id); + mutex_unlock(&fpriv->proto_context_lock); + + return ret; +} + static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv, struct i915_gem_proto_context *pc, const struct drm_i915_gem_context_param *args) @@ -1448,12 +1484,12 @@ void i915_gem_init__contexts(struct drm_i915_private *i915) init_contexts(&i915->gem.contexts); } -static int gem_context_register(struct i915_gem_context *ctx, - struct drm_i915_file_private *fpriv, - u32 *id) +static void gem_context_register(struct i915_gem_context *ctx, +struct drm_i915_file_private *fpriv, +u32 id) { struct drm_i915_private *i915 = ctx->i915; - int ret; +
[Intel-gfx] [PATCH 26/30] drm/i915/gem: Don't allow changing the engine set on running contexts (v3)
When the APIs were added to manage the engine set on a GEM context directly from userspace, the questionable choice was made to allow changing the engine set on a context at any time. This is horribly racy and there's absolutely no reason why any userspace would want to do this outside of trying to exercise interesting race conditions. By removing support for CONTEXT_PARAM_ENGINES from ctx_setparam, we make it impossible to change the engine set after the context has been fully created. This doesn't yet let us delete all the deferred engine clean-up code as that's still used for handling the case where the client dies or calls GEM_CONTEXT_DESTROY while work is in flight. However, moving to an API where the engine set is effectively immutable gives us more options to potentially clean that code up a bit going forward. It also removes a whole class of ways in which a client can hurt itself or try to get around kernel context banning. v2 (Jason Ekstrand): - Expand the commit mesage v3 (Jason Ekstrand): - Make it more obvious that I915_CONTEXT_PARAM_ENGINES returns -EINVAL Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 304 +--- 1 file changed, 1 insertion(+), 303 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 40acecfbbe5b5..5f5375b15c530 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1819,305 +1819,6 @@ static int set_sseu(struct i915_gem_context *ctx, return ret; } -struct set_engines { - struct i915_gem_context *ctx; - struct i915_gem_engines *engines; -}; - -static int -set_engines__load_balance(struct i915_user_extension __user *base, void *data) -{ - struct i915_context_engines_load_balance __user *ext = - container_of_user(base, typeof(*ext), base); - const struct set_engines *set = data; - struct drm_i915_private *i915 = set->ctx->i915; - struct intel_engine_cs *stack[16]; - struct intel_engine_cs **siblings; - struct intel_context *ce; - struct intel_sseu null_sseu = {}; - u16 num_siblings, idx; - unsigned int n; - int err; - - if (!HAS_EXECLISTS(i915)) - return -ENODEV; - - if (intel_uc_uses_guc_submission(&i915->gt.uc)) - return -ENODEV; /* not implement yet */ - - if (get_user(idx, &ext->engine_index)) - return -EFAULT; - - if (idx >= set->engines->num_engines) { - drm_dbg(&i915->drm, "Invalid placement value, %d >= %d\n", - idx, set->engines->num_engines); - return -EINVAL; - } - - idx = array_index_nospec(idx, set->engines->num_engines); - if (set->engines->engines[idx]) { - drm_dbg(&i915->drm, - "Invalid placement[%d], already occupied\n", idx); - return -EEXIST; - } - - if (get_user(num_siblings, &ext->num_siblings)) - return -EFAULT; - - err = check_user_mbz(&ext->flags); - if (err) - return err; - - err = check_user_mbz(&ext->mbz64); - if (err) - return err; - - siblings = stack; - if (num_siblings > ARRAY_SIZE(stack)) { - siblings = kmalloc_array(num_siblings, -sizeof(*siblings), -GFP_KERNEL); - if (!siblings) - return -ENOMEM; - } - - for (n = 0; n < num_siblings; n++) { - struct i915_engine_class_instance ci; - - if (copy_from_user(&ci, &ext->engines[n], sizeof(ci))) { - err = -EFAULT; - goto out_siblings; - } - - siblings[n] = intel_engine_lookup_user(i915, - ci.engine_class, - ci.engine_instance); - if (!siblings[n]) { - drm_dbg(&i915->drm, - "Invalid sibling[%d]: { class:%d, inst:%d }\n", - n, ci.engine_class, ci.engine_instance); - err = -EINVAL; - goto out_siblings; - } - } - - ce = intel_execlists_create_virtual(siblings, n); - if (IS_ERR(ce)) { - err = PTR_ERR(ce); - goto out_siblings; - } - - intel_context_set_gem(ce, set->ctx, null_sseu); - - if (cmpxchg(&set->engines->engines[idx], NULL, ce)) { - intel_context_put(ce); - err = -EEXIST; - goto out_siblings; - } - -out_siblings: - if (siblings != stack) - kfree(siblings); - - return err; -} - -static int -set_engin
[Intel-gfx] [PATCH 25/30] drm/i915/gem: Don't allow changing the VM on running contexts (v4)
When the APIs were added to manage VMs more directly from userspace, the questionable choice was made to allow changing out the VM on a context at any time. This is horribly racy and there's absolutely no reason why any userspace would want to do this outside of testing that exact race. By removing support for CONTEXT_PARAM_VM from ctx_setparam, we make it impossible to change out the VM after the context has been fully created. This lets us delete a bunch of deferred task code as well as a duplicated (and slightly different) copy of the code which programs the PPGTT registers. v2 (Jason Ekstrand): - Expand the commit message v3 (Daniel Vetter): - Don't drop the __rcu on the vm pointer v4 (Jason Ekstrand): - Make it more obvious that I915_CONTEXT_PARAM_VM returns -EINVAL Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 263 +- .../drm/i915/gem/selftests/i915_gem_context.c | 119 .../drm/i915/selftests/i915_mock_selftests.h | 1 - 3 files changed, 1 insertion(+), 382 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index c4f89e4b1665f..40acecfbbe5b5 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1633,120 +1633,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, void *data, return 0; } -struct context_barrier_task { - struct i915_active base; - void (*task)(void *data); - void *data; -}; - -static void cb_retire(struct i915_active *base) -{ - struct context_barrier_task *cb = container_of(base, typeof(*cb), base); - - if (cb->task) - cb->task(cb->data); - - i915_active_fini(&cb->base); - kfree(cb); -} - -I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault); -static int context_barrier_task(struct i915_gem_context *ctx, - intel_engine_mask_t engines, - bool (*skip)(struct intel_context *ce, void *data), - int (*pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void *data), - int (*emit)(struct i915_request *rq, void *data), - void (*task)(void *data), - void *data) -{ - struct context_barrier_task *cb; - struct i915_gem_engines_iter it; - struct i915_gem_engines *e; - struct i915_gem_ww_ctx ww; - struct intel_context *ce; - int err = 0; - - GEM_BUG_ON(!task); - - cb = kmalloc(sizeof(*cb), GFP_KERNEL); - if (!cb) - return -ENOMEM; - - i915_active_init(&cb->base, NULL, cb_retire, 0); - err = i915_active_acquire(&cb->base); - if (err) { - kfree(cb); - return err; - } - - e = __context_engines_await(ctx, NULL); - if (!e) { - i915_active_release(&cb->base); - return -ENOENT; - } - - for_each_gem_engine(ce, e, it) { - struct i915_request *rq; - - if (I915_SELFTEST_ONLY(context_barrier_inject_fault & - ce->engine->mask)) { - err = -ENXIO; - break; - } - - if (!(ce->engine->mask & engines)) - continue; - - if (skip && skip(ce, data)) - continue; - - i915_gem_ww_ctx_init(&ww, true); -retry: - err = intel_context_pin_ww(ce, &ww); - if (err) - goto err; - - if (pin) - err = pin(ce, &ww, data); - if (err) - goto err_unpin; - - rq = i915_request_create(ce); - if (IS_ERR(rq)) { - err = PTR_ERR(rq); - goto err_unpin; - } - - err = 0; - if (emit) - err = emit(rq, data); - if (err == 0) - err = i915_active_add_request(&cb->base, rq); - - i915_request_add(rq); -err_unpin: - intel_context_unpin(ce); -err: - if (err == -EDEADLK) { - err = i915_gem_ww_ctx_backoff(&ww); - if (!err) - goto retry; - } - i915_gem_ww_ctx_fini(&ww); - - if (err) - break; - } - i915_sw_fence_complete(&e->fence); - - cb->task = err ? NULL : task; /* caller needs to unwind instead */ - cb->data = data; - - i915_active_release(&cb->base); - - return err; -} - static int get_ppgtt(struct drm_i915_file_private *file_priv, struct i915_gem_context *
[Intel-gfx] [PATCH 23/30] drm/i915/gt: Drop i915_address_space::file (v2)
There's a big comment saying how useful it is but no one is using this for anything anymore. It was added in 2bfa996e031b ("drm/i915: Store owning file on the i915_address_space") and used for debugfs at the time as well as telling the difference between the global GTT and a PPGTT. In f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") we removed one use of it by switching to a context walk and comparing with the VM in the context. Finally, VM stats for debugfs were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects") v2 (Daniel Vetter): - Delete a struct drm_i915_file_private pre-declaration - Add a comment to the commit message about history Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 9 - drivers/gpu/drm/i915/gt/intel_gtt.h | 11 --- drivers/gpu/drm/i915/selftests/mock_gtt.c | 1 - 3 files changed, 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 7045e3afa7113..5a1402544d48d 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1453,17 +1453,10 @@ static int gem_context_register(struct i915_gem_context *ctx, u32 *id) { struct drm_i915_private *i915 = ctx->i915; - struct i915_address_space *vm; int ret; ctx->file_priv = fpriv; - mutex_lock(&ctx->mutex); - vm = i915_gem_context_vm(ctx); - if (vm) - WRITE_ONCE(vm->file, fpriv); /* XXX */ - mutex_unlock(&ctx->mutex); - ctx->pid = get_task_pid(current, PIDTYPE_PID); snprintf(ctx->name, sizeof(ctx->name), "%s[%d]", current->comm, pid_nr(ctx->pid)); @@ -1562,8 +1555,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data, if (IS_ERR(ppgtt)) return PTR_ERR(ppgtt); - ppgtt->vm.file = file_priv; - if (args->extensions) { err = i915_user_extensions(u64_to_user_ptr(args->extensions), NULL, 0, diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 9bd89f2a01ff1..bc7153018ebd5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -140,7 +140,6 @@ typedef u64 gen8_pte_t; enum i915_cache_level; -struct drm_i915_file_private; struct drm_i915_gem_object; struct i915_fence_reg; struct i915_vma; @@ -220,16 +219,6 @@ struct i915_address_space { struct intel_gt *gt; struct drm_i915_private *i915; struct device *dma; - /* -* Every address space belongs to a struct file - except for the global -* GTT that is owned by the driver (and so @file is set to NULL). In -* principle, no information should leak from one context to another -* (or between files/processes etc) unless explicitly shared by the -* owner. Tracking the owner is important in order to free up per-file -* objects along with the file, to aide resource tracking, and to -* assign blame. -*/ - struct drm_i915_file_private *file; u64 total; /* size addr space maps (ex. 2GB for ggtt) */ u64 reserved; /* size addr space reserved */ diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c index 5c7ae40bba634..cc047ec594f93 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c @@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name) ppgtt->vm.gt = &i915->gt; ppgtt->vm.i915 = i915; ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE); - ppgtt->vm.file = ERR_PTR(-ENODEV); ppgtt->vm.dma = i915->drm.dev; i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT); -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 21/30] drm/i915/gem: Use the proto-context to handle create parameters (v5)
This means that the proto-context needs to grow support for engine configuration information as well as setparam logic. Fortunately, we'll be deleting a lot of setparam logic on the primary context shortly so it will hopefully balance out. There's an extra bit of fun here when it comes to setting SSEU and the way it interacts with PARAM_ENGINES. Unfortunately, thanks to SET_CONTEXT_PARAM and not being allowed to pick the order in which we handle certain parameters, we have think about those interactions. v2 (Daniel Vetter): - Add a proto_context_free_user_engines helper - Comment on SSEU in the commit message - Use proto_context_set_persistence in set_proto_ctx_param v3 (Daniel Vetter): - Fix a doc comment - Do an explicit HAS_FULL_PPGTT check in set_proto_ctx_vm instead of relying on pc->vm != NULL. - Handle errors for CONTEXT_PARAM_PERSISTENCE - Don't allow more resetting user engines - Rework initialization of UCONTEXT_PERSISTENCE v4 (Jason Ekstrand): - Move hand-rolled initialization of UCONTEXT_PERSISTENCE to an earlier patch v5 (Jason Ekstrand): - Move proto_context_set_persistence to this patch Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 577 +- .../gpu/drm/i915/gem/i915_gem_context_types.h | 58 ++ 2 files changed, 618 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index f135fbc97c5a7..4972b8c91d942 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -193,11 +193,59 @@ static int validate_priority(struct drm_i915_private *i915, static void proto_context_close(struct i915_gem_proto_context *pc) { + int i; + if (pc->vm) i915_vm_put(pc->vm); + if (pc->user_engines) { + for (i = 0; i < pc->num_user_engines; i++) + kfree(pc->user_engines[i].siblings); + kfree(pc->user_engines); + } kfree(pc); } +static int proto_context_set_persistence(struct drm_i915_private *i915, +struct i915_gem_proto_context *pc, +bool persist) +{ + if (persist) { + /* +* Only contexts that are short-lived [that will expire or be +* reset] are allowed to survive past termination. We require +* hangcheck to ensure that the persistent requests are healthy. +*/ + if (!i915->params.enable_hangcheck) + return -EINVAL; + + pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); + } else { + /* To cancel a context we use "preempt-to-idle" */ + if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION)) + return -ENODEV; + + /* +* If the cancel fails, we then need to reset, cleanly! +* +* If the per-engine reset fails, all hope is lost! We resort +* to a full GPU reset in that unlikely case, but realistically +* if the engine could not reset, the full reset does not fare +* much better. The damage has been done. +* +* However, if we cannot reset an engine by itself, we cannot +* cleanup a hanging persistent context without causing +* colateral damage, and we should not pretend we can by +* exposing the interface. +*/ + if (!intel_has_reset_engine(&i915->gt)) + return -ENODEV; + + pc->user_flags &= ~BIT(UCONTEXT_PERSISTENCE); + } + + return 0; +} + static struct i915_gem_proto_context * proto_context_create(struct drm_i915_private *i915, unsigned int flags) { @@ -207,6 +255,8 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) if (!pc) return ERR_PTR(-ENOMEM); + pc->num_user_engines = -1; + pc->user_engines = NULL; pc->user_flags = BIT(UCONTEXT_BANNABLE) | BIT(UCONTEXT_RECOVERABLE); if (i915->params.enable_hangcheck) @@ -228,6 +278,430 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags) return err; } +static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv, + struct i915_gem_proto_context *pc, + const struct drm_i915_gem_context_param *args) +{ + struct drm_i915_private *i915 = fpriv->dev_priv; + struct i915_address_space *vm; + + if (args->size) + return -EINVAL; + + if (!HAS_FULL_PPGTT(i915)) + return -ENODEV; + + if (upper_32_bits(args->value)) + return -ENOENT; + + vm = i915_gem_vm_lookup
[Intel-gfx] [PATCH 22/30] drm/i915/gem: Return an error ptr from context_lookup
We're about to start doing lazy context creation which means contexts get created in i915_gem_context_lookup and we may start having more errors than -ENOENT. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c| 12 ++-- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 4 ++-- drivers/gpu/drm/i915/i915_drv.h| 2 +- drivers/gpu/drm/i915/i915_perf.c | 4 ++-- 4 files changed, 11 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 4972b8c91d942..7045e3afa7113 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -2636,8 +2636,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data, int ret = 0; ctx = i915_gem_context_lookup(file_priv, args->ctx_id); - if (!ctx) - return -ENOENT; + if (IS_ERR(ctx)) + return PTR_ERR(ctx); switch (args->param) { case I915_CONTEXT_PARAM_GTT_SIZE: @@ -2705,8 +2705,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, int ret; ctx = i915_gem_context_lookup(file_priv, args->ctx_id); - if (!ctx) - return -ENOENT; + if (IS_ERR(ctx)) + return PTR_ERR(ctx); ret = ctx_setparam(file_priv, ctx, args); @@ -2725,8 +2725,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, return -EINVAL; ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id); - if (!ctx) - return -ENOENT; + if (IS_ERR(ctx)) + return PTR_ERR(ctx); /* * We opt for unserialised reads here. This may result in tearing diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 9aa7e10d16308..5ea8b4e23e428 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -750,8 +750,8 @@ static int eb_select_context(struct i915_execbuffer *eb) struct i915_gem_context *ctx; ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1); - if (unlikely(!ctx)) - return -ENOENT; + if (unlikely(IS_ERR(ctx))) + return PTR_ERR(ctx); eb->gem_context = ctx; if (rcu_access_pointer(ctx->vm)) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8c1994c16b920..d9278c973a734 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1864,7 +1864,7 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) ctx = NULL; rcu_read_unlock(); - return ctx; + return ctx ? ctx : ERR_PTR(-ENOENT); } static inline struct i915_address_space * diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 9f94914958c39..b4ec114a4698b 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf, struct drm_i915_file_private *file_priv = file->driver_priv; specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle); - if (!specific_ctx) { + if (IS_ERR(specific_ctx)) { DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n", ctx_handle); - ret = -ENOENT; + ret = PTR_ERR(specific_ctx); goto err; } } -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 20/30] drm/i915/gem: Make an alignment check more sensible
What we really want to check is that size of the engines array, i.e. args->size - sizeof(*user) is divisible by the element size, i.e. sizeof(*user->engines) because that's what's required for computing the array length right below the check. However, we're currently not doing this and instead doing a compile-time check that sizeof(*user) is divisible by sizeof(*user->engines) and avoiding the subtraction. As far as I can tell, the only reason for the more confusing pair of checks is to avoid a single subtraction of a constant. The other thing the BUILD_BUG_ON might be trying to implicitly check is that offsetof(user->engines) == sizeof(*user) and we don't have any weird padding throwing us off. However, that's not the check it's doing and it's not even a reliable way to do that check. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 3c59d1e4080c4..f135fbc97c5a7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1723,9 +1723,8 @@ set_engines(struct i915_gem_context *ctx, goto replace; } - BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines))); if (args->size < sizeof(*user) || - !IS_ALIGNED(args->size, sizeof(*user->engines))) { + !IS_ALIGNED(args->size - sizeof(*user), sizeof(*user->engines))) { drm_dbg(&i915->drm, "Invalid size for engine array: %d\n", args->size); return -EINVAL; -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 18/30] drm/i915/gem: Optionally set SSEU in intel_context_set_gem
For now this is a no-op because everyone passes in a null SSEU but it lets us get some of the error handling and selftest refactoring plumbed through. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 41 +++ .../gpu/drm/i915/gem/selftests/mock_context.c | 6 ++- 2 files changed, 36 insertions(+), 11 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5b75f98274b9e..206721dccd24e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -266,9 +266,12 @@ context_get_vm_rcu(struct i915_gem_context *ctx) } while (1); } -static void intel_context_set_gem(struct intel_context *ce, - struct i915_gem_context *ctx) +static int intel_context_set_gem(struct intel_context *ce, +struct i915_gem_context *ctx, +struct intel_sseu sseu) { + int ret = 0; + GEM_BUG_ON(rcu_access_pointer(ce->gem_context)); RCU_INIT_POINTER(ce->gem_context, ctx); @@ -295,6 +298,12 @@ static void intel_context_set_gem(struct intel_context *ce, intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000); } + + /* A valid SSEU has no zero fields */ + if (sseu.slice_mask && !WARN_ON(ce->engine->class != RENDER_CLASS)) + ret = intel_context_reconfigure_sseu(ce, sseu); + + return ret; } static void __free_engines(struct i915_gem_engines *e, unsigned int count) @@ -362,7 +371,8 @@ static struct i915_gem_engines *alloc_engines(unsigned int count) return e; } -static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) +static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx, + struct intel_sseu rcs_sseu) { const struct intel_gt *gt = &ctx->i915->gt; struct intel_engine_cs *engine; @@ -375,6 +385,8 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) for_each_engine(engine, gt, id) { struct intel_context *ce; + struct intel_sseu sseu = {}; + int ret; if (engine->legacy_idx == INVALID_ENGINE) continue; @@ -388,10 +400,18 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) goto free_engines; } - intel_context_set_gem(ce, ctx); - e->engines[engine->legacy_idx] = ce; e->num_engines = max(e->num_engines, engine->legacy_idx + 1); + + if (engine->class == RENDER_CLASS) + sseu = rcs_sseu; + + ret = intel_context_set_gem(ce, ctx, sseu); + if (ret) { + err = ERR_PTR(ret); + goto free_engines; + } + } return e; @@ -705,6 +725,7 @@ __create_context(struct drm_i915_private *i915, { struct i915_gem_context *ctx; struct i915_gem_engines *e; + struct intel_sseu null_sseu = {}; int err; int i; @@ -722,7 +743,7 @@ __create_context(struct drm_i915_private *i915, INIT_LIST_HEAD(&ctx->stale.engines); mutex_init(&ctx->engines_mutex); - e = default_engines(ctx); + e = default_engines(ctx, null_sseu); if (IS_ERR(e)) { err = PTR_ERR(e); goto err_free; @@ -1508,6 +1529,7 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data) struct intel_engine_cs *stack[16]; struct intel_engine_cs **siblings; struct intel_context *ce; + struct intel_sseu null_sseu = {}; u16 num_siblings, idx; unsigned int n; int err; @@ -1580,7 +1602,7 @@ set_engines__load_balance(struct i915_user_extension __user *base, void *data) goto out_siblings; } - intel_context_set_gem(ce, set->ctx); + intel_context_set_gem(ce, set->ctx, null_sseu); if (cmpxchg(&set->engines->engines[idx], NULL, ce)) { intel_context_put(ce); @@ -1688,6 +1710,7 @@ set_engines(struct i915_gem_context *ctx, struct drm_i915_private *i915 = ctx->i915; struct i915_context_param_engines __user *user = u64_to_user_ptr(args->value); + struct intel_sseu null_sseu = {}; struct set_engines set = { .ctx = ctx }; unsigned int num_engines, n; u64 extensions; @@ -1697,7 +1720,7 @@ set_engines(struct i915_gem_context *ctx, if (!i915_gem_context_user_engines(ctx)) return 0; - set.engines = default_engines(ctx); + set.engines = default_engines(ctx, null_sseu); if (IS_ERR(set.eng
[Intel-gfx] [PATCH 19/30] drm/i915: Add an i915_gem_vm_lookup helper
This is the VM equivalent of i915_gem_context_lookup. It's only used once in this patch but future patches will need to duplicate this lookup code so it's better to have it in a helper. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 6 +- drivers/gpu/drm/i915/i915_drv.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 206721dccd24e..3c59d1e4080c4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1311,11 +1311,7 @@ static int set_ppgtt(struct drm_i915_file_private *file_priv, if (upper_32_bits(args->value)) return -ENOENT; - rcu_read_lock(); - vm = xa_load(&file_priv->vm_xa, args->value); - if (vm && !kref_get_unless_zero(&vm->ref)) - vm = NULL; - rcu_read_unlock(); + vm = i915_gem_vm_lookup(file_priv, args->value); if (!vm) return -ENOENT; diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index ae45ea7b26997..8c1994c16b920 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1867,6 +1867,20 @@ i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id) return ctx; } +static inline struct i915_address_space * +i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id) +{ + struct i915_address_space *vm; + + rcu_read_lock(); + vm = xa_load(&file_priv->vm_xa, id); + if (vm && !kref_get_unless_zero(&vm->ref)) + vm = NULL; + rcu_read_unlock(); + + return vm; +} + /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct i915_address_space *vm, u64 min_size, u64 alignment, -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 16/30] drm/i915/gem: Add an intermediate proto_context struct (v5)
The current context uAPI allows for two methods of setting context parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM. The former is allowed to be called at any time while the later happens as part of GEM_CONTEXT_CREATE. Currently, everything settable via one is settable via the other. While some params are fairly simple and setting them on a live context is harmless such the context priority, others are far trickier such as the VM or the set of engines. In order to swap out the VM, for instance, we have to delay until all current in-flight work is complete, swap in the new VM, and then continue. This leads to a plethora of potential race conditions we'd really rather avoid. Unfortunately, both methods of setting the VM and the engine set are in active use today so we can't simply disallow setting the VM or engine set vial SET_CONTEXT_PARAM. In order to work around this wart, this commit adds a proto-context struct which contains all the context create parameters. v2 (Daniel Vetter): - Better commit message - Use __set/clear_bit instead of set/clear_bit because there's no race and we don't need the atomics v3 (Daniel Vetter): - Use manual bitops and BIT() instead of __set_bit v4 (Daniel Vetter): - Add a changelog to the commit message - Better hyperlinking in docs - Create the default PPGTT in i915_gem_create_context v5 (Daniel Vetter): - Hand-roll the initialization of UCONTEXT_PERSISTENCE Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 84 +++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 22 + .../gpu/drm/i915/gem/selftests/mock_context.c | 16 +++- 3 files changed, 105 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index f9a6eac78c0ae..741624da8db78 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -191,6 +191,43 @@ static int validate_priority(struct drm_i915_private *i915, return 0; } +static void proto_context_close(struct i915_gem_proto_context *pc) +{ + if (pc->vm) + i915_vm_put(pc->vm); + kfree(pc); +} + +static struct i915_gem_proto_context * +proto_context_create(struct drm_i915_private *i915, unsigned int flags) +{ + struct i915_gem_proto_context *pc, *err; + + pc = kzalloc(sizeof(*pc), GFP_KERNEL); + if (!pc) + return ERR_PTR(-ENOMEM); + + pc->user_flags = BIT(UCONTEXT_BANNABLE) | +BIT(UCONTEXT_RECOVERABLE); + if (i915->params.enable_hangcheck) + pc->user_flags |= BIT(UCONTEXT_PERSISTENCE); + pc->sched.priority = I915_PRIORITY_NORMAL; + + if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) { + if (!HAS_EXECLISTS(i915)) { + err = ERR_PTR(-EINVAL); + goto proto_close; + } + pc->single_timeline = true; + } + + return pc; + +proto_close: + proto_context_close(pc); + return err; +} + static struct i915_address_space * context_get_vm_rcu(struct i915_gem_context *ctx) { @@ -660,7 +697,8 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state) } static struct i915_gem_context * -__create_context(struct drm_i915_private *i915) +__create_context(struct drm_i915_private *i915, +const struct i915_gem_proto_context *pc) { struct i915_gem_context *ctx; struct i915_gem_engines *e; @@ -673,7 +711,7 @@ __create_context(struct drm_i915_private *i915) kref_init(&ctx->ref); ctx->i915 = i915; - ctx->sched.priority = I915_PRIORITY_NORMAL; + ctx->sched = pc->sched; mutex_init(&ctx->mutex); INIT_LIST_HEAD(&ctx->link); @@ -696,9 +734,7 @@ __create_context(struct drm_i915_private *i915) * is no remap info, it will be a NOP. */ ctx->remap_slice = ALL_L3_SLICES(i915); - i915_gem_context_set_bannable(ctx); - i915_gem_context_set_recoverable(ctx); - __context_set_persistence(ctx, true /* cgroup hook? */); + ctx->user_flags = pc->user_flags; for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++) ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES; @@ -786,20 +822,22 @@ static void __assign_ppgtt(struct i915_gem_context *ctx, } static struct i915_gem_context * -i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags) +i915_gem_create_context(struct drm_i915_private *i915, + const struct i915_gem_proto_context *pc) { struct i915_gem_context *ctx; int ret; - if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE && - !HAS_EXECLISTS(i915)) - return ERR_PTR(-EINVAL); - - ctx = __create_context(i915); + ctx = __create_context(i915, pc);
[Intel-gfx] [PATCH 17/30] drm/i915/gem: Rework error handling in default_engines
Since free_engines works for partially constructed engine sets, we can use the usual goto pattern. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 741624da8db78..5b75f98274b9e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -366,7 +366,7 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) { const struct intel_gt *gt = &ctx->i915->gt; struct intel_engine_cs *engine; - struct i915_gem_engines *e; + struct i915_gem_engines *e, *err; enum intel_engine_id id; e = alloc_engines(I915_NUM_ENGINES); @@ -384,18 +384,21 @@ static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx) ce = intel_context_create(engine); if (IS_ERR(ce)) { - __free_engines(e, e->num_engines + 1); - return ERR_CAST(ce); + err = ERR_CAST(ce); + goto free_engines; } intel_context_set_gem(ce, ctx); e->engines[engine->legacy_idx] = ce; - e->num_engines = max(e->num_engines, engine->legacy_idx); + e->num_engines = max(e->num_engines, engine->legacy_idx + 1); } - e->num_engines++; return e; + +free_engines: + free_engines(e); + return err; } void i915_gem_context_release(struct kref *ref) -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 11/30] drm/i915/request: Remove the hook from await_execution
This was only ever used for FENCE_SUBMIT automatic engine selection which was removed in the previous commit. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 3 +- drivers/gpu/drm/i915/i915_request.c | 42 --- drivers/gpu/drm/i915/i915_request.h | 4 +- 3 files changed, 9 insertions(+), 40 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 30498948c83d0..9aa7e10d16308 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -3492,8 +3492,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (in_fence) { if (args->flags & I915_EXEC_FENCE_SUBMIT) err = i915_request_await_execution(eb.request, - in_fence, - NULL); + in_fence); else err = i915_request_await_dma_fence(eb.request, in_fence); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index c5989c0b83d3e..86b4c9f2613d5 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -49,7 +49,6 @@ struct execute_cb { struct irq_work work; struct i915_sw_fence *fence; - void (*hook)(struct i915_request *rq, struct dma_fence *signal); struct i915_request *signal; }; @@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk) kmem_cache_free(global.slab_execute_cbs, cb); } -static void irq_execute_cb_hook(struct irq_work *wrk) -{ - struct execute_cb *cb = container_of(wrk, typeof(*cb), work); - - cb->hook(container_of(cb->fence, struct i915_request, submit), -&cb->signal->fence); - i915_request_put(cb->signal); - - irq_execute_cb(wrk); -} - static __always_inline void __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk)) { @@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request *signal) static int __await_execution(struct i915_request *rq, struct i915_request *signal, - void (*hook)(struct i915_request *rq, - struct dma_fence *signal), gfp_t gfp) { struct execute_cb *cb; - if (i915_request_is_active(signal)) { - if (hook) - hook(rq, &signal->fence); + if (i915_request_is_active(signal)) return 0; - } cb = kmem_cache_alloc(global.slab_execute_cbs, gfp); if (!cb) @@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq, i915_sw_fence_await(cb->fence); init_irq_work(&cb->work, irq_execute_cb); - if (hook) { - cb->hook = hook; - cb->signal = i915_request_get(signal); - cb->work.func = irq_execute_cb_hook; - } - /* * Register the callback first, then see if the signaler is already * active. This ensures that if we race with the @@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to, goto await_fence; /* Only submit our spinner after the signaler is running! */ - if (__await_execution(to, from, NULL, gfp)) + if (__await_execution(to, from, gfp)) goto await_fence; if (__emit_semaphore_wait(to, from, from->fence.seqno)) @@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct intel_timeline *tl, static int __i915_request_await_execution(struct i915_request *to, - struct i915_request *from, - void (*hook)(struct i915_request *rq, - struct dma_fence *signal)) + struct i915_request *from) { int err; GEM_BUG_ON(intel_context_is_barrier(from->context)); /* Submit both requests at the same time */ - err = __await_execution(to, from, hook, I915_FENCE_GFP); + err = __await_execution(to, from, I915_FENCE_GFP); if (err) return err; @@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, struct dma_fence *fence) int i915_request_await_execution(struct i915_request *rq, -struct dma_fence *fence, -void (*hook)(struct i915_request *rq, - struct dma_fence *signal)) +struct dma_fence *fence) { struct dma_fence **child = &fence; unsigned int nchild = 1; @@ -1441,8 +1414,7 @@ i915_request_await_ex
[Intel-gfx] [PATCH 15/30] drm/i915: Add gem/i915_gem_context.h to the docs
In order to prevent kernel doc warnings, also fill out docs for any missing fields and fix those that forgot the "@". Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- Documentation/gpu/i915.rst| 2 + .../gpu/drm/i915/gem/i915_gem_context_types.h | 43 --- 2 files changed, 38 insertions(+), 7 deletions(-) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index e6fd9608e9c6d..204ebdaadb45a 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -422,6 +422,8 @@ Batchbuffer Parsing User Batchbuffer Execution -- +.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h + .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c :doc: User command execution diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index df76767f0c41b..5f0673a2129f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -30,19 +30,39 @@ struct i915_address_space; struct intel_timeline; struct intel_ring; +/** + * struct i915_gem_engines - A set of engines + */ struct i915_gem_engines { union { + /** @link: Link in i915_gem_context::stale::engines */ struct list_head link; + + /** @rcu: RCU to use when freeing */ struct rcu_head rcu; }; + + /** @fence: Fence used for delayed destruction of engines */ struct i915_sw_fence fence; + + /** @ctx: i915_gem_context backpointer */ struct i915_gem_context *ctx; + + /** @num_engines: Number of engines in this set */ unsigned int num_engines; + + /** @engines: Array of engines */ struct intel_context *engines[]; }; +/** + * struct i915_gem_engines_iter - Iterator for an i915_gem_engines set + */ struct i915_gem_engines_iter { + /** @idx: Index into i915_gem_engines::engines */ unsigned int idx; + + /** @engines: Engine set being iterated */ const struct i915_gem_engines *engines; }; @@ -53,10 +73,10 @@ struct i915_gem_engines_iter { * logical hardware state for a particular client. */ struct i915_gem_context { - /** i915: i915 device backpointer */ + /** @i915: i915 device backpointer */ struct drm_i915_private *i915; - /** file_priv: owning file descriptor */ + /** @file_priv: owning file descriptor */ struct drm_i915_file_private *file_priv; /** @@ -81,7 +101,9 @@ struct i915_gem_context { * CONTEXT_USER_ENGINES flag is set). */ struct i915_gem_engines __rcu *engines; - struct mutex engines_mutex; /* guards writes to engines */ + + /** @engines_mutex: guards writes to engines */ + struct mutex engines_mutex; /** * @syncobj: Shared timeline syncobj @@ -118,7 +140,7 @@ struct i915_gem_context { */ struct pid *pid; - /** link: place with &drm_i915_private.context_list */ + /** @link: place with &drm_i915_private.context_list */ struct list_head link; /** @@ -153,11 +175,13 @@ struct i915_gem_context { #define CONTEXT_CLOSED 0 #define CONTEXT_USER_ENGINES 1 + /** @mutex: guards everything that isn't engines or handles_vma */ struct mutex mutex; + /** @sched: scheduler parameters */ struct i915_sched_attr sched; - /** guilty_count: How many times this context has caused a GPU hang. */ + /** @guilty_count: How many times this context has caused a GPU hang. */ atomic_t guilty_count; /** * @active_count: How many times this context was active during a GPU @@ -171,15 +195,17 @@ struct i915_gem_context { unsigned long hang_timestamp[2]; #define CONTEXT_FAST_HANG_JIFFIES (120 * HZ) /* 3 hangs within 120s? Banned! */ - /** remap_slice: Bitmask of cache lines that need remapping */ + /** @remap_slice: Bitmask of cache lines that need remapping */ u8 remap_slice; /** -* handles_vma: rbtree to look up our context specific obj/vma for +* @handles_vma: rbtree to look up our context specific obj/vma for * the user handle. (user handles are per fd, but the binding is * per vm, which may be one per context or shared with the global GTT) */ struct radix_tree_root handles_vma; + + /** @lut_mutex: Locks handles_vma */ struct mutex lut_mutex; /** @@ -191,8 +217,11 @@ struct i915_gem_context { */ char name[TASK_COMM_LEN + 8]; + /** @stale: tracks stale engines to be destroyed */ struct { + /** @lock: guards engines */ spinlock_t lock; + /** @engines: list of stale engines */ struct list_head engines; } stal
[Intel-gfx] [PATCH 14/30] drm/i915/gem: Add a separate validate_priority helper
With the proto-context stuff added later in this series, we end up having to duplicate set_priority. This lets us avoid duplicating the validation logic. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 + 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 61fe6d18d4068..f9a6eac78c0ae 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx, return i915_gem_context_get_engine(ctx, idx); } +static int validate_priority(struct drm_i915_private *i915, +const struct drm_i915_gem_context_param *args) +{ + s64 priority = args->value; + + if (args->size) + return -EINVAL; + + if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY)) + return -ENODEV; + + if (priority > I915_CONTEXT_MAX_USER_PRIORITY || + priority < I915_CONTEXT_MIN_USER_PRIORITY) + return -EINVAL; + + if (priority > I915_CONTEXT_DEFAULT_PRIORITY && + !capable(CAP_SYS_NICE)) + return -EPERM; + + return 0; +} + static struct i915_address_space * context_get_vm_rcu(struct i915_gem_context *ctx) { @@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, void *arg) static int set_priority(struct i915_gem_context *ctx, const struct drm_i915_gem_context_param *args) { - s64 priority = args->value; - - if (args->size) - return -EINVAL; - - if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY)) - return -ENODEV; - - if (priority > I915_CONTEXT_MAX_USER_PRIORITY || - priority < I915_CONTEXT_MIN_USER_PRIORITY) - return -EINVAL; + int err; - if (priority > I915_CONTEXT_DEFAULT_PRIORITY && - !capable(CAP_SYS_NICE)) - return -EPERM; + err = validate_priority(ctx->i915, args); + if (err) + return err; - ctx->sched.priority = priority; + ctx->sched.priority = args->value; context_apply_all(ctx, __apply_priority, ctx); return 0; -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/30] drm/i915/gem: Disallow bonding of virtual engines (v3)
This adds a bunch of complexity which the media driver has never actually used. The media driver does technically bond a balanced engine to another engine but the balanced engine only has one engine in the sibling set. This doesn't actually result in a virtual engine. This functionality was originally added to handle cases where we may have more than two video engines and media might want to load-balance their bonded submits by, for instance, submitting to a balanced vcs0-1 as the primary and then vcs2-3 as the secondary. However, no such hardware has shipped thus far and, if we ever want to enable such use-cases in the future, we'll use the up-and-coming parallel submit API which targets GuC submission. This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op. We leave the validation code in place in case we ever decide we want to do something interesting with the bonding information. v2 (Jason Ekstrand): - Don't delete quite as much code. v3 (Tvrtko Ursulin): - Add some history to the commit message Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 18 +- .../drm/i915/gt/intel_execlists_submission.c | 69 -- .../drm/i915/gt/intel_execlists_submission.h | 5 +- drivers/gpu/drm/i915/gt/selftest_execlists.c | 229 -- 4 files changed, 8 insertions(+), 313 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index e36e3b1ae14e4..5eca91ded3423 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1552,6 +1552,12 @@ set_engines__bond(struct i915_user_extension __user *base, void *data) } virtual = set->engines->engines[idx]->engine; + if (intel_engine_is_virtual(virtual)) { + drm_dbg(&i915->drm, + "Bonding with virtual engines not allowed\n"); + return -EINVAL; + } + err = check_user_mbz(&ext->flags); if (err) return err; @@ -1592,18 +1598,6 @@ set_engines__bond(struct i915_user_extension __user *base, void *data) n, ci.engine_class, ci.engine_instance); return -EINVAL; } - - /* -* A non-virtual engine has no siblings to choose between; and -* a submit fence will always be directed to the one engine. -*/ - if (intel_engine_is_virtual(virtual)) { - err = intel_virtual_engine_attach_bond(virtual, - master, - bond); - if (err) - return err; - } } return 0; diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 7dd7afccb3adc..98b256352c23d 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -182,18 +182,6 @@ struct virtual_engine { int prio; } nodes[I915_NUM_ENGINES]; - /* -* Keep track of bonded pairs -- restrictions upon on our selection -* of physical engines any particular request may be submitted to. -* If we receive a submit-fence from a master engine, we will only -* use one of sibling_mask physical engines. -*/ - struct ve_bond { - const struct intel_engine_cs *master; - intel_engine_mask_t sibling_mask; - } *bonds; - unsigned int num_bonds; - /* And finally, which physical engines this virtual engine maps onto. */ unsigned int num_siblings; struct intel_engine_cs *siblings[]; @@ -3413,7 +3401,6 @@ static void rcu_virtual_context_destroy(struct work_struct *wrk) i915_sched_engine_put(ve->base.sched_engine); intel_engine_free_request_pool(&ve->base); - kfree(ve->bonds); kfree(ve); } @@ -3668,33 +3655,13 @@ static void virtual_submit_request(struct i915_request *rq) spin_unlock_irqrestore(&ve->base.sched_engine->lock, flags); } -static struct ve_bond * -virtual_find_bond(struct virtual_engine *ve, - const struct intel_engine_cs *master) -{ - int i; - - for (i = 0; i < ve->num_bonds; i++) { - if (ve->bonds[i].master == master) - return &ve->bonds[i]; - } - - return NULL; -} - static void virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal) { - struct virtual_engine *ve = to_virtual_engine(rq->engine); intel_engine_mask_t allowed, exec; - struct ve_bond *bond; allowed = ~to_request(signal)->engine->mask; - bond = virtual_find_bond(ve, to_req
[Intel-gfx] [PATCH 10/30] drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)
Even though FENCE_SUBMIT is only documented to wait until the request in the in-fence starts instead of waiting until it completes, it has a bit more magic than that. If FENCE_SUBMIT is used to submit something to a balanced engine, we would wait to assign engines until the primary request was ready to start and then attempt to assign it to a different engine than the primary. There is an IGT test (the bonded-slice subtest of gem_exec_balancer) which exercises this by submitting a primary batch to a specific VCS and then using FENCE_SUBMIT to submit a secondary which can run on any VCS and have i915 figure out which VCS to run it on such that they can run in parallel. However, this functionality has never been used in the real world. The media driver (the only user of FENCE_SUBMIT) always picks exactly two physical engines to bond and never asks us to pick which to use. v2 (Daniel Vetter): - Mention the exact IGT test this breaks Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/gt/intel_engine_types.h| 7 --- .../drm/i915/gt/intel_execlists_submission.c| 17 - 3 files changed, 1 insertion(+), 25 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index 7b7897242a837..30498948c83d0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -3493,7 +3493,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, if (args->flags & I915_EXEC_FENCE_SUBMIT) err = i915_request_await_execution(eb.request, in_fence, - eb.engine->bond_execute); + NULL); else err = i915_request_await_dma_fence(eb.request, in_fence); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index 5b91068ab2779..1cb9c3b70b29a 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -416,13 +416,6 @@ struct intel_engine_cs { */ void(*submit_request)(struct i915_request *rq); - /* -* Called on signaling of a SUBMIT_FENCE, passing along the signaling -* request down to the bonded pairs. -*/ - void(*bond_execute)(struct i915_request *rq, - struct dma_fence *signal); - void(*release)(struct intel_engine_cs *engine); struct intel_engine_execlists execlists; diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index 98b256352c23d..56e25090da672 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -3655,22 +3655,6 @@ static void virtual_submit_request(struct i915_request *rq) spin_unlock_irqrestore(&ve->base.sched_engine->lock, flags); } -static void -virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal) -{ - intel_engine_mask_t allowed, exec; - - allowed = ~to_request(signal)->engine->mask; - - /* Restrict the bonded request to run on only the available engines */ - exec = READ_ONCE(rq->execution_mask); - while (!try_cmpxchg(&rq->execution_mask, &exec, exec & allowed)) - ; - - /* Prevent the master from being re-run on the bonded engines */ - to_request(signal)->execution_mask &= ~allowed; -} - struct intel_context * intel_execlists_create_virtual(struct intel_engine_cs **siblings, unsigned int count) @@ -3731,7 +3715,6 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings, ve->base.sched_engine->schedule = i915_schedule; ve->base.sched_engine->kick_backend = kick_execlists; ve->base.submit_request = virtual_submit_request; - ve->base.bond_execute = virtual_bond_execute; INIT_LIST_HEAD(virtual_queue(ve)); tasklet_setup(&ve->base.sched_engine->tasklet, virtual_submission_tasklet); -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 12/30] drm/i915/gem: Disallow creating contexts with too many engines
There's no sense in allowing userspace to create more engines than it can possibly access via execbuf. Signed-off-by: Jason Ekstrand Reviewed-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 5eca91ded3423..0ba8506fb966f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1639,11 +1639,11 @@ set_engines(struct i915_gem_context *ctx, return -EINVAL; } - /* -* Note that I915_EXEC_RING_MASK limits execbuf to only using the -* first 64 engines defined here. -*/ num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines); + /* RING_MASK has no shift so we can use it directly here */ + if (num_engines > I915_EXEC_RING_MASK + 1) + return -EINVAL; + set.engines = alloc_engines(num_engines); if (!set.engines) return -ENOMEM; -- 2.31.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx