Re: [Intel-gfx] [PATCH] drm/i915/display/xelpd: Fix incorrect color capability reporting

2021-07-08 Thread Shankar, Uma



> -Original Message-
> From: Sharma, Swati2 
> Sent: Thursday, July 8, 2021 1:07 AM
> To: Shankar, Uma ; intel-gfx@lists.freedesktop.org
> Subject: Re: [PATCH] drm/i915/display/xelpd: Fix incorrect color capability 
> reporting
> 
> Reviewed-by: Swati Sharma 

Merged the change to drm-intel-next. Thanks for the review

Regards,
Uma Shankar
> 
> On 07-Jul-21 3:22 PM, Uma Shankar wrote:
> > On XELPD platforms, color management support is not yet enabled.
> > Fix wrongly reporting the same through platform info, which was
> > resulting in incorrect initialization and usage.
> >
> > Cc: Swati Sharma 
> > Signed-off-by: Uma Shankar 
> > ---
> >   drivers/gpu/drm/i915/i915_pci.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_pci.c
> > b/drivers/gpu/drm/i915/i915_pci.c index a7bfdd827bc8..8ff1990528d1
> > 100644
> > --- a/drivers/gpu/drm/i915/i915_pci.c
> > +++ b/drivers/gpu/drm/i915/i915_pci.c
> > @@ -947,7 +947,7 @@ static const struct intel_device_info adl_s_info =
> > {
> >
> >   #define XE_LPD_FEATURES \
> > .abox_mask = GENMASK(1, 0), 
> > \
> > -   .color = { .degamma_lut_size = 33, .gamma_lut_size = 262145 },  
> > \
> > +   .color = { .degamma_lut_size = 0, .gamma_lut_size = 0 },
> > \
> > .cpu_transcoder_mask = BIT(TRANSCODER_A) | BIT(TRANSCODER_B) |
>   \
> > BIT(TRANSCODER_C) | BIT(TRANSCODER_D),
>   \
> > .dbuf.size = 4096,  
> > \
> >
> 
> --
> ~Swati Sharma
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 09/10] drm/i915/step: Add intel_step_name() helper

2021-07-08 Thread Matt Roper
On Thu, Jul 08, 2021 at 04:18:20PM -0700, Anusha Srivatsa wrote:
> Add a helper to convert the step info to string.
> This is specifically useful when we want to load a specific
> firmware for a given stepping/substepping combination.

What if we use macros to generate the per-stepping code here as well as
the stepping values in the enum?

In intel_step.h:

#define STEPPING_NAME_LIST(func) \
func(A0)
func(A1)
func(A2)
func(B0)
...

#define STEPPING_ENUM_VAL(name)  STEP_##name,

enum intel_step {
STEP_NONE = 0,
STEPPING_NAME_LIST(STEPPING_ENUM_VAL)
STEP_FUTURE,
STEP_FOREVER,
};

and in intel_step.c:

#define STEPPING_NAME_CASE(name)\
case STEP_##name:   \
return #name;   \
break;

const char *intel_step_name(enum intel_step step) {
switch(step) {
STEPPING_NAME_LIST(STEPPING_NAME_CASE)

default:
return "**";
}
}

This has the advantage that anytime a new stepping is added (in
STEPPING_NAME_LIST) it will generate a new "STEP_XX" enum value and a
new case statement to return "XX" as the name; we won't have to remember
to update two separate places in the code.


Matt

> 
> Suggested-by: Jani Nikula 
> Signed-off-by: Anusha Srivatsa 
> ---
>  drivers/gpu/drm/i915/intel_step.c | 58 +++
>  drivers/gpu/drm/i915/intel_step.h |  1 +
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_step.c 
> b/drivers/gpu/drm/i915/intel_step.c
> index 99c0d3df001b..9af7f30b777e 100644
> --- a/drivers/gpu/drm/i915/intel_step.c
> +++ b/drivers/gpu/drm/i915/intel_step.c
> @@ -182,3 +182,61 @@ void intel_step_init(struct drm_i915_private *i915)
>  
>   RUNTIME_INFO(i915)->step = step;
>  }
> +
> +const char *intel_step_name(enum intel_step step) {
> + switch (step) {
> + case STEP_A0:
> + return "A0";
> + break;
> + case STEP_A1:
> + return "A1";
> + break;
> + case STEP_A2:
> + return "A2";
> + break;
> + case STEP_B0:
> + return "B0";
> + break;
> + case STEP_B1:
> + return "B1";
> + break;
> + case STEP_B2:
> + return "B2";
> + break;
> + case STEP_C0:
> + return "C0";
> + break;
> + case STEP_C1:
> + return "C1";
> + break;
> + case STEP_D0:
> + return "D0";
> + break;
> + case STEP_D1:
> + return "D1";
> + break;
> + case STEP_E0:
> + return "E0";
> + break;
> + case STEP_F0:
> + return "F0";
> + break;
> + case STEP_G0:
> + return "G0";
> + break;
> + case STEP_H0:
> + return "H0";
> + break;
> + case STEP_I0:
> + return "I0";
> + break;
> + case STEP_I1:
> + return "I1";
> + break;
> + case STEP_J0:
> + return "J0";
> + break;
> + default:
> + return "**";
> + }
> +}
> diff --git a/drivers/gpu/drm/i915/intel_step.h 
> b/drivers/gpu/drm/i915/intel_step.h
> index 3e8b2babd9da..2fbe51483472 100644
> --- a/drivers/gpu/drm/i915/intel_step.h
> +++ b/drivers/gpu/drm/i915/intel_step.h
> @@ -43,5 +43,6 @@ enum intel_step {
>  };
>  
>  void intel_step_init(struct drm_i915_private *i915);
> +const char *intel_step_name(enum intel_step step);
>  
>  #endif /* __INTEL_STEP_H__ */
> -- 
> 2.32.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 08/10] drm/i915/bxt: Use revid->stepping tables

2021-07-08 Thread Matt Roper
On Thu, Jul 08, 2021 at 04:18:19PM -0700, Anusha Srivatsa wrote:
> Switch BXT to use a revid->stepping table as we're trying to do on all
> platforms going forward.
> 
> Signed-off-by: Anusha Srivatsa 
> ---
>  drivers/gpu/drm/i915/intel_step.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_step.c 
> b/drivers/gpu/drm/i915/intel_step.c
> index c4ce02d22828..99c0d3df001b 100644
> --- a/drivers/gpu/drm/i915/intel_step.c
> +++ b/drivers/gpu/drm/i915/intel_step.c
> @@ -31,6 +31,15 @@ static const struct intel_step_info skl_revid_step_tbl[] = 
> {
>   [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 },
>  };
>  
> +static const struct intel_step_info bxt_revids[] = {
> + [0] = { .gt_step = STEP_A0 },
> + [1] = { .gt_step = STEP_A1 },
> + [2] = { .gt_step = STEP_A2 },
> + [6] = { .gt_step = STEP_B0 },
> + [7] = { .gt_step = STEP_B1 },
> + [8] = { .gt_step = STEP_B2 },

I realize the mistake originates from the #define's that you're
replacing with these tables, but the values in this table aren't the
correct GT/display steppings, but rather the SoC stepping; that's the
wrong thing for us to be matching on for workarounds, DMC versions, etc.
You want to use column #4 of the bspec table, not column #2.

Also we need to update this to use the proper revisions from the bspec;
most of the ones you have here were temporary placeholders before the
platform was released and the actual revisions that showed up in real
hardware are higher than any of your table entries.  If we take into
account the right-most column of the bspec we'd actually want:

static  const struct intel_step_info bxt_revids[] = {
[0xA] = { .gt_step = STEP_C0 },
[0xB] = { .gt_step = STEP_C0 },
[0xC] = { .gt_step = STEP_D0 },
[0xD] = { .gt_step = STEP_E0 },
};


Matt

> +};
> +
>  static const struct intel_step_info kbl_revid_step_tbl[] = {
>   [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
>   [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
> @@ -129,6 +138,9 @@ void intel_step_init(struct drm_i915_private *i915)
>   } else if (IS_KABYLAKE(i915)) {
>   revids = kbl_revid_step_tbl;
>   size = ARRAY_SIZE(kbl_revid_step_tbl);
> + } else if (IS_BROXTON(i915)) {
> + revids = bxt_revids;
> + size = ARRAY_SIZE(bxt_revids);
>   } else if (IS_SKYLAKE(i915)) {
>   revids = skl_revid_step_tbl;
>   size = ARRAY_SIZE(skl_revid_step_tbl);
> -- 
> 2.32.0
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BAT: failure for Get stepping info from RUNTIME_INFO->step

2021-07-08 Thread Patchwork
== Series Details ==

Series: Get stepping info from RUNTIME_INFO->step
URL   : https://patchwork.freedesktop.org/series/92346/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20560


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20560 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20560, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20560:

### IGT changes ###

 Possible regressions 

  * igt@core_hotunplug@unbind-rebind:
- fi-bxt-dsi: [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bxt-dsi/igt@core_hotunp...@unbind-rebind.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bxt-dsi/igt@core_hotunp...@unbind-rebind.html

  
Known issues


  Here are the changes found in Patchwork_20560 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_sync@basic-each:
- fi-bdw-5557u:   [PASS][3] -> [INCOMPLETE][4] ([i915#2944])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@gem_s...@basic-each.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bdw-5557u/igt@gem_s...@basic-each.html

  
 Warnings 

  * igt@runner@aborted:
- fi-bdw-5557u:   [FAIL][5] ([i915#2722] / [i915#3744]) -> [FAIL][6] 
([i915#2722])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@run...@aborted.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/fi-bdw-5557u/igt@run...@aborted.html

  
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#2944]: https://gitlab.freedesktop.org/drm/intel/issues/2944
  [i915#3744]: https://gitlab.freedesktop.org/drm/intel/issues/3744


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20560

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20560: 2ecb4107afb2f9dfa4c87599cdb00c099b3de1e3 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2ecb4107afb2 drm/i915/dmc: Modify intel_get_stepping_info()
de45975f5a75 drm/i915/step: Add intel_step_name() helper
5b936277263e drm/i915/bxt: Use revid->stepping tables
2bc809b73f67 drm/i915/cnl: Drop all workarounds
da4af914b48b drm/i915/dg1: Use revid->stepping tables
95363c8b9dc9 drm/i915/rkl: Use revid->stepping tables
2f61797e047b drm/i915/jsl_ehl: Use revid->stepping tables
daee55a59631 drm/i915/icl: Use revid->stepping tables
8ad43b5ade6f drm/i915/skl: Use revid->stepping tables
c109c3b789aa drm/i915: Make pre-production detection use direct revid comparison

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20560/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Get stepping info from RUNTIME_INFO->step

2021-07-08 Thread Patchwork
== Series Details ==

Series: Get stepping info from RUNTIME_INFO->step
URL   : https://patchwork.freedesktop.org/series/92346/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1210:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: 

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Get stepping info from RUNTIME_INFO->step

2021-07-08 Thread Patchwork
== Series Details ==

Series: Get stepping info from RUNTIME_INFO->step
URL   : https://patchwork.freedesktop.org/series/92346/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
c109c3b789aa drm/i915: Make pre-production detection use direct revid comparison
8ad43b5ade6f drm/i915/skl: Use revid->stepping tables
-:54: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#54: FILE: drivers/gpu/drm/i915/i915_drv.h:1512:
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))

total: 0 errors, 0 warnings, 1 checks, 85 lines checked
daee55a59631 drm/i915/icl: Use revid->stepping tables
-:93: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#93: FILE: drivers/gpu/drm/i915/i915_drv.h:1519:
+#define IS_ICL_GT_STEP(p, since, until) \
+   (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))

total: 0 errors, 0 warnings, 1 checks, 94 lines checked
2f61797e047b drm/i915/jsl_ehl: Use revid->stepping tables
-:51: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#51: FILE: drivers/gpu/drm/i915/i915_drv.h:1522:
+#define IS_JSL_EHL_GT_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))

-:53: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#53: FILE: drivers/gpu/drm/i915/i915_drv.h:1524:
+#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 2 checks, 51 lines checked
95363c8b9dc9 drm/i915/rkl: Use revid->stepping tables
-:48: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#48: FILE: drivers/gpu/drm/i915/i915_drv.h:1539:
+#define IS_RKL_DISPLAY_STEP(p, since, until) \
+   (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 1 checks, 51 lines checked
da4af914b48b drm/i915/dg1: Use revid->stepping tables
-:124: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#124: FILE: drivers/gpu/drm/i915/i915_drv.h:1533:
+#define IS_DG1_GT_STEP(p, since, until) \
+   (IS_DG1(p) && IS_GT_STEP(p, since, until))

-:126: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'p' - possible side-effects?
#126: FILE: drivers/gpu/drm/i915/i915_drv.h:1535:
+#define IS_DG1_DISPLAY_STEP(p, since, until) \
+   (IS_DG1(p) && IS_DISPLAY_STEP(p, since, until))

total: 0 errors, 0 warnings, 2 checks, 118 lines checked
2bc809b73f67 drm/i915/cnl: Drop all workarounds
5b936277263e drm/i915/bxt: Use revid->stepping tables
de45975f5a75 drm/i915/step: Add intel_step_name() helper
-:22: ERROR:OPEN_BRACE: open brace '{' following function definitions go on the 
next line
#22: FILE: drivers/gpu/drm/i915/intel_step.c:186:
+const char *intel_step_name(enum intel_step step) {

-:26: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#26: FILE: drivers/gpu/drm/i915/intel_step.c:190:
+   return "A0";
+   break;

-:29: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#29: FILE: drivers/gpu/drm/i915/intel_step.c:193:
+   return "A1";
+   break;

-:32: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#32: FILE: drivers/gpu/drm/i915/intel_step.c:196:
+   return "A2";
+   break;

-:35: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#35: FILE: drivers/gpu/drm/i915/intel_step.c:199:
+   return "B0";
+   break;

-:38: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#38: FILE: drivers/gpu/drm/i915/intel_step.c:202:
+   return "B1";
+   break;

-:41: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#41: FILE: drivers/gpu/drm/i915/intel_step.c:205:
+   return "B2";
+   break;

-:44: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#44: FILE: drivers/gpu/drm/i915/intel_step.c:208:
+   return "C0";
+   break;

-:47: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#47: FILE: drivers/gpu/drm/i915/intel_step.c:211:
+   return "C1";
+   break;

-:50: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#50: FILE: drivers/gpu/drm/i915/intel_step.c:214:
+   return "D0";
+   break;

-:53: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#53: FILE: drivers/gpu/drm/i915/intel_step.c:217:
+   return "D1";
+   break;

-:56: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#56: FILE: drivers/gpu/drm/i915/intel_step.c:220:
+   return "E0";
+   break;

-:59: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#59: FILE: drivers/gpu/drm/i915/intel_step.c:223:
+   return "F0";
+   break;

-:62: WARNING:UNNECESSARY_BREAK: break is not useful after a return
#62: FILE: drivers/gpu/drm/i915/intel_step.c:226:
+   return "G0";

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/sched dependency tracking and dma-resv fixes (rev2)

2021-07-08 Thread Patchwork
== Series Details ==

Series: drm/sched dependency tracking and dma-resv fixes (rev2)
URL   : https://patchwork.freedesktop.org/series/92333/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20559


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20559/index.html


Changes
---

  No changes found


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20559

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20559: ca32cb2b56ef01204baff47cde1058e14ddf23ca @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ca32cb2b56ef dma-resv: Give the docs a do-over
10b231f895fe drm/i915: Don't break exclusive fence ordering
56fc78cd5073 drm/i915: delete exclude argument from 
i915_sw_fence_await_reservation
d022184c169b drm/etnaviv: Don't break exclusive fence ordering
9e395399b983 drm/msm: always wait for the exclusive fence
b149f2ca5c3f drm/msm: Don't break exclusive fence ordering
6219e79695d6 drm/sched: Check locking in drm_sched_job_await_implicit
588331541d89 drm/sched: Don't store self-dependencies
28feb868173e drm/gem: Delete gem array fencing helpers
5e34a7c752a3 drm/etnaviv: Use scheduler dependency handling
d4fa82c9c97f drm/v3d: Use scheduler dependency handling
34e837ac1d7b drm/v3d: Move drm_sched_job_init to v3d_job_init
a51bbbac956b drm/lima: use scheduler dependency tracking
dca5c513ac3d drm/panfrost: use scheduler dependency tracking
f46108672694 drm/sched: improve docs around drm_sched_entity
6f8a6dac8300 drm/sched: drop entity parameter from drm_sched_push_job
2554ae9762d1 drm/sched: Add dependency tracking
998b636c6f3c drm/sched: Barriers are needed for entity->last_scheduled
3afe40cffa3e drm/sched: Split drm_sched_job_init
aa63f4780928 drm/sched: entity->rq selection cannot fail

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20559/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency tracking and dma-resv fixes (rev2)

2021-07-08 Thread Patchwork
== Series Details ==

Series: drm/sched dependency tracking and dma-resv fixes (rev2)
URL   : https://patchwork.freedesktop.org/series/92333/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
aa63f4780928 drm/sched: entity->rq selection cannot fail
-:52: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#52: FILE: drivers/gpu/drm/scheduler/sched_main.c:584:
+   BUG_ON(!entity->rq);

-:55: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 0 checks, 17 lines checked
3afe40cffa3e drm/sched: Split drm_sched_job_init
-:205: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#205: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173:
+   unsigned seq;

-:284: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#284: FILE: drivers/gpu/drm/scheduler/sched_main.c:614:
+   BUG_ON(!entity);

-:365: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#365: FILE: include/drm/gpu_scheduler.h:391:
+struct drm_sched_fence *drm_sched_fence_alloc(

-:373: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 1 checks, 235 lines checked
998b636c6f3c drm/sched: Barriers are needed for entity->last_scheduled
-:86: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 43 lines checked
2554ae9762d1 drm/sched: Add dependency tracking
-:193: CHECK:LINE_SPACING: Please don't use multiple blank lines
#193: FILE: drivers/gpu/drm/scheduler/sched_main.c:722:
+
+

-:268: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'?
#268: FILE: include/drm/gpu_scheduler.h:243:
+* drm_sched_job_await_implicit() this can be ommitted and left as NULL.
  

-:282: CHECK:LINE_SPACING: Please don't use multiple blank lines
#282: FILE: include/drm/gpu_scheduler.h:376:
+
+

-:285: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 2 checks, 227 lines checked
6f8a6dac8300 drm/sched: drop entity parameter from drm_sched_push_job
-:203: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 102 lines checked
f46108672694 drm/sched: improve docs around drm_sched_entity
-:14: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: 
move entity handling into separate file")'
#14: 
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into

-:396: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 346 lines checked
dca5c513ac3d drm/panfrost: use scheduler dependency tracking
-:209: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 157 lines checked
a51bbbac956b drm/lima: use scheduler dependency tracking
-:114: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 73 lines checked
34e837ac1d7b drm/v3d: Move drm_sched_job_init to v3d_job_init
-:356: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 302 lines checked
d4fa82c9c97f drm/v3d: Use scheduler dependency handling
-:202: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 161 lines checked
5e34a7c752a3 drm/etnaviv: Use scheduler dependency handling
-:13: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#13: 
I wanted to to in the previous round (and did, for all other drivers).

-:119: WARNING:LINE_SPACING: Missing a blank line after declarations
#119: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:551:
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {

-:293: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 0 checks, 241 lines checked
28feb868173e drm/gem: Delete gem array 

Re: [Intel-gfx] [PATCH 52/53] drm/i915/dg2: Update to bigjoiner path

2021-07-08 Thread Navare, Manasi
On Thu, Jul 01, 2021 at 01:24:26PM -0700, Matt Roper wrote:
> From: Animesh Manna 
> 
> In verify_mpllb_state() encoder is retrieved from best_encoder
> of connector_state. As there will be only one connector_state
> for bigjoiner and checking encoder may not be needed for
> bigjoiner-slave. This code path related to mpll is done on dg2
> and need this fix to avoid null pointer dereference issue.
> 
> Cc: Manasi Navare 
> Signed-off-by: Animesh Manna 
> Signed-off-by: Matt Roper 

Reviewed-by: Manasi Navare 

Manasi

> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 9655f1b1b41b..3f4e811145b6 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -9153,6 +9153,9 @@ verify_mpllb_state(struct intel_atomic_state *state,
>   if (!new_crtc_state->hw.active)
>   return;
>  
> + if (new_crtc_state->bigjoiner_slave)
> + return;
> +
>   encoder = intel_get_crtc_new_encoder(state, new_crtc_state);
>   intel_mpllb_readout_hw_state(encoder, _hw_state);
>  
> -- 
> 2.25.4
> 
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments

2021-07-08 Thread Patchwork
== Series Details ==

Series: series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments
URL   : https://patchwork.freedesktop.org/series/92342/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20558


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20558/index.html

Known issues


  Here are the changes found in Patchwork_20558 that come from known issues:

### IGT changes ###

  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20558

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20558: 0be142e780934cd09e46b4699fe3f7cd9b7adde0 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0be142e78093 drm/i915/display/xelpd: Exetend Wa_14011508470
8d9f68df0f34 drm/i915/display/adl_p: Correctly program MBUS DBOX A credits
ccd65a2b8785 drm/i915: Limit Wa_22010178259 to affected platforms
4818897f7082 drm/i915: Limit maximum number of memory channels
1a869323ea02 drm/i915/adl_s: Extend Wa_1406941453
6cea025795d1 drm/i915: Implement Wa_1508744258
78692bfbbdb4 drm/i915: Settle on "adl-x" in WA comments

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20558/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments

2021-07-08 Thread Patchwork
== Series Details ==

Series: series starting with [1/7] drm/i915: Settle on "adl-x" in WA comments
URL   : https://patchwork.freedesktop.org/series/92342/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type 
in assignment (different address spaces)


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/dg1: Compute MEM Bandwidth using MCHBAR (rev2)

2021-07-08 Thread Patchwork
== Series Details ==

Series: drm/i915/dg1: Compute MEM Bandwidth using MCHBAR (rev2)
URL   : https://patchwork.freedesktop.org/series/92094/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20557


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/index.html

Known issues


  Here are the changes found in Patchwork_20557 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s0:
- fi-cfl-8109u:   [PASS][1] -> [INCOMPLETE][2] ([i915#155])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20557

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20557: 11350b4a8bc8324518472387c20aebe74ec93069 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

11350b4a8bc8 drm/i915/dg1: Compute MEM Bandwidth using MCHBAR

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20557/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 09/10] drm/i915/step: Add intel_step_name() helper

2021-07-08 Thread Anusha Srivatsa
Add a helper to convert the step info to string.
This is specifically useful when we want to load a specific
firmware for a given stepping/substepping combination.

Suggested-by: Jani Nikula 
Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/intel_step.c | 58 +++
 drivers/gpu/drm/i915/intel_step.h |  1 +
 2 files changed, 59 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 99c0d3df001b..9af7f30b777e 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -182,3 +182,61 @@ void intel_step_init(struct drm_i915_private *i915)
 
RUNTIME_INFO(i915)->step = step;
 }
+
+const char *intel_step_name(enum intel_step step) {
+   switch (step) {
+   case STEP_A0:
+   return "A0";
+   break;
+   case STEP_A1:
+   return "A1";
+   break;
+   case STEP_A2:
+   return "A2";
+   break;
+   case STEP_B0:
+   return "B0";
+   break;
+   case STEP_B1:
+   return "B1";
+   break;
+   case STEP_B2:
+   return "B2";
+   break;
+   case STEP_C0:
+   return "C0";
+   break;
+   case STEP_C1:
+   return "C1";
+   break;
+   case STEP_D0:
+   return "D0";
+   break;
+   case STEP_D1:
+   return "D1";
+   break;
+   case STEP_E0:
+   return "E0";
+   break;
+   case STEP_F0:
+   return "F0";
+   break;
+   case STEP_G0:
+   return "G0";
+   break;
+   case STEP_H0:
+   return "H0";
+   break;
+   case STEP_I0:
+   return "I0";
+   break;
+   case STEP_I1:
+   return "I1";
+   break;
+   case STEP_J0:
+   return "J0";
+   break;
+   default:
+   return "**";
+   }
+}
diff --git a/drivers/gpu/drm/i915/intel_step.h 
b/drivers/gpu/drm/i915/intel_step.h
index 3e8b2babd9da..2fbe51483472 100644
--- a/drivers/gpu/drm/i915/intel_step.h
+++ b/drivers/gpu/drm/i915/intel_step.h
@@ -43,5 +43,6 @@ enum intel_step {
 };
 
 void intel_step_init(struct drm_i915_private *i915);
+const char *intel_step_name(enum intel_step step);
 
 #endif /* __INTEL_STEP_H__ */
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 07/10] drm/i915/cnl: Drop all workarounds

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

All of the Cannon Lake hardware that came out had graphics fused off,
and our userspace drivers have already dropped their support for the
platform; CNL-specific code in i915 that isn't inherited by subsequent
platforms is effectively dead code.  Let's remove all of the
CNL-specific workarounds as a quick and easy first step.

References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 55 -
 1 file changed, 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 62321e9149db..9b257a394305 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -514,35 +514,6 @@ static void cfl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
 GEN7_SBE_SS_CACHE_DISPATCH_PORT_SHARING_DISABLE);
 }
 
-static void cnl_ctx_workarounds_init(struct intel_engine_cs *engine,
-struct i915_wa_list *wal)
-{
-   /* WaForceContextSaveRestoreNonCoherent:cnl */
-   wa_masked_en(wal, CNL_HDC_CHICKEN0,
-HDC_FORCE_CONTEXT_SAVE_RESTORE_NON_COHERENT);
-
-   /* WaDisableReplayBufferBankArbitrationOptimization:cnl */
-   wa_masked_en(wal, COMMON_SLICE_CHICKEN2,
-GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
-
-   /* WaPushConstantDereferenceHoldDisable:cnl */
-   wa_masked_en(wal, GEN7_ROW_CHICKEN2, PUSH_CONSTANT_DEREF_DISABLE);
-
-   /* FtrEnableFastAnisoL1BankingFix:cnl */
-   wa_masked_en(wal, HALF_SLICE_CHICKEN3, CNL_FAST_ANISO_L1_BANKING_FIX);
-
-   /* WaDisable3DMidCmdPreemption:cnl */
-   wa_masked_dis(wal, GEN8_CS_CHICKEN1, GEN9_PREEMPT_3D_OBJECT_LEVEL);
-
-   /* WaDisableGPGPUMidCmdPreemption:cnl */
-   wa_masked_field_set(wal, GEN8_CS_CHICKEN1,
-   GEN9_PREEMPT_GPGPU_LEVEL_MASK,
-   GEN9_PREEMPT_GPGPU_COMMAND_LEVEL);
-
-   /* WaDisableEarlyEOT:cnl */
-   wa_masked_en(wal, GEN8_ROW_CHICKEN, DISABLE_EARLY_EOT);
-}
-
 static void icl_ctx_workarounds_init(struct intel_engine_cs *engine,
 struct i915_wa_list *wal)
 {
@@ -704,8 +675,6 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
gen12_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_ctx_workarounds_init(engine, wal);
-   else if (IS_CANNONLAKE(i915))
-   cnl_ctx_workarounds_init(engine, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_ctx_workarounds_init(engine, wal);
else if (IS_GEMINILAKE(i915))
@@ -982,15 +951,6 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct 
i915_wa_list *wal)
wa_write_clr_set(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
 }
 
-static void
-cnl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
-{
-   /* WaInPlaceDecompressionHang:cnl */
-   wa_write_or(wal,
-   GEN9_GAMT_ECO_REG_RW_IA,
-   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
-}
-
 static void
 icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list 
*wal)
 {
@@ -1140,8 +1100,6 @@ gt_init_workarounds(struct drm_i915_private *i915, struct 
i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_gt_workarounds_init(i915, wal);
-   else if (IS_CANNONLAKE(i915))
-   cnl_gt_workarounds_init(i915, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_gt_workarounds_init(i915, wal);
else if (IS_GEMINILAKE(i915))
@@ -1418,17 +1376,6 @@ static void cml_whitelist_build(struct intel_engine_cs 
*engine)
cfl_whitelist_build(engine);
 }
 
-static void cnl_whitelist_build(struct intel_engine_cs *engine)
-{
-   struct i915_wa_list *w = >whitelist;
-
-   if (engine->class != RENDER_CLASS)
-   return;
-
-   /* WaEnablePreemptionGranularityControlByUMD:cnl */
-   whitelist_reg(w, GEN8_CS_CHICKEN1);
-}
-
 static void icl_whitelist_build(struct intel_engine_cs *engine)
 {
struct i915_wa_list *w = >whitelist;
@@ -1542,8 +1489,6 @@ void intel_engine_init_whitelist(struct intel_engine_cs 
*engine)
tgl_whitelist_build(engine);
else if (GRAPHICS_VER(i915) == 11)
icl_whitelist_build(engine);
-   else if (IS_CANNONLAKE(i915))
-   cnl_whitelist_build(engine);
else if (IS_COMETLAKE(i915))
cml_whitelist_build(engine);
else if (IS_COFFEELAKE(i915))
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 05/10] drm/i915/rkl: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Switch RKL to use a revid->stepping table as we're trying to do on all
platforms going forward.

Bspec: 44501
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 4 ++--
 drivers/gpu/drm/i915/i915_drv.h  | 8 ++--
 drivers/gpu/drm/i915/intel_step.c| 9 +
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 9643624fe160..74b2aa3c2946 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -594,7 +594,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
if (intel_dp->psr.psr2_sel_fetch_enabled) {
/* WA 1408330847 */
if (IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
-   IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0))
+   IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK);
@@ -1342,7 +1342,7 @@ static void intel_psr_disable_locked(struct intel_dp 
*intel_dp)
/* WA 1408330847 */
if (intel_dp->psr.psr2_sel_fetch_enabled &&
(IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
-IS_RKL_REVID(dev_priv, RKL_REVID_A0, RKL_REVID_A0)))
+IS_RKL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)))
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 78db92bbb1c6..592e7177202e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1536,12 +1536,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
(IS_TIGERLAKE(__i915) && !(IS_TGL_U(__i915) || IS_TGL_Y(__i915)) && \
 IS_GT_STEP(__i915, since, until))
 
-#define RKL_REVID_A0   0x0
-#define RKL_REVID_B0   0x1
-#define RKL_REVID_C0   0x4
-
-#define IS_RKL_REVID(p, since, until) \
-   (IS_ROCKETLAKE(p) && IS_REVID(p, since, until))
+#define IS_RKL_DISPLAY_STEP(p, since, until) \
+   (IS_ROCKETLAKE(p) && IS_DISPLAY_STEP(p, since, until))
 
 #define DG1_REVID_A0   0x0
 #define DG1_REVID_B0   0x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 61666a3dd672..1593ab25f41a 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -69,6 +69,12 @@ static const struct intel_step_info tgl_revid_step_tbl[] = {
[1] = { .gt_step = STEP_B0, .display_step = STEP_D0 },
 };
 
+static const struct intel_step_info rkl_revid_step_tbl[] = {
+   [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
+   [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
+   [4] = { .gt_step = STEP_C0, .display_step = STEP_C0 },
+};
+
 static const struct intel_step_info adls_revid_step_tbl[] = {
[0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[0x1] = { .gt_step = STEP_A0, .display_step = STEP_A2 },
@@ -97,6 +103,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_ALDERLAKE_S(i915)) {
revids = adls_revid_step_tbl;
size = ARRAY_SIZE(adls_revid_step_tbl);
+   } else if (IS_ROCKETLAKE(i915)) {
+   revids = rkl_revid_step_tbl;
+   size = ARRAY_SIZE(rkl_revid_step_tbl);
} else if (IS_TGL_U(i915) || IS_TGL_Y(i915)) {
revids = tgl_uy_revid_step_tbl;
size = ARRAY_SIZE(tgl_uy_revid_step_tbl);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 10/10] drm/i915/dmc: Modify intel_get_stepping_info()

2021-07-08 Thread Anusha Srivatsa
With all platforms having the tepping info in intel_step.c,
it makes no sense to maintain a separate lookup table
in intel_dmc.c Let modify intel_Get_stepping_info()
to grab stepping info from the central location towards
which everything is moving.

Cc: Jani Nikula 
Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/display/intel_dmc.c | 51 +---
 1 file changed, 9 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dmc.c 
b/drivers/gpu/drm/i915/display/intel_dmc.c
index f8789d4543bf..895bee8f9782 100644
--- a/drivers/gpu/drm/i915/display/intel_dmc.c
+++ b/drivers/gpu/drm/i915/display/intel_dmc.c
@@ -247,50 +247,16 @@ bool intel_dmc_has_payload(struct drm_i915_private *i915)
return i915->dmc.dmc_info[DMC_FW_MAIN].payload;
 }
 
-static const struct stepping_info skl_stepping_info[] = {
-   {'A', '0'}, {'B', '0'}, {'C', '0'},
-   {'D', '0'}, {'E', '0'}, {'F', '0'},
-   {'G', '0'}, {'H', '0'}, {'I', '0'},
-   {'J', '0'}, {'K', '0'}
-};
-
-static const struct stepping_info bxt_stepping_info[] = {
-   {'A', '0'}, {'A', '1'}, {'A', '2'},
-   {'B', '0'}, {'B', '1'}, {'B', '2'}
-};
-
-static const struct stepping_info icl_stepping_info[] = {
-   {'A', '0'}, {'A', '1'}, {'A', '2'},
-   {'B', '0'}, {'B', '2'},
-   {'C', '0'}
-};
-
-static const struct stepping_info no_stepping_info = { '*', '*' };
-
 static const struct stepping_info *
-intel_get_stepping_info(struct drm_i915_private *dev_priv)
+intel_get_stepping_info(struct drm_i915_private *dev_priv,
+   struct stepping_info *si)
 {
-   const struct stepping_info *si;
-   unsigned int size;
-
-   if (IS_ICELAKE(dev_priv)) {
-   size = ARRAY_SIZE(icl_stepping_info);
-   si = icl_stepping_info;
-   } else if (IS_SKYLAKE(dev_priv)) {
-   size = ARRAY_SIZE(skl_stepping_info);
-   si = skl_stepping_info;
-   } else if (IS_BROXTON(dev_priv)) {
-   size = ARRAY_SIZE(bxt_stepping_info);
-   si = bxt_stepping_info;
-   } else {
-   size = 0;
-   si = NULL;
-   }
-
-   if (INTEL_REVID(dev_priv) < size)
-   return si + INTEL_REVID(dev_priv);
+   struct intel_step_info step = RUNTIME_INFO(dev_priv)->step;
+   const char *step_name = intel_step_name(step.display_step);
 
-   return _stepping_info;
+   si->stepping = step_name[0];
+si->substepping = step_name[1];
+   return si;
 }
 
 static void gen9_set_dc_state_debugmask(struct drm_i915_private *dev_priv)
@@ -616,7 +582,8 @@ static void parse_dmc_fw(struct drm_i915_private *dev_priv,
struct intel_package_header *package_header;
struct intel_dmc_header_base *dmc_header;
struct intel_dmc *dmc = _priv->dmc;
-   const struct stepping_info *si = intel_get_stepping_info(dev_priv);
+   struct stepping_info display_info = { '*', '*'};
+   const struct stepping_info *si = intel_get_stepping_info(dev_priv, 
_info);
u32 readcount = 0;
u32 r, offset;
int id;
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 00/10] Get stepping info from RUNTIME_INFO->step

2021-07-08 Thread Anusha Srivatsa
The changes are added on top of Matt's series:
https://patchwork.freedesktop.org/series/92299/
This series modifies the way we get stepping indo for DMC 
to load the right firmware for the right stepping/substepping
combinations.

Since we have a lookup table for BXT in intel_dmc.c and BXT
stepping changes were missing from Matt's series, I have added a
patch for it.

Anusha Srivatsa (3):
  drm/i915/bxt: Use revid->stepping tables
  drm/i915/step: Add intel_step_name() helper
  drm/i915/dmc: Modify intel_get_stepping_info()

Matt Roper (7):
  drm/i915: Make pre-production detection use direct revid comparison
  drm/i915/skl: Use revid->stepping tables
  drm/i915/icl: Use revid->stepping tables
  drm/i915/jsl_ehl: Use revid->stepping tables
  drm/i915/rkl: Use revid->stepping tables
  drm/i915/dg1: Use revid->stepping tables
  drm/i915/cnl: Drop all workarounds

 .../drm/i915/display/intel_display_power.c|   2 +-
 drivers/gpu/drm/i915/display/intel_dmc.c  |  51 ++-
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c |   2 +-
 drivers/gpu/drm/i915/display/intel_psr.c  |   4 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   |  81 ++
 drivers/gpu/drm/i915/i915_drv.c   |   8 +-
 drivers/gpu/drm/i915/i915_drv.h   |  80 ++
 drivers/gpu/drm/i915/intel_pm.c   |   2 +-
 drivers/gpu/drm/i915/intel_step.c | 142 +-
 drivers/gpu/drm/i915/intel_step.h |   8 +
 11 files changed, 187 insertions(+), 195 deletions(-)

-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 04/10] drm/i915/jsl_ehl: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Switch JSL/EHL to use a revid->stepping table as we're trying to do on
all platforms going forward.

Bspec: 29153
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/display/intel_dpll_mgr.c | 2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c   | 2 +-
 drivers/gpu/drm/i915/i915_drv.h   | 9 -
 drivers/gpu/drm/i915/intel_step.c | 8 
 4 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c 
b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
index 882bfd499e55..dfc31b682848 100644
--- a/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/display/intel_dpll_mgr.c
@@ -2674,7 +2674,7 @@ static bool
 ehl_combo_pll_div_frac_wa_needed(struct drm_i915_private *i915)
 {
return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
-IS_JSL_EHL_REVID(i915, EHL_REVID_B0, REVID_FOREVER)) ||
+IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
 IS_TIGERLAKE(i915) || IS_ALDERLAKE_P(i915)) &&
 i915->dpll.ref_clks.nssc == 38400;
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index e2d8acb8c1c9..4c0c15bbdac2 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1043,7 +1043,7 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
 
/* Wa_1607087056:icl,ehl,jsl */
if (IS_ICELAKE(i915) ||
-   IS_JSL_EHL_REVID(i915, EHL_REVID_A0, EHL_REVID_A0))
+   IS_JSL_EHL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e26ff8624945..78db92bbb1c6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1519,11 +1519,10 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_ICL_GT_STEP(p, since, until) \
(IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
 
-#define EHL_REVID_A00x0
-#define EHL_REVID_B00x1
-
-#define IS_JSL_EHL_REVID(p, since, until) \
-   (IS_JSL_EHL(p) && IS_REVID(p, since, until))
+#define IS_JSL_EHL_GT_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_GT_STEP(p, since, until))
+#define IS_JSL_EHL_DISPLAY_STEP(p, since, until) \
+   (IS_JSL_EHL(p) && IS_DISPLAY_STEP(p, since, until))
 
 #define IS_TGL_DISPLAY_STEP(__i915, since, until) \
(IS_TIGERLAKE(__i915) && \
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index 4d8248cf67d3..61666a3dd672 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -51,6 +51,11 @@ static const struct intel_step_info icl_revid_step_tbl[] = {
[7] = { .gt_step = STEP_D0, .display_step = STEP_D0 },
 };
 
+static const struct intel_step_info jsl_ehl_revid_step_tbl[] = {
+   [0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
+   [1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
+};
+
 static const struct intel_step_info tgl_uy_revid_step_tbl[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_C0 },
@@ -98,6 +103,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_TIGERLAKE(i915)) {
revids = tgl_revid_step_tbl;
size = ARRAY_SIZE(tgl_revid_step_tbl);
+   } else if (IS_JSL_EHL(i915)) {
+   revids = jsl_ehl_revid_step_tbl;
+   size = ARRAY_SIZE(jsl_ehl_revid_step_tbl);
} else if (IS_ICELAKE(i915)) {
revids = icl_revid_step_tbl;
size = ARRAY_SIZE(icl_revid_step_tbl);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 08/10] drm/i915/bxt: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
Switch BXT to use a revid->stepping table as we're trying to do on all
platforms going forward.

Signed-off-by: Anusha Srivatsa 
---
 drivers/gpu/drm/i915/intel_step.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index c4ce02d22828..99c0d3df001b 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -31,6 +31,15 @@ static const struct intel_step_info skl_revid_step_tbl[] = {
[0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 },
 };
 
+static const struct intel_step_info bxt_revids[] = {
+   [0] = { .gt_step = STEP_A0 },
+   [1] = { .gt_step = STEP_A1 },
+   [2] = { .gt_step = STEP_A2 },
+   [6] = { .gt_step = STEP_B0 },
+   [7] = { .gt_step = STEP_B1 },
+   [8] = { .gt_step = STEP_B2 },
+};
+
 static const struct intel_step_info kbl_revid_step_tbl[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
@@ -129,6 +138,9 @@ void intel_step_init(struct drm_i915_private *i915)
} else if (IS_KABYLAKE(i915)) {
revids = kbl_revid_step_tbl;
size = ARRAY_SIZE(kbl_revid_step_tbl);
+   } else if (IS_BROXTON(i915)) {
+   revids = bxt_revids;
+   size = ARRAY_SIZE(bxt_revids);
} else if (IS_SKYLAKE(i915)) {
revids = skl_revid_step_tbl;
size = ARRAY_SIZE(skl_revid_step_tbl);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 06/10] drm/i915/dg1: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Switch DG1 to use a revid->stepping table as we're trying to do on all
platforms going forward.

This removes the last use of IS_REVID() and REVID_FOREVER, so remove
those now-unused macros as well to prevent their accidental use on
future platforms.

Bspec: 44463
Signed-off-by: Matt Roper 
---
 .../gpu/drm/i915/display/intel_display_power.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c|  2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c| 10 +-
 drivers/gpu/drm/i915/i915_drv.h| 18 --
 drivers/gpu/drm/i915/intel_pm.c|  2 +-
 drivers/gpu/drm/i915/intel_step.c  |  8 
 6 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 285380079aab..975a7e25cea5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5799,7 +5799,7 @@ static void tgl_bw_buddy_init(struct drm_i915_private 
*dev_priv)
int config, i;
 
if (IS_ALDERLAKE_S(dev_priv) ||
-   IS_DG1_REVID(dev_priv, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) ||
IS_TGL_DISPLAY_STEP(dev_priv, STEP_A0, STEP_B0))
/* Wa_1409767108:tgl,dg1,adl-s */
table = wa_1409767108_buddy_page_masks;
diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c 
b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
index 1f43aba2e9e2..50d11a84e7a9 100644
--- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c
+++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c
@@ -157,7 +157,7 @@ intel_gt_setup_fake_lmem(struct intel_gt *gt)
 static bool get_legacy_lowmem_region(struct intel_uncore *uncore,
 u64 *start, u32 *size)
 {
-   if (!IS_DG1_REVID(uncore->i915, DG1_REVID_A0, DG1_REVID_B0))
+   if (!IS_DG1_GT_STEP(uncore->i915, STEP_A0, STEP_B0))
return false;
 
*start = 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 4c0c15bbdac2..62321e9149db 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -,7 +,7 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
gen12_gt_workarounds_init(i915, wal);
 
/* Wa_1607087056:dg1 */
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0))
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
L3_CLKGATE_DIS | L3_CR2X_CLKGATE_DIS);
@@ -1522,7 +1522,7 @@ static void dg1_whitelist_build(struct intel_engine_cs 
*engine)
tgl_whitelist_build(engine);
 
/* GEN:BUG:1409280441:dg1 */
-   if (IS_DG1_REVID(engine->i915, DG1_REVID_A0, DG1_REVID_A0) &&
+   if (IS_DG1_GT_STEP(engine->i915, STEP_A0, STEP_A0) &&
(engine->class == RENDER_CLASS ||
 engine->class == COPY_ENGINE_CLASS))
whitelist_reg_ext(w, RING_ID(engine->mmio_base),
@@ -1592,7 +1592,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
 {
struct drm_i915_private *i915 = engine->i915;
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_A0)) {
/*
 * Wa_1607138336:tgl[a0],dg1[a0]
@@ -1638,7 +1638,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
if (IS_ALDERLAKE_P(i915) || IS_ALDERLAKE_S(i915) ||
-   IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/* Wa_1409804808:tgl,rkl,dg1[a0],adl-s,adl-p */
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
@@ -1652,7 +1652,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
}
 
 
-   if (IS_DG1_REVID(i915, DG1_REVID_A0, DG1_REVID_A0) ||
+   if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_A0) ||
IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
/*
 * Wa_1607030317:tgl
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 592e7177202e..496c468229fc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1317,19 +1317,10 @@ static inline struct drm_i915_private 
*pdev_to_i915(struct pci_dev *pdev)
 #define IS_DISPLAY_VER(i915, from, until) \
(DISPLAY_VER(i915) >= (from) && DISPLAY_VER(i915) <= (until))
 
-#define REVID_FOREVER  0xff
 #define INTEL_REVID(dev_priv)  (to_pci_dev((dev_priv)->drm.dev)->revision)
 
 #define HAS_DSB(dev_priv)  

[Intel-gfx] [PATCH 02/10] drm/i915/skl: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Switch SKL to use a revid->stepping table as we're trying to do on all
platforms going forward.  Also add some additional stepping definitions
for completeness, even if we don't have any workarounds tied to them.

Note that SKL has a case where a newer revision ID corresponds to an
older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1).  Also, the lack
of a revision ID 0x8 in the table is intentional and not an oversight.
We'll re-write the KBL-specific comment to make it clear that these kind
of quirks are expected.

Finally, since we're already touching the KBL area too, let's rename the
KBL table to match the naming convention used by all of the other
platforms.

Bspec: 13626
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
 drivers/gpu/drm/i915/i915_drv.h | 11 +--
 drivers/gpu/drm/i915/intel_step.c   | 35 -
 drivers/gpu/drm/i915/intel_step.h   |  4 +++
 4 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index d9a5a445ceec..6dfd564e078f 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -883,7 +883,7 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
 
/* WaInPlaceDecompressionHang:skl */
-   if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
+   if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4f2a61cb024a..775057626ee6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1509,16 +1509,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_TGL_Y(dev_priv) \
IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE, INTEL_SUBPLATFORM_ULX)
 
-#define SKL_REVID_A0   0x0
-#define SKL_REVID_B0   0x1
-#define SKL_REVID_C0   0x2
-#define SKL_REVID_D0   0x3
-#define SKL_REVID_E0   0x4
-#define SKL_REVID_F0   0x5
-#define SKL_REVID_G0   0x6
-#define SKL_REVID_H0   0x7
-
-#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
+#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p, since, 
until))
 
 #define IS_KBL_GT_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until))
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index ba9479a67521..bfd63f56c200 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -7,15 +7,31 @@
 #include "intel_step.h"
 
 /*
- * KBL revision ID ordering is bizarre; higher revision ID's map to lower
- * steppings in some cases.  So rather than test against the revision ID
- * directly, let's map that into our own range of increasing ID's that we
- * can test against in a regular manner.
+ * Some platforms have unusual ways of mapping PCI revision ID to GT/display
+ * steppings.  E.g., in some cases a higher PCI revision may translate to a
+ * lower stepping of the GT and/or display IP.  This file provides lookup
+ * tables to map the PCI revision into a standard set of stepping values that
+ * can be compared numerically.
+ *
+ * Also note that some revisions/steppings may have been set aside as
+ * placeholders but never materialized in real hardware; in those cases there
+ * may be jumps in the revision IDs or stepping values in the tables below.
  */
 
+static const struct intel_step_info skl_revid_step_tbl[] = {
+   [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
+   [0x1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
+   [0x2] = { .gt_step = STEP_C0, .display_step = STEP_C0 },
+   [0x3] = { .gt_step = STEP_D0, .display_step = STEP_D0 },
+   [0x4] = { .gt_step = STEP_E0, .display_step = STEP_E0 },
+   [0x5] = { .gt_step = STEP_F0, .display_step = STEP_F0 },
+   [0x6] = { .gt_step = STEP_G0, .display_step = STEP_G0 },
+   [0x7] = { .gt_step = STEP_H0, .display_step = STEP_H0 },
+   [0x9] = { .gt_step = STEP_J0, .display_step = STEP_J0 },
+   [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 },
+};
 
-/* FIXME: what about REVID_E0 */
-static const struct intel_step_info kbl_revids[] = {
+static const struct intel_step_info kbl_revid_step_tbl[] = {
[0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
[1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
[2] = { .gt_step = STEP_C0, .display_step = STEP_B0 },
@@ -74,8 +90,11 @@ void intel_step_init(struct drm_i915_private *i915)
revids = tgl_revid_step_tbl;

[Intel-gfx] [PATCH 03/10] drm/i915/icl: Use revid->stepping tables

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Switch ICL to use a revid->stepping table as we're trying to do on all
platforms going forward.  While we're at it, let's include some
additional steppings that have popped up, even if we don't yet have any
workarounds tied to those steppings (we probably need to audit our
workaround list soon to see if any of the bounds have moved or if new
workarounds have appeared).

Note that the current bspec table is missing information about how to
map PCI revision ID to GT/display steppings; it only provides an SoC
stepping.  The mapping to GT/display steppings (which aren't always the
same as the SoC stepping) used to be in the bspec, but was apparently
dropped during an update in Nov 2019; I've made my changes here based on
an older bspec snapshot that still had the necessary information.  We've
requested that the missing information be restored.

Bspec: 21441  # pre-Nov 2019 snapshot
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 ++--
 drivers/gpu/drm/i915/i915_drv.h | 10 ++
 drivers/gpu/drm/i915/intel_step.c   | 12 
 drivers/gpu/drm/i915/intel_step.h   |  2 ++
 4 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 6dfd564e078f..e2d8acb8c1c9 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -557,7 +557,7 @@ static void icl_ctx_workarounds_init(struct intel_engine_cs 
*engine,
/* Wa_1604370585:icl (pre-prod)
 * Formerly known as WaPushConstantDereferenceHoldDisable
 */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
 PUSH_CONSTANT_DEREF_DISABLE);
 
@@ -573,12 +573,12 @@ static void icl_ctx_workarounds_init(struct 
intel_engine_cs *engine,
/* Wa_2006611047:icl (pre-prod)
 * Formerly known as WaDisableImprovedTdlClkGating
 */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_masked_en(wal, GEN7_ROW_CHICKEN2,
 GEN11_TDL_CLOCK_GATING_FIX_DISABLE);
 
/* Wa_2006665173:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_masked_en(wal, GEN11_COMMON_SLICE_CHICKEN3,
 GEN11_BLEND_EMB_FIX_DISABLE_IN_RCC);
 
@@ -1023,13 +1023,13 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, 
struct i915_wa_list *wal)
GAMW_ECO_DEV_CTX_RELOAD_DISABLE);
 
/* Wa_1405779004:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_A0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_A0))
wa_write_or(wal,
SLICE_UNIT_LEVEL_CLKGATE,
MSCUNIT_CLKGATE_DIS);
 
/* Wa_1406838659:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
INF_UNIT_LEVEL_CLKGATE,
CGPSF_CLKGATE_DIS);
@@ -1725,7 +1725,7 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
PMFLUSHDONE_LNEBLK);
 
/* Wa_1406609255:icl (pre-prod) */
-   if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0))
+   if (IS_ICL_GT_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal,
GEN7_SARCHKMD,
GEN7_DISABLE_DEMAND_PREFETCH);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 775057626ee6..e26ff8624945 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1516,14 +1516,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until))
 
-#define ICL_REVID_A0   0x0
-#define ICL_REVID_A2   0x1
-#define ICL_REVID_B0   0x3
-#define ICL_REVID_B2   0x4
-#define ICL_REVID_C0   0x5
-
-#define IS_ICL_REVID(p, since, until) \
-   (IS_ICELAKE(p) && IS_REVID(p, since, until))
+#define IS_ICL_GT_STEP(p, since, until) \
+   (IS_ICELAKE(p) && IS_GT_STEP(p, since, until))
 
 #define EHL_REVID_A00x0
 #define EHL_REVID_B00x1
diff --git a/drivers/gpu/drm/i915/intel_step.c 
b/drivers/gpu/drm/i915/intel_step.c
index bfd63f56c200..4d8248cf67d3 100644
--- a/drivers/gpu/drm/i915/intel_step.c
+++ b/drivers/gpu/drm/i915/intel_step.c
@@ -42,6 +42,15 @@ static const struct intel_step_info kbl_revid_step_tbl[] 

[Intel-gfx] [PATCH 01/10] drm/i915: Make pre-production detection use direct revid comparison

2021-07-08 Thread Anusha Srivatsa
From: Matt Roper 

Although we're converting our workarounds to use a revid->stepping
lookup table, the function that detects pre-production hardware should
continue to compare against PCI revision ID values directly.  These are
listed in the bspec as integers, so it's easier to confirm their
correctness if we just use an integer literal rather than a symbolic
name anyway.

Since the BXT, GLK, and CNL revid macros were never used in any
workaround code, just remove them completely.

Bspec: 13620, 19131, 13626, 18329
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/i915_drv.c   |  8 
 drivers/gpu/drm/i915/i915_drv.h   | 24 
 drivers/gpu/drm/i915/intel_step.h |  1 +
 3 files changed, 5 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 30d8cd8c69b1..90136995f5eb 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct 
drm_i915_private *dev_priv)
bool pre = false;
 
pre |= IS_HSW_EARLY_SDV(dev_priv);
-   pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0);
-   pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST);
-   pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0);
-   pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2);
+   pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6;
+   pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
+   pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
+   pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3;
 
if (pre) {
drm_err(_priv->drm, "This is a pre-production stepping. "
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d14cda2ff923..4f2a61cb024a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1520,35 +1520,11 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since, 
until))
 
-#define BXT_REVID_A0   0x0
-#define BXT_REVID_A1   0x1
-#define BXT_REVID_B0   0x3
-#define BXT_REVID_B_LAST   0x8
-#define BXT_REVID_C0   0x9
-
-#define IS_BXT_REVID(dev_priv, since, until) \
-   (IS_BROXTON(dev_priv) && IS_REVID(dev_priv, since, until))
-
 #define IS_KBL_GT_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until))
 #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
(IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since, until))
 
-#define GLK_REVID_A0   0x0
-#define GLK_REVID_A1   0x1
-#define GLK_REVID_A2   0x2
-#define GLK_REVID_B0   0x3
-
-#define IS_GLK_REVID(dev_priv, since, until) \
-   (IS_GEMINILAKE(dev_priv) && IS_REVID(dev_priv, since, until))
-
-#define CNL_REVID_A0   0x0
-#define CNL_REVID_B0   0x1
-#define CNL_REVID_C0   0x2
-
-#define IS_CNL_REVID(p, since, until) \
-   (IS_CANNONLAKE(p) && IS_REVID(p, since, until))
-
 #define ICL_REVID_A0   0x0
 #define ICL_REVID_A2   0x1
 #define ICL_REVID_B0   0x3
diff --git a/drivers/gpu/drm/i915/intel_step.h 
b/drivers/gpu/drm/i915/intel_step.h
index 958a8bb5d677..8efacef6ab31 100644
--- a/drivers/gpu/drm/i915/intel_step.h
+++ b/drivers/gpu/drm/i915/intel_step.h
@@ -22,6 +22,7 @@ struct intel_step_info {
 enum intel_step {
STEP_NONE = 0,
STEP_A0,
+   STEP_A1,
STEP_A2,
STEP_B0,
STEP_B1,
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup

2021-07-08 Thread Srivatsa, Anusha



> -Original Message-
> From: Roper, Matthew D 
> Sent: Thursday, July 8, 2021 4:05 PM
> To: Srivatsa, Anusha 
> Cc: Jani Nikula ; intel-gfx@lists.freedesktop.org
> Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup
> 
> On Thu, Jul 08, 2021 at 11:37:50AM -0700, Srivatsa, Anusha wrote:
> >
> >
> > > -Original Message-
> > > From: Jani Nikula 
> > > Sent: Thursday, July 8, 2021 12:33 AM
> > > To: Roper, Matthew D ; intel-
> > > g...@lists.freedesktop.org
> > > Cc: Srivatsa, Anusha 
> > > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup
> > >
> > > On Wed, 07 Jul 2021, Matt Roper  wrote:
> > > > PCI revision IDs don't always map to GT and display IP steppings
> > > > in an intuitive/sensible way.  On many of our recent platforms
> > > > we've switched to using revid->stepping lookup tables with the
> > > > infrastructure in intel_step.c to handle stepping lookups and
> > > > comparisons.  Since it's confusing to have some of our platforms
> > > > using the new lookup tables and some still using old revid
> > > > comparisons, let's migrate all the old platforms over to the table
> > > > approach since that's what we want to standardize on going
> > > > forward.  The only place that revision ID's should really get used
> > > > directly now is when checking to see if we're running on pre-production
> hardware.
> > >
> > > Anusha, Matt, please sort this out between the two of you. :)
> > >
> > > https://patchwork.freedesktop.org/series/92257/
> > >
> > @Roper, Matthew D the series doesn't add the steeping table for BXT and
> GLK.
> 
> Right, that was intentional because we don't use the steppings for those
> platforms anywhere in the code.  But if that's changing with your DMC series,
> I can add the tables for those two as well.
> 
Yes, will need GLK and BXT
Thanks

Anusha
> Matt
> 
> >
> > Anusha
> > > BR,
> > > Jani.
> > >
> > >
> > > >
> > > > Let's also take the opportunity to drop a bit of effectively dead
> > > > code in the workarounds file too.
> > > >
> > > > Cc: Jani Nikula 
> > > >
> > > > Matt Roper (7):
> > > >   drm/i915: Make pre-production detection use direct revid comparison
> > > >   drm/i915/skl: Use revid->stepping tables
> > > >   drm/i915/icl: Use revid->stepping tables
> > > >   drm/i915/jsl_ehl: Use revid->stepping tables
> > > >   drm/i915/rkl: Use revid->stepping tables
> > > >   drm/i915/dg1: Use revid->stepping tables
> > > >   drm/i915/cnl: Drop all workarounds
> > > >
> > > >  .../drm/i915/display/intel_display_power.c|  2 +-
> > > >  drivers/gpu/drm/i915/display/intel_dpll_mgr.c |  2 +-
> > > >  drivers/gpu/drm/i915/display/intel_psr.c  |  4 +-
> > > >  drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  2 +-
> > > >  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 81 +++
> > > >  drivers/gpu/drm/i915/i915_drv.c   |  8 +-
> > > >  drivers/gpu/drm/i915/i915_drv.h   | 80 +++---
> > > >  drivers/gpu/drm/i915/intel_pm.c   |  2 +-
> > > >  drivers/gpu/drm/i915/intel_step.c | 72 +++--
> > > >  drivers/gpu/drm/i915/intel_step.h |  7 ++
> > > >  10 files changed, 107 insertions(+), 153 deletions(-)
> > >
> > > --
> > > Jani Nikula, Intel Open Source Graphics Center
> 
> --
> Matt Roper
> Graphics Software Engineer
> VTT-OSGC Platform Enablement
> Intel Corporation
> (916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup

2021-07-08 Thread Matt Roper
On Thu, Jul 08, 2021 at 11:37:50AM -0700, Srivatsa, Anusha wrote:
> 
> 
> > -Original Message-
> > From: Jani Nikula 
> > Sent: Thursday, July 8, 2021 12:33 AM
> > To: Roper, Matthew D ; intel-
> > g...@lists.freedesktop.org
> > Cc: Srivatsa, Anusha 
> > Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup
> > 
> > On Wed, 07 Jul 2021, Matt Roper  wrote:
> > > PCI revision IDs don't always map to GT and display IP steppings in an
> > > intuitive/sensible way.  On many of our recent platforms we've
> > > switched to using revid->stepping lookup tables with the
> > > infrastructure in intel_step.c to handle stepping lookups and
> > > comparisons.  Since it's confusing to have some of our platforms using
> > > the new lookup tables and some still using old revid comparisons,
> > > let's migrate all the old platforms over to the table approach since
> > > that's what we want to standardize on going forward.  The only place
> > > that revision ID's should really get used directly now is when
> > > checking to see if we're running on pre-production hardware.
> > 
> > Anusha, Matt, please sort this out between the two of you. :)
> > 
> > https://patchwork.freedesktop.org/series/92257/
> > 
> @Roper, Matthew D the series doesn't add the steeping table for BXT and GLK.

Right, that was intentional because we don't use the steppings for those
platforms anywhere in the code.  But if that's changing with your DMC
series, I can add the tables for those two as well.


Matt

> 
> Anusha
> > BR,
> > Jani.
> > 
> > 
> > >
> > > Let's also take the opportunity to drop a bit of effectively dead code
> > > in the workarounds file too.
> > >
> > > Cc: Jani Nikula 
> > >
> > > Matt Roper (7):
> > >   drm/i915: Make pre-production detection use direct revid comparison
> > >   drm/i915/skl: Use revid->stepping tables
> > >   drm/i915/icl: Use revid->stepping tables
> > >   drm/i915/jsl_ehl: Use revid->stepping tables
> > >   drm/i915/rkl: Use revid->stepping tables
> > >   drm/i915/dg1: Use revid->stepping tables
> > >   drm/i915/cnl: Drop all workarounds
> > >
> > >  .../drm/i915/display/intel_display_power.c|  2 +-
> > >  drivers/gpu/drm/i915/display/intel_dpll_mgr.c |  2 +-
> > >  drivers/gpu/drm/i915/display/intel_psr.c  |  4 +-
> > >  drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  2 +-
> > >  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 81 +++
> > >  drivers/gpu/drm/i915/i915_drv.c   |  8 +-
> > >  drivers/gpu/drm/i915/i915_drv.h   | 80 +++---
> > >  drivers/gpu/drm/i915/intel_pm.c   |  2 +-
> > >  drivers/gpu/drm/i915/intel_step.c | 72 +++--
> > >  drivers/gpu/drm/i915/intel_step.h |  7 ++
> > >  10 files changed, 107 insertions(+), 153 deletions(-)
> > 
> > --
> > Jani Nikula, Intel Open Source Graphics Center

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-08 Thread Andrey Grodzovsky


On 2021-07-08 1:37 p.m., Daniel Vetter wrote:

It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
  drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
  1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 64d398166644..4e1124ed80e0 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
  
  	dma_fence_put(entity->last_scheduled);

+
entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
  
+	/*

+* if the queue is empty we allow drm_sched_job_arm() to locklessly



Probably meant drm_sched_entity_select_rq here



+* access ->last_scheduled. This only works if we set the pointer before
+* we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(>job_queue);
return sched_job;
  }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
  
-	if (spsc_queue_count(>job_queue) || !entity->sched_list)

+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(>job_queue))
return;



Shouldn't smp_rmb be here in between ? Given the queue is empty we want to
be certain we are reading the most recent value of entity->last_scheduled

Andrey



  
-	fence = READ_ONCE(entity->last_scheduled);

+   fence = entity->last_scheduled;
+
+   /*
+* Only when the queue is empty are we guaranteed the the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
  

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.BAT: success for CT changes required for GuC submission

2021-07-08 Thread Patchwork
== Series Details ==

Series: CT changes required for GuC submission
URL   : https://patchwork.freedesktop.org/series/92330/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20556


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20556/index.html


Changes
---

  No changes found


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20556

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20556: 2d17849eda02a4072eb8ab2ba74f5bb44dc8a027 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

2d17849eda02 drm/i915/guc: Module load failure test for CT buffer creation
f6019e334f58 drm/i915/guc: Optimize CTB writes and reads
8ca5983abea9 drm/i915/guc: Add stall timer to non blocking CTB send function
d8d221ac4b12 drm/i915/guc: Add non blocking CTB send function
df6020220043 drm/i915/guc: Increase size of CTB buffers
6d0520a4e3e0 drm/i915/guc: Improve error message for unsolicited CT response
c5e508db6c3f drm/i915/guc: Relax CTB response timeout

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20556/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for CT changes required for GuC submission

2021-07-08 Thread Patchwork
== Series Details ==

Series: CT changes required for GuC submission
URL   : https://patchwork.freedesktop.org/series/92330/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1210:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: 

[Intel-gfx] ✓ Fi.CI.BAT: success for Set BPP in the kernel (rev2)

2021-07-08 Thread Patchwork
== Series Details ==

Series: Set BPP in the kernel (rev2)
URL   : https://patchwork.freedesktop.org/series/92312/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10320 -> Patchwork_20554


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/index.html

Known issues


  Here are the changes found in Patchwork_20554 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s0:
- fi-cfl-8109u:   [PASS][1] -> [INCOMPLETE][2] ([i915#155])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/fi-cfl-8109u/igt@gem_exec_susp...@basic-s0.html

  
 Warnings 

  * igt@runner@aborted:
- fi-bdw-5557u:   [FAIL][3] ([i915#2722] / [i915#3744]) -> [FAIL][4] 
([i915#1602] / [i915#2029])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10320/fi-bdw-5557u/igt@run...@aborted.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/fi-bdw-5557u/igt@run...@aborted.html

  
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#3744]: https://gitlab.freedesktop.org/drm/intel/issues/3744


Participating hosts (40 -> 39)
--

  Missing(1): fi-bsw-cyan 


Build changes
-

  * Linux: CI_DRM_10320 -> Patchwork_20554

  CI-20190529: 20190529
  CI_DRM_10320: 7d61ab4a59bcbb206324b6a430748b4c15dd8adb @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6132: 61fb9cdf2a9132e3618c8b08b9d20fec0c347831 @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20554: ed4b3f15937bc71ae8a2db7c8c6a998ef31d3067 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ed4b3f15937b drm/i915/display/dsc: Force dsc BPP
c4f9e04d46c8 drm/i915/display/dsc: Add Per connector debugfs node for DSC BPP 
enable
ea416f2dba91 drm/i915/display: Add write permissions for fec support

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20554/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.BUILD: failure for drm/i915/gem: ioctl clean-ups (rev8)

2021-07-08 Thread Patchwork
== Series Details ==

Series: drm/i915/gem: ioctl clean-ups (rev8)
URL   : https://patchwork.freedesktop.org/series/89443/
State : failure

== Summary ==

Applying: drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
Using index info to reconstruct a base tree...
M   drivers/gpu/drm/i915/Makefile
M   drivers/gpu/drm/i915/gem/i915_gem_context.c
A   drivers/gpu/drm/i915/gt/intel_context_param.c
M   drivers/gpu/drm/i915/gt/intel_context_param.h
M   include/uapi/drm/i915_drm.h
Falling back to patching base and 3-way merge...
Auto-merging include/uapi/drm/i915_drm.h
Auto-merging drivers/gpu/drm/i915/gt/intel_context_param.h
CONFLICT (content): Merge conflict in 
drivers/gpu/drm/i915/gt/intel_context_param.h
Auto-merging drivers/gpu/drm/i915/gem/i915_gem_context.c
CONFLICT (content): Merge conflict in 
drivers/gpu/drm/i915/gem/i915_gem_context.c
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZE
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Set BPP in the kernel (rev2)

2021-07-08 Thread Patchwork
== Series Details ==

Series: Set BPP in the kernel (rev2)
URL   : https://patchwork.freedesktop.org/series/92312/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1896:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1896:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1412:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 

[Intel-gfx] [PATCH] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-08 Thread Daniel Vetter
It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

v2: Put smp_rmp() in the right place and fix up comment (Andrey)

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 64d398166644..6366006c0fcf 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
 
dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
 
+   /*
+* If the queue is empty we allow drm_sched_entity_select_rq() to
+* locklessly access ->last_scheduled. This only works if we set the
+* pointer before we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(>job_queue);
return sched_job;
 }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
 
-   if (spsc_queue_count(>job_queue) || !entity->sched_list)
+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(>job_queue))
return;
 
-   fence = READ_ONCE(entity->last_scheduled);
+   /*
+* Only when the queue is empty are we guaranteed that the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   fence = entity->last_scheduled;
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/7] drm/i915/adl_s: Extend Wa_1406941453

2021-07-08 Thread José Roberto de Souza
BSpec: 54370
Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index c346229e2be00..72562c233ad20 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1677,8 +1677,9 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct 
i915_wa_list *wal)
 GEN8_RC_SEMA_IDLE_MSG_DISABLE);
}
 
-   if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915)) {
-   /* Wa_1406941453:tgl,rkl,dg1 */
+   if (IS_DG1(i915) || IS_ROCKETLAKE(i915) || IS_TIGERLAKE(i915) ||
+   IS_ALDERLAKE_S(i915)) {
+   /* Wa_1406941453:tgl,rkl,dg1,adl-s */
wa_masked_en(wal,
 GEN10_SAMPLER_MODE,
 ENABLE_SMALLPL);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/7] drm/i915: Limit maximum number of memory channels

2021-07-08 Thread José Roberto de Souza
Alderlake-P PCODE is returning 4 memory channels while it has a
maximum of 2.
So adding this limit and printing a debug message but the real fix
will need to come from PCODE.

HSDES: 22013272110
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/intel_dram.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_dram.c 
b/drivers/gpu/drm/i915/intel_dram.c
index 879b0f007be31..de1d426627ef1 100644
--- a/drivers/gpu/drm/i915/intel_dram.c
+++ b/drivers/gpu/drm/i915/intel_dram.c
@@ -467,6 +467,10 @@ static int icl_pcode_read_mem_global_info(struct 
drm_i915_private *dev_priv)
}
 
dram_info->num_channels = (val & 0xf0) >> 4;
+   if (dram_info->num_channels > 2) {
+   drm_info(_priv->drm, "More DRAM channels than expected, 
setting to max.\n");
+   dram_info->num_channels = 2;
+   }
dram_info->num_qgv_points = (val & 0xf00) >> 8;
 
return 0;
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 1/7] drm/i915: Settle on "adl-x" in WA comments

2021-07-08 Thread José Roberto de Souza
From: Lucas De Marchi 

Most of the places are using this format so lets consolidate it.

Signed-off-by: José Roberto de Souza 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/display/intel_cdclk.c |  2 +-
 drivers/gpu/drm/i915/display/intel_cursor.c|  2 +-
 drivers/gpu/drm/i915/display/intel_display.c   |  2 +-
 drivers/gpu/drm/i915/display/intel_psr.c   | 10 +-
 drivers/gpu/drm/i915/display/skl_universal_plane.c |  2 +-
 drivers/gpu/drm/i915/gt/intel_workarounds.c|  2 +-
 drivers/gpu/drm/i915/intel_pm.c|  4 ++--
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c 
b/drivers/gpu/drm/i915/display/intel_cdclk.c
index df2d8ce4a12f6..71067a62264de 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -2878,7 +2878,7 @@ void intel_init_cdclk_hooks(struct drm_i915_private 
*dev_priv)
dev_priv->display.bw_calc_min_cdclk = skl_bw_calc_min_cdclk;
dev_priv->display.modeset_calc_cdclk = bxt_modeset_calc_cdclk;
dev_priv->display.calc_voltage_level = tgl_calc_voltage_level;
-   /* Wa_22011320316:adlp[a0] */
+   /* Wa_22011320316:adl-p[a0] */
if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
dev_priv->cdclk.table = adlp_a_step_cdclk_table;
else
diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c 
b/drivers/gpu/drm/i915/display/intel_cursor.c
index bb61e736de911..f61a25fb87e90 100644
--- a/drivers/gpu/drm/i915/display/intel_cursor.c
+++ b/drivers/gpu/drm/i915/display/intel_cursor.c
@@ -383,7 +383,7 @@ static u32 i9xx_cursor_ctl(const struct intel_crtc_state 
*crtc_state,
if (plane_state->hw.rotation & DRM_MODE_ROTATE_180)
cntl |= MCURSOR_ROTATE_180;
 
-   /* Wa_22012358565:adlp */
+   /* Wa_22012358565:adl-p */
if (DISPLAY_VER(dev_priv) == 13)
cntl |= MCURSOR_ARB_SLOTS(1);
 
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 026c28c612f07..65ddb6ca16e67 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -975,7 +975,7 @@ void intel_enable_pipe(const struct intel_crtc_state 
*new_crtc_state)
/* FIXME: assert CPU port conditions for SNB+ */
}
 
-   /* Wa_22012358565:adlp */
+   /* Wa_22012358565:adl-p */
if (DISPLAY_VER(dev_priv) == 13)
intel_de_rmw(dev_priv, PIPE_ARB_CTL(pipe),
 0, PIPE_ARB_USE_PROG_SLOTS);
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 9643624fe160d..4dfe1dceb8635 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -545,7 +545,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
val |= EDP_PSR2_FRAME_BEFORE_SU(intel_dp->psr.sink_sync_latency + 1);
val |= intel_psr2_get_tp_time(intel_dp);
 
-   /* Wa_22012278275:adlp */
+   /* Wa_22012278275:adl-p */
if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_D1)) {
static const u8 map[] = {
2, /* 5 lines */
@@ -733,7 +733,7 @@ tgl_dc3co_exitline_compute_config(struct intel_dp *intel_dp,
if (!dc3co_is_pipe_port_compatible(intel_dp, crtc_state))
return;
 
-   /* Wa_16011303918:adlp */
+   /* Wa_16011303918:adl-p */
if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0))
return;
 
@@ -965,7 +965,7 @@ static bool intel_psr2_config_valid(struct intel_dp 
*intel_dp,
return false;
}
 
-   /* Wa_16011303918:adlp */
+   /* Wa_16011303918:adl-p */
if (crtc_state->vrr.enable &&
IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0)) {
drm_dbg_kms(_priv->drm,
@@ -1160,7 +1160,7 @@ static void intel_psr_enable_source(struct intel_dp 
*intel_dp)
 intel_dp->psr.psr2_sel_fetch_enabled ?
 IGNORE_PSR2_HW_TRACKING : 0);
 
-   /* Wa_16011168373:adlp */
+   /* Wa_16011168373:adl-p */
if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) &&
intel_dp->psr.psr2_enabled)
intel_de_rmw(dev_priv,
@@ -1346,7 +1346,7 @@ static void intel_psr_disable_locked(struct intel_dp 
*intel_dp)
intel_de_rmw(dev_priv, CHICKEN_PAR1_1,
 DIS_RAM_BYPASS_PSR2_MAN_TRACK, 0);
 
-   /* Wa_16011168373:adlp */
+   /* Wa_16011168373:adl-p */
if (IS_ADLP_DISPLAY_STEP(dev_priv, STEP_A0, STEP_A0) &&
intel_dp->psr.psr2_enabled)
intel_de_rmw(dev_priv,
diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c 

[Intel-gfx] [PATCH 6/7] drm/i915/display/adl_p: Correctly program MBUS DBOX A credits

2021-07-08 Thread José Roberto de Souza
Alderlake-P have different values for MBUS DBOX A credits depending
if MBUS join is enabled or not.

BSpec: 50343
BSpec: 54369
Cc: Matt Atwood 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_display.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 65ddb6ca16e67..fe380896eb99e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3400,13 +3400,17 @@ static void glk_pipe_scaler_clock_gating_wa(struct 
drm_i915_private *dev_priv,
intel_de_write(dev_priv, CLKGATE_DIS_PSL(pipe), val);
 }
 
-static void icl_pipe_mbus_enable(struct intel_crtc *crtc)
+static void icl_pipe_mbus_enable(struct intel_crtc *crtc, bool joined_mbus)
 {
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
enum pipe pipe = crtc->pipe;
u32 val;
 
-   val = MBUS_DBOX_A_CREDIT(2);
+   /* Wa_22010947358:adl-p */
+   if (IS_ALDERLAKE_P(dev_priv))
+   val = joined_mbus ? MBUS_DBOX_A_CREDIT(6) : 
MBUS_DBOX_A_CREDIT(4);
+   else
+   val = MBUS_DBOX_A_CREDIT(2);
 
if (DISPLAY_VER(dev_priv) >= 12) {
val |= MBUS_DBOX_BW_CREDIT(2);
@@ -3561,8 +3565,12 @@ static void hsw_crtc_enable(struct intel_atomic_state 
*state,
if (dev_priv->display.initial_watermarks)
dev_priv->display.initial_watermarks(state, crtc);
 
-   if (DISPLAY_VER(dev_priv) >= 11)
-   icl_pipe_mbus_enable(crtc);
+   if (DISPLAY_VER(dev_priv) >= 11) {
+   const struct intel_dbuf_state *dbuf_state =
+   intel_atomic_get_new_dbuf_state(state);
+
+   icl_pipe_mbus_enable(crtc, dbuf_state->joined_mbus);
+   }
 
if (new_crtc_state->bigjoiner_slave)
intel_crtc_vblank_on(new_crtc_state);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 5/7] drm/i915: Limit Wa_22010178259 to affected platforms

2021-07-08 Thread José Roberto de Souza
This workaround is not needed for platforms with display 13.

Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_display_power.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 285380079aab2..6fc766da66054 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5822,10 +5822,11 @@ static void tgl_bw_buddy_init(struct drm_i915_private 
*dev_priv)
intel_de_write(dev_priv, BW_BUDDY_PAGE_MASK(i),
   table[config].page_mask);
 
-   /* Wa_22010178259:tgl,rkl */
-   intel_de_rmw(dev_priv, BW_BUDDY_CTL(i),
-BW_BUDDY_TLB_REQ_TIMER_MASK,
-BW_BUDDY_TLB_REQ_TIMER(0x8));
+   /* Wa_22010178259:tgl,dg1,rkl,adl-s */
+   if (DISPLAY_VER(dev_priv) == 12)
+   intel_de_rmw(dev_priv, BW_BUDDY_CTL(i),
+BW_BUDDY_TLB_REQ_TIMER_MASK,
+BW_BUDDY_TLB_REQ_TIMER(0x8));
}
}
 }
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 7/7] drm/i915/display/xelpd: Exetend Wa_14011508470

2021-07-08 Thread José Roberto de Souza
This workaround is also applicable to xelpd display so extending it.

Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_display_power.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_power.c 
b/drivers/gpu/drm/i915/display/intel_display_power.c
index 6fc766da66054..d92db471411e5 100644
--- a/drivers/gpu/drm/i915/display/intel_display_power.c
+++ b/drivers/gpu/drm/i915/display/intel_display_power.c
@@ -5883,8 +5883,8 @@ static void icl_display_core_init(struct drm_i915_private 
*dev_priv,
if (resume && intel_dmc_has_payload(dev_priv))
intel_dmc_load_program(dev_priv);
 
-   /* Wa_14011508470 */
-   if (DISPLAY_VER(dev_priv) == 12) {
+   /* Wa_14011508470:tgl,dg1,rkl,adl-s,adl-p */
+   if (DISPLAY_VER(dev_priv) >= 12) {
val = DCPR_CLEAR_MEMSTAT_DIS | DCPR_SEND_RESP_IMM |
  DCPR_MASK_LPMODE | DCPR_MASK_MAXLATENCY_MEMUP_CLR;
intel_uncore_rmw(_priv->uncore, GEN11_CHICKEN_DCPR_2, 0, 
val);
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/7] drm/i915: Implement Wa_1508744258

2021-07-08 Thread José Roberto de Souza
Same bit was required for Wa_14012131227 in DG1 now it is also
required as Wa_1508744258 to TGL, RKL, DG1, ADL-S and ADL-P.

Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index e5e3f820074a9..c346229e2be00 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -670,6 +670,13 @@ static void gen12_ctx_workarounds_init(struct 
intel_engine_cs *engine,
   FF_MODE2_GS_TIMER_MASK,
   FF_MODE2_GS_TIMER_224,
   0);
+
+   /*
+* Wa_14012131227:dg1
+* Wa_1508744258:tgl,rkl,dg1,adl-s,adl-p
+*/
+   wa_masked_en(wal, GEN7_COMMON_SLICE_CHICKEN1,
+GEN9_RHWO_OPTIMIZATION_DISABLE);
 }
 
 static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine,
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/display/xelpd: Fix incorrect color capability reporting

2021-07-08 Thread Patchwork
== Series Details ==

Series: drm/i915/display/xelpd: Fix incorrect color capability reporting
URL   : https://patchwork.freedesktop.org/series/92266/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10308_full -> Patchwork_20542_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20542_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs:
- {shard-rkl}:[FAIL][1] ([i915#3678]) -> [SKIP][2] +2 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-rkl-2/igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-rkl-6/igt@kms_ccs@pipe-a-bad-aux-stride-yf_tiled_ccs.html

  * igt@sysfs_timeslice_duration@timeout@vcs0:
- {shard-rkl}:[PASS][3] -> [FAIL][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-rkl-6/igt@sysfs_timeslice_duration@time...@vcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-rkl-6/igt@sysfs_timeslice_duration@time...@vcs0.html

  
Known issues


  Here are the changes found in Patchwork_20542_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][5] ([i915#3002])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-snb2/igt@gem_cre...@create-massive.html
- shard-kbl:  NOTRUN -> [DMESG-WARN][6] ([i915#3002])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl1/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-mixed:
- shard-snb:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#1099]) +5 
similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-mixed.html

  * igt@gem_eio@in-flight-contexts-1us:
- shard-tglb: [PASS][8] -> [TIMEOUT][9] ([i915#3063])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-tglb7/igt@gem_...@in-flight-contexts-1us.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-tglb3/igt@gem_...@in-flight-contexts-1us.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) +2 similar 
issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-tglb5/igt@gem_exec_fair@basic-f...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none@rcs0:
- shard-kbl:  [PASS][12] -> [FAIL][13] ([i915#2842]) +1 similar 
issue
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-kbl7/igt@gem_exec_fair@basic-n...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl7/igt@gem_exec_fair@basic-n...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][14] -> [SKIP][15] ([fdo#109271]) +1 similar 
issue
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-kbl3/igt@gem_exec_fair@basic-p...@vecs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][16] -> [FAIL][17] ([i915#2842]) +2 similar 
issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-glk8/igt@gem_exec_fair@basic-throt...@rcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-glk8/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@vcs1:
- shard-iclb: NOTRUN -> [FAIL][18] ([i915#3633])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-iclb1/igt@gem_exec_reloc@basic-wide-act...@vcs1.html

  * igt@gem_exec_suspend@basic-s3-devices:
- shard-iclb: [PASS][19] -> [INCOMPLETE][20] ([i915#1185])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-iclb4/igt@gem_exec_susp...@basic-s3-devices.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-iclb3/igt@gem_exec_susp...@basic-s3-devices.html

  * igt@gem_mmap_gtt@big-copy-xy:
- shard-skl:  [PASS][21] -> [FAIL][22] ([i915#307])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10308/shard-skl4/igt@gem_mmap_...@big-copy-xy.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20542/shard-skl1/igt@gem_mmap_...@big-copy-xy.html

  * igt@gem_ppgtt@flink-and-close-vma-leak:
- shard-glk:  [PASS][23] -> [FAIL][24] 

Re: [Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-08 Thread Daniel Vetter
On Thu, Jul 8, 2021 at 8:56 PM Andrey Grodzovsky
 wrote:
> On 2021-07-08 1:37 p.m., Daniel Vetter wrote:
> > It might be good enough on x86 with just READ_ONCE, but the write side
> > should then at least be WRITE_ONCE because x86 has total store order.
> >
> > It's definitely not enough on arm.
> >
> > Fix this proplery, which means
> > - explain the need for the barrier in both places
> > - point at the other side in each comment
> >
> > Also pull out the !sched_list case as the first check, so that the
> > code flow is clearer.
> >
> > While at it sprinkle some comments around because it was very
> > non-obvious to me what's actually going on here and why.
> >
> > Note that we really need full barriers here, at first I thought
> > store-release and load-acquire on ->last_scheduled would be enough,
> > but we actually requiring ordering between that and the queue state.
> >
> > Signed-off-by: Daniel Vetter 
> > Cc: "Christian König" 
> > Cc: Steven Price 
> > Cc: Daniel Vetter 
> > Cc: Andrey Grodzovsky 
> > Cc: Lee Jones 
> > Cc: Boris Brezillon 
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
> >   1 file changed, 25 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> > b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 64d398166644..4e1124ed80e0 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
> > drm_sched_entity *entity)
> >   dma_fence_set_error(_job->s_fence->finished, 
> > -ECANCELED);
> >
> >   dma_fence_put(entity->last_scheduled);
> > +
> >   entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
> >
> > + /*
> > +  * if the queue is empty we allow drm_sched_job_arm() to locklessly
>
>
> Probably meant drm_sched_entity_select_rq here

Which is called from drm_sched_job_arm but yes. I'll switch it around.

> > +  * access ->last_scheduled. This only works if we set the pointer 
> > before
> > +  * we dequeue and if we a write barrier here.
> > +  */
> > + smp_wmb();
> > +
> >   spsc_queue_pop(>job_queue);
> >   return sched_job;
> >   }
> > @@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct 
> > drm_sched_entity *entity)
> >   struct drm_gpu_scheduler *sched;
> >   struct drm_sched_rq *rq;
> >
> > - if (spsc_queue_count(>job_queue) || !entity->sched_list)
> > + /* single possible engine and already selected */
> > + if (!entity->sched_list)
> > + return;
> > +
> > + /* queue non-empty, stay on the same engine */
> > + if (spsc_queue_count(>job_queue))
> >   return;
>
>
> Shouldn't smp_rmb be here in between ? Given the queue is empty we want to
> be certain we are reading the most recent value of entity->last_scheduled

Yeah I had a load_acquire barrier here earlier and then put the
smp_rmb() on the wrong side. Will fix.
>
> Andrey
>
>
>
> >
> > - fence = READ_ONCE(entity->last_scheduled);
> > + fence = entity->last_scheduled;
> > +
> > + /*
> > +  * Only when the queue is empty are we guaranteed the the scheduler
> > +  * thread cannot change ->last_scheduled. To enforce ordering we need
> > +  * a read barrier here. See drm_sched_entity_pop_job() for the other
> > +  * side.
> > +  */
> > + smp_rmb();
> > +
> > + /* stay on the same engine if the previous job hasn't finished */
> >   if (fence && !dma_fence_is_signaled(fence))
> >   return;
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/7] Minor revid/stepping and workaround cleanup

2021-07-08 Thread Srivatsa, Anusha



> -Original Message-
> From: Jani Nikula 
> Sent: Thursday, July 8, 2021 12:33 AM
> To: Roper, Matthew D ; intel-
> g...@lists.freedesktop.org
> Cc: Srivatsa, Anusha 
> Subject: Re: [PATCH 0/7] Minor revid/stepping and workaround cleanup
> 
> On Wed, 07 Jul 2021, Matt Roper  wrote:
> > PCI revision IDs don't always map to GT and display IP steppings in an
> > intuitive/sensible way.  On many of our recent platforms we've
> > switched to using revid->stepping lookup tables with the
> > infrastructure in intel_step.c to handle stepping lookups and
> > comparisons.  Since it's confusing to have some of our platforms using
> > the new lookup tables and some still using old revid comparisons,
> > let's migrate all the old platforms over to the table approach since
> > that's what we want to standardize on going forward.  The only place
> > that revision ID's should really get used directly now is when
> > checking to see if we're running on pre-production hardware.
> 
> Anusha, Matt, please sort this out between the two of you. :)
> 
> https://patchwork.freedesktop.org/series/92257/
> 
@Roper, Matthew D the series doesn't add the steeping table for BXT and GLK.

Anusha
> BR,
> Jani.
> 
> 
> >
> > Let's also take the opportunity to drop a bit of effectively dead code
> > in the workarounds file too.
> >
> > Cc: Jani Nikula 
> >
> > Matt Roper (7):
> >   drm/i915: Make pre-production detection use direct revid comparison
> >   drm/i915/skl: Use revid->stepping tables
> >   drm/i915/icl: Use revid->stepping tables
> >   drm/i915/jsl_ehl: Use revid->stepping tables
> >   drm/i915/rkl: Use revid->stepping tables
> >   drm/i915/dg1: Use revid->stepping tables
> >   drm/i915/cnl: Drop all workarounds
> >
> >  .../drm/i915/display/intel_display_power.c|  2 +-
> >  drivers/gpu/drm/i915/display/intel_dpll_mgr.c |  2 +-
> >  drivers/gpu/drm/i915/display/intel_psr.c  |  4 +-
> >  drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  2 +-
> >  drivers/gpu/drm/i915/gt/intel_workarounds.c   | 81 +++
> >  drivers/gpu/drm/i915/i915_drv.c   |  8 +-
> >  drivers/gpu/drm/i915/i915_drv.h   | 80 +++---
> >  drivers/gpu/drm/i915/intel_pm.c   |  2 +-
> >  drivers/gpu/drm/i915/intel_step.c | 72 +++--
> >  drivers/gpu/drm/i915/intel_step.h |  7 ++
> >  10 files changed, 107 insertions(+), 153 deletions(-)
> 
> --
> Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/7] drm/i915/skl: Use revid->stepping tables

2021-07-08 Thread Srivatsa, Anusha



> -Original Message-
> From: Intel-gfx  On Behalf Of
> Matt Roper
> Sent: Wednesday, July 7, 2021 10:38 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 2/7] drm/i915/skl: Use revid->stepping tables
> 
> Switch SKL to use a revid->stepping table as we're trying to do on all
> platforms going forward.  Also add some additional stepping definitions for
> completeness, even if we don't have any workarounds tied to them.
> 
> Note that SKL has a case where a newer revision ID corresponds to an older
> GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1).  Also, the lack of a 
> revision
> ID 0x8 in the table is intentional and not an oversight.
> We'll re-write the KBL-specific comment to make it clear that these kind of
> quirks are expected.
> 
> Finally, since we're already touching the KBL area too, let's rename the KBL
> table to match the naming convention used by all of the other platforms.
> 
> Bspec: 13626
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c |  2 +-
>  drivers/gpu/drm/i915/i915_drv.h | 11 +--
>  drivers/gpu/drm/i915/intel_step.c   | 35 -
>  drivers/gpu/drm/i915/intel_step.h   |  4 +++
>  4 files changed, 33 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index d9a5a445ceec..6dfd564e078f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -883,7 +883,7 @@ skl_gt_workarounds_init(struct drm_i915_private
> *i915, struct i915_wa_list *wal)
>   GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
> 
>   /* WaInPlaceDecompressionHang:skl */
> - if (IS_SKL_REVID(i915, SKL_REVID_H0, REVID_FOREVER))
> + if (IS_SKL_GT_STEP(i915, STEP_H0, STEP_FOREVER))
>   wa_write_or(wal,
>   GEN9_GAMT_ECO_REG_RW_IA,
>   GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 796e6838bc79..300575f64ca6
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1462,16 +1462,7 @@ IS_SUBPLATFORM(const struct drm_i915_private
> *i915,  #define IS_TGL_Y(dev_priv) \
>   IS_SUBPLATFORM(dev_priv, INTEL_TIGERLAKE,
> INTEL_SUBPLATFORM_ULX)
> 
> -#define SKL_REVID_A0 0x0
> -#define SKL_REVID_B0 0x1
> -#define SKL_REVID_C0 0x2
> -#define SKL_REVID_D0 0x3
> -#define SKL_REVID_E0 0x4
> -#define SKL_REVID_F0 0x5
> -#define SKL_REVID_G0 0x6
> -#define SKL_REVID_H0 0x7
> -
> -#define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since,
> until))
> +#define IS_SKL_GT_STEP(p, since, until) (IS_SKYLAKE(p) && IS_GT_STEP(p,
> +since, until))
> 
>  #define IS_KBL_GT_STEP(dev_priv, since, until) \
>   (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until)) diff -
> -git a/drivers/gpu/drm/i915/intel_step.c
> b/drivers/gpu/drm/i915/intel_step.c
> index ba9479a67521..bfd63f56c200 100644
> --- a/drivers/gpu/drm/i915/intel_step.c
> +++ b/drivers/gpu/drm/i915/intel_step.c
> @@ -7,15 +7,31 @@
>  #include "intel_step.h"
> 
>  /*
> - * KBL revision ID ordering is bizarre; higher revision ID's map to lower
> - * steppings in some cases.  So rather than test against the revision ID
> - * directly, let's map that into our own range of increasing ID's that we
> - * can test against in a regular manner.
> + * Some platforms have unusual ways of mapping PCI revision ID to
> + GT/display
> + * steppings.  E.g., in some cases a higher PCI revision may translate
> + to a
> + * lower stepping of the GT and/or display IP.  This file provides
> + lookup
> + * tables to map the PCI revision into a standard set of stepping
> + values that
> + * can be compared numerically.
> + *
> + * Also note that some revisions/steppings may have been set aside as
> + * placeholders but never materialized in real hardware; in those cases
> + there
> + * may be jumps in the revision IDs or stepping values in the tables below.
>   */
> 
> +static const struct intel_step_info skl_revid_step_tbl[] = {
> + [0x0] = { .gt_step = STEP_A0, .display_step = STEP_A0 },
> + [0x1] = { .gt_step = STEP_B0, .display_step = STEP_B0 },
> + [0x2] = { .gt_step = STEP_C0, .display_step = STEP_C0 },
> + [0x3] = { .gt_step = STEP_D0, .display_step = STEP_D0 },
> + [0x4] = { .gt_step = STEP_E0, .display_step = STEP_E0 },
> + [0x5] = { .gt_step = STEP_F0, .display_step = STEP_F0 },
> + [0x6] = { .gt_step = STEP_G0, .display_step = STEP_G0 },
> + [0x7] = { .gt_step = STEP_H0, .display_step = STEP_H0 },
> + [0x9] = { .gt_step = STEP_J0, .display_step = STEP_J0 },
> + [0xA] = { .gt_step = STEP_I1, .display_step = STEP_I1 }, };

Feedback I received was to avoid adding .display_step if it is same as .gt_step 
and have something 

Re: [Intel-gfx] [PATCH 1/7] drm/i915: Make pre-production detection use direct revid comparison

2021-07-08 Thread Srivatsa, Anusha



> -Original Message-
> From: Intel-gfx  On Behalf Of
> Matt Roper
> Sent: Wednesday, July 7, 2021 10:38 PM
> To: intel-gfx@lists.freedesktop.org
> Subject: [Intel-gfx] [PATCH 1/7] drm/i915: Make pre-production detection
> use direct revid comparison
> 
> Although we're converting our workarounds to use a revid->stepping lookup
> table, the function that detects pre-production hardware should continue to
> compare against PCI revision ID values directly.  These are listed in the 
> bspec
> as integers, so it's easier to confirm their correctness if we just use an 
> integer
> literal rather than a symbolic name anyway.
> 
> Since the BXT, GLK, and CNL revid macros were never used in any
> workaround code, just remove them completely.
> 
> Bspec: 13620, 19131, 13626, 18329
> Signed-off-by: Matt Roper 
> ---
>  drivers/gpu/drm/i915/i915_drv.c   |  8 
>  drivers/gpu/drm/i915/i915_drv.h   | 24 
>  drivers/gpu/drm/i915/intel_step.h |  1 +
>  3 files changed, 5 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c
> b/drivers/gpu/drm/i915/i915_drv.c index 30d8cd8c69b1..90136995f5eb
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -271,10 +271,10 @@ static void intel_detect_preproduction_hw(struct
> drm_i915_private *dev_priv)
>   bool pre = false;
> 
>   pre |= IS_HSW_EARLY_SDV(dev_priv);
> - pre |= IS_SKL_REVID(dev_priv, 0, SKL_REVID_F0);
> - pre |= IS_BXT_REVID(dev_priv, 0, BXT_REVID_B_LAST);
> - pre |= IS_KBL_GT_STEP(dev_priv, 0, STEP_A0);
> - pre |= IS_GLK_REVID(dev_priv, 0, GLK_REVID_A2);
> + pre |= IS_SKYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x6;
> + pre |= IS_BROXTON(dev_priv) && INTEL_REVID(dev_priv) < 0xA;
> + pre |= IS_KABYLAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x1;
> + pre |= IS_GEMINILAKE(dev_priv) && INTEL_REVID(dev_priv) < 0x3;
> 
>   if (pre) {
>   drm_err(_priv->drm, "This is a pre-production stepping.
> "
> diff --git a/drivers/gpu/drm/i915/i915_drv.h
> b/drivers/gpu/drm/i915/i915_drv.h index 6dff4ca01241..796e6838bc79
> 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1473,35 +1473,11 @@ IS_SUBPLATFORM(const struct drm_i915_private
> *i915,
> 
>  #define IS_SKL_REVID(p, since, until) (IS_SKYLAKE(p) && IS_REVID(p, since,
> until))
> 
> -#define BXT_REVID_A0 0x0
> -#define BXT_REVID_A1 0x1
> -#define BXT_REVID_B0 0x3
> -#define BXT_REVID_B_LAST 0x8
> -#define BXT_REVID_C0 0x9
> -
> -#define IS_BXT_REVID(dev_priv, since, until) \
> - (IS_BROXTON(dev_priv) && IS_REVID(dev_priv, since, until))

Here, we can have IS_BXT_GT_STEP, similar to other platform and use in 
intel_detect_preproduction_hw() above.
Same for other platforms - SKL and GLK. KBL already uses IS_KBL_GT_STEP.

Anusha 
>  #define IS_KBL_GT_STEP(dev_priv, since, until) \
>   (IS_KABYLAKE(dev_priv) && IS_GT_STEP(dev_priv, since, until))
> #define IS_KBL_DISPLAY_STEP(dev_priv, since, until) \
>   (IS_KABYLAKE(dev_priv) && IS_DISPLAY_STEP(dev_priv, since,
> until))
> 
> -#define GLK_REVID_A0 0x0
> -#define GLK_REVID_A1 0x1
> -#define GLK_REVID_A2 0x2
> -#define GLK_REVID_B0 0x3
> -
> -#define IS_GLK_REVID(dev_priv, since, until) \
> - (IS_GEMINILAKE(dev_priv) && IS_REVID(dev_priv, since, until))
> -
> -#define CNL_REVID_A0 0x0
> -#define CNL_REVID_B0 0x1
> -#define CNL_REVID_C0 0x2
> -
> -#define IS_CNL_REVID(p, since, until) \
> - (IS_CANNONLAKE(p) && IS_REVID(p, since, until))
> -
>  #define ICL_REVID_A0 0x0
>  #define ICL_REVID_A2 0x1
>  #define ICL_REVID_B0 0x3
> diff --git a/drivers/gpu/drm/i915/intel_step.h
> b/drivers/gpu/drm/i915/intel_step.h
> index 958a8bb5d677..8efacef6ab31 100644
> --- a/drivers/gpu/drm/i915/intel_step.h
> +++ b/drivers/gpu/drm/i915/intel_step.h
> @@ -22,6 +22,7 @@ struct intel_step_info {  enum intel_step {
>   STEP_NONE = 0,
>   STEP_A0,
> + STEP_A1,
>   STEP_A2,
>   STEP_B0,
>   STEP_B1,
> --
> 2.25.4
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 00/30] drm/i915/gem: ioctl clean-ups (v9)

2021-07-08 Thread Daniel Vetter
On Thu, Jul 08, 2021 at 10:48:05AM -0500, Jason Ekstrand wrote:
> Overview:
> -
> 
> This patch series attempts to clean up some of the IOCTL mess we've created
> over the last few years.  The most egregious bit being context mutability.
> In summary, this series:
> 
>  1. Drops two never-used context params: RINGSIZE and NO_ZEROMAP
>  2. Drops the entire CONTEXT_CLONE API
>  3. Implements SINGLE_TIMELINE with a syncobj instead of actually sharing
> intel_timeline between engines.
>  4. Adds a few sanity restrictions to the balancing/bonding API.
>  5. Implements a proto-ctx mechanism so that the engine set and VM can only
> be set early on in the lifetime of a context, before anything ever
> executes on it.  This effectively makes the VM and engine set
> immutable.
> 
> This series has been tested with IGT as well as the Iris, ANV, and the
> Intel media driver doing an 8K decode (this uses bonding/balancing).  I've
> also done quite a bit of git archeology to ensure that nothing in here will
> break anything that's already shipped at some point in history.  It's
> possible I've missed something, but I've dug quite a bit.
> 
> 
> Details and motivation:
> ---
> 
> In very broad strokes, there's an effort going on right now within Intel to
> try and clean up and simplify i915 anywhere we can.  We obviously don't
> want to break any shipping userspace but, as can be seen by this series,
> there's a lot i915 theoretically supports which userspace doesn't actually
> need.  Some of this, like the two context params used here, were simply
> oversights where we went through the usual API review process and merged
> the i915 bits but the userspace bits never landed for some reason.
> 
> Not all are so innocent, however.  For instance, there's an entire context
> cloning API which allows one to create a context with certain parameters
> "cloned" from some other context.  This entire API has never been used by
> any userspace except IGT and there were never patches to any other
> userspace to use it.  It never should have landed.  Also, when we added
> support for setting explicit engine sets and sharing VMs across contexts,
> people decided to do so via SET_CONTEXT_PARAM.  While this allowed them to
> re-use existing API, it did so at the cost of making those states mutable
> which leads to a plethora of potential race conditions.  There were even
> IGT tests merged to cover some of theses:
> 
>  - gem_vm_create@async-destroy and gem_vm_create@destroy-race which test
>swapping out the VM on a running context.
> 
>  - gem_ctx_persistence@replace* which test whether a client can escape a
>non-persistent context by submitting a hanging batch and then swapping
>out the engine set before the hang is detected.
> 
>  - api_intel_bb@bb-with-vm which tests the that intel_bb_assign_vm works
>properly.  This API is never used by any other IGT test.
> 
> There is also an entire deferred flush and set state framework in
> i915_gem_cotnext.c which exists for safely swapping out the VM while there
> is work in-flight on a context.
> 
> So, clearly people knew that this API was inherently racy and difficult to
> implement but they landed it anyway.  Why?  The best explanation I've been
> given is because it makes the API more "unified" or "symmetric" for this
> stuff to go through SET_CONTEXT_PARAM.  It's not because any userspace
> actually wants to be able to swap out the VM or the set of engines on a
> running context.  That would be utterly insane.
> 
> This patch series cleans up this particular mess by introducing the concept
> of a i915_gem_proto_context data structure which contains context creation
> information.  When you initially call GEM_CONTEXT_CREATE, a proto-context
> in created instead of an actual context.  Then, the first time something is
> done on the context besides SET_CONTEXT_PARAM, an actual context is
> created.  This allows us to keep the old drivers which use
> SET_CONTEXT_PARAM to set up the engine set (see also media) while ensuring
> that, once you have an i915_gem_context, the VM and the engine set are
> immutable state.
> 
> Eventually, there are more clean-ups I'd like to do on top of this which
> should make working with contexts inside i915 simpler and safer:
> 
>  1. Move the GEM handle -> vma LUT from i915_gem_context into either
> i915_ppgtt or drm_i915_file_private depending on whether or not the
> hardware has a full PPGTT.
> 
>  2. Move the delayed context destruction code into intel_context or a
> per-engine wrapper struct rather than i915_gem_context.
> 
>  3. Get rid of the separation between context close and context destroy
> 
>  4. Get rid of the RCU on i915_gem_context
> 
> However, these should probably be done as a separate patch series as this
> one is already starting to get longish, especially if you consider the 89
> IGT patches that go along with it.
> 
> Test-with: 

[Intel-gfx] [PATCH] drm/i915/dg1: Compute MEM Bandwidth using MCHBAR

2021-07-08 Thread Lucas De Marchi
From: Clint Taylor 

The PUNIT FW is currently returning 0 for all memory bandwidth
parameters. Read the values directly from MCHBAR offsets 0x5918 and
0x4000(4).

v2 (Lucas): tidy up checking for ret slightly
v3 (Lucas):
  - Squash change to double the memory bandwidth based on
MCHBAR Gear_type
  - Move ICL_GEAR_TYPE_MASK to the appropriate place and change prefix
to DG1
  - Move register definitions to i915_reg.h
  - Make the MCHBAR path permanent for DG1
  - Convert to REG_BIT()/REG_GENMASK()
v4: Drop unneeded initializations

Cc: Ville Syrjälä 
Cc: Matt Roper 
Cc: Jani Saarinen 
Signed-off-by: Clint Taylor 
Signed-off-by: Jani Nikula 
Signed-off-by: Matthew Auld 
Reviewed-by: Lucas De Marchi 
Signed-off-by: Lucas De Marchi 
Reviewed-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_bw.c | 41 -
 drivers/gpu/drm/i915/i915_reg.h | 12 
 2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bw.c 
b/drivers/gpu/drm/i915/display/intel_bw.c
index bfb398f0432e..0d5d52548925 100644
--- a/drivers/gpu/drm/i915/display/intel_bw.c
+++ b/drivers/gpu/drm/i915/display/intel_bw.c
@@ -23,6 +23,41 @@ struct intel_qgv_info {
u8 t_bl;
 };
 
+static int dg1_mchbar_read_qgv_point_info(struct drm_i915_private *dev_priv,
+ struct intel_qgv_point *sp,
+ int point)
+{
+   u32 dclk_ratio, dclk_reference;
+   u32 val;
+
+   val = intel_uncore_read(_priv->uncore, 
SA_PERF_STATUS_0_0_0_MCHBAR_PC);
+   dclk_ratio = REG_FIELD_GET(DG1_QCLK_RATIO_MASK, val);
+   if (val & DG1_QCLK_REFERENCE)
+   dclk_reference = 6; /* 6 * 16.666 MHz = 100 MHz */
+   else
+   dclk_reference = 8; /* 8 * 16.666 MHz = 133 MHz */
+   sp->dclk = dclk_ratio * dclk_reference;
+
+   val = intel_uncore_read(_priv->uncore, 
SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU);
+   if (val & DG1_GEAR_TYPE)
+   sp->dclk *= 2;
+
+   if (sp->dclk == 0)
+   return -EINVAL;
+
+   val = intel_uncore_read(_priv->uncore, 
MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR);
+   sp->t_rp = REG_FIELD_GET(DG1_DRAM_T_RP_MASK, val);
+   sp->t_rdpre = REG_FIELD_GET(DG1_DRAM_T_RDPRE_MASK, val);
+
+   val = intel_uncore_read(_priv->uncore, 
MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH);
+   sp->t_rcd = REG_FIELD_GET(DG1_DRAM_T_RCD_MASK, val);
+   sp->t_ras = REG_FIELD_GET(DG1_DRAM_T_RAS_MASK, val);
+
+   sp->t_rc = sp->t_rp + sp->t_ras;
+
+   return 0;
+}
+
 static int icl_pcode_read_qgv_point_info(struct drm_i915_private *dev_priv,
 struct intel_qgv_point *sp,
 int point)
@@ -99,7 +134,11 @@ static int icl_get_qgv_points(struct drm_i915_private 
*dev_priv,
for (i = 0; i < qi->num_points; i++) {
struct intel_qgv_point *sp = >points[i];
 
-   ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
+   if (IS_DG1(dev_priv))
+   ret = dg1_mchbar_read_qgv_point_info(dev_priv, sp, i);
+   else
+   ret = icl_pcode_read_qgv_point_info(dev_priv, sp, i);
+
if (ret)
return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 16a19239d86d..943fe485c662 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -11060,6 +11060,7 @@ enum skl_power_gate {
 #define SKL_MEMORY_FREQ_MULTIPLIER_HZ  2
 #define SKL_MC_BIOS_DATA_0_0_0_MCHBAR_PCU  _MMIO(MCHBAR_MIRROR_BASE_SNB + 
0x5E04)
 #define  SKL_REQ_DATA_MASK (0xF << 0)
+#define  DG1_GEAR_TYPE REG_BIT(16)
 
 #define SKL_MAD_INTER_CHANNEL_0_0_0_MCHBAR_MCMAIN _MMIO(MCHBAR_MIRROR_BASE_SNB 
+ 0x5000)
 #define  SKL_DRAM_DDR_TYPE_MASK(0x3 << 0)
@@ -11095,6 +11096,17 @@ enum skl_power_gate {
 #define  CNL_DRAM_RANK_3   (0x2 << 9)
 #define  CNL_DRAM_RANK_4   (0x3 << 9)
 
+#define SA_PERF_STATUS_0_0_0_MCHBAR_PC _MMIO(MCHBAR_MIRROR_BASE_SNB + 
0x5918)
+#define  DG1_QCLK_RATIO_MASK   REG_GENMASK(9, 2)
+#define  DG1_QCLK_REFERENCEREG_BIT(10)
+
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR  _MMIO(MCHBAR_MIRROR_BASE_SNB + 
0x4000)
+#define   DG1_DRAM_T_RDPRE_MASKREG_GENMASK(16, 11)
+#define   DG1_DRAM_T_RP_MASK   REG_GENMASK(6, 0)
+#define MCHBAR_CH0_CR_TC_PRE_0_0_0_MCHBAR_HIGH _MMIO(MCHBAR_MIRROR_BASE_SNB + 
0x4004)
+#define   DG1_DRAM_T_RCD_MASK  REG_GENMASK(15, 9)
+#define   DG1_DRAM_T_RAS_MASK  REG_GENMASK(8, 1)
+
 /*
  * Please see hsw_read_dcomp() and hsw_write_dcomp() before using this 
register,
  * since on HSW we can't write to it using intel_uncore_write.
-- 

[Intel-gfx] PR for new GuC v62.0.3 and HuC v7.9.3 binaries

2021-07-08 Thread John . C . Harrison
The following changes since commit d79c26779d459063b8052b7fe0a48bce4e08d0d9:

  amdgpu: update vcn firmware for green sardine for 21.20 (2021-06-29 07:26:03 
-0400)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-firmware guc_62.0_huc_7.9

for you to fetch changes up to f4d897acd200190350a5f2148316c51c6c57bc9b:

  firmware/i915/guc: Add HuC v7.9.3 for TGL & DG1 (2021-06-29 14:20:03 -0700)


John Harrison (3):
  firmware/i915/guc: Add GuC v62.0.0 for all platforms
  firmware/i915/guc: Add GuC v62.0.3 for ADL-P
  firmware/i915/guc: Add HuC v7.9.3 for TGL & DG1

 WHENCE   |  38 +-
 i915/adlp_guc_62.0.3.bin | Bin 0 -> 336704 bytes
 i915/bxt_guc_62.0.0.bin  | Bin 0 -> 199616 bytes
 i915/cml_guc_62.0.0.bin  | Bin 0 -> 200448 bytes
 i915/dg1_guc_62.0.0.bin  | Bin 0 -> 315648 bytes
 i915/dg1_huc_7.9.3.bin   | Bin 0 -> 589888 bytes
 i915/ehl_guc_62.0.0.bin  | Bin 0 -> 327488 bytes
 i915/glk_guc_62.0.0.bin  | Bin 0 -> 20 bytes
 i915/icl_guc_62.0.0.bin  | Bin 0 -> 327488 bytes
 i915/kbl_guc_62.0.0.bin  | Bin 0 -> 200448 bytes
 i915/skl_guc_62.0.0.bin  | Bin 0 -> 199552 bytes
 i915/tgl_guc_62.0.0.bin  | Bin 0 -> 326016 bytes
 i915/tgl_huc_7.9.3.bin   | Bin 0 -> 589888 bytes
 13 files changed, 37 insertions(+), 1 deletion(-)
 create mode 100644 i915/adlp_guc_62.0.3.bin
 create mode 100644 i915/bxt_guc_62.0.0.bin
 create mode 100644 i915/cml_guc_62.0.0.bin
 create mode 100644 i915/dg1_guc_62.0.0.bin
 create mode 100644 i915/dg1_huc_7.9.3.bin
 create mode 100644 i915/ehl_guc_62.0.0.bin
 create mode 100644 i915/glk_guc_62.0.0.bin
 create mode 100644 i915/icl_guc_62.0.0.bin
 create mode 100644 i915/kbl_guc_62.0.0.bin
 create mode 100644 i915/skl_guc_62.0.0.bin
 create mode 100644 i915/tgl_guc_62.0.0.bin
 create mode 100644 i915/tgl_huc_7.9.3.bin
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 20/20] dma-resv: Give the docs a do-over

2021-07-08 Thread Daniel Vetter
Specifically document the new/clarified rules around how the shared
fences do not have any ordering requirements against the exclusive
fence.

But also document all the things a bit better, given how central
struct dma_resv to dynamic buffer management the docs have been very
inadequat.

- Lots more links to other pieces of the puzzle. Unfortunately
  ttm_buffer_object has no docs, so no links :-(

- Explain/complain a bit about dma_resv_locking_ctx(). I still don't
  like that one, but fixing the ttm call chains is going to be
  horrible. Plus we want to plug in real slowpath locking when we do
  that anyway.

- Main part of the patch is some actual docs for struct dma_resv.

Overall I think we still have a lot of bad naming in this area (e.g.
dma_resv.fence is singular, but contains the multiple shared fences),
but I think that's more indicative of how the semantics and rules are
just not great.

Another thing that's real awkard is how chaining exclusive fences
right now means direct dma_resv.exclusive_fence pointer access with an
rcu_assign_pointer. Not so great either.

v2:
- Fix a pile of typos (Matt, Jason)
- Hammer it in that breaking the rules leads to use-after-free issues
  around dma-buf sharing (Christian)

Reviewed-by: Christian König 
Cc: Jason Ekstrand 
Cc: Matthew Auld 
Reviewed-by: Matthew Auld 
Signed-off-by: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/dma-buf/dma-resv.c |  24 ++---
 include/linux/dma-buf.h|   7 +++
 include/linux/dma-resv.h   | 104 +++--
 3 files changed, 124 insertions(+), 11 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..a3acb6479ddf 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -48,6 +48,8 @@
  * write operations) or N shared fences (read operations).  The RCU
  * mechanism is used to protect read access to fences from locked
  * write-side updates.
+ *
+ * See struct dma_resv for more details.
  */
 
 DEFINE_WD_CLASS(reservation_ww_class);
@@ -137,7 +139,11 @@ EXPORT_SYMBOL(dma_resv_fini);
  * @num_fences: number of fences we want to add
  *
  * Should be called before dma_resv_add_shared_fence().  Must
- * be called with obj->lock held.
+ * be called with @obj locked through dma_resv_lock().
+ *
+ * Note that the preallocated slots need to be re-reserved if @obj is unlocked
+ * at any time before calling dma_resv_add_shared_fence(). This is validated
+ * when CONFIG_DEBUG_MUTEXES is enabled.
  *
  * RETURNS
  * Zero for success, or -errno
@@ -234,8 +240,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
  * @obj: the reservation object
  * @fence: the shared fence to add
  *
- * Add a fence to a shared slot, obj->lock must be held, and
+ * Add a fence to a shared slot, @obj must be locked with dma_resv_lock(), and
  * dma_resv_reserve_shared() has been called.
+ *
+ * See also _resv.fence for a discussion of the semantics.
  */
 void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -278,9 +286,11 @@ EXPORT_SYMBOL(dma_resv_add_shared_fence);
 /**
  * dma_resv_add_excl_fence - Add an exclusive fence.
  * @obj: the reservation object
- * @fence: the shared fence to add
+ * @fence: the exclusive fence to add
  *
- * Add a fence to the exclusive slot.  The obj->lock must be held.
+ * Add a fence to the exclusive slot. @obj must be locked with dma_resv_lock().
+ * Note that this function replaces all fences attached to @obj, see also
+ * _resv.fence_excl for a discussion of the semantics.
  */
 void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence)
 {
@@ -609,9 +619,11 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  * fence
  *
  * Callers are not required to hold specific locks, but maybe hold
- * dma_resv_lock() already
+ * dma_resv_lock() already.
+ *
  * RETURNS
- * true if all fences signaled, else false
+ *
+ * True if all fences signaled, else false.
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 2b814fde0d11..8cc0c55877a6 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -420,6 +420,13 @@ struct dma_buf {
 * - Dynamic importers should set fences for any access that they can't
 *   disable immediately from their _buf_attach_ops.move_notify
 *   callback.
+*
+* IMPORTANT:
+*
+* All drivers must obey the struct dma_resv rules, specifically the
+* rules for updating fences, see _resv.fence_excl and
+* _resv.fence. If these dependency rules are broken access tracking
+* can be lost resulting in use after free issues.
 */
struct dma_resv *resv;
 
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index e1ca2080a1ff..9100dd3dc21f 100644
--- 

[Intel-gfx] [PATCH v3 19/20] drm/i915: Don't break exclusive fence ordering

2021-07-08 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 47e07179347a..9d717c8842e2 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1775,6 +1775,7 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
struct i915_vma *vma = ev->vma;
unsigned int flags = ev->flags;
struct drm_i915_gem_object *obj = vma->obj;
+   bool async, write;
 
assert_vma_held(vma);
 
@@ -1806,7 +1807,10 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
flags &= ~EXEC_OBJECT_ASYNC;
}
 
-   if (err == 0 && !(flags & EXEC_OBJECT_ASYNC)) {
+   async = flags & EXEC_OBJECT_ASYNC;
+   write = flags & EXEC_OBJECT_WRITE;
+
+   if (err == 0 && (!async || write)) {
err = i915_request_await_object
(eb->request, obj, flags & EXEC_OBJECT_WRITE);
}
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 18/20] drm/i915: delete exclude argument from i915_sw_fence_await_reservation

2021-07-08 Thread Daniel Vetter
No longer used, the last user disappeared with

commit d07f0e59b2c762584478920cd2d11fba2980a94a
Author: Chris Wilson 
Date:   Fri Oct 28 13:58:44 2016 +0100

drm/i915: Move GEM activity tracking into a common struct reservation_object

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: "Thomas Hellström" 
Cc: Jason Ekstrand 
---
 drivers/gpu/drm/i915/display/intel_display.c   | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c| 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 +-
 drivers/gpu/drm/i915/i915_sw_fence.c   | 6 +-
 drivers/gpu/drm/i915/i915_sw_fence.h   | 1 -
 5 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index 98e0f4ed7e4a..678c7839034e 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -9,7 +9,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
 */
if (intel_crtc_needs_modeset(crtc_state)) {
ret = 
i915_sw_fence_await_reservation(>commit_ready,
- 
old_obj->base.resv, NULL,
+ 
old_obj->base.resv,
  false, 0,
  GFP_KERNEL);
if (ret < 0)
@@ -11153,7 +11153,7 @@ intel_prepare_plane_fb(struct drm_plane *_plane,
struct dma_fence *fence;
 
ret = i915_sw_fence_await_reservation(>commit_ready,
- obj->base.resv, NULL,
+ obj->base.resv,
  false,
  
i915_fence_timeout(dev_priv),
  GFP_KERNEL);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c 
b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
index daf9284ef1f5..93439d2c7a58 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_clflush.c
@@ -106,7 +106,7 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object 
*obj,
clflush = clflush_work_create(obj);
if (clflush) {
i915_sw_fence_await_reservation(>base.chain,
-   obj->base.resv, NULL, true,
+   obj->base.resv, true,

i915_fence_timeout(to_i915(obj->base.dev)),
I915_FENCE_GFP);
dma_resv_add_excl_fence(obj->base.resv, >base.dma);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 68ce366f46cf..47e07179347a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2095,7 +2095,7 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
 
/* Wait for all writes (and relocs) into the batch to complete */
err = i915_sw_fence_await_reservation(>base.chain,
- pw->batch->resv, NULL, false,
+ pw->batch->resv, false,
  0, I915_FENCE_GFP);
if (err < 0)
goto err_commit;
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
b/drivers/gpu/drm/i915/i915_sw_fence.c
index c589a681da77..91711a46b1c7 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -567,7 +567,6 @@ int __i915_sw_fence_await_dma_fence(struct i915_sw_fence 
*fence,
 
 int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
struct dma_resv *resv,
-   const struct dma_fence_ops *exclude,
bool write,
unsigned long timeout,
gfp_t gfp)
@@ -587,9 +586,6 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
return ret;
 
for (i = 0; i < count; i++) {
-   if (shared[i]->ops == exclude)
-   continue;
-
pending = i915_sw_fence_await_dma_fence(fence,
shared[i],
timeout,
@@ -609,7 +605,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
*fence,
excl = dma_resv_get_excl_unlocked(resv);
}
 
-   if (ret >= 0 && excl && excl->ops != exclude) {
+   if (ret >= 0 

[Intel-gfx] [PATCH v3 17/20] drm/etnaviv: Don't break exclusive fence ordering

2021-07-08 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.
Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but it probably makes sense to lift this into
dma-resv.c code as a proper concept, so that drivers don't have to
hack up their own solution each on their own. Hence go with the simple
fix for now.

Another option is the fence import ioctl from Jason:

https://lore.kernel.org/dri-devel/20210610210925.642582-7-ja...@jlekstrand.net/

v2: Improve commit message per Lucas' suggestion.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: etna...@lists.freedesktop.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 5b97ce1299ad..07454db4b150 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -178,18 +178,20 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
for (i = 0; i < submit->nr_bos; i++) {
struct etnaviv_gem_submit_bo *bo = >bos[i];
struct dma_resv *robj = bo->obj->base.resv;
+   bool write = bo->flags & ETNA_SUBMIT_BO_WRITE;
 
-   if (!(bo->flags & ETNA_SUBMIT_BO_WRITE)) {
+   if (!(write)) {
ret = dma_resv_reserve_shared(robj, 1);
if (ret)
return ret;
}
 
-   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
+   /* exclusive fences must be ordered */
+   if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT && !write)
continue;
 
ret = drm_sched_job_await_implicit(>sched_job, 
>obj->base,
-  bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+  write);
if (ret)
return ret;
}
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 16/20] drm/msm: always wait for the exclusive fence

2021-07-08 Thread Daniel Vetter
From: Christian König 

Drivers also need to to sync to the exclusive fence when
a shared one is present.

Signed-off-by: Christian König 
[danvet: Not that hard to compile-test on arm ...]
Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index 141178754231..d9c4f1deeafb 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -812,17 +812,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj,
struct dma_fence *fence;
int i, ret;
 
-   fobj = dma_resv_shared_list(obj->resv);
-   if (!fobj || (fobj->shared_count == 0)) {
-   fence = dma_resv_excl_fence(obj->resv);
-   /* don't need to wait on our own fences, since ring is fifo */
-   if (fence && (fence->context != fctx->context)) {
-   ret = dma_fence_wait(fence, true);
-   if (ret)
-   return ret;
-   }
+   fence = dma_resv_excl_fence(obj->resv);
+   /* don't need to wait on our own fences, since ring is fifo */
+   if (fence && (fence->context != fctx->context)) {
+   ret = dma_fence_wait(fence, true);
+   if (ret)
+   return ret;
}
 
+   fobj = dma_resv_shared_list(obj->resv);
if (!exclusive || !fobj)
return 0;
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 15/20] drm/msm: Don't break exclusive fence ordering

2021-07-08 Thread Daniel Vetter
There's only one exclusive slot, and we must not break the ordering.

Adding a new exclusive fence drops all previous fences from the
dma_resv. To avoid violating the signalling order we err on the side of
over-synchronizing by waiting for the existing fences, even if
userspace asked us to ignore them.

A better fix would be to us a dma_fence_chain or _array like e.g.
amdgpu now uses, but
- msm has a synchronous dma_fence_wait for anything from another
  context, so doesn't seem to care much,
- and it probably makes sense to lift this into dma-resv.c code as a
  proper concept, so that drivers don't have to hack up their own
  solution each on their own.

v2: Improve commit message per Lucas' suggestion.

Cc: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index b71da71a3dd8..edd0051d849f 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -306,7 +306,8 @@ static int submit_fence_sync(struct msm_gem_submit *submit, 
bool no_implicit)
return ret;
}
 
-   if (no_implicit)
+   /* exclusive fences must be ordered */
+   if (no_implicit && !write)
continue;
 
ret = msm_gem_sync_object(_obj->base, submit->ring->fctx,
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 14/20] drm/sched: Check locking in drm_sched_job_await_implicit

2021-07-08 Thread Daniel Vetter
You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
Cc: Jack Zhang 
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index db326a1ebf3c..67eca88e070e 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -708,6 +708,8 @@ int drm_sched_job_await_implicit(struct drm_sched_job *job,
struct dma_fence **fences;
unsigned int i, fence_count;
 
+   dma_resv_assert_held(obj->resv);
+
if (!write) {
struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 13/20] drm/sched: Don't store self-dependencies

2021-07-08 Thread Daniel Vetter
This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

This would mean that as-is amdgpu can't use the dependency helpers, at
least not with the current approach amdgpu has for deciding whether a
vm_flush is needed. Since amdgpu also has very special rules around
implicit fencing it can't use those helpers either, and adding a
drm_sched_job_await_fence_always or similar for amdgpu wouldn't be too
onerous. That way the special case handling for amdgpu sticks even
more out and we have higher chances that reviewers that go across all
drivers wont miss it.

Reviewed-by: Lucas Stach 
Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Daniel Vetter 
Cc: Luben Tuikov 
Cc: Andrey Grodzovsky 
Cc: Alex Deucher 
Cc: Jack Zhang 
---
 drivers/gpu/drm/scheduler/sched_main.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ad62f1d2991c..db326a1ebf3c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -654,6 +654,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
if (!fence)
return 0;
 
+   /* if it's a fence from us it's guaranteed to be earlier */
+   if (fence->context == job->entity->fence_context ||
+   fence->context == job->entity->fence_context + 1) {
+   dma_fence_put(fence);
+   return 0;
+   }
+
/* Deduplicate if we already depend on a fence from the same context.
 * This lets the size of the array of deps scale with the number of
 * engines involved, rather than the number of BOs.
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 12/20] drm/gem: Delete gem array fencing helpers

2021-07-08 Thread Daniel Vetter
Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---
 include/drm/drm_gem.h |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 68deb1de8235..24d49a2636e0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object 
**objs, int count,
ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence)
-{
-   struct dma_fence *entry;
-   unsigned long index;
-   u32 id = 0;
-   int ret;
-
-   if (!fence)
-   return 0;
-
-   /* Deduplicate if we already depend on a fence from the same context.
-* This lets the size of the array of deps scale with the number of
-* engines involved, rather than the number of BOs.
-*/
-   xa_for_each(fence_array, index, entry) {
-   if (entry->context != fence->context)
-   continue;
-
-   if (dma_fence_is_later(fence, entry)) {
-   dma_fence_put(entry);
-   xa_store(fence_array, index, fence, GFP_KERNEL);
-   } else {
-   dma_fence_put(fence);
-   }
-   return 0;
-   }
-
-   ret = xa_alloc(fence_array, , fence, xa_limit_32b, GFP_KERNEL);
-   if (ret != 0)
-   dma_fence_put(fence);
-
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write)
-{
-   int ret;
-   struct dma_fence **fences;
-   unsigned int i, fence_count;
-
-   if (!write) {
-   struct dma_fence *fence =
-   dma_resv_get_excl_unlocked(obj->resv);
-
-   return drm_gem_fence_array_add(fence_array, fence);
-   }
-
-   ret = dma_resv_get_fences(obj->resv, NULL,
-   _count, );
-   if (ret || !fence_count)
-   return ret;
-
-   for (i = 0; i < fence_count; i++) {
-   ret = drm_gem_fence_array_add(fence_array, fences[i]);
-   if (ret)
-   break;
-   }
-
-   for (; i < fence_count; i++)
-   dma_fence_put(fences[i]);
-   kfree(fences);
-   return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 240049566592..6d5e33b89074 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object 
**objs, int count,
  struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-   struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-struct drm_gem_object *obj,
-bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
u32 handle, u64 *offset);
 
-- 
2.32.0

___
Intel-gfx mailing list

[Intel-gfx] [PATCH v3 11/20] drm/etnaviv: Use scheduler dependency handling

2021-07-08 Thread Daniel Vetter
We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

v2: Actually fix up cleanup paths by calling drm_sched_job_init, which
I wanted to to in the previous round (and did, for all other drivers).
Spotted by Lucas.

Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: etna...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h|  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 58 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 63 +---
 drivers/gpu/drm/etnaviv/etnaviv_sched.h  |  3 +-
 4 files changed, 35 insertions(+), 94 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h 
b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
u64 va;
struct etnaviv_gem_object *obj;
struct etnaviv_vram_mapping *mapping;
-   struct dma_fence *excl;
-   unsigned int nr_shared;
-   struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
struct etnaviv_file_private *ctx;
struct etnaviv_gpu *gpu;
struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-   struct dma_fence *out_fence, *in_fence;
+   struct dma_fence *out_fence;
int out_fence_id;
struct list_head node; /* GPU active submit list */
struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..5b97ce1299ad 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit 
*submit)
if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
continue;
 
-   if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-   ret = dma_resv_get_fences(robj, >excl,
- >nr_shared,
- >shared);
-   if (ret)
-   return ret;
-   } else {
-   bo->excl = dma_resv_get_excl_unlocked(robj);
-   }
-
+   ret = drm_sched_job_await_implicit(>sched_job, 
>obj->base,
+  bo->flags & 
ETNA_SUBMIT_BO_WRITE);
+   if (ret)
+   return ret;
}
 
return ret;
@@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref)
 
wake_up_all(>gpu->fence_event);
 
-   if (submit->in_fence)
-   dma_fence_put(submit->in_fence);
if (submit->out_fence) {
/* first remove from IDR, so fence can not be found anymore */
mutex_lock(>gpu->fence_lock);
@@ -529,7 +521,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
ret = etnaviv_cmdbuf_init(priv->cmdbuf_suballoc, >cmdbuf,
  ALIGN(args->stream_size, 8) + 8);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_put;
 
submit->ctx = file->driver_priv;
etnaviv_iommu_context_get(submit->ctx->mmu);
@@ -537,51 +529,61 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void 
*data,
submit->exec_state = args->exec_state;
submit->flags = args->flags;
 
+   ret = drm_sched_job_init(>sched_job,
+>sched_entity[args->pipe],
+submit->ctx);
+   if (ret)
+   goto err_submit_put;
+
ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
if (ret)
-   goto err_submit_objects;
+   goto err_submit_job;
 
if ((priv->mmu_global->version != ETNAVIV_IOMMU_V2) &&
!etnaviv_cmd_validate_one(gpu, stream, args->stream_size / 4,
  relocs, args->nr_relocs)) {
ret = -EINVAL;
-   goto err_submit_objects;
+   goto err_submit_job;
}
 
if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-   submit->in_fence = sync_file_get_fence(args->fence_fd);
-   if (!submit->in_fence) {
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {
ret = -EINVAL;
-   goto err_submit_objects;
+   goto err_submit_job;
}
+
+   ret = drm_sched_job_await_fence(>sched_job, in_fence);
+ 

[Intel-gfx] [PATCH v3 09/20] drm/v3d: Move drm_sched_job_init to v3d_job_init

2021-07-08 Thread Daniel Vetter
Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

v3: Cleanup jobs under construction correctly (Emma)

Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  1 +
 drivers/gpu/drm/v3d/v3d_gem.c   | 88 ++---
 drivers/gpu/drm/v3d/v3d_sched.c | 15 +++---
 3 files changed, 44 insertions(+), 60 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 8a390738d65b..1d870261eaac 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -332,6 +332,7 @@ int v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 struct drm_file *file_priv);
 int v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
  struct drm_file *file_priv);
+void v3d_job_cleanup(struct v3d_job *job);
 void v3d_job_put(struct v3d_job *job);
 void v3d_reset(struct v3d_dev *v3d);
 void v3d_invalidate_caches(struct v3d_dev *v3d);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 69ac20e11b09..5eccd3658938 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -392,6 +392,12 @@ v3d_render_job_free(struct kref *ref)
v3d_job_free(ref);
 }
 
+void v3d_job_cleanup(struct v3d_job *job)
+{
+   drm_sched_job_cleanup(>base);
+   v3d_job_put(job);
+}
+
 void v3d_job_put(struct v3d_job *job)
 {
kref_put(>refcount, job->free);
@@ -433,9 +439,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 struct v3d_job *job, void (*free)(struct kref *ref),
-u32 in_sync)
+u32 in_sync, enum v3d_queue queue)
 {
struct dma_fence *in_fence = NULL;
+   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
int ret;
 
job->v3d = v3d;
@@ -446,35 +453,33 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
return ret;
 
xa_init_flags(>deps, XA_FLAGS_ALLOC);
+   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
+v3d_priv);
+   if (ret)
+   goto fail;
 
ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, _fence);
if (ret == -EINVAL)
-   goto fail;
+   goto fail_job;
 
ret = drm_gem_fence_array_add(>deps, in_fence);
if (ret)
-   goto fail;
+   goto fail_job;
 
kref_init(>refcount);
 
return 0;
+fail_job:
+   drm_sched_job_cleanup(>base);
 fail:
xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-   int ret;
-
-   ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
-v3d_priv);
-   if (ret)
-   return ret;
-
drm_sched_job_arm(>base);
 
job->done_fence = dma_fence_get(>base.s_fence->finished);
@@ -483,8 +488,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
kref_get(>refcount);
 
drm_sched_entity_push_job(>base);
-
-   return 0;
 }
 
 static void
@@ -530,7 +533,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
struct drm_file *file_priv)
 {
struct v3d_dev *v3d = to_v3d_dev(dev);
-   struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
struct drm_v3d_submit_cl *args = data;
struct v3d_bin_job *bin = NULL;
struct v3d_render_job *render;
@@ -556,7 +558,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
INIT_LIST_HEAD(>unref_list);
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_render_job_free, args->in_sync_rcl);
+  v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
if (ret) {
kfree(render);
return ret;
@@ -570,7 +572,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
}
 
ret = v3d_job_init(v3d, file_priv, >base,
-  v3d_job_free, args->in_sync_bcl);
+  v3d_job_free, args->in_sync_bcl, V3D_BIN);
if (ret) {
v3d_job_put(>base);
kfree(bin);
@@ -592,7 +594,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
goto fail;
}
 
-   ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+   ret = v3d_job_init(v3d, file_priv, 

[Intel-gfx] [PATCH v3 08/20] drm/lima: use scheduler dependency tracking

2021-07-08 Thread Daniel Vetter
Nothing special going on here.

Aside reviewing the code, it seems like drm_sched_job_arm() should be
moved into lima_sched_context_queue_task and put under some mutex
together with drm_sched_push_job(). See the kerneldoc for
drm_sched_push_job().

Signed-off-by: Daniel Vetter 
Cc: Qiang Yu 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/lima/lima_gem.c   |  4 ++--
 drivers/gpu/drm/lima/lima_sched.c | 21 -
 drivers/gpu/drm/lima/lima_sched.h |  3 ---
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index c528f40981bb..e54a88d5037a 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -267,7 +267,7 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, 
struct lima_bo *bo,
if (explicit)
return 0;
 
-   return drm_gem_fence_array_add_implicit(>deps, >base.base, 
write);
+   return drm_sched_job_await_implicit(>base, >base.base, write);
 }
 
 static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
@@ -285,7 +285,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct 
lima_submit *submit)
if (err)
return err;
 
-   err = drm_gem_fence_array_add(>task->deps, fence);
+   err = drm_sched_job_await_fence(>task->base, fence);
if (err) {
dma_fence_put(fence);
return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index e968b5a8f0b0..99d5f6f1a882 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
return 0;
 }
 
 void lima_sched_task_fini(struct lima_sched_task *task)
 {
-   struct dma_fence *fence;
-   unsigned long index;
int i;
 
drm_sched_job_cleanup(>base);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
if (task->bos) {
for (i = 0; i < task->num_bos; i++)
drm_gem_object_put(>bos[i]->base.base);
@@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct 
lima_sched_task *task)
return fence;
 }
 
-static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
-  struct drm_sched_entity *entity)
-{
-   struct lima_sched_task *task = to_lima_task(job);
-
-   if (!xa_empty(>deps))
-   return xa_erase(>deps, task->last_dep++);
-
-   return NULL;
-}
-
 static int lima_pm_busy(struct lima_device *ldev)
 {
int ret;
@@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
 }
 
 static const struct drm_sched_backend_ops lima_sched_ops = {
-   .dependency = lima_sched_dependency,
.run_job = lima_sched_run_job,
.timedout_job = lima_sched_timedout_job,
.free_job = lima_sched_free_job,
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index ac70006b0e26..6a11764d87b3 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -23,9 +23,6 @@ struct lima_sched_task {
struct lima_vm *vm;
void *frame;
 
-   struct xarray deps;
-   unsigned long last_dep;
-
struct lima_bo **bos;
int num_bos;
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 10/20] drm/v3d: Use scheduler dependency handling

2021-07-08 Thread Daniel Vetter
With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

Cc: Melissa Wen 
Signed-off-by: Daniel Vetter 
Cc: Cc: Emma Anholt 
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  5 -
 drivers/gpu/drm/v3d/v3d_gem.c   | 25 -
 drivers/gpu/drm/v3d/v3d_sched.c | 29 +
 3 files changed, 9 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 1d870261eaac..f80f4ff1f7aa 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -192,11 +192,6 @@ struct v3d_job {
struct drm_gem_object **bo;
u32 bo_count;
 
-   /* Array of struct dma_fence * to block on before submitting this job.
-*/
-   struct xarray deps;
-   unsigned long last_dep;
-
/* v3d fence to be signaled by IRQ handler when the job is complete. */
struct dma_fence *irq_fence;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 5eccd3658938..42b07ffbea5e 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
return ret;
 
for (i = 0; i < job->bo_count; i++) {
-   ret = drm_gem_fence_array_add_implicit(>deps,
-  job->bo[i], true);
+   ret = drm_sched_job_await_implicit(>base,
+  job->bo[i], true);
if (ret) {
drm_gem_unlock_reservations(job->bo, job->bo_count,
acquire_ctx);
@@ -354,8 +354,6 @@ static void
 v3d_job_free(struct kref *ref)
 {
struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
-   unsigned long index;
-   struct dma_fence *fence;
int i;
 
for (i = 0; i < job->bo_count; i++) {
@@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref)
}
kvfree(job->bo);
 
-   xa_for_each(>deps, index, fence) {
-   dma_fence_put(fence);
-   }
-   xa_destroy(>deps);
-
dma_fence_put(job->irq_fence);
dma_fence_put(job->done_fence);
 
@@ -452,7 +445,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret < 0)
return ret;
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
ret = drm_sched_job_init(>base, _priv->sched_entity[queue],
 v3d_priv);
if (ret)
@@ -462,7 +454,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
if (ret == -EINVAL)
goto fail_job;
 
-   ret = drm_gem_fence_array_add(>deps, in_fence);
+   ret = drm_sched_job_await_fence(>base, in_fence);
if (ret)
goto fail_job;
 
@@ -472,7 +464,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file 
*file_priv,
 fail_job:
drm_sched_job_cleanup(>base);
 fail:
-   xa_destroy(>deps);
pm_runtime_put_autosuspend(v3d->drm.dev);
return ret;
 }
@@ -619,8 +610,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (bin) {
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(>base.deps,
- 
dma_fence_get(bin->base.done_fence));
+   ret = drm_sched_job_await_fence(>base.base,
+   
dma_fence_get(bin->base.done_fence));
if (ret)
goto fail_unreserve;
}
@@ -630,7 +621,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
if (clean_job) {
struct dma_fence *render_fence =
dma_fence_get(render->base.done_fence);
-   ret = drm_gem_fence_array_add(_job->deps, render_fence);
+   ret = drm_sched_job_await_fence(_job->base, render_fence);
if (ret)
goto fail_unreserve;
v3d_push_job(clean_job);
@@ -820,8 +811,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
mutex_lock(>sched_lock);
v3d_push_job(>base);
 
-   ret = drm_gem_fence_array_add(_job->deps,
- dma_fence_get(job->base.done_fence));
+   ret = drm_sched_job_await_fence(_job->base,
+   dma_fence_get(job->base.done_fence));
if (ret)
goto fail_unreserve;
 
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 3f352d73af9c..f0de584f452c 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -13,7 +13,7 @@
  * jobs when bulk background jobs 

[Intel-gfx] [PATCH v3 06/20] drm/sched: improve docs around drm_sched_entity

2021-07-08 Thread Daniel Vetter
I found a few too many things that are tricky and not documented, so I
started typing.

I found a few more things that looked broken while typing, see the
varios FIXME in drm_sched_entity.

Also some of the usual logics:
- actually include sched_entity.c declarations, that was lost in the
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into
  separate file")

- Ditch the kerneldoc for internal functions, keep the comments where
  they're describing more than what the function name already implies.

- Switch drm_sched_entity to inline docs.

Signed-off-by: Daniel Vetter 
---
 Documentation/gpu/drm-mm.rst |   3 +
 drivers/gpu/drm/scheduler/sched_entity.c |  85 -
 include/drm/gpu_scheduler.h  | 145 ++-
 3 files changed, 146 insertions(+), 87 deletions(-)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index d5a73fa2c9ef..0198fa43d254 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -504,3 +504,6 @@ Scheduler Function References
 
 .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
:export:
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c
+   :export:
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index e2a6803910ce..ebb0dcb8a942 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -45,8 +45,14 @@
  * @guilty: atomic_t set to 1 when a job on this queue
  *  is found to be guilty causing a timeout
  *
- * Note: the sched_list must have at least one element to schedule
- *   the entity
+ * Note that the _list must have at least one element to schedule the 
entity.
+ *
+ * For changing @priority later on at runtime see
+ * drm_sched_entity_set_priority(). For changing the set of schedulers
+ * @sched_list at runtime see drm_sched_entity_modify_sched().
+ *
+ * An entity is cleaned up by callind drm_sched_entity_fini(). See also
+ * drm_sched_entity_destroy().
  *
  * Returns 0 on success or a negative error code on failure.
  */
@@ -92,6 +98,11 @@ EXPORT_SYMBOL(drm_sched_entity_init);
  * @sched_list: the list of new drm scheds which will replace
  *  existing entity->sched_list
  * @num_sched_list: number of drm sched in sched_list
+ *
+ * Note that this must be called under the same common lock for @entity as
+ * drm_sched_job_arm() and drm_sched_entity_push_job(), or the driver needs to
+ * guarantee through some other means that this is never called while new jobs
+ * can be pushed to @entity.
  */
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
struct drm_gpu_scheduler **sched_list,
@@ -104,13 +115,6 @@ void drm_sched_entity_modify_sched(struct drm_sched_entity 
*entity,
 }
 EXPORT_SYMBOL(drm_sched_entity_modify_sched);
 
-/**
- * drm_sched_entity_is_idle - Check if entity is idle
- *
- * @entity: scheduler entity
- *
- * Returns true if the entity does not have any unscheduled jobs.
- */
 static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity)
 {
rmb(); /* for list_empty to work without lock */
@@ -123,13 +127,7 @@ static bool drm_sched_entity_is_idle(struct 
drm_sched_entity *entity)
return false;
 }
 
-/**
- * drm_sched_entity_is_ready - Check if entity is ready
- *
- * @entity: scheduler entity
- *
- * Return true if entity could provide a job.
- */
+/* Return true if entity could provide a job. */
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity)
 {
if (spsc_queue_peek(>job_queue) == NULL)
@@ -192,14 +190,7 @@ long drm_sched_entity_flush(struct drm_sched_entity 
*entity, long timeout)
 }
 EXPORT_SYMBOL(drm_sched_entity_flush);
 
-/**
- * drm_sched_entity_kill_jobs_cb - helper for drm_sched_entity_kill_jobs
- *
- * @f: signaled fence
- * @cb: our callback structure
- *
- * Signal the scheduler finished fence when the entity in question is killed.
- */
+/* Signal the scheduler finished fence when the entity in question is killed. 
*/
 static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
  struct dma_fence_cb *cb)
 {
@@ -224,14 +215,6 @@ drm_sched_job_dependency(struct drm_sched_job *job,
return NULL;
 }
 
-/**
- * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
- *
- * @entity: entity which is cleaned up
- *
- * Makes sure that all remaining jobs in an entity are killed before it is
- * destroyed.
- */
 static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 {
struct drm_sched_job *job;
@@ -273,9 +256,11 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
  *
  * @entity: scheduler entity
  *
- * This should be called after @drm_sched_entity_do_release. It goes over the
- * entity and signals all jobs with an error code if the process was killed.
+ * Cleanups up @entity which has been initialized by 

[Intel-gfx] [PATCH v3 05/20] drm/sched: drop entity parameter from drm_sched_push_job

2021-07-08 Thread Daniel Vetter
Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Emma Anholt 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Alex Deucher 
Cc: Nirmoy Das 
Cc: Dave Airlie 
Cc: Chen Li 
Cc: Lee Jones 
Cc: Deepak R Varma 
Cc: Kevin Wang 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Dennis Li 
Cc: Boris Brezillon 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c  | 3 +--
 drivers/gpu/drm/lima/lima_sched.c| 5 ++---
 drivers/gpu/drm/lima/lima_sched.h| 3 +--
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++
 drivers/gpu/drm/v3d/v3d_gem.c| 2 +-
 include/drm/gpu_scheduler.h  | 3 +--
 10 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a4ec092af9a7..18f63567fb69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
trace_amdgpu_cs_ioctl(job);
amdgpu_vm_bo_trace_cs(>vm, >ticket);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
amdgpu_vm_move_to_lru_tail(p->adev, >vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b86099c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
 
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
-   drm_sched_entity_push_job(>base, entity);
+   drm_sched_entity_push_job(>base);
 
return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 05f412204118..180bb633d5c5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
/* the scheduler holds on to the job now */
kref_get(>refcount);
 
-   drm_sched_entity_push_job(>sched_job, sched_entity);
+   drm_sched_entity_push_job(>sched_job);
 
 out_unlock:
mutex_unlock(>gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct 
lima_submit *submit)
goto err_out2;
}
 
-   fence = lima_sched_context_queue_task(
-   submit->ctx->context + submit->pipe, submit->task);
+   fence = lima_sched_context_queue_task(submit->task);
 
for (i = 0; i < submit->nr_bos; i++) {
if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index 38f755580507..e968b5a8f0b0 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
drm_sched_entity_fini(>base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context 
*context,
-   struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
struct dma_fence *fence = dma_fence_get(>base.s_fence->finished);
 
trace_lima_task_submit(task);
-   drm_sched_entity_push_job(>base, >base);
+   drm_sched_entity_push_job(>base);
return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched.h 
b/drivers/gpu/drm/lima/lima_sched.h
index 90f03c48ef4a..ac70006b0e26 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
atomic_t *guilty);
 void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 struct lima_sched_context 

[Intel-gfx] [PATCH v3 07/20] drm/panfrost: use scheduler dependency tracking

2021-07-08 Thread Daniel Vetter
Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

v2: Correctly clean up the partially set up job, now that job_init()
and job_arm() are apart (Emma).

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 16 ---
 drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++--
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 17 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c 
b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..9f53bea07d61 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
if (ret)
goto fail;
 
-   ret = drm_gem_fence_array_add(>deps, fence);
+   ret = drm_sched_job_await_fence(>base, fence);
 
if (ret)
goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
struct drm_panfrost_submit *args = data;
struct drm_syncobj *sync_out = NULL;
struct panfrost_job *job;
-   int ret = 0;
+   int ret = 0, slot;
 
if (!args->jc)
return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
 
kref_init(>refcount);
 
-   xa_init_flags(>deps, XA_FLAGS_ALLOC);
-
job->pfdev = pfdev;
job->jc = args->jc;
job->requirements = args->requirements;
job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
job->file_priv = file->driver_priv;
 
+   slot = panfrost_job_get_slot(job);
+
+   ret = drm_sched_job_init(>base,
+>file_priv->sched_entity[slot],
+NULL);
+   if (ret)
+   goto fail_job_put;
+
ret = panfrost_copy_in_sync(dev, file, args, job);
if (ret)
goto fail_job;
@@ -283,6 +289,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, 
void *data,
drm_syncobj_replace_fence(sync_out, job->render_done_fence);
 
 fail_job:
+   drm_sched_job_cleanup(>base);
+fail_job_put:
panfrost_job_put(job);
 fail_out_sync:
if (sync_out)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 4bc962763e1f..86c843d8822e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -102,7 +102,7 @@ static struct dma_fence *panfrost_fence_create(struct 
panfrost_device *pfdev, in
return >base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
/* JS0: fragment jobs.
 * JS1: vertex/tiler jobs
@@ -242,13 +242,13 @@ static void panfrost_job_hw_submit(struct panfrost_job 
*job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
  int bo_count,
- struct xarray *deps)
+ struct drm_sched_job *job)
 {
int i, ret;
 
for (i = 0; i < bo_count; i++) {
/* panfrost always uses write mode in its current uapi */
-   ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+   ret = drm_sched_job_await_implicit(job, bos[i], true);
if (ret)
return ret;
}
@@ -269,31 +269,21 @@ static void panfrost_attach_object_fences(struct 
drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
struct panfrost_device *pfdev = job->pfdev;
-   int slot = panfrost_job_get_slot(job);
-   struct drm_sched_entity *entity = >file_priv->sched_entity[slot];
struct ww_acquire_ctx acquire_ctx;
int ret = 0;
 
-
ret = drm_gem_lock_reservations(job->bos, job->bo_count,
_ctx);
if (ret)
return ret;
 
mutex_lock(>sched_lock);
-
-   ret = drm_sched_job_init(>base, entity, NULL);
-   if (ret) {
-   mutex_unlock(>sched_lock);
-   goto unlock;
-   }
-
drm_sched_job_arm(>base);
 
job->render_done_fence = dma_fence_get(>base.s_fence->finished);
 
ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
->deps);
+  

[Intel-gfx] [PATCH v3 04/20] drm/sched: Add dependency tracking

2021-07-08 Thread Daniel Vetter
Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

On the naming bikeshed: The idea for using _await_ to denote adding
dependencies to a job comes from i915, where that's used quite
extensively all over the place, in lots of datastructures.

v2/3: Rebased.

Reviewed-by: Steven Price  (v1)
Signed-off-by: Daniel Vetter 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Nirmoy Das 
Cc: Boris Brezillon 
Cc: Luben Tuikov 
Cc: Alex Deucher 
Cc: Jack Zhang 
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
---
 drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
 drivers/gpu/drm/scheduler/sched_main.c   | 103 +++
 include/drm/gpu_scheduler.h  |  31 ++-
 3 files changed, 146 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 4e1124ed80e0..c7e6d29c9a33 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence 
*f,
job->sched->ops->free_job(job);
 }
 
+static struct dma_fence *
+drm_sched_job_dependency(struct drm_sched_job *job,
+struct drm_sched_entity *entity)
+{
+   if (!xa_empty(>dependencies))
+   return xa_erase(>dependencies, job->last_dependency++);
+
+   if (job->sched->ops->dependency)
+   return job->sched->ops->dependency(job, entity);
+
+   return NULL;
+}
+
 /**
  * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
  *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct 
drm_sched_entity *entity)
struct drm_sched_fence *s_fence = job->s_fence;
 
/* Wait for all dependencies to avoid data corruptions */
-   while ((f = job->sched->ops->dependency(job, entity)))
+   while ((f = drm_sched_job_dependency(job, entity)))
dma_fence_wait(f, false);
 
drm_sched_fence_scheduled(s_fence);
@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct 
drm_sched_entity *entity)
  */
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 {
-   struct drm_gpu_scheduler *sched = entity->rq->sched;
struct drm_sched_job *sched_job;
 
sched_job = to_drm_sched_job(spsc_queue_peek(>job_queue));
@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
return NULL;
 
while ((entity->dependency =
-   sched->ops->dependency(sched_job, entity))) {
+   drm_sched_job_dependency(sched_job, entity))) {
trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
if (drm_sched_entity_add_dependency_cb(entity))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 7e94754eb34c..ad62f1d2991c 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -594,6 +594,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
INIT_LIST_HEAD(>list);
 
+   xa_init_flags(>dependencies, XA_FLAGS_ALLOC);
+
return 0;
 }
 EXPORT_SYMBOL(drm_sched_job_init);
@@ -631,6 +633,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_arm);
 
+/**
+ * drm_sched_job_await_fence - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+ struct dma_fence *fence)
+{
+   struct dma_fence *entry;
+   unsigned long index;
+   u32 id = 0;
+   int ret;
+
+   if (!fence)
+   return 0;
+
+   /* Deduplicate if we already depend on a fence from the same context.
+* This lets the size of the array of deps scale with the number of
+* engines involved, rather than the number of BOs.
+*/
+   xa_for_each(>dependencies, index, entry) {
+   if (entry->context != fence->context)
+   continue;
+
+   if (dma_fence_is_later(fence, entry)) {
+   dma_fence_put(entry);
+   xa_store(>dependencies, index, fence, GFP_KERNEL);
+   } else {
+   dma_fence_put(fence);
+   }
+   return 0;
+   }
+
+   ret = xa_alloc(>dependencies, , fence, xa_limit_32b, 
GFP_KERNEL);
+   if (ret != 0)
+ 

[Intel-gfx] [PATCH v3 03/20] drm/sched: Barriers are needed for entity->last_scheduled

2021-07-08 Thread Daniel Vetter
It might be good enough on x86 with just READ_ONCE, but the write side
should then at least be WRITE_ONCE because x86 has total store order.

It's definitely not enough on arm.

Fix this proplery, which means
- explain the need for the barrier in both places
- point at the other side in each comment

Also pull out the !sched_list case as the first check, so that the
code flow is clearer.

While at it sprinkle some comments around because it was very
non-obvious to me what's actually going on here and why.

Note that we really need full barriers here, at first I thought
store-release and load-acquire on ->last_scheduled would be enough,
but we actually requiring ordering between that and the queue state.

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Steven Price 
Cc: Daniel Vetter 
Cc: Andrey Grodzovsky 
Cc: Lee Jones 
Cc: Boris Brezillon 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 27 ++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 64d398166644..4e1124ed80e0 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -439,8 +439,16 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct 
drm_sched_entity *entity)
dma_fence_set_error(_job->s_fence->finished, -ECANCELED);
 
dma_fence_put(entity->last_scheduled);
+
entity->last_scheduled = dma_fence_get(_job->s_fence->finished);
 
+   /*
+* if the queue is empty we allow drm_sched_job_arm() to locklessly
+* access ->last_scheduled. This only works if we set the pointer before
+* we dequeue and if we a write barrier here.
+*/
+   smp_wmb();
+
spsc_queue_pop(>job_queue);
return sched_job;
 }
@@ -459,10 +467,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity 
*entity)
struct drm_gpu_scheduler *sched;
struct drm_sched_rq *rq;
 
-   if (spsc_queue_count(>job_queue) || !entity->sched_list)
+   /* single possible engine and already selected */
+   if (!entity->sched_list)
+   return;
+
+   /* queue non-empty, stay on the same engine */
+   if (spsc_queue_count(>job_queue))
return;
 
-   fence = READ_ONCE(entity->last_scheduled);
+   fence = entity->last_scheduled;
+
+   /*
+* Only when the queue is empty are we guaranteed the the scheduler
+* thread cannot change ->last_scheduled. To enforce ordering we need
+* a read barrier here. See drm_sched_entity_pop_job() for the other
+* side.
+*/
+   smp_rmb();
+
+   /* stay on the same engine if the previous job hasn't finished */
if (fence && !dma_fence_is_signaled(fence))
return;
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 01/20] drm/sched: entity->rq selection cannot fail

2021-07-08 Thread Daniel Vetter
If it does, someone managed to set up a sched_entity without
schedulers, which is just a driver bug.

We BUG_ON() here because in the next patch drm_sched_job_init() will
be split up, with drm_sched_job_arm() never failing. And that's the
part where the rq selection will end up in.

Note that if having an empty sched_list set on an entity is indeed a
valid use-case, we can keep that check in job_init even after the split
into job_init/arm.

Signed-off-by: Daniel Vetter 
Cc: "Christian König" 
Cc: Luben Tuikov 
Cc: Daniel Vetter 
Cc: Steven Price 
Cc: Andrey Grodzovsky 
Cc: Boris Brezillon 
Cc: Jack Zhang 
---
 drivers/gpu/drm/scheduler/sched_entity.c | 2 +-
 drivers/gpu/drm/scheduler/sched_main.c   | 3 +--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..6fc116ee7302 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -45,7 +45,7 @@
  * @guilty: atomic_t set to 1 when a job on this queue
  *  is found to be guilty causing a timeout
  *
- * Note: the sched_list should have at least one element to schedule
+ * Note: the sched_list must have at least one element to schedule
  *   the entity
  *
  * Returns 0 on success or a negative error code on failure.
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 33c414d55fab..01dd47154181 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -586,8 +586,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
struct drm_gpu_scheduler *sched;
 
drm_sched_entity_select_rq(entity);
-   if (!entity->rq)
-   return -ENOENT;
+   BUG_ON(!entity->rq);
 
sched = entity->rq->sched;
 
-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH v3 02/20] drm/sched: Split drm_sched_job_init

2021-07-08 Thread Daniel Vetter
This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

v3: Emma noticed that I leak the memory allocated in
drm_sched_job_init if we bail out before the point of no return in
subsequent driver patches. To be able to fix this change
drm_sched_job_cleanup() so it can handle being called both before and
after drm_sched_job_arm().

Also improve the kerneldoc for this.

v4:
- Fix the drm_sched_job_cleanup logic, I inverted the booleans, as
  usual (Melissa)

- Christian pointed out that drm_sched_entity_select_rq() also needs
  to be moved into drm_sched_job_arm, which made me realize that the
  job->id definitely needs to be moved too.

  Shuffle things to fit between job_init and job_arm.

Cc: Melissa Wen 
Acked-by: Steven Price  (v2)
Signed-off-by: Daniel Vetter 
Cc: Lucas Stach 
Cc: Russell King 
Cc: Christian Gmeiner 
Cc: Qiang Yu 
Cc: Rob Herring 
Cc: Tomeu Vizoso 
Cc: Steven Price 
Cc: Alyssa Rosenzweig 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: Masahiro Yamada 
Cc: Kees Cook 
Cc: Adam Borowski 
Cc: Nick Terrell 
Cc: Mauro Carvalho Chehab 
Cc: Paul Menzel 
Cc: Sami Tolvanen 
Cc: Viresh Kumar 
Cc: Alex Deucher 
Cc: Dave Airlie 
Cc: Nirmoy Das 
Cc: Deepak R Varma 
Cc: Lee Jones 
Cc: Kevin Wang 
Cc: Chen Li 
Cc: Luben Tuikov 
Cc: "Marek Olšák" 
Cc: Dennis Li 
Cc: Maarten Lankhorst 
Cc: Andrey Grodzovsky 
Cc: Sonny Jiang 
Cc: Boris Brezillon 
Cc: Tian Tao 
Cc: etna...@lists.freedesktop.org
Cc: l...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Cc: Emma Anholt 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 +
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 +
 drivers/gpu/drm/lima/lima_sched.c|  2 +
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 +
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +--
 drivers/gpu/drm/scheduler/sched_fence.c  | 19 ---
 drivers/gpu/drm/scheduler/sched_main.c   | 64 
 drivers/gpu/drm/v3d/v3d_gem.c|  2 +
 include/drm/gpu_scheduler.h  |  7 ++-
 10 files changed, 86 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
if (r)
goto error_unlock;
 
+   drm_sched_job_arm(>base);
+
/* No memory allocation is allowed while holding the notifier lock.
 * The lock is held until amdgpu_cs_submit is finished and fence is
 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct 
drm_sched_entity *entity,
if (r)
return r;
 
+   drm_sched_job_arm(>base);
+
*f = dma_fence_get(>base.s_fence->finished);
amdgpu_job_free_resources(job);
drm_sched_entity_push_job(>base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index feb6da1b6ceb..05f412204118 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity 
*sched_entity,
if (ret)
goto out_unlock;
 
+   drm_sched_job_arm(>sched_job);
+
submit->out_fence = dma_fence_get(>sched_job.s_fence->finished);
submit->out_fence_id = idr_alloc_cyclic(>gpu->fence_idr,
submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index dba8329937a3..38f755580507 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
return err;
}
 
+   drm_sched_job_arm(>base);
+
task->num_bos = num_bos;
task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71a72fb50e6b..2992dc85325f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -288,6 +288,8 @@ int 

[Intel-gfx] [PATCH v3 00/20] drm/sched dependency tracking and dma-resv fixes

2021-07-08 Thread Daniel Vetter
Hil all,

I figured I'll combine the two series, they build on top of each another
anyway. Changes:

- drop broken i915 patch (Matt)
- typos and improvements in the dma-resv patch
- bunch of fixes to the drm_sched_job_init/arm split (Melissa, Christian)
- threw a drm_sched_entity doc patch on top

Testing & review very much welcome.

Cheers, Daniel

Christian König (1):
  drm/msm: always wait for the exclusive fence

Daniel Vetter (19):
  drm/sched: entity->rq selection cannot fail
  drm/sched: Split drm_sched_job_init
  drm/sched: Barriers are needed for entity->last_scheduled
  drm/sched: Add dependency tracking
  drm/sched: drop entity parameter from drm_sched_push_job
  drm/sched: improve docs around drm_sched_entity
  drm/panfrost: use scheduler dependency tracking
  drm/lima: use scheduler dependency tracking
  drm/v3d: Move drm_sched_job_init to v3d_job_init
  drm/v3d: Use scheduler dependency handling
  drm/etnaviv: Use scheduler dependency handling
  drm/gem: Delete gem array fencing helpers
  drm/sched: Don't store self-dependencies
  drm/sched: Check locking in drm_sched_job_await_implicit
  drm/msm: Don't break exclusive fence ordering
  drm/etnaviv: Don't break exclusive fence ordering
  drm/i915: delete exclude argument from i915_sw_fence_await_reservation
  drm/i915: Don't break exclusive fence ordering
  dma-resv: Give the docs a do-over

 Documentation/gpu/drm-mm.rst  |   3 +
 drivers/dma-buf/dma-resv.c|  24 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c|   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |   4 +-
 drivers/gpu/drm/drm_gem.c |  96 -
 drivers/gpu/drm/etnaviv/etnaviv_gem.h |   5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c  |  64 +++---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c   |  65 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.h   |   3 +-
 drivers/gpu/drm/i915/display/intel_display.c  |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_clflush.c   |   2 +-
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|   8 +-
 drivers/gpu/drm/i915/i915_sw_fence.c  |   6 +-
 drivers/gpu/drm/i915/i915_sw_fence.h  |   1 -
 drivers/gpu/drm/lima/lima_gem.c   |   7 +-
 drivers/gpu/drm/lima/lima_sched.c |  28 +--
 drivers/gpu/drm/lima/lima_sched.h |   6 +-
 drivers/gpu/drm/msm/msm_gem.c |  16 +-
 drivers/gpu/drm/msm/msm_gem_submit.c  |   3 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c   |  16 +-
 drivers/gpu/drm/panfrost/panfrost_job.c   |  39 +---
 drivers/gpu/drm/panfrost/panfrost_job.h   |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c  | 140 +++--
 drivers/gpu/drm/scheduler/sched_fence.c   |  19 +-
 drivers/gpu/drm/scheduler/sched_main.c| 177 +++--
 drivers/gpu/drm/v3d/v3d_drv.h |   6 +-
 drivers/gpu/drm/v3d/v3d_gem.c | 115 +--
 drivers/gpu/drm/v3d/v3d_sched.c   |  44 +
 include/drm/drm_gem.h |   5 -
 include/drm/gpu_scheduler.h   | 186 ++
 include/linux/dma-buf.h   |   7 +
 include/linux/dma-resv.h  | 104 +-
 32 files changed, 674 insertions(+), 538 deletions(-)

-- 
2.32.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH v15 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-07-08 Thread Will Deacon
On Tue, Jul 06, 2021 at 12:14:16PM -0700, Nathan Chancellor wrote:
> On 7/6/2021 10:06 AM, Will Deacon wrote:
> > On Tue, Jul 06, 2021 at 04:39:11PM +0100, Robin Murphy wrote:
> > > On 2021-07-06 15:05, Christoph Hellwig wrote:
> > > > On Tue, Jul 06, 2021 at 03:01:04PM +0100, Robin Murphy wrote:
> > > > > FWIW I was pondering the question of whether to do something along 
> > > > > those
> > > > > lines or just scrap the default assignment entirely, so since I 
> > > > > hadn't got
> > > > > round to saying that I've gone ahead and hacked up the alternative
> > > > > (similarly untested) for comparison :)
> > > > > 
> > > > > TBH I'm still not sure which one I prefer...
> > > > 
> > > > Claire did implement something like your suggestion originally, but
> > > > I don't really like it as it doesn't scale for adding multiple global
> > > > pools, e.g. for the 64-bit addressable one for the various encrypted
> > > > secure guest schemes.
> > > 
> > > Ah yes, that had slipped my mind, and it's a fair point indeed. Since 
> > > we're
> > > not concerned with a minimal fix for backports anyway I'm more than happy 
> > > to
> > > focus on Will's approach. Another thing is that that looks to take us a
> > > quiet step closer to the possibility of dynamically resizing a SWIOTLB 
> > > pool,
> > > which is something that some of the hypervisor protection schemes looking 
> > > to
> > > build on top of this series may want to explore at some point.
> > 
> > Ok, I'll split that nasty diff I posted up into a reviewable series and we
> > can take it from there.
> 
> For what it's worth, I attempted to boot Will's diff on top of Konrad's
> devel/for-linus-5.14 and it did not work; in fact, I got no output on my
> monitor period, even with earlyprintk=, and I do not think this machine has
> a serial console.

Looking back at the diff, I completely messed up swiotlb_exit() by mixing up
physical and virtual addresses.

> Robin's fix does work, it survived ten reboots with no issues getting to X
> and I do not see the KASAN and slub debug messages anymore but I understand
> that this is not the preferred solution it seems (although Konrad did want
> to know if it works).
> 
> I am happy to test any further patches or follow ups as needed, just keep me
> on CC.

Cheers. Since this isn't 5.14 material any more, I'll CC you on a series
next week.

Will
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [v7 3/3] drm/i915/display/dsc: Force dsc BPP

2021-07-08 Thread Kulkarni, Vandita


> -Original Message-
> From: Nikula, Jani 
> Sent: Thursday, July 8, 2021 9:53 PM
> To: Kulkarni, Vandita ; intel-
> g...@lists.freedesktop.org
> Subject: RE: [v7 3/3] drm/i915/display/dsc: Force dsc BPP
> 
> On Thu, 08 Jul 2021, "Kulkarni, Vandita"  wrote:
> >> -Original Message-
> >> From: Nikula, Jani 
> >> Sent: Thursday, July 8, 2021 6:44 PM
> >> To: Kulkarni, Vandita ; intel-
> >> g...@lists.freedesktop.org
> >> Cc: Kulkarni, Vandita 
> >> Subject: Re: [v7 3/3] drm/i915/display/dsc: Force dsc BPP
> >>
> >> On Thu, 08 Jul 2021, Jani Nikula  wrote:
> >> > On Thu, 08 Jul 2021, Vandita Kulkarni 
> wrote:
> >> >> Set DSC BPP to the value forced through debugfs. It can go from
> >> >> bpc to bpp-1.
> >> >>
> >> >> Signed-off-by: Vandita Kulkarni 
> >> >> ---
> >> >>  drivers/gpu/drm/i915/display/intel_dp.c | 17 +
> >> >>  1 file changed, 17 insertions(+)
> >> >>
> >> >> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> b/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> index 5b52beaddada..3e50cdd7e448 100644
> >> >> --- a/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> >> >> @@ -1240,6 +1240,23 @@ static int
> >> >> intel_dp_dsc_compute_config(struct
> >> intel_dp *intel_dp,
> >> >> pipe_config->port_clock = intel_dp->common_rates[limits-
> >> >max_clock];
> >> >> pipe_config->lane_count = limits->max_lane_count;
> >> >>
> >> >> +   if (intel_dp->force_dsc_en) {
> >>
> >> Oh, this should check for intel_dp->force_dsc_bpp. We don't want to
> >> always force the bpp when we force dsc enable.
> > Okay will fix this.
> > And I was returning -EINVAL , to fail the test on setting invalid BPP.
> 
> Okay, if it makes the test easier, I guess it's fine. Up to you.

Okay,  for now I have sent a patch like you suggested,  as I see that there are 
no negative test cases .
Have sent the v2 of this patch. So,  it wouldn't make much difference.

Thanks,
Vandita
> 
> BR,
> Jani.
> 
> >
> >>
> >> >> +   /* As of today we support DSC for only RGB */
> >> >> +   if (intel_dp->force_dsc_bpp >= 8 &&
> >> >> +   intel_dp->force_dsc_bpp < pipe_bpp) {
> >> >> +   drm_dbg_kms(_priv->drm,
> >> >> +   "DSC BPP forced to %d",
> >> >> +   intel_dp->force_dsc_bpp);
> >> >> +   pipe_config->dsc.compressed_bpp =
> >> >> +   intel_dp->force_dsc_bpp;
> >> >> +   } else {
> >> >> +   drm_dbg_kms(_priv->drm,
> >> >> +   "Invalid DSC BPP %d",
> >> >> +   intel_dp->force_dsc_bpp);
> >> >> +   return -EINVAL;
> >> >
> >> > I'd just let it use the normal compressed_bpp, with the debug
> >> > message, instead of returning -EINVAL.
> >> >
> >> >> +   }
> >> >> +   }
> >> >> +
> >> >
> >> > This should be *after* the below blocks, because otherwise
> >> > compressed_bpp will be overridden by the normal case, not by the
> >> > force case!
> >> >
> >> > BR,
> >> > Jani.
> >> >
> >> >> if (intel_dp_is_edp(intel_dp)) {
> >> >> pipe_config->dsc.compressed_bpp =
> >> >> min_t(u16,
> >> drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4,
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center
> 
> --
> Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [v7 3/3] drm/i915/display/dsc: Force dsc BPP

2021-07-08 Thread Jani Nikula
On Thu, 08 Jul 2021, "Kulkarni, Vandita"  wrote:
>> -Original Message-
>> From: Nikula, Jani 
>> Sent: Thursday, July 8, 2021 6:44 PM
>> To: Kulkarni, Vandita ; intel-
>> g...@lists.freedesktop.org
>> Cc: Kulkarni, Vandita 
>> Subject: Re: [v7 3/3] drm/i915/display/dsc: Force dsc BPP
>> 
>> On Thu, 08 Jul 2021, Jani Nikula  wrote:
>> > On Thu, 08 Jul 2021, Vandita Kulkarni  wrote:
>> >> Set DSC BPP to the value forced through debugfs. It can go from bpc
>> >> to bpp-1.
>> >>
>> >> Signed-off-by: Vandita Kulkarni 
>> >> ---
>> >>  drivers/gpu/drm/i915/display/intel_dp.c | 17 +
>> >>  1 file changed, 17 insertions(+)
>> >>
>> >> diff --git a/drivers/gpu/drm/i915/display/intel_dp.c
>> >> b/drivers/gpu/drm/i915/display/intel_dp.c
>> >> index 5b52beaddada..3e50cdd7e448 100644
>> >> --- a/drivers/gpu/drm/i915/display/intel_dp.c
>> >> +++ b/drivers/gpu/drm/i915/display/intel_dp.c
>> >> @@ -1240,6 +1240,23 @@ static int intel_dp_dsc_compute_config(struct
>> intel_dp *intel_dp,
>> >>   pipe_config->port_clock = intel_dp->common_rates[limits-
>> >max_clock];
>> >>   pipe_config->lane_count = limits->max_lane_count;
>> >>
>> >> + if (intel_dp->force_dsc_en) {
>> 
>> Oh, this should check for intel_dp->force_dsc_bpp. We don't want to always
>> force the bpp when we force dsc enable.
> Okay will fix this.
> And I was returning -EINVAL , to fail the test on setting invalid BPP.

Okay, if it makes the test easier, I guess it's fine. Up to you.

BR,
Jani.

>
>> 
>> >> + /* As of today we support DSC for only RGB */
>> >> + if (intel_dp->force_dsc_bpp >= 8 &&
>> >> + intel_dp->force_dsc_bpp < pipe_bpp) {
>> >> + drm_dbg_kms(_priv->drm,
>> >> + "DSC BPP forced to %d",
>> >> + intel_dp->force_dsc_bpp);
>> >> + pipe_config->dsc.compressed_bpp =
>> >> + intel_dp->force_dsc_bpp;
>> >> + } else {
>> >> + drm_dbg_kms(_priv->drm,
>> >> + "Invalid DSC BPP %d",
>> >> + intel_dp->force_dsc_bpp);
>> >> + return -EINVAL;
>> >
>> > I'd just let it use the normal compressed_bpp, with the debug message,
>> > instead of returning -EINVAL.
>> >
>> >> + }
>> >> + }
>> >> +
>> >
>> > This should be *after* the below blocks, because otherwise
>> > compressed_bpp will be overridden by the normal case, not by the force
>> > case!
>> >
>> > BR,
>> > Jani.
>> >
>> >>   if (intel_dp_is_edp(intel_dp)) {
>> >>   pipe_config->dsc.compressed_bpp =
>> >>   min_t(u16,
>> drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4,
>> 
>> --
>> Jani Nikula, Intel Open Source Graphics Center

-- 
Jani Nikula, Intel Open Source Graphics Center
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 0/7] CT changes required for GuC submission

2021-07-08 Thread Matthew Brost
As part of enabling GuC submission discussed in [1], [2], and [3] we
need optimize and update the CT code as this is now in the critical
path of submission. This series includes the patches to do that which is
the first 7 patches from [3]. The patches should have addressed all the
feedback in [3] and should be ready to merge once CI returns a we get a
few more RBs.

v2: Fix checkpatch warning, address a couple of Michal's comments
v3: Address John Harrison's comments
v4: Address remaining comments, resend for patchworks to merge

Signed-off-by: Matthew Brost 

[1] https://patchwork.freedesktop.org/series/89844/
[2] https://patchwork.freedesktop.org/series/91417/
[3] https://patchwork.freedesktop.org/series/91840/

Signed-off-by: Matthew Brost 

John Harrison (1):
  drm/i915/guc: Module load failure test for CT buffer creation

Matthew Brost (6):
  drm/i915/guc: Relax CTB response timeout
  drm/i915/guc: Improve error message for unsolicited CT response
  drm/i915/guc: Increase size of CTB buffers
  drm/i915/guc: Add non blocking CTB send function
  drm/i915/guc: Add stall timer to non blocking CTB send function
  drm/i915/guc: Optimize CTB writes and reads

 .../gt/uc/abi/guc_communication_ctb_abi.h |   3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h|  11 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 256 +++---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  14 +-
 4 files changed, 234 insertions(+), 50 deletions(-)

-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 3/7] drm/i915/guc: Increase size of CTB buffers

2021-07-08 Thread Matthew Brost
With the introduction of non-blocking CTBs more than one CTB can be in
flight at a time. Increasing the size of the CTBs should reduce how
often software hits the case where no space is available in the CTB
buffer.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 80db59b45c45..43e03aa2dde8 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -58,11 +58,16 @@ static inline struct drm_device *ct_to_drm(struct 
intel_guc_ct *ct)
  *  ++---+--+
  *
  * Size of each `CT Buffer`_ must be multiple of 4K.
- * As we don't expect too many messages, for now use minimum sizes.
+ * We don't expect too many messages in flight at any time, unless we are
+ * using the GuC submission. In that case each request requires a minimum
+ * 2 dwords which gives us a maximum 256 queue'd requests. Hopefully this
+ * enough space to avoid backpressure on the driver. We increase the size
+ * of the receive buffer (relative to the send) to ensure a G2H response
+ * CTB has a landing spot.
  */
 #define CTB_DESC_SIZE  ALIGN(sizeof(struct guc_ct_buffer_desc), SZ_2K)
 #define CTB_H2G_BUFFER_SIZE(SZ_4K)
-#define CTB_G2H_BUFFER_SIZE(SZ_4K)
+#define CTB_G2H_BUFFER_SIZE(4 * CTB_H2G_BUFFER_SIZE)
 
 struct ct_request {
struct list_head link;
@@ -643,7 +648,7 @@ static int ct_read(struct intel_guc_ct *ct, struct 
ct_incoming_msg **msg)
/* beware of buffer wrap case */
if (unlikely(available < 0))
available += size;
-   CT_DEBUG(ct, "available %d (%u:%u)\n", available, head, tail);
+   CT_DEBUG(ct, "available %d (%u:%u:%u)\n", available, head, tail, size);
GEM_BUG_ON(available < 0);
 
header = cmds[head];
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 6/7] drm/i915/guc: Optimize CTB writes and reads

2021-07-08 Thread Matthew Brost
CTB writes are now in the path of command submission and should be
optimized for performance. Rather than reading CTB descriptor values
(e.g. head, tail) which could result in accesses across the PCIe bus,
store shadow local copies and only read/write the descriptor values when
absolutely necessary. Also store the current space in the each channel
locally.

v2:
 (Michal)
  - Add additional sanity checks for head / tail pointers
  - Use GUC_CTB_HDR_LEN rather than magic 1
v3:
 (Michal / John H)
  - Drop redundant check of head value
v4:
 (John H)
  - Drop redundant checks of tail / head values
v5:
 (Michal)
  - Address more nits
v6:
 (Michal)
  - Add GEM_BUG_ON sanity check on ctb->space

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 93 +++
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  6 ++
 2 files changed, 67 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index db3e85b89573..ad33708c2818 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -130,6 +130,10 @@ static void guc_ct_buffer_desc_init(struct 
guc_ct_buffer_desc *desc)
 static void guc_ct_buffer_reset(struct intel_guc_ct_buffer *ctb)
 {
ctb->broken = false;
+   ctb->tail = 0;
+   ctb->head = 0;
+   ctb->space = CIRC_SPACE(ctb->tail, ctb->head, ctb->size);
+
guc_ct_buffer_desc_init(ctb->desc);
 }
 
@@ -383,10 +387,8 @@ static int ct_write(struct intel_guc_ct *ct,
 {
struct intel_guc_ct_buffer *ctb = >ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
-   u32 head = desc->head;
-   u32 tail = desc->tail;
+   u32 tail = ctb->tail;
u32 size = ctb->size;
-   u32 used;
u32 header;
u32 hxg;
u32 type;
@@ -396,25 +398,22 @@ static int ct_write(struct intel_guc_ct *ct,
if (unlikely(desc->status))
goto corrupted;
 
-   if (unlikely((tail | head) >= size)) {
-   CT_ERROR(ct, "Invalid offsets head=%u tail=%u (size=%u)\n",
-head, tail, size);
+   GEM_BUG_ON(tail > size);
+
+#ifdef CONFIG_DRM_I915_DEBUG_GUC
+   if (unlikely(tail != READ_ONCE(desc->tail))) {
+   CT_ERROR(ct, "Tail was modified %u != %u\n",
+desc->tail, tail);
+   desc->status |= GUC_CTB_STATUS_MISMATCH;
+   goto corrupted;
+   }
+   if (unlikely(READ_ONCE(desc->head) >= size)) {
+   CT_ERROR(ct, "Invalid head offset %u >= %u)\n",
+desc->head, size);
desc->status |= GUC_CTB_STATUS_OVERFLOW;
goto corrupted;
}
-
-   /*
-* tail == head condition indicates empty. GuC FW does not support
-* using up the entire buffer to get tail == head meaning full.
-*/
-   if (tail < head)
-   used = (size - head) + tail;
-   else
-   used = tail - head;
-
-   /* make sure there is a space including extra dw for the header */
-   if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
-   return -ENOSPC;
+#endif
 
/*
 * dw0: CT header (including fence)
@@ -452,6 +451,11 @@ static int ct_write(struct intel_guc_ct *ct,
 */
write_barrier(ct);
 
+   /* update local copies */
+   ctb->tail = tail;
+   GEM_BUG_ON(ctb->space < len + GUC_CTB_HDR_LEN);
+   ctb->space -= len + GUC_CTB_HDR_LEN;
+
/* now update descriptor */
WRITE_ONCE(desc->tail, tail);
 
@@ -469,7 +473,7 @@ static int ct_write(struct intel_guc_ct *ct,
  * @req:   pointer to pending request
  * @status:placeholder for status
  *
- * For each sent request, Guc shall send bac CT response message.
+ * For each sent request, GuC shall send back CT response message.
  * Our message handler will update status of tracked request once
  * response message with given fence is received. Wait here and
  * check for valid response status value.
@@ -525,24 +529,36 @@ static inline bool ct_deadlocked(struct intel_guc_ct *ct)
return ret;
 }
 
-static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
+static inline bool h2g_has_room(struct intel_guc_ct *ct, u32 len_dw)
 {
+   struct intel_guc_ct_buffer *ctb = >ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
-   u32 head = READ_ONCE(desc->head);
+   u32 head;
u32 space;
 
-   space = CIRC_SPACE(desc->tail, head, ctb->size);
+   if (ctb->space >= len_dw)
+   return true;
+
+   head = READ_ONCE(desc->head);
+   if (unlikely(head > ctb->size)) {
+   CT_ERROR(ct, "Invalid head offset %u >= %u)\n",
+head, ctb->size);
+   desc->status |= GUC_CTB_STATUS_OVERFLOW;
+   ctb->broken = true;

[Intel-gfx] [PATCH 1/7] drm/i915/guc: Relax CTB response timeout

2021-07-08 Thread Matthew Brost
In upcoming patch we will allow more CTB requests to be sent in
parallel to the GuC for processing, so we shouldn't assume any more
that GuC will always reply without 10ms.

Use bigger value hardcoded value of 1s instead.

v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option
v3:
 (Daniel Vetter)
  - Use hardcoded value of 1s rather than config option
v4:
 (Michal)
  - Use defines for timeout values

Signed-off-by: Matthew Brost 
Cc: Michal Wajdeczko 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43409044528e..b86575b99537 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -474,14 +474,18 @@ static int wait_for_ct_request_update(struct ct_request 
*req, u32 *status)
/*
 * Fast commands should complete in less than 10us, so sample quickly
 * up to that length of time, then switch to a slower sleep-wait loop.
-* No GuC command should ever take longer than 10ms.
+* No GuC command should ever take longer than 10ms but many GuC
+* commands can be inflight at time, so use a 1s timeout on the slower
+* sleep-wait loop.
 */
+#define GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS 10
+#define GUC_CTB_RESPONSE_TIMEOUT_LONG_MS 1000
 #define done \
(FIELD_GET(GUC_HXG_MSG_0_ORIGIN, READ_ONCE(req->status)) == \
 GUC_HXG_ORIGIN_GUC)
-   err = wait_for_us(done, 10);
+   err = wait_for_us(done, GUC_CTB_RESPONSE_TIMEOUT_SHORT_MS);
if (err)
-   err = wait_for(done, 10);
+   err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
 #undef done
 
if (unlikely(err))
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 4/7] drm/i915/guc: Add non blocking CTB send function

2021-07-08 Thread Matthew Brost
Add non blocking CTB send function, intel_guc_send_nb. GuC submission
will send CTBs in the critical path and does not need to wait for these
CTBs to complete before moving on, hence the need for this new function.

The non-blocking CTB now must have a flow control mechanism to ensure
the buffer isn't overrun. A lazy spin wait is used as we believe the
flow control condition should be rare with a properly sized buffer.

The function, intel_guc_send_nb, is exported in this patch but unused.
Several patches later in the series make use of this function.

v2:
 (Michal)
  - Use define for H2G room calculations
  - Move INTEL_GUC_SEND_NB define
 (Daniel Vetter)
  - Use msleep_interruptible rather than cond_resched
v3:
 (Michal)
  - Move includes to following patch
  - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g
v4:
 (John H)
  - Update comment, add type local variable

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 .../gt/uc/abi/guc_communication_ctb_abi.h |  3 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.h| 11 ++-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 88 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 +-
 4 files changed, 91 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h 
b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
index e933ca02d0eb..99e1fad5ca20 100644
--- a/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
+++ b/drivers/gpu/drm/i915/gt/uc/abi/guc_communication_ctb_abi.h
@@ -79,7 +79,8 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
  *  
+---+---+--+
  */
 
-#define GUC_CTB_MSG_MIN_LEN1u
+#define GUC_CTB_HDR_LEN1u
+#define GUC_CTB_MSG_MIN_LENGUC_CTB_HDR_LEN
 #define GUC_CTB_MSG_MAX_LEN256u
 #define GUC_CTB_MSG_0_FENCE(0x << 16)
 #define GUC_CTB_MSG_0_FORMAT   (0xf << 12)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 4abc59f6f3cd..72e4653222e2 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -74,7 +74,14 @@ static inline struct intel_guc *log_to_guc(struct 
intel_guc_log *log)
 static
 inline int intel_guc_send(struct intel_guc *guc, const u32 *action, u32 len)
 {
-   return intel_guc_ct_send(>ct, action, len, NULL, 0);
+   return intel_guc_ct_send(>ct, action, len, NULL, 0, 0);
+}
+
+static
+inline int intel_guc_send_nb(struct intel_guc *guc, const u32 *action, u32 len)
+{
+   return intel_guc_ct_send(>ct, action, len, NULL, 0,
+INTEL_GUC_CT_SEND_NB);
 }
 
 static inline int
@@ -82,7 +89,7 @@ intel_guc_send_and_receive(struct intel_guc *guc, const u32 
*action, u32 len,
   u32 *response_buf, u32 response_buf_size)
 {
return intel_guc_ct_send(>ct, action, len,
-response_buf, response_buf_size);
+response_buf, response_buf_size, 0);
 }
 
 static inline void intel_guc_to_host_event_handler(struct intel_guc *guc)
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 43e03aa2dde8..3d6cba8d91ad 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -3,6 +3,8 @@
  * Copyright © 2016-2019 Intel Corporation
  */
 
+#include 
+
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
 #include "gt/intel_gt.h"
@@ -373,7 +375,7 @@ static void write_barrier(struct intel_guc_ct *ct)
 static int ct_write(struct intel_guc_ct *ct,
const u32 *action,
u32 len /* in dwords */,
-   u32 fence)
+   u32 fence, u32 flags)
 {
struct intel_guc_ct_buffer *ctb = >ctbs.send;
struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -383,6 +385,7 @@ static int ct_write(struct intel_guc_ct *ct,
u32 used;
u32 header;
u32 hxg;
+   u32 type;
u32 *cmds = ctb->cmds;
unsigned int i;
 
@@ -408,8 +411,8 @@ static int ct_write(struct intel_guc_ct *ct,
else
used = tail - head;
 
-   /* make sure there is a space including extra dw for the fence */
-   if (unlikely(used + len + 1 >= size))
+   /* make sure there is a space including extra dw for the header */
+   if (unlikely(used + len + GUC_CTB_HDR_LEN >= size))
return -ENOSPC;
 
/*
@@ -421,9 +424,11 @@ static int ct_write(struct intel_guc_ct *ct,
 FIELD_PREP(GUC_CTB_MSG_0_NUM_DWORDS, len) |
 FIELD_PREP(GUC_CTB_MSG_0_FENCE, fence);
 
-   hxg = FIELD_PREP(GUC_HXG_MSG_0_TYPE, GUC_HXG_TYPE_REQUEST) |
- FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION |
- 

[Intel-gfx] [PATCH 5/7] drm/i915/guc: Add stall timer to non blocking CTB send function

2021-07-08 Thread Matthew Brost
Implement a stall timer which fails H2G CTBs once a period of time
with no forward progress is reached to prevent deadlock.

v2:
 (Michal)
  - Improve error message in ct_deadlock()
  - Set broken when ct_deadlock() returns true
  - Return -EPIPE on ct_deadlock()
v3:
 (Michal)
  - Add ms to stall timer comment
 (Matthew)
  - Move broken check to intel_guc_ct_send()

Signed-off-by: John Harrison 
Signed-off-by: Daniele Ceraolo Spurio 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 62 ---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h |  4 ++
 2 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index 3d6cba8d91ad..db3e85b89573 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -4,6 +4,9 @@
  */
 
 #include 
+#include 
+#include 
+#include 
 
 #include "i915_drv.h"
 #include "intel_guc_ct.h"
@@ -316,6 +319,7 @@ int intel_guc_ct_enable(struct intel_guc_ct *ct)
goto err_deregister;
 
ct->enabled = true;
+   ct->stall_time = KTIME_MAX;
 
return 0;
 
@@ -389,9 +393,6 @@ static int ct_write(struct intel_guc_ct *ct,
u32 *cmds = ctb->cmds;
unsigned int i;
 
-   if (unlikely(ctb->broken))
-   return -EPIPE;
-
if (unlikely(desc->status))
goto corrupted;
 
@@ -505,6 +506,25 @@ static int wait_for_ct_request_update(struct ct_request 
*req, u32 *status)
return err;
 }
 
+#define GUC_CTB_TIMEOUT_MS 1500
+static inline bool ct_deadlocked(struct intel_guc_ct *ct)
+{
+   long timeout = GUC_CTB_TIMEOUT_MS;
+   bool ret = ktime_ms_delta(ktime_get(), ct->stall_time) > timeout;
+
+   if (unlikely(ret)) {
+   struct guc_ct_buffer_desc *send = ct->ctbs.send.desc;
+   struct guc_ct_buffer_desc *recv = ct->ctbs.send.desc;
+
+   CT_ERROR(ct, "Communication stalled for %lld ms, desc 
status=%#x,%#x\n",
+ktime_ms_delta(ktime_get(), ct->stall_time),
+send->status, recv->status);
+   ct->ctbs.send.broken = true;
+   }
+
+   return ret;
+}
+
 static inline bool h2g_has_room(struct intel_guc_ct_buffer *ctb, u32 len_dw)
 {
struct guc_ct_buffer_desc *desc = ctb->desc;
@@ -516,6 +536,26 @@ static inline bool h2g_has_room(struct intel_guc_ct_buffer 
*ctb, u32 len_dw)
return space >= len_dw;
 }
 
+static int has_room_nb(struct intel_guc_ct *ct, u32 len_dw)
+{
+   struct intel_guc_ct_buffer *ctb = >ctbs.send;
+
+   lockdep_assert_held(>ctbs.send.lock);
+
+   if (unlikely(!h2g_has_room(ctb, len_dw))) {
+   if (ct->stall_time == KTIME_MAX)
+   ct->stall_time = ktime_get();
+
+   if (unlikely(ct_deadlocked(ct)))
+   return -EPIPE;
+   else
+   return -EBUSY;
+   }
+
+   ct->stall_time = KTIME_MAX;
+   return 0;
+}
+
 static int ct_send_nb(struct intel_guc_ct *ct,
  const u32 *action,
  u32 len,
@@ -528,11 +568,9 @@ static int ct_send_nb(struct intel_guc_ct *ct,
 
spin_lock_irqsave(>lock, spin_flags);
 
-   ret = h2g_has_room(ctb, len + GUC_CTB_HDR_LEN);
-   if (unlikely(!ret)) {
-   ret = -EBUSY;
+   ret = has_room_nb(ct, len + GUC_CTB_HDR_LEN);
+   if (unlikely(ret))
goto out;
-   }
 
fence = ct_get_next_fence(ct);
ret = ct_write(ct, action, len, fence, flags);
@@ -575,8 +613,13 @@ static int ct_send(struct intel_guc_ct *ct,
 retry:
spin_lock_irqsave(>lock, flags);
if (unlikely(!h2g_has_room(ctb, len + GUC_CTB_HDR_LEN))) {
+   if (ct->stall_time == KTIME_MAX)
+   ct->stall_time = ktime_get();
spin_unlock_irqrestore(>lock, flags);
 
+   if (unlikely(ct_deadlocked(ct)))
+   return -EPIPE;
+
if (msleep_interruptible(sleep_period_ms))
return -EINTR;
sleep_period_ms = sleep_period_ms << 1;
@@ -584,6 +627,8 @@ static int ct_send(struct intel_guc_ct *ct,
goto retry;
}
 
+   ct->stall_time = KTIME_MAX;
+
fence = ct_get_next_fence(ct);
request.fence = fence;
request.status = 0;
@@ -646,6 +691,9 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 
*action, u32 len,
return -ENODEV;
}
 
+   if (unlikely(ct->ctbs.send.broken))
+   return -EPIPE;
+
if (flags & INTEL_GUC_CT_SEND_NB)
return ct_send_nb(ct, action, len, flags);
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.h
index 5bb8bef024c8..bee03794c1eb 100644
--- 

[Intel-gfx] [PATCH 2/7] drm/i915/guc: Improve error message for unsolicited CT response

2021-07-08 Thread Matthew Brost
Improve the error message when a unsolicited CT response is received by
printing fence that couldn't be found, the last fence, and all requests
with a response outstanding.

Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index b86575b99537..80db59b45c45 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -732,12 +732,16 @@ static int ct_handle_response(struct intel_guc_ct *ct, 
struct ct_incoming_msg *r
found = true;
break;
}
-   spin_unlock_irqrestore(>requests.lock, flags);
-
if (!found) {
CT_ERROR(ct, "Unsolicited response (fence %u)\n", fence);
-   return -ENOKEY;
+   CT_ERROR(ct, "Could not find fence=%u, last_fence=%u\n", fence,
+ct->requests.last_fence);
+   list_for_each_entry(req, >requests.pending, link)
+   CT_ERROR(ct, "request %u awaits response\n",
+req->fence);
+   err = -ENOKEY;
}
+   spin_unlock_irqrestore(>requests.lock, flags);
 
if (unlikely(err))
return err;
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 7/7] drm/i915/guc: Module load failure test for CT buffer creation

2021-07-08 Thread Matthew Brost
From: John Harrison 

Add several module failure load inject points in the CT buffer creation
code path.

Signed-off-by: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
index ad33708c2818..83ec60ea3f89 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
@@ -175,6 +175,10 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 
type,
 {
int err;
 
+   err = i915_inject_probe_error(guc_to_gt(ct_to_guc(ct))->i915, -ENXIO);
+   if (unlikely(err))
+   return err;
+
err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
desc_addr, buff_addr, size);
if (unlikely(err))
@@ -226,6 +230,10 @@ int intel_guc_ct_init(struct intel_guc_ct *ct)
u32 *cmds;
int err;
 
+   err = i915_inject_probe_error(guc_to_gt(guc)->i915, -ENXIO);
+   if (err)
+   return err;
+
GEM_BUG_ON(ct->vma);
 
blob_size = 2 * CTB_DESC_SIZE + CTB_H2G_BUFFER_SIZE + 
CTB_G2H_BUFFER_SIZE;
-- 
2.28.0

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 29/30] drm/i915/gem: Roll all of context creation together

2021-07-08 Thread Jason Ekstrand
Now that we have the whole engine set and VM at context creation time,
we can just assign those fields instead of creating first and handling
the VM and engines later.  This lets us avoid creating useless VMs and
engine sets and lets us get rid of the complex VM setting code.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 176 ++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  33 ++--
 2 files changed, 73 insertions(+), 136 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5f5375b15c530..c67e305f5bc74 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1279,56 +1279,6 @@ static int __context_set_persistence(struct 
i915_gem_context *ctx, bool state)
return 0;
 }
 
-static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915,
-const struct i915_gem_proto_context *pc)
-{
-   struct i915_gem_context *ctx;
-   struct i915_gem_engines *e;
-   int err;
-   int i;
-
-   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
-   if (!ctx)
-   return ERR_PTR(-ENOMEM);
-
-   kref_init(>ref);
-   ctx->i915 = i915;
-   ctx->sched = pc->sched;
-   mutex_init(>mutex);
-   INIT_LIST_HEAD(>link);
-
-   spin_lock_init(>stale.lock);
-   INIT_LIST_HEAD(>stale.engines);
-
-   mutex_init(>engines_mutex);
-   e = default_engines(ctx, pc->legacy_rcs_sseu);
-   if (IS_ERR(e)) {
-   err = PTR_ERR(e);
-   goto err_free;
-   }
-   RCU_INIT_POINTER(ctx->engines, e);
-
-   INIT_RADIX_TREE(>handles_vma, GFP_KERNEL);
-   mutex_init(>lut_mutex);
-
-   /* NB: Mark all slices as needing a remap so that when the context first
-* loads it will restore whatever remap state already exists. If there
-* is no remap info, it will be a NOP. */
-   ctx->remap_slice = ALL_L3_SLICES(i915);
-
-   ctx->user_flags = pc->user_flags;
-
-   for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
-   ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
-
-   return ctx;
-
-err_free:
-   kfree(ctx);
-   return ERR_PTR(err);
-}
-
 static inline struct i915_gem_engines *
 __context_engines_await(const struct i915_gem_context *ctx,
bool *user_engines)
@@ -1372,54 +1322,31 @@ context_apply_all(struct i915_gem_context *ctx,
i915_sw_fence_complete(>fence);
 }
 
-static void __apply_ppgtt(struct intel_context *ce, void *vm)
-{
-   i915_vm_put(ce->vm);
-   ce->vm = i915_vm_get(vm);
-}
-
-static struct i915_address_space *
-__set_ppgtt(struct i915_gem_context *ctx, struct i915_address_space *vm)
-{
-   struct i915_address_space *old;
-
-   old = rcu_replace_pointer(ctx->vm,
- i915_vm_open(vm),
- lockdep_is_held(>mutex));
-   GEM_BUG_ON(old && i915_vm_is_4lvl(vm) != i915_vm_is_4lvl(old));
-
-   context_apply_all(ctx, __apply_ppgtt, vm);
-
-   return old;
-}
-
-static void __assign_ppgtt(struct i915_gem_context *ctx,
-  struct i915_address_space *vm)
-{
-   if (vm == rcu_access_pointer(ctx->vm))
-   return;
-
-   vm = __set_ppgtt(ctx, vm);
-   if (vm)
-   i915_vm_close(vm);
-}
-
 static struct i915_gem_context *
 i915_gem_create_context(struct drm_i915_private *i915,
const struct i915_gem_proto_context *pc)
 {
struct i915_gem_context *ctx;
-   int ret;
+   struct i915_address_space *vm = NULL;
+   struct i915_gem_engines *e;
+   int err;
+   int i;
 
-   ctx = __create_context(i915, pc);
-   if (IS_ERR(ctx))
-   return ctx;
+   ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+   if (!ctx)
+   return ERR_PTR(-ENOMEM);
+
+   kref_init(>ref);
+   ctx->i915 = i915;
+   ctx->sched = pc->sched;
+   mutex_init(>mutex);
+   INIT_LIST_HEAD(>link);
+
+   spin_lock_init(>stale.lock);
+   INIT_LIST_HEAD(>stale.engines);
 
if (pc->vm) {
-   /* __assign_ppgtt() requires this mutex to be held */
-   mutex_lock(>mutex);
-   __assign_ppgtt(ctx, pc->vm);
-   mutex_unlock(>mutex);
+   vm = i915_vm_get(pc->vm);
} else if (HAS_FULL_PPGTT(i915)) {
struct i915_ppgtt *ppgtt;
 
@@ -1427,50 +1354,65 @@ i915_gem_create_context(struct drm_i915_private *i915,
if (IS_ERR(ppgtt)) {
drm_dbg(>drm, "PPGTT setup failed (%ld)\n",
PTR_ERR(ppgtt));
-   context_close(ctx);
-   return ERR_CAST(ppgtt);
+   err = PTR_ERR(ppgtt);
+   goto err_ctx;
   

[Intel-gfx] [PATCH 30/30] drm/i915: Finalize contexts in GEM_CONTEXT_CREATE on version 13+

2021-07-08 Thread Jason Ekstrand
All the proto-context stuff for context creation exists to allow older
userspace drivers to set VMs and engine sets via SET_CONTEXT_PARAM.
Drivers need to update to use CONTEXT_CREATE_EXT_* for this going
forward.  Force the issue by blocking the old mechanism on any future
hardware generations.

Signed-off-by: Jason Ekstrand 
Cc: Jon Bloomfield 
Cc: Carl Zhang 
Cc: Michal Mrozek 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 39 -
 1 file changed, 30 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c67e305f5bc74..7d6f52d8a8012 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1996,9 +1996,28 @@ int i915_gem_context_create_ioctl(struct drm_device 
*dev, void *data,
goto err_pc;
}
 
-   ret = proto_context_register(ext_data.fpriv, ext_data.pc, );
-   if (ret < 0)
-   goto err_pc;
+   if (GRAPHICS_VER(i915) > 12) {
+   struct i915_gem_context *ctx;
+
+   /* Get ourselves a context ID */
+   ret = xa_alloc(_data.fpriv->context_xa, , NULL,
+  xa_limit_32b, GFP_KERNEL);
+   if (ret)
+   goto err_pc;
+
+   ctx = i915_gem_create_context(i915, ext_data.pc);
+   if (IS_ERR(ctx)) {
+   ret = PTR_ERR(ctx);
+   goto err_pc;
+   }
+
+   proto_context_close(ext_data.pc);
+   gem_context_register(ctx, ext_data.fpriv, id);
+   } else {
+   ret = proto_context_register(ext_data.fpriv, ext_data.pc, );
+   if (ret < 0)
+   goto err_pc;
+   }
 
args->ctx_id = id;
drm_dbg(>drm, "HW context %d created\n", args->ctx_id);
@@ -2181,15 +2200,17 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
mutex_lock(_priv->proto_context_lock);
ctx = __context_lookup(file_priv, args->ctx_id);
if (!ctx) {
-   /* FIXME: We should consider disallowing SET_CONTEXT_PARAM
-* for most things on future platforms.  Clients should be
-* using CONTEXT_CREATE_EXT_PARAM instead.
-*/
pc = xa_load(_priv->proto_context_xa, args->ctx_id);
-   if (pc)
+   if (pc) {
+   /* Contexts should be finalized inside
+* GEM_CONTEXT_CREATE starting with graphics
+* version 13.
+*/
+   WARN_ON(GRAPHICS_VER(file_priv->dev_priv) > 12);
ret = set_proto_ctx_param(file_priv, pc, args);
-   else
+   } else {
ret = -ENOENT;
+   }
}
mutex_unlock(_priv->proto_context_lock);
 
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 28/30] i915/gem/selftests: Assign the VM at context creation in igt_shared_ctx_exec

2021-07-08 Thread Jason Ekstrand
We want to delete __assign_ppgtt and, generally, stop setting the VM
after context creation.  This is the one place I could find in the
selftests where we set a VM after the fact.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 3e59746afdc82..8eb5050f8cb3e 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -813,16 +813,12 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915, NULL);
+   ctx = kernel_context(i915, ctx_vm(parent));
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_test;
}
 
-   mutex_lock(>mutex);
-   __assign_ppgtt(ctx, ctx_vm(parent));
-   mutex_unlock(>mutex);
-
ce = i915_gem_context_get_engine(ctx, 
engine->legacy_idx);
GEM_BUG_ON(IS_ERR(ce));
 
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 27/30] drm/i915/selftests: Take a VM in kernel_context()

2021-07-08 Thread Jason Ekstrand
This better models where we want to go with contexts in general where
things like the VM and engine set are create parameters instead of being
set after the fact.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 .../drm/i915/gem/selftests/i915_gem_context.c |  4 ++--
 .../gpu/drm/i915/gem/selftests/mock_context.c |  9 -
 .../gpu/drm/i915/gem/selftests/mock_context.h |  4 +++-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 20 +--
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |  2 +-
 5 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
index 92544a174cc9a..3e59746afdc82 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_context.c
@@ -680,7 +680,7 @@ static int igt_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915);
+   ctx = kernel_context(i915, NULL);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_file;
@@ -813,7 +813,7 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
 
-   ctx = kernel_context(i915);
+   ctx = kernel_context(i915, NULL);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_test;
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.c 
b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
index 61aaac4a334cf..500ef27ba4771 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.c
@@ -150,7 +150,8 @@ live_context_for_engine(struct intel_engine_cs *engine, 
struct file *file)
 }
 
 struct i915_gem_context *
-kernel_context(struct drm_i915_private *i915)
+kernel_context(struct drm_i915_private *i915,
+  struct i915_address_space *vm)
 {
struct i915_gem_context *ctx;
struct i915_gem_proto_context *pc;
@@ -159,6 +160,12 @@ kernel_context(struct drm_i915_private *i915)
if (IS_ERR(pc))
return ERR_CAST(pc);
 
+   if (vm) {
+   if (pc->vm)
+   i915_vm_put(pc->vm);
+   pc->vm = i915_vm_get(vm);
+   }
+
ctx = i915_gem_create_context(i915, pc);
proto_context_close(pc);
if (IS_ERR(ctx))
diff --git a/drivers/gpu/drm/i915/gem/selftests/mock_context.h 
b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
index 2a6121d33352d..7a02fd9b5866a 100644
--- a/drivers/gpu/drm/i915/gem/selftests/mock_context.h
+++ b/drivers/gpu/drm/i915/gem/selftests/mock_context.h
@@ -10,6 +10,7 @@
 struct file;
 struct drm_i915_private;
 struct intel_engine_cs;
+struct i915_address_space;
 
 void mock_init_contexts(struct drm_i915_private *i915);
 
@@ -25,7 +26,8 @@ live_context(struct drm_i915_private *i915, struct file 
*file);
 struct i915_gem_context *
 live_context_for_engine(struct intel_engine_cs *engine, struct file *file);
 
-struct i915_gem_context *kernel_context(struct drm_i915_private *i915);
+struct i915_gem_context *kernel_context(struct drm_i915_private *i915,
+   struct i915_address_space *vm);
 void kernel_context_close(struct i915_gem_context *ctx);
 
 #endif /* !__MOCK_CONTEXT_H */
diff --git a/drivers/gpu/drm/i915/gt/selftest_execlists.c 
b/drivers/gpu/drm/i915/gt/selftest_execlists.c
index 6abce18d0ef32..73ddc6e147305 100644
--- a/drivers/gpu/drm/i915/gt/selftest_execlists.c
+++ b/drivers/gpu/drm/i915/gt/selftest_execlists.c
@@ -1539,12 +1539,12 @@ static int live_busywait_preempt(void *arg)
 * preempt the busywaits used to synchronise between rings.
 */
 
-   ctx_hi = kernel_context(gt->i915);
+   ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
return -ENOMEM;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-   ctx_lo = kernel_context(gt->i915);
+   ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = I915_CONTEXT_MIN_USER_PRIORITY;
@@ -1741,12 +1741,12 @@ static int live_preempt(void *arg)
if (igt_spinner_init(_lo, gt))
goto err_spin_hi;
 
-   ctx_hi = kernel_context(gt->i915);
+   ctx_hi = kernel_context(gt->i915, NULL);
if (!ctx_hi)
goto err_spin_lo;
ctx_hi->sched.priority = I915_CONTEXT_MAX_USER_PRIORITY;
 
-   ctx_lo = kernel_context(gt->i915);
+   ctx_lo = kernel_context(gt->i915, NULL);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority = 

[Intel-gfx] [PATCH 24/30] drm/i915/gem: Delay context creation (v3)

2021-07-08 Thread Jason Ekstrand
The current context uAPI allows for two methods of setting context
parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM.  The
former is allowed to be called at any time while the later happens as
part of GEM_CONTEXT_CREATE.  Currently, everything settable via one is
settable via the other.  While some params are fairly simple and setting
them on a live context is harmless such as the context priority, others
are far trickier such as the VM or the set of engines.  In order to swap
out the VM, for instance, we have to delay until all current in-flight
work is complete, swap in the new VM, and then continue.  This leads to
a plethora of potential race conditions we'd really rather avoid.

In previous patches, we added a i915_gem_proto_context struct which is
capable of storing and tracking all such create parameters.  This commit
delays the creation of the actual context until after the client is done
configuring it with SET_CONTEXT_PARAM.  From the perspective of the
client, it has the same u32 context ID the whole time.  From the
perspective of i915, however, it's an i915_gem_proto_context right up
until the point where we attempt to do something which the proto-context
can't handle.  Then the real context gets created.

This is accomplished via a little xarray dance.  When GEM_CONTEXT_CREATE
is called, we create a proto-context, reserve a slot in context_xa but
leave it NULL, the proto-context in the corresponding slot in
proto_context_xa.  Then, whenever we go to look up a context, we first
check context_xa.  If it's there, we return the i915_gem_context and
we're done.  If it's not, we look in proto_context_xa and, if we find it
there, we create the actual context and kill the proto-context.

In order for this dance to work properly, everything which ever touches
a proto-context is guarded by drm_i915_file_private::proto_context_lock,
including context creation.  Yes, this means context creation now takes
a giant global lock but it can't really be helped and that should never
be on any driver's fast-path anyway.

v2 (Daniel Vetter):
 - Commit message grammatical fixes.
 - Use WARN_ON instead of GEM_BUG_ON
 - Rename lazy_create_context_locked to finalize_create_context_locked
 - Rework the control-flow logic in the setparam ioctl
 - Better documentation all around

v3 (kernel test robot):
 - Make finalize_create_context_locked static

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 203 ++
 drivers/gpu/drm/i915/gem/i915_gem_context.h   |   3 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  54 +
 .../gpu/drm/i915/gem/selftests/mock_context.c |   5 +-
 drivers/gpu/drm/i915/i915_drv.h   |  76 +--
 5 files changed, 283 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5a1402544d48d..c4f89e4b1665f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -278,6 +278,42 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return err;
 }
 
+static int proto_context_register_locked(struct drm_i915_file_private *fpriv,
+struct i915_gem_proto_context *pc,
+u32 *id)
+{
+   int ret;
+   void *old;
+
+   lockdep_assert_held(>proto_context_lock);
+
+   ret = xa_alloc(>context_xa, id, NULL, xa_limit_32b, GFP_KERNEL);
+   if (ret)
+   return ret;
+
+   old = xa_store(>proto_context_xa, *id, pc, GFP_KERNEL);
+   if (xa_is_err(old)) {
+   xa_erase(>context_xa, *id);
+   return xa_err(old);
+   }
+   WARN_ON(old);
+
+   return 0;
+}
+
+static int proto_context_register(struct drm_i915_file_private *fpriv,
+ struct i915_gem_proto_context *pc,
+ u32 *id)
+{
+   int ret;
+
+   mutex_lock(>proto_context_lock);
+   ret = proto_context_register_locked(fpriv, pc, id);
+   mutex_unlock(>proto_context_lock);
+
+   return ret;
+}
+
 static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
struct i915_gem_proto_context *pc,
const struct drm_i915_gem_context_param *args)
@@ -1448,12 +1484,12 @@ void i915_gem_init__contexts(struct drm_i915_private 
*i915)
init_contexts(>gem.contexts);
 }
 
-static int gem_context_register(struct i915_gem_context *ctx,
-   struct drm_i915_file_private *fpriv,
-   u32 *id)
+static void gem_context_register(struct i915_gem_context *ctx,
+struct drm_i915_file_private *fpriv,
+u32 id)
 {
struct drm_i915_private *i915 = ctx->i915;
-   int ret;
+   void *old;
 
ctx->file_priv = 

[Intel-gfx] [PATCH 26/30] drm/i915/gem: Don't allow changing the engine set on running contexts (v3)

2021-07-08 Thread Jason Ekstrand
When the APIs were added to manage the engine set on a GEM context
directly from userspace, the questionable choice was made to allow
changing the engine set on a context at any time.  This is horribly racy
and there's absolutely no reason why any userspace would want to do this
outside of trying to exercise interesting race conditions.  By removing
support for CONTEXT_PARAM_ENGINES from ctx_setparam, we make it
impossible to change the engine set after the context has been fully
created.

This doesn't yet let us delete all the deferred engine clean-up code as
that's still used for handling the case where the client dies or calls
GEM_CONTEXT_DESTROY while work is in flight.  However, moving to an API
where the engine set is effectively immutable gives us more options to
potentially clean that code up a bit going forward.  It also removes a
whole class of ways in which a client can hurt itself or try to get
around kernel context banning.

v2 (Jason Ekstrand):
 - Expand the commit mesage

v3 (Jason Ekstrand):
 - Make it more obvious that I915_CONTEXT_PARAM_ENGINES returns -EINVAL

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 304 +---
 1 file changed, 1 insertion(+), 303 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 40acecfbbe5b5..5f5375b15c530 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1819,305 +1819,6 @@ static int set_sseu(struct i915_gem_context *ctx,
return ret;
 }
 
-struct set_engines {
-   struct i915_gem_context *ctx;
-   struct i915_gem_engines *engines;
-};
-
-static int
-set_engines__load_balance(struct i915_user_extension __user *base, void *data)
-{
-   struct i915_context_engines_load_balance __user *ext =
-   container_of_user(base, typeof(*ext), base);
-   const struct set_engines *set = data;
-   struct drm_i915_private *i915 = set->ctx->i915;
-   struct intel_engine_cs *stack[16];
-   struct intel_engine_cs **siblings;
-   struct intel_context *ce;
-   struct intel_sseu null_sseu = {};
-   u16 num_siblings, idx;
-   unsigned int n;
-   int err;
-
-   if (!HAS_EXECLISTS(i915))
-   return -ENODEV;
-
-   if (intel_uc_uses_guc_submission(>gt.uc))
-   return -ENODEV; /* not implement yet */
-
-   if (get_user(idx, >engine_index))
-   return -EFAULT;
-
-   if (idx >= set->engines->num_engines) {
-   drm_dbg(>drm, "Invalid placement value, %d >= %d\n",
-   idx, set->engines->num_engines);
-   return -EINVAL;
-   }
-
-   idx = array_index_nospec(idx, set->engines->num_engines);
-   if (set->engines->engines[idx]) {
-   drm_dbg(>drm,
-   "Invalid placement[%d], already occupied\n", idx);
-   return -EEXIST;
-   }
-
-   if (get_user(num_siblings, >num_siblings))
-   return -EFAULT;
-
-   err = check_user_mbz(>flags);
-   if (err)
-   return err;
-
-   err = check_user_mbz(>mbz64);
-   if (err)
-   return err;
-
-   siblings = stack;
-   if (num_siblings > ARRAY_SIZE(stack)) {
-   siblings = kmalloc_array(num_siblings,
-sizeof(*siblings),
-GFP_KERNEL);
-   if (!siblings)
-   return -ENOMEM;
-   }
-
-   for (n = 0; n < num_siblings; n++) {
-   struct i915_engine_class_instance ci;
-
-   if (copy_from_user(, >engines[n], sizeof(ci))) {
-   err = -EFAULT;
-   goto out_siblings;
-   }
-
-   siblings[n] = intel_engine_lookup_user(i915,
-  ci.engine_class,
-  ci.engine_instance);
-   if (!siblings[n]) {
-   drm_dbg(>drm,
-   "Invalid sibling[%d]: { class:%d, inst:%d }\n",
-   n, ci.engine_class, ci.engine_instance);
-   err = -EINVAL;
-   goto out_siblings;
-   }
-   }
-
-   ce = intel_execlists_create_virtual(siblings, n);
-   if (IS_ERR(ce)) {
-   err = PTR_ERR(ce);
-   goto out_siblings;
-   }
-
-   intel_context_set_gem(ce, set->ctx, null_sseu);
-
-   if (cmpxchg(>engines->engines[idx], NULL, ce)) {
-   intel_context_put(ce);
-   err = -EEXIST;
-   goto out_siblings;
-   }
-
-out_siblings:
-   if (siblings != stack)
-   kfree(siblings);
-
-   return err;
-}
-
-static int
-set_engines__bond(struct i915_user_extension __user *base, void 

[Intel-gfx] [PATCH 25/30] drm/i915/gem: Don't allow changing the VM on running contexts (v4)

2021-07-08 Thread Jason Ekstrand
When the APIs were added to manage VMs more directly from userspace, the
questionable choice was made to allow changing out the VM on a context
at any time.  This is horribly racy and there's absolutely no reason why
any userspace would want to do this outside of testing that exact race.
By removing support for CONTEXT_PARAM_VM from ctx_setparam, we make it
impossible to change out the VM after the context has been fully
created.  This lets us delete a bunch of deferred task code as well as a
duplicated (and slightly different) copy of the code which programs the
PPGTT registers.

v2 (Jason Ekstrand):
 - Expand the commit message

v3 (Daniel Vetter):
 - Don't drop the __rcu on the vm pointer

v4 (Jason Ekstrand):
 - Make it more obvious that I915_CONTEXT_PARAM_VM returns -EINVAL

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 263 +-
 .../drm/i915/gem/selftests/i915_gem_context.c | 119 
 .../drm/i915/selftests/i915_mock_selftests.h  |   1 -
 3 files changed, 1 insertion(+), 382 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index c4f89e4b1665f..40acecfbbe5b5 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1633,120 +1633,6 @@ int i915_gem_vm_destroy_ioctl(struct drm_device *dev, 
void *data,
return 0;
 }
 
-struct context_barrier_task {
-   struct i915_active base;
-   void (*task)(void *data);
-   void *data;
-};
-
-static void cb_retire(struct i915_active *base)
-{
-   struct context_barrier_task *cb = container_of(base, typeof(*cb), base);
-
-   if (cb->task)
-   cb->task(cb->data);
-
-   i915_active_fini(>base);
-   kfree(cb);
-}
-
-I915_SELFTEST_DECLARE(static intel_engine_mask_t context_barrier_inject_fault);
-static int context_barrier_task(struct i915_gem_context *ctx,
-   intel_engine_mask_t engines,
-   bool (*skip)(struct intel_context *ce, void 
*data),
-   int (*pin)(struct intel_context *ce, struct 
i915_gem_ww_ctx *ww, void *data),
-   int (*emit)(struct i915_request *rq, void 
*data),
-   void (*task)(void *data),
-   void *data)
-{
-   struct context_barrier_task *cb;
-   struct i915_gem_engines_iter it;
-   struct i915_gem_engines *e;
-   struct i915_gem_ww_ctx ww;
-   struct intel_context *ce;
-   int err = 0;
-
-   GEM_BUG_ON(!task);
-
-   cb = kmalloc(sizeof(*cb), GFP_KERNEL);
-   if (!cb)
-   return -ENOMEM;
-
-   i915_active_init(>base, NULL, cb_retire, 0);
-   err = i915_active_acquire(>base);
-   if (err) {
-   kfree(cb);
-   return err;
-   }
-
-   e = __context_engines_await(ctx, NULL);
-   if (!e) {
-   i915_active_release(>base);
-   return -ENOENT;
-   }
-
-   for_each_gem_engine(ce, e, it) {
-   struct i915_request *rq;
-
-   if (I915_SELFTEST_ONLY(context_barrier_inject_fault &
-  ce->engine->mask)) {
-   err = -ENXIO;
-   break;
-   }
-
-   if (!(ce->engine->mask & engines))
-   continue;
-
-   if (skip && skip(ce, data))
-   continue;
-
-   i915_gem_ww_ctx_init(, true);
-retry:
-   err = intel_context_pin_ww(ce, );
-   if (err)
-   goto err;
-
-   if (pin)
-   err = pin(ce, , data);
-   if (err)
-   goto err_unpin;
-
-   rq = i915_request_create(ce);
-   if (IS_ERR(rq)) {
-   err = PTR_ERR(rq);
-   goto err_unpin;
-   }
-
-   err = 0;
-   if (emit)
-   err = emit(rq, data);
-   if (err == 0)
-   err = i915_active_add_request(>base, rq);
-
-   i915_request_add(rq);
-err_unpin:
-   intel_context_unpin(ce);
-err:
-   if (err == -EDEADLK) {
-   err = i915_gem_ww_ctx_backoff();
-   if (!err)
-   goto retry;
-   }
-   i915_gem_ww_ctx_fini();
-
-   if (err)
-   break;
-   }
-   i915_sw_fence_complete(>fence);
-
-   cb->task = err ? NULL : task; /* caller needs to unwind instead */
-   cb->data = data;
-
-   i915_active_release(>base);
-
-   return err;
-}
-
 static int get_ppgtt(struct drm_i915_file_private *file_priv,
 struct i915_gem_context *ctx,
 struct 

[Intel-gfx] [PATCH 23/30] drm/i915/gt: Drop i915_address_space::file (v2)

2021-07-08 Thread Jason Ekstrand
There's a big comment saying how useful it is but no one is using this
for anything anymore.

It was added in 2bfa996e031b ("drm/i915: Store owning file on the
i915_address_space") and used for debugfs at the time as well as telling
the difference between the global GTT and a PPGTT.  In f6e8aa387171
("drm/i915: Report the number of closed vma held by each context in
debugfs") we removed one use of it by switching to a context walk and
comparing with the VM in the context.  Finally, VM stats for debugfs
were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client
stats from debugfs/i915_gem_objects")

v2 (Daniel Vetter):
 - Delete a struct drm_i915_file_private pre-declaration
 - Add a comment to the commit message about history

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  9 -
 drivers/gpu/drm/i915/gt/intel_gtt.h | 11 ---
 drivers/gpu/drm/i915/selftests/mock_gtt.c   |  1 -
 3 files changed, 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 7045e3afa7113..5a1402544d48d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1453,17 +1453,10 @@ static int gem_context_register(struct i915_gem_context 
*ctx,
u32 *id)
 {
struct drm_i915_private *i915 = ctx->i915;
-   struct i915_address_space *vm;
int ret;
 
ctx->file_priv = fpriv;
 
-   mutex_lock(>mutex);
-   vm = i915_gem_context_vm(ctx);
-   if (vm)
-   WRITE_ONCE(vm->file, fpriv); /* XXX */
-   mutex_unlock(>mutex);
-
ctx->pid = get_task_pid(current, PIDTYPE_PID);
snprintf(ctx->name, sizeof(ctx->name), "%s[%d]",
 current->comm, pid_nr(ctx->pid));
@@ -1562,8 +1555,6 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void 
*data,
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
 
-   ppgtt->vm.file = file_priv;
-
if (args->extensions) {
err = i915_user_extensions(u64_to_user_ptr(args->extensions),
   NULL, 0,
diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h 
b/drivers/gpu/drm/i915/gt/intel_gtt.h
index 9bd89f2a01ff1..bc7153018ebd5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gtt.h
+++ b/drivers/gpu/drm/i915/gt/intel_gtt.h
@@ -140,7 +140,6 @@ typedef u64 gen8_pte_t;
 
 enum i915_cache_level;
 
-struct drm_i915_file_private;
 struct drm_i915_gem_object;
 struct i915_fence_reg;
 struct i915_vma;
@@ -220,16 +219,6 @@ struct i915_address_space {
struct intel_gt *gt;
struct drm_i915_private *i915;
struct device *dma;
-   /*
-* Every address space belongs to a struct file - except for the global
-* GTT that is owned by the driver (and so @file is set to NULL). In
-* principle, no information should leak from one context to another
-* (or between files/processes etc) unless explicitly shared by the
-* owner. Tracking the owner is important in order to free up per-file
-* objects along with the file, to aide resource tracking, and to
-* assign blame.
-*/
-   struct drm_i915_file_private *file;
u64 total;  /* size addr space maps (ex. 2GB for ggtt) */
u64 reserved;   /* size addr space reserved */
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c 
b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index 5c7ae40bba634..cc047ec594f93 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -73,7 +73,6 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, 
const char *name)
ppgtt->vm.gt = >gt;
ppgtt->vm.i915 = i915;
ppgtt->vm.total = round_down(U64_MAX, PAGE_SIZE);
-   ppgtt->vm.file = ERR_PTR(-ENODEV);
ppgtt->vm.dma = i915->drm.dev;
 
i915_address_space_init(>vm, VM_CLASS_PPGTT);
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 21/30] drm/i915/gem: Use the proto-context to handle create parameters (v5)

2021-07-08 Thread Jason Ekstrand
This means that the proto-context needs to grow support for engine
configuration information as well as setparam logic.  Fortunately, we'll
be deleting a lot of setparam logic on the primary context shortly so it
will hopefully balance out.

There's an extra bit of fun here when it comes to setting SSEU and the
way it interacts with PARAM_ENGINES.  Unfortunately, thanks to
SET_CONTEXT_PARAM and not being allowed to pick the order in which we
handle certain parameters, we have think about those interactions.

v2 (Daniel Vetter):
 - Add a proto_context_free_user_engines helper
 - Comment on SSEU in the commit message
 - Use proto_context_set_persistence in set_proto_ctx_param

v3 (Daniel Vetter):
 - Fix a doc comment
 - Do an explicit HAS_FULL_PPGTT check in set_proto_ctx_vm instead of
   relying on pc->vm != NULL.
 - Handle errors for CONTEXT_PARAM_PERSISTENCE
 - Don't allow more resetting user engines
 - Rework initialization of UCONTEXT_PERSISTENCE

v4 (Jason Ekstrand):
 - Move hand-rolled initialization of UCONTEXT_PERSISTENCE to an
   earlier patch

v5 (Jason Ekstrand):
 - Move proto_context_set_persistence to this patch

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 577 +-
 .../gpu/drm/i915/gem/i915_gem_context_types.h |  58 ++
 2 files changed, 618 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f135fbc97c5a7..4972b8c91d942 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -193,11 +193,59 @@ static int validate_priority(struct drm_i915_private 
*i915,
 
 static void proto_context_close(struct i915_gem_proto_context *pc)
 {
+   int i;
+
if (pc->vm)
i915_vm_put(pc->vm);
+   if (pc->user_engines) {
+   for (i = 0; i < pc->num_user_engines; i++)
+   kfree(pc->user_engines[i].siblings);
+   kfree(pc->user_engines);
+   }
kfree(pc);
 }
 
+static int proto_context_set_persistence(struct drm_i915_private *i915,
+struct i915_gem_proto_context *pc,
+bool persist)
+{
+   if (persist) {
+   /*
+* Only contexts that are short-lived [that will expire or be
+* reset] are allowed to survive past termination. We require
+* hangcheck to ensure that the persistent requests are healthy.
+*/
+   if (!i915->params.enable_hangcheck)
+   return -EINVAL;
+
+   pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
+   } else {
+   /* To cancel a context we use "preempt-to-idle" */
+   if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
+   return -ENODEV;
+
+   /*
+* If the cancel fails, we then need to reset, cleanly!
+*
+* If the per-engine reset fails, all hope is lost! We resort
+* to a full GPU reset in that unlikely case, but realistically
+* if the engine could not reset, the full reset does not fare
+* much better. The damage has been done.
+*
+* However, if we cannot reset an engine by itself, we cannot
+* cleanup a hanging persistent context without causing
+* colateral damage, and we should not pretend we can by
+* exposing the interface.
+*/
+   if (!intel_has_reset_engine(>gt))
+   return -ENODEV;
+
+   pc->user_flags &= ~BIT(UCONTEXT_PERSISTENCE);
+   }
+
+   return 0;
+}
+
 static struct i915_gem_proto_context *
 proto_context_create(struct drm_i915_private *i915, unsigned int flags)
 {
@@ -207,6 +255,8 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
if (!pc)
return ERR_PTR(-ENOMEM);
 
+   pc->num_user_engines = -1;
+   pc->user_engines = NULL;
pc->user_flags = BIT(UCONTEXT_BANNABLE) |
 BIT(UCONTEXT_RECOVERABLE);
if (i915->params.enable_hangcheck)
@@ -228,6 +278,430 @@ proto_context_create(struct drm_i915_private *i915, 
unsigned int flags)
return err;
 }
 
+static int set_proto_ctx_vm(struct drm_i915_file_private *fpriv,
+   struct i915_gem_proto_context *pc,
+   const struct drm_i915_gem_context_param *args)
+{
+   struct drm_i915_private *i915 = fpriv->dev_priv;
+   struct i915_address_space *vm;
+
+   if (args->size)
+   return -EINVAL;
+
+   if (!HAS_FULL_PPGTT(i915))
+   return -ENODEV;
+
+   if (upper_32_bits(args->value))
+   return -ENOENT;
+
+   vm = 

[Intel-gfx] [PATCH 22/30] drm/i915/gem: Return an error ptr from context_lookup

2021-07-08 Thread Jason Ekstrand
We're about to start doing lazy context creation which means contexts
get created in i915_gem_context_lookup and we may start having more
errors than -ENOENT.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c| 12 ++--
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h|  2 +-
 drivers/gpu/drm/i915/i915_perf.c   |  4 ++--
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 4972b8c91d942..7045e3afa7113 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -2636,8 +2636,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device 
*dev, void *data,
int ret = 0;
 
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
switch (args->param) {
case I915_CONTEXT_PARAM_GTT_SIZE:
@@ -2705,8 +2705,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device 
*dev, void *data,
int ret;
 
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
ret = ctx_setparam(file_priv, ctx, args);
 
@@ -2725,8 +2725,8 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device 
*dev,
return -EINVAL;
 
ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
-   if (!ctx)
-   return -ENOENT;
+   if (IS_ERR(ctx))
+   return PTR_ERR(ctx);
 
/*
 * We opt for unserialised reads here. This may result in tearing
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 9aa7e10d16308..5ea8b4e23e428 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -750,8 +750,8 @@ static int eb_select_context(struct i915_execbuffer *eb)
struct i915_gem_context *ctx;
 
ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
-   if (unlikely(!ctx))
-   return -ENOENT;
+   if (unlikely(IS_ERR(ctx)))
+   return PTR_ERR(ctx);
 
eb->gem_context = ctx;
if (rcu_access_pointer(ctx->vm))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8c1994c16b920..d9278c973a734 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1864,7 +1864,7 @@ i915_gem_context_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
ctx = NULL;
rcu_read_unlock();
 
-   return ctx;
+   return ctx ? ctx : ERR_PTR(-ENOENT);
 }
 
 static inline struct i915_address_space *
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 9f94914958c39..b4ec114a4698b 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -3414,10 +3414,10 @@ i915_perf_open_ioctl_locked(struct i915_perf *perf,
struct drm_i915_file_private *file_priv = file->driver_priv;
 
specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
-   if (!specific_ctx) {
+   if (IS_ERR(specific_ctx)) {
DRM_DEBUG("Failed to look up context with ID %u for 
opening perf stream\n",
  ctx_handle);
-   ret = -ENOENT;
+   ret = PTR_ERR(specific_ctx);
goto err;
}
}
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 20/30] drm/i915/gem: Make an alignment check more sensible

2021-07-08 Thread Jason Ekstrand
What we really want to check is that size of the engines array, i.e.
args->size - sizeof(*user) is divisible by the element size, i.e.
sizeof(*user->engines) because that's what's required for computing the
array length right below the check.  However, we're currently not doing
this and instead doing a compile-time check that sizeof(*user) is
divisible by sizeof(*user->engines) and avoiding the subtraction.  As
far as I can tell, the only reason for the more confusing pair of checks
is to avoid a single subtraction of a constant.

The other thing the BUILD_BUG_ON might be trying to implicitly check is
that offsetof(user->engines) == sizeof(*user) and we don't have any
weird padding throwing us off.  However, that's not the check it's doing
and it's not even a reliable way to do that check.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 3c59d1e4080c4..f135fbc97c5a7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1723,9 +1723,8 @@ set_engines(struct i915_gem_context *ctx,
goto replace;
}
 
-   BUILD_BUG_ON(!IS_ALIGNED(sizeof(*user), sizeof(*user->engines)));
if (args->size < sizeof(*user) ||
-   !IS_ALIGNED(args->size, sizeof(*user->engines))) {
+   !IS_ALIGNED(args->size -  sizeof(*user), sizeof(*user->engines))) {
drm_dbg(>drm, "Invalid size for engine array: %d\n",
args->size);
return -EINVAL;
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 18/30] drm/i915/gem: Optionally set SSEU in intel_context_set_gem

2021-07-08 Thread Jason Ekstrand
For now this is a no-op because everyone passes in a null SSEU but it
lets us get some of the error handling and selftest refactoring plumbed
through.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 41 +++
 .../gpu/drm/i915/gem/selftests/mock_context.c |  6 ++-
 2 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5b75f98274b9e..206721dccd24e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -266,9 +266,12 @@ context_get_vm_rcu(struct i915_gem_context *ctx)
} while (1);
 }
 
-static void intel_context_set_gem(struct intel_context *ce,
- struct i915_gem_context *ctx)
+static int intel_context_set_gem(struct intel_context *ce,
+struct i915_gem_context *ctx,
+struct intel_sseu sseu)
 {
+   int ret = 0;
+
GEM_BUG_ON(rcu_access_pointer(ce->gem_context));
RCU_INIT_POINTER(ce->gem_context, ctx);
 
@@ -295,6 +298,12 @@ static void intel_context_set_gem(struct intel_context *ce,
 
intel_context_set_watchdog_us(ce, (u64)timeout_ms * 1000);
}
+
+   /* A valid SSEU has no zero fields */
+   if (sseu.slice_mask && !WARN_ON(ce->engine->class != RENDER_CLASS))
+   ret = intel_context_reconfigure_sseu(ce, sseu);
+
+   return ret;
 }
 
 static void __free_engines(struct i915_gem_engines *e, unsigned int count)
@@ -362,7 +371,8 @@ static struct i915_gem_engines *alloc_engines(unsigned int 
count)
return e;
 }
 
-static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx)
+static struct i915_gem_engines *default_engines(struct i915_gem_context *ctx,
+   struct intel_sseu rcs_sseu)
 {
const struct intel_gt *gt = >i915->gt;
struct intel_engine_cs *engine;
@@ -375,6 +385,8 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 
for_each_engine(engine, gt, id) {
struct intel_context *ce;
+   struct intel_sseu sseu = {};
+   int ret;
 
if (engine->legacy_idx == INVALID_ENGINE)
continue;
@@ -388,10 +400,18 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
goto free_engines;
}
 
-   intel_context_set_gem(ce, ctx);
-
e->engines[engine->legacy_idx] = ce;
e->num_engines = max(e->num_engines, engine->legacy_idx + 1);
+
+   if (engine->class == RENDER_CLASS)
+   sseu = rcs_sseu;
+
+   ret = intel_context_set_gem(ce, ctx, sseu);
+   if (ret) {
+   err = ERR_PTR(ret);
+   goto free_engines;
+   }
+
}
 
return e;
@@ -705,6 +725,7 @@ __create_context(struct drm_i915_private *i915,
 {
struct i915_gem_context *ctx;
struct i915_gem_engines *e;
+   struct intel_sseu null_sseu = {};
int err;
int i;
 
@@ -722,7 +743,7 @@ __create_context(struct drm_i915_private *i915,
INIT_LIST_HEAD(>stale.engines);
 
mutex_init(>engines_mutex);
-   e = default_engines(ctx);
+   e = default_engines(ctx, null_sseu);
if (IS_ERR(e)) {
err = PTR_ERR(e);
goto err_free;
@@ -1508,6 +1529,7 @@ set_engines__load_balance(struct i915_user_extension 
__user *base, void *data)
struct intel_engine_cs *stack[16];
struct intel_engine_cs **siblings;
struct intel_context *ce;
+   struct intel_sseu null_sseu = {};
u16 num_siblings, idx;
unsigned int n;
int err;
@@ -1580,7 +1602,7 @@ set_engines__load_balance(struct i915_user_extension 
__user *base, void *data)
goto out_siblings;
}
 
-   intel_context_set_gem(ce, set->ctx);
+   intel_context_set_gem(ce, set->ctx, null_sseu);
 
if (cmpxchg(>engines->engines[idx], NULL, ce)) {
intel_context_put(ce);
@@ -1688,6 +1710,7 @@ set_engines(struct i915_gem_context *ctx,
struct drm_i915_private *i915 = ctx->i915;
struct i915_context_param_engines __user *user =
u64_to_user_ptr(args->value);
+   struct intel_sseu null_sseu = {};
struct set_engines set = { .ctx = ctx };
unsigned int num_engines, n;
u64 extensions;
@@ -1697,7 +1720,7 @@ set_engines(struct i915_gem_context *ctx,
if (!i915_gem_context_user_engines(ctx))
return 0;
 
-   set.engines = default_engines(ctx);
+   set.engines = default_engines(ctx, null_sseu);
if (IS_ERR(set.engines))
 

[Intel-gfx] [PATCH 19/30] drm/i915: Add an i915_gem_vm_lookup helper

2021-07-08 Thread Jason Ekstrand
This is the VM equivalent of i915_gem_context_lookup.  It's only used
once in this patch but future patches will need to duplicate this lookup
code so it's better to have it in a helper.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c |  6 +-
 drivers/gpu/drm/i915/i915_drv.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 206721dccd24e..3c59d1e4080c4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1311,11 +1311,7 @@ static int set_ppgtt(struct drm_i915_file_private 
*file_priv,
if (upper_32_bits(args->value))
return -ENOENT;
 
-   rcu_read_lock();
-   vm = xa_load(_priv->vm_xa, args->value);
-   if (vm && !kref_get_unless_zero(>ref))
-   vm = NULL;
-   rcu_read_unlock();
+   vm = i915_gem_vm_lookup(file_priv, args->value);
if (!vm)
return -ENOENT;
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ae45ea7b26997..8c1994c16b920 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1867,6 +1867,20 @@ i915_gem_context_lookup(struct drm_i915_file_private 
*file_priv, u32 id)
return ctx;
 }
 
+static inline struct i915_address_space *
+i915_gem_vm_lookup(struct drm_i915_file_private *file_priv, u32 id)
+{
+   struct i915_address_space *vm;
+
+   rcu_read_lock();
+   vm = xa_load(_priv->vm_xa, id);
+   if (vm && !kref_get_unless_zero(>ref))
+   vm = NULL;
+   rcu_read_unlock();
+
+   return vm;
+}
+
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct i915_address_space *vm,
  u64 min_size, u64 alignment,
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 16/30] drm/i915/gem: Add an intermediate proto_context struct (v5)

2021-07-08 Thread Jason Ekstrand
The current context uAPI allows for two methods of setting context
parameters: SET_CONTEXT_PARAM and CONTEXT_CREATE_EXT_SETPARAM.  The
former is allowed to be called at any time while the later happens as
part of GEM_CONTEXT_CREATE.  Currently, everything settable via one is
settable via the other.  While some params are fairly simple and setting
them on a live context is harmless such the context priority, others are
far trickier such as the VM or the set of engines.  In order to swap out
the VM, for instance, we have to delay until all current in-flight work
is complete, swap in the new VM, and then continue.  This leads to a
plethora of potential race conditions we'd really rather avoid.

Unfortunately, both methods of setting the VM and the engine set are in
active use today so we can't simply disallow setting the VM or engine
set vial SET_CONTEXT_PARAM.  In order to work around this wart, this
commit adds a proto-context struct which contains all the context create
parameters.

v2 (Daniel Vetter):
 - Better commit message
 - Use __set/clear_bit instead of set/clear_bit because there's no race
   and we don't need the atomics

v3 (Daniel Vetter):
 - Use manual bitops and BIT() instead of __set_bit

v4 (Daniel Vetter):
 - Add a changelog to the commit message
 - Better hyperlinking in docs
 - Create the default PPGTT in i915_gem_create_context

v5 (Daniel Vetter):
 - Hand-roll the initialization of UCONTEXT_PERSISTENCE

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 84 +++
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 22 +
 .../gpu/drm/i915/gem/selftests/mock_context.c | 16 +++-
 3 files changed, 105 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index f9a6eac78c0ae..741624da8db78 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -191,6 +191,43 @@ static int validate_priority(struct drm_i915_private *i915,
return 0;
 }
 
+static void proto_context_close(struct i915_gem_proto_context *pc)
+{
+   if (pc->vm)
+   i915_vm_put(pc->vm);
+   kfree(pc);
+}
+
+static struct i915_gem_proto_context *
+proto_context_create(struct drm_i915_private *i915, unsigned int flags)
+{
+   struct i915_gem_proto_context *pc, *err;
+
+   pc = kzalloc(sizeof(*pc), GFP_KERNEL);
+   if (!pc)
+   return ERR_PTR(-ENOMEM);
+
+   pc->user_flags = BIT(UCONTEXT_BANNABLE) |
+BIT(UCONTEXT_RECOVERABLE);
+   if (i915->params.enable_hangcheck)
+   pc->user_flags |= BIT(UCONTEXT_PERSISTENCE);
+   pc->sched.priority = I915_PRIORITY_NORMAL;
+
+   if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE) {
+   if (!HAS_EXECLISTS(i915)) {
+   err = ERR_PTR(-EINVAL);
+   goto proto_close;
+   }
+   pc->single_timeline = true;
+   }
+
+   return pc;
+
+proto_close:
+   proto_context_close(pc);
+   return err;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -660,7 +697,8 @@ static int __context_set_persistence(struct 
i915_gem_context *ctx, bool state)
 }
 
 static struct i915_gem_context *
-__create_context(struct drm_i915_private *i915)
+__create_context(struct drm_i915_private *i915,
+const struct i915_gem_proto_context *pc)
 {
struct i915_gem_context *ctx;
struct i915_gem_engines *e;
@@ -673,7 +711,7 @@ __create_context(struct drm_i915_private *i915)
 
kref_init(>ref);
ctx->i915 = i915;
-   ctx->sched.priority = I915_PRIORITY_NORMAL;
+   ctx->sched = pc->sched;
mutex_init(>mutex);
INIT_LIST_HEAD(>link);
 
@@ -696,9 +734,7 @@ __create_context(struct drm_i915_private *i915)
 * is no remap info, it will be a NOP. */
ctx->remap_slice = ALL_L3_SLICES(i915);
 
-   i915_gem_context_set_bannable(ctx);
-   i915_gem_context_set_recoverable(ctx);
-   __context_set_persistence(ctx, true /* cgroup hook? */);
+   ctx->user_flags = pc->user_flags;
 
for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
@@ -786,20 +822,22 @@ static void __assign_ppgtt(struct i915_gem_context *ctx,
 }
 
 static struct i915_gem_context *
-i915_gem_create_context(struct drm_i915_private *i915, unsigned int flags)
+i915_gem_create_context(struct drm_i915_private *i915,
+   const struct i915_gem_proto_context *pc)
 {
struct i915_gem_context *ctx;
int ret;
 
-   if (flags & I915_CONTEXT_CREATE_FLAGS_SINGLE_TIMELINE &&
-   !HAS_EXECLISTS(i915))
-   return ERR_PTR(-EINVAL);
-
-   ctx = __create_context(i915);
+   ctx = __create_context(i915, pc);
if 

[Intel-gfx] [PATCH 17/30] drm/i915/gem: Rework error handling in default_engines

2021-07-08 Thread Jason Ekstrand
Since free_engines works for partially constructed engine sets, we can
use the usual goto pattern.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 741624da8db78..5b75f98274b9e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -366,7 +366,7 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 {
const struct intel_gt *gt = >i915->gt;
struct intel_engine_cs *engine;
-   struct i915_gem_engines *e;
+   struct i915_gem_engines *e, *err;
enum intel_engine_id id;
 
e = alloc_engines(I915_NUM_ENGINES);
@@ -384,18 +384,21 @@ static struct i915_gem_engines *default_engines(struct 
i915_gem_context *ctx)
 
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
-   __free_engines(e, e->num_engines + 1);
-   return ERR_CAST(ce);
+   err = ERR_CAST(ce);
+   goto free_engines;
}
 
intel_context_set_gem(ce, ctx);
 
e->engines[engine->legacy_idx] = ce;
-   e->num_engines = max(e->num_engines, engine->legacy_idx);
+   e->num_engines = max(e->num_engines, engine->legacy_idx + 1);
}
-   e->num_engines++;
 
return e;
+
+free_engines:
+   free_engines(e);
+   return err;
 }
 
 void i915_gem_context_release(struct kref *ref)
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 11/30] drm/i915/request: Remove the hook from await_execution

2021-07-08 Thread Jason Ekstrand
This was only ever used for FENCE_SUBMIT automatic engine selection
which was removed in the previous commit.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  3 +-
 drivers/gpu/drm/i915/i915_request.c   | 42 ---
 drivers/gpu/drm/i915/i915_request.h   |  4 +-
 3 files changed, 9 insertions(+), 40 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 30498948c83d0..9aa7e10d16308 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3492,8 +3492,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (in_fence) {
if (args->flags & I915_EXEC_FENCE_SUBMIT)
err = i915_request_await_execution(eb.request,
-  in_fence,
-  NULL);
+  in_fence);
else
err = i915_request_await_dma_fence(eb.request,
   in_fence);
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index c5989c0b83d3e..86b4c9f2613d5 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -49,7 +49,6 @@
 struct execute_cb {
struct irq_work work;
struct i915_sw_fence *fence;
-   void (*hook)(struct i915_request *rq, struct dma_fence *signal);
struct i915_request *signal;
 };
 
@@ -180,17 +179,6 @@ static void irq_execute_cb(struct irq_work *wrk)
kmem_cache_free(global.slab_execute_cbs, cb);
 }
 
-static void irq_execute_cb_hook(struct irq_work *wrk)
-{
-   struct execute_cb *cb = container_of(wrk, typeof(*cb), work);
-
-   cb->hook(container_of(cb->fence, struct i915_request, submit),
->signal->fence);
-   i915_request_put(cb->signal);
-
-   irq_execute_cb(wrk);
-}
-
 static __always_inline void
 __notify_execute_cb(struct i915_request *rq, bool (*fn)(struct irq_work *wrk))
 {
@@ -517,17 +505,12 @@ static bool __request_in_flight(const struct i915_request 
*signal)
 static int
 __await_execution(struct i915_request *rq,
  struct i915_request *signal,
- void (*hook)(struct i915_request *rq,
-  struct dma_fence *signal),
  gfp_t gfp)
 {
struct execute_cb *cb;
 
-   if (i915_request_is_active(signal)) {
-   if (hook)
-   hook(rq, >fence);
+   if (i915_request_is_active(signal))
return 0;
-   }
 
cb = kmem_cache_alloc(global.slab_execute_cbs, gfp);
if (!cb)
@@ -537,12 +520,6 @@ __await_execution(struct i915_request *rq,
i915_sw_fence_await(cb->fence);
init_irq_work(>work, irq_execute_cb);
 
-   if (hook) {
-   cb->hook = hook;
-   cb->signal = i915_request_get(signal);
-   cb->work.func = irq_execute_cb_hook;
-   }
-
/*
 * Register the callback first, then see if the signaler is already
 * active. This ensures that if we race with the
@@ -1253,7 +1230,7 @@ emit_semaphore_wait(struct i915_request *to,
goto await_fence;
 
/* Only submit our spinner after the signaler is running! */
-   if (__await_execution(to, from, NULL, gfp))
+   if (__await_execution(to, from, gfp))
goto await_fence;
 
if (__emit_semaphore_wait(to, from, from->fence.seqno))
@@ -1284,16 +1261,14 @@ static int intel_timeline_sync_set_start(struct 
intel_timeline *tl,
 
 static int
 __i915_request_await_execution(struct i915_request *to,
-  struct i915_request *from,
-  void (*hook)(struct i915_request *rq,
-   struct dma_fence *signal))
+  struct i915_request *from)
 {
int err;
 
GEM_BUG_ON(intel_context_is_barrier(from->context));
 
/* Submit both requests at the same time */
-   err = __await_execution(to, from, hook, I915_FENCE_GFP);
+   err = __await_execution(to, from, I915_FENCE_GFP);
if (err)
return err;
 
@@ -1406,9 +1381,7 @@ i915_request_await_external(struct i915_request *rq, 
struct dma_fence *fence)
 
 int
 i915_request_await_execution(struct i915_request *rq,
-struct dma_fence *fence,
-void (*hook)(struct i915_request *rq,
- struct dma_fence *signal))
+struct dma_fence *fence)
 {
struct dma_fence **child = 
unsigned int nchild = 1;
@@ -1441,8 +1414,7 @@ i915_request_await_execution(struct 

[Intel-gfx] [PATCH 15/30] drm/i915: Add gem/i915_gem_context.h to the docs

2021-07-08 Thread Jason Ekstrand
In order to prevent kernel doc warnings, also fill out docs for any
missing fields and fix those that forgot the "@".

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 Documentation/gpu/i915.rst|  2 +
 .../gpu/drm/i915/gem/i915_gem_context_types.h | 43 ---
 2 files changed, 38 insertions(+), 7 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index e6fd9608e9c6d..204ebdaadb45a 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -422,6 +422,8 @@ Batchbuffer Parsing
 User Batchbuffer Execution
 --
 
+.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+
 .. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
:doc: User command execution
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h 
b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
index df76767f0c41b..5f0673a2129f9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h
@@ -30,19 +30,39 @@ struct i915_address_space;
 struct intel_timeline;
 struct intel_ring;
 
+/**
+ * struct i915_gem_engines - A set of engines
+ */
 struct i915_gem_engines {
union {
+   /** @link: Link in i915_gem_context::stale::engines */
struct list_head link;
+
+   /** @rcu: RCU to use when freeing */
struct rcu_head rcu;
};
+
+   /** @fence: Fence used for delayed destruction of engines */
struct i915_sw_fence fence;
+
+   /** @ctx: i915_gem_context backpointer */
struct i915_gem_context *ctx;
+
+   /** @num_engines: Number of engines in this set */
unsigned int num_engines;
+
+   /** @engines: Array of engines */
struct intel_context *engines[];
 };
 
+/**
+ * struct i915_gem_engines_iter - Iterator for an i915_gem_engines set
+ */
 struct i915_gem_engines_iter {
+   /** @idx: Index into i915_gem_engines::engines */
unsigned int idx;
+
+   /** @engines: Engine set being iterated */
const struct i915_gem_engines *engines;
 };
 
@@ -53,10 +73,10 @@ struct i915_gem_engines_iter {
  * logical hardware state for a particular client.
  */
 struct i915_gem_context {
-   /** i915: i915 device backpointer */
+   /** @i915: i915 device backpointer */
struct drm_i915_private *i915;
 
-   /** file_priv: owning file descriptor */
+   /** @file_priv: owning file descriptor */
struct drm_i915_file_private *file_priv;
 
/**
@@ -81,7 +101,9 @@ struct i915_gem_context {
 * CONTEXT_USER_ENGINES flag is set).
 */
struct i915_gem_engines __rcu *engines;
-   struct mutex engines_mutex; /* guards writes to engines */
+
+   /** @engines_mutex: guards writes to engines */
+   struct mutex engines_mutex;
 
/**
 * @syncobj: Shared timeline syncobj
@@ -118,7 +140,7 @@ struct i915_gem_context {
 */
struct pid *pid;
 
-   /** link: place with _i915_private.context_list */
+   /** @link: place with _i915_private.context_list */
struct list_head link;
 
/**
@@ -153,11 +175,13 @@ struct i915_gem_context {
 #define CONTEXT_CLOSED 0
 #define CONTEXT_USER_ENGINES   1
 
+   /** @mutex: guards everything that isn't engines or handles_vma */
struct mutex mutex;
 
+   /** @sched: scheduler parameters */
struct i915_sched_attr sched;
 
-   /** guilty_count: How many times this context has caused a GPU hang. */
+   /** @guilty_count: How many times this context has caused a GPU hang. */
atomic_t guilty_count;
/**
 * @active_count: How many times this context was active during a GPU
@@ -171,15 +195,17 @@ struct i915_gem_context {
unsigned long hang_timestamp[2];
 #define CONTEXT_FAST_HANG_JIFFIES (120 * HZ) /* 3 hangs within 120s? Banned! */
 
-   /** remap_slice: Bitmask of cache lines that need remapping */
+   /** @remap_slice: Bitmask of cache lines that need remapping */
u8 remap_slice;
 
/**
-* handles_vma: rbtree to look up our context specific obj/vma for
+* @handles_vma: rbtree to look up our context specific obj/vma for
 * the user handle. (user handles are per fd, but the binding is
 * per vm, which may be one per context or shared with the global GTT)
 */
struct radix_tree_root handles_vma;
+
+   /** @lut_mutex: Locks handles_vma */
struct mutex lut_mutex;
 
/**
@@ -191,8 +217,11 @@ struct i915_gem_context {
 */
char name[TASK_COMM_LEN + 8];
 
+   /** @stale: tracks stale engines to be destroyed */
struct {
+   /** @lock: guards engines */
spinlock_t lock;
+   /** @engines: list of stale engines */
struct list_head engines;
} stale;
 };

[Intel-gfx] [PATCH 14/30] drm/i915/gem: Add a separate validate_priority helper

2021-07-08 Thread Jason Ekstrand
With the proto-context stuff added later in this series, we end up
having to duplicate set_priority.  This lets us avoid duplicating the
validation logic.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 42 +
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 61fe6d18d4068..f9a6eac78c0ae 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -169,6 +169,28 @@ lookup_user_engine(struct i915_gem_context *ctx,
return i915_gem_context_get_engine(ctx, idx);
 }
 
+static int validate_priority(struct drm_i915_private *i915,
+const struct drm_i915_gem_context_param *args)
+{
+   s64 priority = args->value;
+
+   if (args->size)
+   return -EINVAL;
+
+   if (!(i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
+   return -ENODEV;
+
+   if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
+   priority < I915_CONTEXT_MIN_USER_PRIORITY)
+   return -EINVAL;
+
+   if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
+   !capable(CAP_SYS_NICE))
+   return -EPERM;
+
+   return 0;
+}
+
 static struct i915_address_space *
 context_get_vm_rcu(struct i915_gem_context *ctx)
 {
@@ -1744,23 +1766,13 @@ static void __apply_priority(struct intel_context *ce, 
void *arg)
 static int set_priority(struct i915_gem_context *ctx,
const struct drm_i915_gem_context_param *args)
 {
-   s64 priority = args->value;
-
-   if (args->size)
-   return -EINVAL;
-
-   if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PRIORITY))
-   return -ENODEV;
-
-   if (priority > I915_CONTEXT_MAX_USER_PRIORITY ||
-   priority < I915_CONTEXT_MIN_USER_PRIORITY)
-   return -EINVAL;
+   int err;
 
-   if (priority > I915_CONTEXT_DEFAULT_PRIORITY &&
-   !capable(CAP_SYS_NICE))
-   return -EPERM;
+   err = validate_priority(ctx->i915, args);
+   if (err)
+   return err;
 
-   ctx->sched.priority = priority;
+   ctx->sched.priority = args->value;
context_apply_all(ctx, __apply_priority, ctx);
 
return 0;
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 09/30] drm/i915/gem: Disallow bonding of virtual engines (v3)

2021-07-08 Thread Jason Ekstrand
This adds a bunch of complexity which the media driver has never
actually used.  The media driver does technically bond a balanced engine
to another engine but the balanced engine only has one engine in the
sibling set.  This doesn't actually result in a virtual engine.

This functionality was originally added to handle cases where we may
have more than two video engines and media might want to load-balance
their bonded submits by, for instance, submitting to a balanced vcs0-1
as the primary and then vcs2-3 as the secondary.  However, no such
hardware has shipped thus far and, if we ever want to enable such
use-cases in the future, we'll use the up-and-coming parallel submit API
which targets GuC submission.

This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op.  We leave the
validation code in place in case we ever decide we want to do something
interesting with the bonding information.

v2 (Jason Ekstrand):
 - Don't delete quite as much code.

v3 (Tvrtko Ursulin):
 - Add some history to the commit message

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |  18 +-
 .../drm/i915/gt/intel_execlists_submission.c  |  69 --
 .../drm/i915/gt/intel_execlists_submission.h  |   5 +-
 drivers/gpu/drm/i915/gt/selftest_execlists.c  | 229 --
 4 files changed, 8 insertions(+), 313 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e36e3b1ae14e4..5eca91ded3423 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1552,6 +1552,12 @@ set_engines__bond(struct i915_user_extension __user 
*base, void *data)
}
virtual = set->engines->engines[idx]->engine;
 
+   if (intel_engine_is_virtual(virtual)) {
+   drm_dbg(>drm,
+   "Bonding with virtual engines not allowed\n");
+   return -EINVAL;
+   }
+
err = check_user_mbz(>flags);
if (err)
return err;
@@ -1592,18 +1598,6 @@ set_engines__bond(struct i915_user_extension __user 
*base, void *data)
n, ci.engine_class, ci.engine_instance);
return -EINVAL;
}
-
-   /*
-* A non-virtual engine has no siblings to choose between; and
-* a submit fence will always be directed to the one engine.
-*/
-   if (intel_engine_is_virtual(virtual)) {
-   err = intel_virtual_engine_attach_bond(virtual,
-  master,
-  bond);
-   if (err)
-   return err;
-   }
}
 
return 0;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 7dd7afccb3adc..98b256352c23d 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -182,18 +182,6 @@ struct virtual_engine {
int prio;
} nodes[I915_NUM_ENGINES];
 
-   /*
-* Keep track of bonded pairs -- restrictions upon on our selection
-* of physical engines any particular request may be submitted to.
-* If we receive a submit-fence from a master engine, we will only
-* use one of sibling_mask physical engines.
-*/
-   struct ve_bond {
-   const struct intel_engine_cs *master;
-   intel_engine_mask_t sibling_mask;
-   } *bonds;
-   unsigned int num_bonds;
-
/* And finally, which physical engines this virtual engine maps onto. */
unsigned int num_siblings;
struct intel_engine_cs *siblings[];
@@ -3413,7 +3401,6 @@ static void rcu_virtual_context_destroy(struct 
work_struct *wrk)
i915_sched_engine_put(ve->base.sched_engine);
intel_engine_free_request_pool(>base);
 
-   kfree(ve->bonds);
kfree(ve);
 }
 
@@ -3668,33 +3655,13 @@ static void virtual_submit_request(struct i915_request 
*rq)
spin_unlock_irqrestore(>base.sched_engine->lock, flags);
 }
 
-static struct ve_bond *
-virtual_find_bond(struct virtual_engine *ve,
- const struct intel_engine_cs *master)
-{
-   int i;
-
-   for (i = 0; i < ve->num_bonds; i++) {
-   if (ve->bonds[i].master == master)
-   return >bonds[i];
-   }
-
-   return NULL;
-}
-
 static void
 virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
 {
-   struct virtual_engine *ve = to_virtual_engine(rq->engine);
intel_engine_mask_t allowed, exec;
-   struct ve_bond *bond;
 
allowed = ~to_request(signal)->engine->mask;
 
-   bond = virtual_find_bond(ve, to_request(signal)->engine);

[Intel-gfx] [PATCH 10/30] drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)

2021-07-08 Thread Jason Ekstrand
Even though FENCE_SUBMIT is only documented to wait until the request in
the in-fence starts instead of waiting until it completes, it has a bit
more magic than that.  If FENCE_SUBMIT is used to submit something to a
balanced engine, we would wait to assign engines until the primary
request was ready to start and then attempt to assign it to a different
engine than the primary.  There is an IGT test (the bonded-slice subtest
of gem_exec_balancer) which exercises this by submitting a primary batch
to a specific VCS and then using FENCE_SUBMIT to submit a secondary
which can run on any VCS and have i915 figure out which VCS to run it on
such that they can run in parallel.

However, this functionality has never been used in the real world.  The
media driver (the only user of FENCE_SUBMIT) always picks exactly two
physical engines to bond and never asks us to pick which to use.

v2 (Daniel Vetter):
 - Mention the exact IGT test this breaks

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c  |  2 +-
 drivers/gpu/drm/i915/gt/intel_engine_types.h|  7 ---
 .../drm/i915/gt/intel_execlists_submission.c| 17 -
 3 files changed, 1 insertion(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 7b7897242a837..30498948c83d0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -3493,7 +3493,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
if (args->flags & I915_EXEC_FENCE_SUBMIT)
err = i915_request_await_execution(eb.request,
   in_fence,
-  
eb.engine->bond_execute);
+  NULL);
else
err = i915_request_await_dma_fence(eb.request,
   in_fence);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index 5b91068ab2779..1cb9c3b70b29a 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -416,13 +416,6 @@ struct intel_engine_cs {
 */
void(*submit_request)(struct i915_request *rq);
 
-   /*
-* Called on signaling of a SUBMIT_FENCE, passing along the signaling
-* request down to the bonded pairs.
-*/
-   void(*bond_execute)(struct i915_request *rq,
-   struct dma_fence *signal);
-
void(*release)(struct intel_engine_cs *engine);
 
struct intel_engine_execlists execlists;
diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 98b256352c23d..56e25090da672 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -3655,22 +3655,6 @@ static void virtual_submit_request(struct i915_request 
*rq)
spin_unlock_irqrestore(>base.sched_engine->lock, flags);
 }
 
-static void
-virtual_bond_execute(struct i915_request *rq, struct dma_fence *signal)
-{
-   intel_engine_mask_t allowed, exec;
-
-   allowed = ~to_request(signal)->engine->mask;
-
-   /* Restrict the bonded request to run on only the available engines */
-   exec = READ_ONCE(rq->execution_mask);
-   while (!try_cmpxchg(>execution_mask, , exec & allowed))
-   ;
-
-   /* Prevent the master from being re-run on the bonded engines */
-   to_request(signal)->execution_mask &= ~allowed;
-}
-
 struct intel_context *
 intel_execlists_create_virtual(struct intel_engine_cs **siblings,
   unsigned int count)
@@ -3731,7 +3715,6 @@ intel_execlists_create_virtual(struct intel_engine_cs 
**siblings,
ve->base.sched_engine->schedule = i915_schedule;
ve->base.sched_engine->kick_backend = kick_execlists;
ve->base.submit_request = virtual_submit_request;
-   ve->base.bond_execute = virtual_bond_execute;
 
INIT_LIST_HEAD(virtual_queue(ve));
tasklet_setup(>base.sched_engine->tasklet, 
virtual_submission_tasklet);
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 12/30] drm/i915/gem: Disallow creating contexts with too many engines

2021-07-08 Thread Jason Ekstrand
There's no sense in allowing userspace to create more engines than it
can possibly access via execbuf.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Daniel Vetter 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index 5eca91ded3423..0ba8506fb966f 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1639,11 +1639,11 @@ set_engines(struct i915_gem_context *ctx,
return -EINVAL;
}
 
-   /*
-* Note that I915_EXEC_RING_MASK limits execbuf to only using the
-* first 64 engines defined here.
-*/
num_engines = (args->size - sizeof(*user)) / sizeof(*user->engines);
+   /* RING_MASK has no shift so we can use it directly here */
+   if (num_engines > I915_EXEC_RING_MASK + 1)
+   return -EINVAL;
+
set.engines = alloc_engines(num_engines);
if (!set.engines)
return -ENOMEM;
-- 
2.31.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


  1   2   >