date:20210817

== Series Details ==

Series: drm/i915: Use designated initializers for init/exit table
URL   : https://patchwork.freedesktop.org/series/93768/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491_full -> Patchwork_20840_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20840_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][1] ([i915#1839])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-iclb1/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-apl2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@process:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-snb6/igt@gem_ctx_persiste...@process.html

  * igt@gem_ctx_shared@q-in-order:
- shard-snb:  NOTRUN -> [SKIP][4] ([fdo#109271]) +304 similar issues
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-snb6/igt@gem_ctx_sha...@q-in-order.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][5] ([i915#2842])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-iclb2/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][6] -> [FAIL][7] ([i915#2842])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-glk3/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_gttfill@engines@rcs0:
- shard-glk:  [PASS][8] -> [DMESG-WARN][9] ([i915#118] / [i915#95]) 
+1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk7/igt@gem_exec_gttfill@engi...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-glk8/igt@gem_exec_gttfill@engi...@rcs0.html

  * igt@gem_huc_copy@huc-copy:
- shard-kbl:  NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#2190])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-kbl2/igt@gem_huc_c...@huc-copy.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][11] ([i915#2658]) +1 similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-apl7/igt@gem_pr...@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][12] ([i915#2658])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-tglb1/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-to-vebox-yf-tiled:
- shard-iclb: NOTRUN -> [SKIP][13] ([i915#768])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-iclb1/igt@gem_render_c...@y-tiled-to-vebox-yf-tiled.html

  * igt@gem_userptr_blits@dmabuf-sync:
- shard-tglb: NOTRUN -> [SKIP][14] ([i915#3323])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-tglb2/igt@gem_userptr_bl...@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
- shard-iclb: NOTRUN -> [SKIP][15] ([i915#3297]) +1 similar issue
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-iclb1/igt@gem_userptr_bl...@dmabuf-unsync.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
- shard-tglb: NOTRUN -> [SKIP][16] ([i915#3297]) +3 similar issues
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-tglb2/igt@gem_userptr_bl...@unsync-unmap-cycles.html

  * igt@gem_userptr_blits@vma-merge:
- shard-apl:  NOTRUN -> [FAIL][17] ([i915#3318])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-apl1/igt@gem_userptr_bl...@vma-merge.html

  * igt@gen3_mixed_blits:
- shard-iclb: NOTRUN -> [SKIP][18] ([fdo#109289])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-iclb1/igt@gen3_mixed_blits.html

  * igt@gen3_render_tiledx_blits:
- shard-tglb: NOTRUN -> [SKIP][19] ([fdo#109289])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-tglb2/igt@gen3_render_tiledx_blits.html

  * igt@gen9_exec_parse@allowed-single:
- shard-skl:  [PASS][20] -> [DMESG-WARN][21] ([i915#1436] / 
[i915#716])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-skl1/igt@gen9_exec_pa...@allowed-single.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/shard-skl9/igt@gen9_exec_pa...@allowed-single.html

  * igt@gen9_exec_parse@bb-start-out:
- shard-iclb: NOTRUN -> [SKIP][22]

[Intel-gfx] ✗ Fi.CI.BAT: failure for Drop frontbuffer rendering support from Skylake and newer

== Series Details ==

Series: Drop frontbuffer rendering support from Skylake and newer
URL   : https://patchwork.freedesktop.org/series/93769/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10491 -> Patchwork_20841


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20841 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20841, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20841:

### IGT changes ###

 Possible regressions 

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-atomic:
- fi-skl-guc: [PASS][1] -> [FAIL][2] +3 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-skl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-skl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
- fi-cfl-guc: [PASS][3] -> [FAIL][4] +3 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-cfl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-cfl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
- fi-icl-y:   [PASS][5] -> [FAIL][6] +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-icl-y/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-icl-y/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
- fi-rkl-guc: [PASS][7] -> [FAIL][8] +3 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-rkl-guc/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
- fi-skl-6700k2:  [PASS][9] -> [FAIL][10] +3 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-skl-6700k2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-skl-6700k2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-atomic.html

  * igt@kms_cursor_legacy@basic-busy-flip-before-cursor-legacy:
- fi-icl-u2:  [PASS][11] -> [FAIL][12] +3 similar issues
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-icl-u2/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
- fi-cfl-8700k:   [PASS][13] -> [FAIL][14] +3 similar issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-cfl-8700k/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-cfl-8700k/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
- fi-cfl-8109u:   [PASS][15] -> [FAIL][16] +3 similar issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-cfl-8109u/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-cfl-8109u/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
- fi-glk-dsi: [PASS][17] -> [FAIL][18] +3 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-glk-dsi/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-glk-dsi/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
- fi-kbl-soraka:  NOTRUN -> [FAIL][19] +3 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-kbl-soraka/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
- fi-kbl-7500u:   [PASS][20] -> [FAIL][21] +3 similar issues
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-kbl-7500u/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20841/fi-kbl-7500u/igt@kms_cursor_leg...@basic-busy-flip-before-cursor-legacy.html

  * igt@kms_cursor_legacy@basic-flip-before-cursor-atomic:
- fi-bxt-dsi: [PASS][22] -> [FAIL][23] +3 similar issues
   [22]:

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/damage_helper: Fix handling of cursor dirty buffers

== Series Details ==

Series: drm/damage_helper: Fix handling of cursor dirty buffers
URL   : https://patchwork.freedesktop.org/series/93765/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10491_full -> Patchwork_20839_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20839_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20839_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20839_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_fair@basic-pace@vcs0:
- shard-kbl:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-kbl1/igt@gem_exec_fair@basic-p...@vcs0.html

  
Known issues


  Here are the changes found in Patchwork_20839_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][3] ([i915#1839])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-iclb3/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][4] ([i915#3002])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-apl6/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
- shard-apl:  NOTRUN -> [DMESG-WARN][5] ([i915#180]) +1 similar 
issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-apl8/igt@gem_ctx_isolation@preservation...@rcs0.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-snb5/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][7] -> [TIMEOUT][8] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb6/igt@gem_...@unwedge-stress.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-tglb7/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-tglb8/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs0:
- shard-apl:  [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-apl6/igt@gem_exec_fair@basic-n...@vcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-apl7/igt@gem_exec_fair@basic-n...@vcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][13] ([i915#2842]) +1 similar issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-iclb4/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_huc_copy@huc-copy:
- shard-kbl:  NOTRUN -> [SKIP][16] ([fdo#109271] / [i915#2190])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-kbl4/igt@gem_huc_c...@huc-copy.html

  * igt@gem_mmap_gtt@big-copy-odd:
- shard-glk:  [PASS][17] -> [FAIL][18] ([i915#1888] / [i915#307])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk5/igt@gem_mmap_...@big-copy-odd.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-glk9/igt@gem_mmap_...@big-copy-odd.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][19] ([i915#2658]) +1 similar issue
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-apl1/igt@gem_pr...@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][20] ([i915#2658])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/shard-tglb7/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-to-vebox-yf-tiled:
- shard-iclb: NOTRUN -> [SKIP][21] ([i915#768])
   [21]:

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Drop frontbuffer rendering support from Skylake and newer

== Series Details ==

Series: Drop frontbuffer rendering support from Skylake and newer
URL   : https://patchwork.freedesktop.org/series/93769/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+./include/linux/stddef.h:17:9: this was the original definition
+./include/linux/stddef.h:17:9: this was the original definition
+./include/linux/stddef.h:17:9: this was the original definition
+/usr/lib/gcc/x86_64-linux-gnu/8/include/stddef.h:417:9: warning: preprocessor 
token offsetof redefined
+/usr/lib/gcc/x86_64-linux-gnu/8/include/stddef.h:417:9: warning: preprocessor 
token offsetof redefined
+/usr/lib/gcc/x86_64-linux-gnu/8/include/stddef.h:417:9: warning: preprocessor 
token offsetof redefined

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Drop frontbuffer rendering support from Skylake and newer

== Series Details ==

Series: Drop frontbuffer rendering support from Skylake and newer
URL   : https://patchwork.freedesktop.org/series/93769/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
c4bc916870ba drm/damage_helper: Fix handling of cursor dirty buffers
1080e192fe73 drm/i915/display: Drop PSR support from HSW and BDW
-:267: WARNING:LONG_LINE_COMMENT: line length of 113 exceeds 100 columns
#267: FILE: drivers/gpu/drm/i915/i915_reg.h:4563:
+#define EDP_PSR_AUX_DATA(tran, i)  _MMIO(_TRANS2(tran, 
_SRD_AUX_DATA_A) + (i) + 4) /* 5 registers */

total: 0 errors, 1 warnings, 0 checks, 240 lines checked
6d37b8b938ac drm/i915/display: Move DRRS code its own file
-:604: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#604: 
new file mode 100644

-:725: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!intel_dp"
#725: FILE: drivers/gpu/drm/i915/display/intel_drrs.c:117:
+   if (intel_dp == NULL) {

-:934: WARNING:LONG_LINE: line length of 111 exceeds 100 columns
#934: FILE: drivers/gpu/drm/i915/display/intel_drrs.c:326:
+   
drm_mode_vrefresh(intel_dp->attached_connector->panel.downclock_mode));

-:980: WARNING:LONG_LINE: line length of 107 exceeds 100 columns
#980: FILE: drivers/gpu/drm/i915/display/intel_drrs.c:372:
+   
drm_mode_vrefresh(intel_dp->attached_connector->panel.fixed_mode));

-:1026: WARNING:LONG_LINE: line length of 107 exceeds 100 columns
#1026: FILE: drivers/gpu/drm/i915/display/intel_drrs.c:418:
+   
drm_mode_vrefresh(intel_dp->attached_connector->panel.fixed_mode));

total: 0 errors, 4 warnings, 1 checks, 1071 lines checked
f8a34cf4d131 drm/i915/display: Some code improvements and code style fixes for 
DRRS
0be6fec2e58f drm/i915/display: Share code between intel_edp_drrs_flush and 
invalidate
d392eb08d84c drm/i915/display: Prepare DRRS for frontbuffer rendering drop
7abf1b3f8539 drm/i915/display/skl+: Drop frontbuffer rendering support
deb2e98252ec drm/i915/display: Drop PSR frontbuffer rendering support
-:250: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its 
#endif
#250: FILE: drivers/gpu/drm/i915/display/intel_psr.c:1894:
+#if 0

total: 0 errors, 1 warnings, 0 checks, 298 lines checked

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Use designated initializers for init/exit table

== Series Details ==

Series: drm/i915: Use designated initializers for init/exit table
URL   : https://patchwork.freedesktop.org/series/93768/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491 -> Patchwork_20840


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/index.html

Known issues


  Here are the changes found in Patchwork_20840 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271]) +21 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][4] ([i915#1886] / [i915#2291])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: NOTRUN -> [DMESG-WARN][5] ([i915#3967])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#533])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@runner@aborted:
- fi-bdw-5557u:   NOTRUN -> [FAIL][8] ([i915#1602] / [i915#2029])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-bdw-5557u/igt@run...@aborted.html

  
 Possible fixes 

  * igt@core_hotunplug@unbind-rebind:
- fi-rkl-guc: [DMESG-WARN][9] ([i915#3925]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
- fi-ilk-650: [DMESG-WARN][11] ([i915#164]) -> [PASS][12] +1 
similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-soraka:  [INCOMPLETE][13] ([i915#155]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][15] ([i915#1888]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@execlists:
- {fi-tgl-dsi}:   [DMESG-FAIL][17] ([i915#1993]) -> [PASS][18]
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20840/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#164]: https://gitlab.freedesktop.org/drm/intel/issues/164
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1993]: https://gitlab.freedesktop.org/drm/intel/issues/1993
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190

[Intel-gfx] [PATCH 5/8] drm/i915/display: Share code between intel_edp_drrs_flush and invalidate

Both functions are pretty much equal, with minor changes that can be
handled by a single parameter.

Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_drrs.c | 82 +--
 1 file changed, 32 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_drrs.c 
b/drivers/gpu/drm/i915/display/intel_drrs.c
index e96033bc6c658..b885c1ec76bf9 100644
--- a/drivers/gpu/drm/i915/display/intel_drrs.c
+++ b/drivers/gpu/drm/i915/display/intel_drrs.c
@@ -312,18 +312,9 @@ static void intel_edp_drrs_downclock_work(struct 
work_struct *work)
mutex_unlock(_priv->drrs.mutex);
 }
 
-/**
- * intel_edp_drrs_invalidate - Disable Idleness DRRS
- * @dev_priv: i915 device
- * @frontbuffer_bits: frontbuffer plane tracking bits
- *
- * This function gets called everytime rendering on the given planes start.
- * Hence DRRS needs to be Upclocked, i.e. (LOW_RR -> HIGH_RR).
- *
- * Dirty frontbuffers relevant to DRRS are tracked in busy_frontbuffer_bits.
- */
-void intel_edp_drrs_invalidate(struct drm_i915_private *dev_priv,
-  unsigned int frontbuffer_bits)
+static void intel_edp_drrs_frontbuffer_update(struct drm_i915_private 
*dev_priv,
+ unsigned int frontbuffer_bits,
+ bool invalidate)
 {
struct intel_dp *intel_dp;
struct drm_crtc *crtc;
@@ -346,16 +337,42 @@ void intel_edp_drrs_invalidate(struct drm_i915_private 
*dev_priv,
pipe = to_intel_crtc(crtc)->pipe;
 
frontbuffer_bits &= INTEL_FRONTBUFFER_ALL_MASK(pipe);
-   dev_priv->drrs.busy_frontbuffer_bits |= frontbuffer_bits;
+   if (invalidate)
+   dev_priv->drrs.busy_frontbuffer_bits |= frontbuffer_bits;
+   else
+   dev_priv->drrs.busy_frontbuffer_bits &= ~frontbuffer_bits;
 
-   /* invalidate means busy screen hence upclock */
+   /* flush/invalidate means busy screen hence upclock */
if (frontbuffer_bits)
intel_dp_set_drrs_state(dev_priv, to_intel_crtc(crtc)->config,
DRRS_HIGH_RR);
 
+   /*
+* flush also means no more activity hence schedule downclock, if all
+* other fbs are quiescent too
+*/
+   if (!dev_priv->drrs.busy_frontbuffer_bits)
+   schedule_delayed_work(_priv->drrs.work,
+ msecs_to_jiffies(1000));
mutex_unlock(_priv->drrs.mutex);
 }
 
+/**
+ * intel_edp_drrs_invalidate - Disable Idleness DRRS
+ * @dev_priv: i915 device
+ * @frontbuffer_bits: frontbuffer plane tracking bits
+ *
+ * This function gets called everytime rendering on the given planes start.
+ * Hence DRRS needs to be Upclocked, i.e. (LOW_RR -> HIGH_RR).
+ *
+ * Dirty frontbuffers relevant to DRRS are tracked in busy_frontbuffer_bits.
+ */
+void intel_edp_drrs_invalidate(struct drm_i915_private *dev_priv,
+  unsigned int frontbuffer_bits)
+{
+   intel_edp_drrs_frontbuffer_update(dev_priv, frontbuffer_bits, true);
+}
+
 /**
  * intel_edp_drrs_flush - Restart Idleness DRRS
  * @dev_priv: i915 device
@@ -371,42 +388,7 @@ void intel_edp_drrs_invalidate(struct drm_i915_private 
*dev_priv,
 void intel_edp_drrs_flush(struct drm_i915_private *dev_priv,
  unsigned int frontbuffer_bits)
 {
-   struct intel_dp *intel_dp;
-   struct drm_crtc *crtc;
-   enum pipe pipe;
-
-   if (dev_priv->drrs.type == DRRS_NOT_SUPPORTED)
-   return;
-
-   cancel_delayed_work(_priv->drrs.work);
-
-   mutex_lock(_priv->drrs.mutex);
-
-   intel_dp = dev_priv->drrs.dp;
-   if (!intel_dp) {
-   mutex_unlock(_priv->drrs.mutex);
-   return;
-   }
-
-   crtc = dp_to_dig_port(intel_dp)->base.base.crtc;
-   pipe = to_intel_crtc(crtc)->pipe;
-
-   frontbuffer_bits &= INTEL_FRONTBUFFER_ALL_MASK(pipe);
-   dev_priv->drrs.busy_frontbuffer_bits &= ~frontbuffer_bits;
-
-   /* flush means busy screen hence upclock */
-   if (frontbuffer_bits)
-   intel_dp_set_drrs_state(dev_priv, to_intel_crtc(crtc)->config,
-   DRRS_HIGH_RR);
-
-   /*
-* flush also means no more activity hence schedule downclock, if all
-* other fbs are quiescent too
-*/
-   if (!dev_priv->drrs.busy_frontbuffer_bits)
-   schedule_delayed_work(_priv->drrs.work,
- msecs_to_jiffies(1000));
-   mutex_unlock(_priv->drrs.mutex);
+   intel_edp_drrs_frontbuffer_update(dev_priv, frontbuffer_bits, false);
 }
 
 /**
-- 
2.32.0

[Intel-gfx] [PATCH 6/8] drm/i915/display: Prepare DRRS for frontbuffer rendering drop

Frontbuffer rendering will be dropped for modern platforms but
before that we to prepare DRRS for it.

intel_edp_drrs_flush and intel_edp_drrs_invalidate will not be called
for platforms that will not support frontbuffer rendering so DRRS
needs another way to be notified about to page flips so it can change
between high and low refresh rates as needed.

Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_display.c | 2 ++
 drivers/gpu/drm/i915/display/intel_drrs.c| 9 +
 drivers/gpu/drm/i915/display/intel_drrs.h| 4 
 3 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a257e5dc381c6..e55c9e2cb254a 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -52,6 +52,7 @@
 #include "display/intel_dp_mst.h"
 #include "display/intel_dpll.h"
 #include "display/intel_dpll_mgr.h"
+#include "display/intel_drrs.h"
 #include "display/intel_dsi.h"
 #include "display/intel_dvo.h"
 #include "display/intel_fb.h"
@@ -2872,6 +2873,7 @@ static void intel_post_plane_update(struct 
intel_atomic_state *state,
hsw_enable_ips(new_crtc_state);
 
intel_fbc_post_update(state, crtc);
+   intel_edp_drrs_page_flip(state, crtc);
 
if (needs_nv12_wa(old_crtc_state) &&
!needs_nv12_wa(new_crtc_state))
diff --git a/drivers/gpu/drm/i915/display/intel_drrs.c 
b/drivers/gpu/drm/i915/display/intel_drrs.c
index b885c1ec76bf9..c5509ed9666be 100644
--- a/drivers/gpu/drm/i915/display/intel_drrs.c
+++ b/drivers/gpu/drm/i915/display/intel_drrs.c
@@ -391,6 +391,15 @@ void intel_edp_drrs_flush(struct drm_i915_private 
*dev_priv,
intel_edp_drrs_frontbuffer_update(dev_priv, frontbuffer_bits, false);
 }
 
+void intel_edp_drrs_page_flip(struct intel_atomic_state *state,
+ struct intel_crtc *crtc)
+{
+   struct drm_i915_private *dev_priv = to_i915(state->base.dev);
+   unsigned int frontbuffer_bits = INTEL_FRONTBUFFER_ALL_MASK(crtc->pipe);
+
+   intel_edp_drrs_frontbuffer_update(dev_priv, frontbuffer_bits, false);
+}
+
 /**
  * intel_dp_drrs_init - Init basic DRRS work and mutex.
  * @connector: eDP connector
diff --git a/drivers/gpu/drm/i915/display/intel_drrs.h 
b/drivers/gpu/drm/i915/display/intel_drrs.h
index ffa175b4cf4f4..5ae3769700bf3 100644
--- a/drivers/gpu/drm/i915/display/intel_drrs.h
+++ b/drivers/gpu/drm/i915/display/intel_drrs.h
@@ -9,6 +9,8 @@
 #include 
 
 struct drm_i915_private;
+struct intel_atomic_state;
+struct intel_crtc;
 struct intel_crtc_state;
 struct intel_connector;
 struct intel_dp;
@@ -23,6 +25,8 @@ void intel_edp_drrs_invalidate(struct drm_i915_private 
*dev_priv,
   unsigned int frontbuffer_bits);
 void intel_edp_drrs_flush(struct drm_i915_private *dev_priv,
  unsigned int frontbuffer_bits);
+void intel_edp_drrs_page_flip(struct intel_atomic_state *state,
+ struct intel_crtc *crtc);
 void intel_dp_drrs_compute_config(struct intel_dp *intel_dp,
  struct intel_crtc_state *pipe_config,
  int output_bpp, bool constant_n);
-- 
2.32.0

[Intel-gfx] [PATCH 7/8] drm/i915/display/skl+: Drop frontbuffer rendering support

By now all the userspace applications should have migrated to atomic
or at least be calling DRM_IOCTL_MODE_DIRTYFB.

With that we can kill frontbuffer rendering support in i915 for
modern platforms.

So here converting legacy APIs into atomic commits so it can be
properly handled by driver i915.

Several IGT tests will fail with this changes, because some tests
were stressing those frontbuffer rendering scenarios that no userspace
should be using by now, fixes to IGT should be sent soon.

Cc: Daniel Vetter 
Cc: Gwan-gyeong Mun 
Cc: Ville Syrjälä 
Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_cursor.c  | 6 ++
 drivers/gpu/drm/i915/display/intel_display.c | 7 ++-
 drivers/gpu/drm/i915/display/intel_frontbuffer.c | 6 ++
 drivers/gpu/drm/i915/i915_drv.h  | 2 ++
 4 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_cursor.c 
b/drivers/gpu/drm/i915/display/intel_cursor.c
index c7618fef01439..5aa996c3b7980 100644
--- a/drivers/gpu/drm/i915/display/intel_cursor.c
+++ b/drivers/gpu/drm/i915/display/intel_cursor.c
@@ -617,6 +617,7 @@ intel_legacy_cursor_update(struct drm_plane *_plane,
   u32 src_w, u32 src_h,
   struct drm_modeset_acquire_ctx *ctx)
 {
+   struct drm_i915_private *i915 = to_i915(_crtc->dev);
struct intel_plane *plane = to_intel_plane(_plane);
struct intel_crtc *crtc = to_intel_crtc(_crtc);
struct intel_plane_state *old_plane_state =
@@ -633,12 +634,9 @@ intel_legacy_cursor_update(struct drm_plane *_plane,
 * PSR2 selective fetch also requires the slow path as
 * PSR2 plane and transcoder registers can only be updated during
 * vblank.
-*
-* FIXME bigjoiner fastpath would be good
 */
if (!crtc_state->hw.active || intel_crtc_needs_modeset(crtc_state) ||
-   crtc_state->update_pipe || crtc_state->bigjoiner ||
-   crtc_state->enable_psr2_sel_fetch)
+   crtc_state->update_pipe || !HAS_FRONTBUFFER_RENDERING(i915))
goto slow;
 
/*
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index e55c9e2cb254a..f700544454ad5 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11744,10 +11744,15 @@ static int intel_user_framebuffer_dirty(struct 
drm_framebuffer *fb,
unsigned num_clips)
 {
struct drm_i915_gem_object *obj = intel_fb_obj(fb);
+   struct drm_i915_private *i915 = to_i915(obj->base.dev);
 
i915_gem_object_flush_if_display(obj);
-   intel_frontbuffer_flush(to_intel_frontbuffer(fb), ORIGIN_DIRTYFB);
 
+   if (!HAS_FRONTBUFFER_RENDERING(i915))
+   return drm_atomic_helper_dirtyfb(fb, file, flags, color, clips,
+num_clips);
+
+   intel_frontbuffer_flush(to_intel_frontbuffer(fb), ORIGIN_DIRTYFB);
return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c 
b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
index e4834d84ce5e3..6be2f767a203c 100644
--- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
@@ -91,6 +91,9 @@ static void frontbuffer_flush(struct drm_i915_private *i915,
 
trace_intel_frontbuffer_flush(frontbuffer_bits, origin);
 
+   if (!HAS_FRONTBUFFER_RENDERING(i915))
+   return;
+
might_sleep();
intel_edp_drrs_flush(i915, frontbuffer_bits);
intel_psr_flush(i915, frontbuffer_bits, origin);
@@ -179,6 +182,9 @@ void __intel_fb_invalidate(struct intel_frontbuffer *front,
 
trace_intel_frontbuffer_invalidate(frontbuffer_bits, origin);
 
+   if (!HAS_FRONTBUFFER_RENDERING(i915))
+   return;
+
might_sleep();
intel_psr_invalidate(i915, frontbuffer_bits, origin);
intel_edp_drrs_invalidate(i915, frontbuffer_bits);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1ea27c4e94a6d..fe1dc8b7871a0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1719,6 +1719,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
 
 #define HAS_VRR(i915)  (GRAPHICS_VER(i915) >= 12)
 
+#define HAS_FRONTBUFFER_RENDERING(i915)(GRAPHICS_VER(i915) < 9)
+
 /* Only valid when HAS_DISPLAY() is true */
 #define INTEL_DISPLAY_ENABLED(dev_priv) \
(drm_WARN_ON(&(dev_priv)->drm, !HAS_DISPLAY(dev_priv)), 
!(dev_priv)->params.disable_display)
-- 
2.32.0

[Intel-gfx] [PATCH 2/8] drm/i915/display: Drop PSR support from HSW and BDW

At this point is sure that HSW and BDW will never have PSR enabled by
default, so here dropping it from device info and cleaning up code.

v2:
- enable psr support for display 9

Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_psr.c | 97 
 drivers/gpu/drm/i915/i915_drv.h  |  2 -
 drivers/gpu/drm/i915/i915_irq.c  | 16 
 drivers/gpu/drm/i915/i915_pci.c  |  4 +-
 drivers/gpu/drm/i915/i915_reg.h  | 21 ++---
 5 files changed, 20 insertions(+), 120 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index cade37f67f33c..3f6fb7d67f84d 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -364,41 +364,6 @@ void intel_psr_init_dpcd(struct intel_dp *intel_dp)
}
 }
 
-static void hsw_psr_setup_aux(struct intel_dp *intel_dp)
-{
-   struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
-   u32 aux_clock_divider, aux_ctl;
-   int i;
-   static const u8 aux_msg[] = {
-   [0] = DP_AUX_NATIVE_WRITE << 4,
-   [1] = DP_SET_POWER >> 8,
-   [2] = DP_SET_POWER & 0xff,
-   [3] = 1 - 1,
-   [4] = DP_SET_POWER_D0,
-   };
-   u32 psr_aux_mask = EDP_PSR_AUX_CTL_TIME_OUT_MASK |
-  EDP_PSR_AUX_CTL_MESSAGE_SIZE_MASK |
-  EDP_PSR_AUX_CTL_PRECHARGE_2US_MASK |
-  EDP_PSR_AUX_CTL_BIT_CLOCK_2X_MASK;
-
-   BUILD_BUG_ON(sizeof(aux_msg) > 20);
-   for (i = 0; i < sizeof(aux_msg); i += 4)
-   intel_de_write(dev_priv,
-  EDP_PSR_AUX_DATA(intel_dp->psr.transcoder, i >> 
2),
-  intel_dp_pack_aux(_msg[i], sizeof(aux_msg) - 
i));
-
-   aux_clock_divider = intel_dp->get_aux_clock_divider(intel_dp, 0);
-
-   /* Start with bits set for DDI_AUX_CTL register */
-   aux_ctl = intel_dp->get_aux_send_ctl(intel_dp, sizeof(aux_msg),
-aux_clock_divider);
-
-   /* Select only valid bits for SRD_AUX_CTL */
-   aux_ctl &= psr_aux_mask;
-   intel_de_write(dev_priv, EDP_PSR_AUX_CTL(intel_dp->psr.transcoder),
-  aux_ctl);
-}
-
 static void intel_psr_enable_sink(struct intel_dp *intel_dp)
 {
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
@@ -621,9 +586,7 @@ static void hsw_activate_psr2(struct intel_dp *intel_dp)
 static bool
 transcoder_has_psr2(struct drm_i915_private *dev_priv, enum transcoder trans)
 {
-   if (DISPLAY_VER(dev_priv) < 9)
-   return false;
-   else if (DISPLAY_VER(dev_priv) >= 12)
+   if (DISPLAY_VER(dev_priv) >= 12)
return trans == TRANSCODER_A;
else
return trans == TRANSCODER_EDP;
@@ -1114,12 +1077,6 @@ static void intel_psr_enable_source(struct intel_dp 
*intel_dp)
enum transcoder cpu_transcoder = intel_dp->psr.transcoder;
u32 mask;
 
-   /* Only HSW and BDW have PSR AUX registers that need to be setup. SKL+
-* use hardcoded values PSR AUX transactions
-*/
-   if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
-   hsw_psr_setup_aux(intel_dp);
-
if (intel_dp->psr.psr2_enabled && DISPLAY_VER(dev_priv) == 9) {
i915_reg_t reg = CHICKEN_TRANS(cpu_transcoder);
u32 chicken = intel_de_read(dev_priv, reg);
@@ -1460,23 +1417,16 @@ static void psr_force_hw_tracking_exit(struct intel_dp 
*intel_dp)
 {
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 
-   if (DISPLAY_VER(dev_priv) >= 9)
-   /*
-* Display WA #0884: skl+
-* This documented WA for bxt can be safely applied
-* broadly so we can force HW tracking to exit PSR
-* instead of disabling and re-enabling.
-* Workaround tells us to write 0 to CUR_SURFLIVE_A,
-* but it makes more sense write to the current active
-* pipe.
-*/
-   intel_de_write(dev_priv, CURSURFLIVE(intel_dp->psr.pipe), 0);
-   else
-   /*
-* A write to CURSURFLIVE do not cause HW tracking to exit PSR
-* on older gens so doing the manual exit instead.
-*/
-   intel_psr_exit(intel_dp);
+   /*
+* Display WA #0884: skl+
+* This documented WA for bxt can be safely applied
+* broadly so we can force HW tracking to exit PSR
+* instead of disabling and re-enabling.
+* Workaround tells us to write 0 to CUR_SURFLIVE_A,
+* but it makes more sense write to the current active
+* pipe.
+*/
+   intel_de_write(dev_priv, CURSURFLIVE(intel_dp->psr.pipe), 0);
 }
 
 void intel_psr2_program_plane_sel_fetch(struct intel_plane

[Intel-gfx] [PATCH 0/8] Drop frontbuffer rendering support from Skylake and newer

This will break some IGT tests, 
here(https://patchwork.freedesktop.org/series/93764/)
I fixed the ones part of fast-feedback test list but probably there
will be more tests needing fix.

The first patch was also sent separated to intel-gfx and dri-devel.

Cc: Gwan-gyeong Mun 
Cc: Daniel Vetter 

José Roberto de Souza (8):
  drm/damage_helper: Fix handling of cursor dirty buffers
  drm/i915/display: Drop PSR support from HSW and BDW
  drm/i915/display: Move DRRS code its own file
  drm/i915/display: Some code improvements and code style fixes for DRRS
  drm/i915/display: Share code between intel_edp_drrs_flush and
invalidate
  drm/i915/display: Prepare DRRS for frontbuffer rendering drop
  drm/i915/display/skl+: Drop frontbuffer rendering support
  drm/i915/display: Drop PSR frontbuffer rendering support

 Documentation/gpu/i915.rst|  14 +-
 drivers/gpu/drm/drm_damage_helper.c   |   8 +-
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/intel_cursor.c   |   6 +-
 drivers/gpu/drm/i915/display/intel_ddi.c  |   1 +
 drivers/gpu/drm/i915/display/intel_display.c  |   9 +-
 .../drm/i915/display/intel_display_debugfs.c  |   3 +-
 .../drm/i915/display/intel_display_types.h|   2 -
 drivers/gpu/drm/i915/display/intel_dp.c   | 467 +-
 drivers/gpu/drm/i915/display/intel_dp.h   |  11 -
 drivers/gpu/drm/i915/display/intel_drrs.c | 450 +
 drivers/gpu/drm/i915/display/intel_drrs.h |  36 ++
 .../gpu/drm/i915/display/intel_frontbuffer.c  |   9 +-
 drivers/gpu/drm/i915/display/intel_psr.c  | 283 ++-
 drivers/gpu/drm/i915/display/intel_psr.h  |   8 +-
 drivers/gpu/drm/i915/i915_drv.h   |   4 +-
 drivers/gpu/drm/i915/i915_irq.c   |  16 -
 drivers/gpu/drm/i915/i915_pci.c   |   4 +-
 drivers/gpu/drm/i915/i915_reg.h   |  21 +-
 19 files changed, 561 insertions(+), 792 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/display/intel_drrs.c
 create mode 100644 drivers/gpu/drm/i915/display/intel_drrs.h

-- 
2.32.0

[Intel-gfx] [PATCH 3/8] drm/i915/display: Move DRRS code its own file

intel_dp.c is a 5k lines monster, so moving DRRS out of it to reduce
some lines from it.

Cc: Jani Nikula 
Cc: Rodrigo Vivi 
Signed-off-by: José Roberto de Souza 
---
 Documentation/gpu/i915.rst|  14 +-
 drivers/gpu/drm/i915/Makefile |   1 +
 drivers/gpu/drm/i915/display/intel_ddi.c  |   1 +
 .../drm/i915/display/intel_display_debugfs.c  |   1 +
 drivers/gpu/drm/i915/display/intel_dp.c   | 467 +
 drivers/gpu/drm/i915/display/intel_dp.h   |  11 -
 drivers/gpu/drm/i915/display/intel_drrs.c | 477 ++
 drivers/gpu/drm/i915/display/intel_drrs.h |  32 ++
 .../gpu/drm/i915/display/intel_frontbuffer.c  |   1 +
 9 files changed, 521 insertions(+), 484 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/display/intel_drrs.c
 create mode 100644 drivers/gpu/drm/i915/display/intel_drrs.h

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 204ebdaadb45a..03021dfa0dd81 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -183,25 +183,25 @@ Frame Buffer Compression (FBC)
 Display Refresh Rate Switching (DRRS)
 -
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:doc: Display Refresh Rate Switching (DRRS)
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_dp_set_drrs_state
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_edp_drrs_enable
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_edp_drrs_disable
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_edp_drrs_invalidate
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_edp_drrs_flush
 
-.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dp.c
+.. kernel-doc:: drivers/gpu/drm/i915/display/intel_drrs.c
:functions: intel_dp_drrs_init
 
 DPIO
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 642a5b5a1b81c..c7cf4dfdc6379 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -212,6 +212,7 @@ i915-y += \
display/intel_dpio_phy.o \
display/intel_dpll.o \
display/intel_dpll_mgr.o \
+   display/intel_drrs.o \
display/intel_dsb.o \
display/intel_fb.o \
display/intel_fbc.o \
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c 
b/drivers/gpu/drm/i915/display/intel_ddi.c
index 1ef7a65feb660..828df570a4809 100644
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -40,6 +40,7 @@
 #include "intel_dp_link_training.h"
 #include "intel_dp_mst.h"
 #include "intel_dpio_phy.h"
+#include "intel_drrs.h"
 #include "intel_dsi.h"
 #include "intel_fdi.h"
 #include "intel_fifo_underrun.h"
diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index 8fdacb252bb19..b136a0fc0963b 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -13,6 +13,7 @@
 #include "intel_display_types.h"
 #include "intel_dmc.h"
 #include "intel_dp.h"
+#include "intel_drrs.h"
 #include "intel_fbc.h"
 #include "intel_hdcp.h"
 #include "intel_hdmi.h"
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 75d4ebc669411..10583b0aa489e 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -55,6 +55,7 @@
 #include "intel_dp_mst.h"
 #include "intel_dpio_phy.h"
 #include "intel_dpll.h"
+#include "intel_drrs.h"
 #include "intel_fifo_underrun.h"
 #include "intel_hdcp.h"
 #include "intel_hdmi.h"
@@ -1603,46 +1604,6 @@ intel_dp_compute_hdr_metadata_infoframe_sdp(struct 
intel_dp *intel_dp,
intel_hdmi_infoframe_enable(HDMI_PACKET_TYPE_GAMUT_METADATA);
 }
 
-static void
-intel_dp_drrs_compute_config(struct intel_dp *intel_dp,
-struct intel_crtc_state *pipe_config,
-int output_bpp, bool constant_n)
-{
-   struct intel_connector *intel_connector = intel_dp->attached_connector;
-   struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
-   int pixel_clock;
-
-   if (pipe_config->vrr.enable)
-   return;
-
-   /*
-* DRRS and PSR can't be enable together, so giving preference to PSR
-* as it allows more power-savings by complete shutting down display,
-* so to guarantee this, intel_dp_drrs_compute_config() must be called
-* after

[Intel-gfx] [PATCH 8/8] drm/i915/display: Drop PSR frontbuffer rendering support

After commit "drm/i915/display/skl+: Drop frontbuffer rendering
support" frontbuffer rendering is not supported for display 9 and
newer and as PSR is only supported by default in display 9 and newer
we can now drop all frontbuffer rendering support for PSR code.

Some DC3CO code was commented with a macro, because the function
caller is being dropped. As DC3CO is already disabled by default
because it requires changes in its sequences

Two DC3CO functions lost their callers while dropping frontbuffer
rendering but as DC3CO is already disabled by default because it
requires fixes, will leave this task to whoever will fix DC3CO.

Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 .../drm/i915/display/intel_display_debugfs.c  |   2 -
 .../drm/i915/display/intel_display_types.h|   2 -
 .../gpu/drm/i915/display/intel_frontbuffer.c  |   2 -
 drivers/gpu/drm/i915/display/intel_psr.c  | 186 ++
 drivers/gpu/drm/i915/display/intel_psr.h  |   8 +-
 5 files changed, 18 insertions(+), 182 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display_debugfs.c 
b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
index b136a0fc0963b..64a03ae56d6fe 100644
--- a/drivers/gpu/drm/i915/display/intel_display_debugfs.c
+++ b/drivers/gpu/drm/i915/display/intel_display_debugfs.c
@@ -374,8 +374,6 @@ static int intel_psr_status(struct seq_file *m, struct 
intel_dp *intel_dp)
seq_printf(m, "Source PSR ctl: %s [0x%08x]\n",
   enableddisabled(enabled), val);
psr_source_status(intel_dp, m);
-   seq_printf(m, "Busy frontbuffer bits: 0x%08x\n",
-  psr->busy_frontbuffer_bits);
 
/*
 * SKL+ Perf counter is reset to 0 everytime DC state is entered
diff --git a/drivers/gpu/drm/i915/display/intel_display_types.h 
b/drivers/gpu/drm/i915/display/intel_display_types.h
index 6bba1bed2..a6b08032917a7 100644
--- a/drivers/gpu/drm/i915/display/intel_display_types.h
+++ b/drivers/gpu/drm/i915/display/intel_display_types.h
@@ -1512,7 +1512,6 @@ struct intel_psr {
enum transcoder transcoder;
bool active;
struct work_struct work;
-   unsigned int busy_frontbuffer_bits;
bool sink_psr2_support;
bool link_standby;
bool colorimetry_support;
@@ -1523,7 +1522,6 @@ struct intel_psr {
ktime_t last_entry_attempt;
ktime_t last_exit;
bool sink_not_reliable;
-   bool irq_aux_error;
u16 su_w_granularity;
u16 su_y_granularity;
u32 dc3co_exitline;
diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c 
b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
index 6be2f767a203c..784aa423b84bf 100644
--- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c
+++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c
@@ -96,7 +96,6 @@ static void frontbuffer_flush(struct drm_i915_private *i915,
 
might_sleep();
intel_edp_drrs_flush(i915, frontbuffer_bits);
-   intel_psr_flush(i915, frontbuffer_bits, origin);
intel_fbc_flush(i915, frontbuffer_bits, origin);
 }
 
@@ -186,7 +185,6 @@ void __intel_fb_invalidate(struct intel_frontbuffer *front,
return;
 
might_sleep();
-   intel_psr_invalidate(i915, frontbuffer_bits, origin);
intel_edp_drrs_invalidate(i915, frontbuffer_bits);
intel_fbc_invalidate(i915, frontbuffer_bits, origin);
 }
diff --git a/drivers/gpu/drm/i915/display/intel_psr.c 
b/drivers/gpu/drm/i915/display/intel_psr.c
index 3f6fb7d67f84d..8c9bd5846a8d0 100644
--- a/drivers/gpu/drm/i915/display/intel_psr.c
+++ b/drivers/gpu/drm/i915/display/intel_psr.c
@@ -224,15 +224,12 @@ void intel_psr_irq_handler(struct intel_dp *intel_dp, u32 
psr_iir)
drm_warn(_priv->drm, "[transcoder %s] PSR aux error\n",
 transcoder_name(cpu_transcoder));
 
-   intel_dp->psr.irq_aux_error = true;
-
/*
 * If this interruption is not masked it will keep
 * interrupting so fast that it prevents the scheduled
 * work to run.
 * Also after a PSR error, we don't want to arm PSR
 * again so we don't care about unmask the interruption
-* or unset irq_aux_error.
 */
val = intel_de_read(dev_priv, imr_reg);
val |= EDP_PSR_ERROR(trans_shift);
@@ -614,14 +611,6 @@ static void psr2_program_idle_frames(struct intel_dp 
*intel_dp,
intel_de_write(dev_priv, EDP_PSR2_CTL(intel_dp->psr.transcoder), val);
 }
 
-static void tgl_psr2_enable_dc3co(struct intel_dp *intel_dp)
-{
-   struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
-
-   psr2_program_idle_frames(intel_dp, 0);
-   intel_display_power_set_target_dc_state(dev_priv, DC_STATE_EN_DC3CO);
-}
-
 static void tgl_psr2_disable_dc3co(struct intel_dp *intel_dp)
 {
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
@@ -1177,7

[Intel-gfx] [PATCH 4/8] drm/i915/display: Some code improvements and code style fixes for DRRS

It started as a code style fix for the lines above 100 col but it
turned out to simplyfications to intel_dp_set_drrs_state().
Now it receives the desired refresh rate type, high or low.

Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/i915/display/intel_drrs.c | 60 ---
 1 file changed, 21 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_drrs.c 
b/drivers/gpu/drm/i915/display/intel_drrs.c
index be9b6d4482f04..e96033bc6c658 100644
--- a/drivers/gpu/drm/i915/display/intel_drrs.c
+++ b/drivers/gpu/drm/i915/display/intel_drrs.c
@@ -91,7 +91,7 @@ intel_dp_drrs_compute_config(struct intel_dp *intel_dp,
  * intel_dp_set_drrs_state - program registers for RR switch to take effect
  * @dev_priv: i915 device
  * @crtc_state: a pointer to the active intel_crtc_state
- * @refresh_rate: RR to be programmed
+ * @refresh_type: high or low refresh rate to be programmed
  *
  * This function gets called when refresh rate (RR) has to be changed from
  * one frequency to another. Switches can be between high and low RR
@@ -102,19 +102,13 @@ intel_dp_drrs_compute_config(struct intel_dp *intel_dp,
  */
 static void intel_dp_set_drrs_state(struct drm_i915_private *dev_priv,
const struct intel_crtc_state *crtc_state,
-   int refresh_rate)
+   enum drrs_refresh_rate_type refresh_type)
 {
struct intel_dp *intel_dp = dev_priv->drrs.dp;
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
-   enum drrs_refresh_rate_type index = DRRS_HIGH_RR;
+   struct drm_display_mode *mode;
 
-   if (refresh_rate <= 0) {
-   drm_dbg_kms(_priv->drm,
-   "Refresh rate should be positive non-zero.\n");
-   return;
-   }
-
-   if (intel_dp == NULL) {
+   if (!intel_dp) {
drm_dbg_kms(_priv->drm, "DRRS not supported.\n");
return;
}
@@ -130,15 +124,8 @@ static void intel_dp_set_drrs_state(struct 
drm_i915_private *dev_priv,
return;
}
 
-   if 
(drm_mode_vrefresh(intel_dp->attached_connector->panel.downclock_mode) ==
-   refresh_rate)
-   index = DRRS_LOW_RR;
-
-   if (index == dev_priv->drrs.refresh_rate_type) {
-   drm_dbg_kms(_priv->drm,
-   "DRRS requested for previously set 
RR...ignoring\n");
+   if (refresh_type == dev_priv->drrs.refresh_rate_type)
return;
-   }
 
if (!crtc_state->hw.active) {
drm_dbg_kms(_priv->drm,
@@ -147,7 +134,7 @@ static void intel_dp_set_drrs_state(struct drm_i915_private 
*dev_priv,
}
 
if (DISPLAY_VER(dev_priv) >= 8 && !IS_CHERRYVIEW(dev_priv)) {
-   switch (index) {
+   switch (refresh_type) {
case DRRS_HIGH_RR:
intel_dp_set_m_n(crtc_state, M1_N1);
break;
@@ -164,7 +151,7 @@ static void intel_dp_set_drrs_state(struct drm_i915_private 
*dev_priv,
u32 val;
 
val = intel_de_read(dev_priv, reg);
-   if (index > DRRS_HIGH_RR) {
+   if (refresh_type == DRRS_LOW_RR) {
if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
val |= PIPECONF_EDP_RR_MODE_SWITCH_VLV;
else
@@ -178,10 +165,14 @@ static void intel_dp_set_drrs_state(struct 
drm_i915_private *dev_priv,
intel_de_write(dev_priv, reg, val);
}
 
-   dev_priv->drrs.refresh_rate_type = index;
+   dev_priv->drrs.refresh_rate_type = refresh_type;
 
+   if (refresh_type == DRRS_LOW_RR)
+   mode = intel_dp->attached_connector->panel.fixed_mode;
+   else
+   mode = intel_dp->attached_connector->panel.downclock_mode;
drm_dbg_kms(_priv->drm, "eDP Refresh Rate set to : %dHz\n",
-   refresh_rate);
+   drm_mode_vrefresh(mode));
 }
 
 static void
@@ -229,13 +220,7 @@ intel_edp_drrs_disable_locked(struct intel_dp *intel_dp,
 {
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
 
-   if (dev_priv->drrs.refresh_rate_type == DRRS_LOW_RR) {
-   int refresh;
-
-   refresh = 
drm_mode_vrefresh(intel_dp->attached_connector->panel.fixed_mode);
-   intel_dp_set_drrs_state(dev_priv, crtc_state, refresh);
-   }
-
+   intel_dp_set_drrs_state(dev_priv, crtc_state, DRRS_HIGH_RR);
dev_priv->drrs.dp = NULL;
 }
 
@@ -303,6 +288,7 @@ static void intel_edp_drrs_downclock_work(struct 
work_struct *work)
struct drm_i915_private *dev_priv =
container_of(work, typeof(*dev_priv), drrs.work.work);
struct intel_dp *intel_dp;
+   struct drm_crtc *crtc;
 
mutex_lock(_priv->drrs.mutex);
 
@@ -319,12 +305,8 @@ static void

[Intel-gfx] [PATCH 1/8] drm/damage_helper: Fix handling of cursor dirty buffers

Cursors don't have a framebuffer so the fb comparisson was always
failing and atomic state was being committed without any plane state.

So here checking if objects match when checking cursors.

Fixes: b9fc5e01d1ce ("drm: Add helper to implement legacy dirtyfb")
Cc: Daniel Vetter 
Cc: Rob Clark 
Cc: Deepak Rawat 
Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/drm_damage_helper.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_damage_helper.c 
b/drivers/gpu/drm/drm_damage_helper.c
index 8eeff0c7bdd47..595187d97c131 100644
--- a/drivers/gpu/drm/drm_damage_helper.c
+++ b/drivers/gpu/drm/drm_damage_helper.c
@@ -157,12 +157,18 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb,
 retry:
drm_for_each_plane(plane, fb->dev) {
struct drm_plane_state *plane_state;
+   bool match;
 
ret = drm_modeset_lock(>mutex, state->acquire_ctx);
if (ret)
goto out;
 
-   if (plane->state->fb != fb) {
+   match = plane->state->fb == fb;
+   /* Check if objs match to handle dirty buffers of cursors */
+   if (plane->type == DRM_PLANE_TYPE_CURSOR && plane->state->fb)
+   match |= fb->obj[0] == plane->state->fb->obj[0];
+
+   if (!match) {
drm_modeset_unlock(>mutex);
continue;
}
-- 
2.32.0

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/damage_helper: Fix handling of cursor dirty buffers

== Series Details ==

Series: drm/damage_helper: Fix handling of cursor dirty buffers
URL   : https://patchwork.freedesktop.org/series/93765/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491 -> Patchwork_20839


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/index.html

Known issues


  Here are the changes found in Patchwork_20839 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271]) +20 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][4] ([i915#1886] / [i915#2291])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#533])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  
 Possible fixes 

  * igt@core_hotunplug@unbind-rebind:
- fi-rkl-guc: [DMESG-WARN][7] ([i915#3925]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
- fi-ilk-650: [DMESG-WARN][9] ([i915#164]) -> [PASS][10] +1 similar 
issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-soraka:  [INCOMPLETE][11] ([i915#155]) -> [PASS][12]
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html

  * igt@i915_selftest@live@execlists:
- {fi-tgl-dsi}:   [DMESG-FAIL][13] ([i915#1993]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#164]: https://gitlab.freedesktop.org/drm/intel/issues/164
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#1993]: https://gitlab.freedesktop.org/drm/intel/issues/1993
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#3925]: https://gitlab.freedesktop.org/drm/intel/issues/3925
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (36 -> 33)
--

  Missing(3): fi-bdw-samus fi-bsw-cyan fi-apl-guc 


Build changes
-

  * Linux: CI_DRM_10491 -> Patchwork_20839

  CI-20190529: 20190529
  CI_DRM_10491: efa09f306ade4b8550404d7248ac743fc0cb2c7d @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20839: 8ccb68e74c2246f70ce5216c06afed3e88e804f4 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

8ccb68e74c22 drm/damage_helper: Fix handling of cursor dirty buffers

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20839/index.html

Re: [Intel-gfx] [PATCH 1/1] drm/i915: Check if engine has heartbeat when closing a context

2021-08-17 Thread John Harrison


On 8/9/2021 23:36, Daniel Vetter wrote:

On Mon, Aug 09, 2021 at 04:12:52PM -0700, John Harrison wrote:

On 8/6/2021 12:46, Daniel Vetter wrote:

Seen this fly by and figured I dropped a few thoughts in here. At the
likely cost of looking a bit out of whack :-)

On Fri, Aug 6, 2021 at 8:01 PM John Harrison  wrote:

On 8/2/2021 02:40, Tvrtko Ursulin wrote:

On 30/07/2021 19:13, John Harrison wrote:

On 7/30/2021 02:49, Tvrtko Ursulin wrote:

On 30/07/2021 01:13, John Harrison wrote:

On 7/28/2021 17:34, Matthew Brost wrote:

If an engine associated with a context does not have a heartbeat,
ban it
immediately. This is needed for GuC submission as a idle pulse
doesn't
kick the context off the hardware where it then can check for a
heartbeat and ban the context.

Pulse, that is a request with I915_PRIORITY_BARRIER, does not
preempt a running normal priority context?

Why does it matter then whether or not heartbeats are enabled - when
heartbeat just ends up sending the same engine pulse (eventually,
with raising priority)?

The point is that the pulse is pointless. See the rest of my comments
below, specifically "the context will get resubmitted to the hardware
after the pulse completes". To re-iterate...

Yes, it preempts the context. Yes, it does so whether heartbeats are
enabled or not. But so what? Who cares? You have preempted a context.
It is no longer running on the hardware. BUT IT IS STILL A VALID
CONTEXT.

It is valid yes, and it even may be the current ABI so another
question is whether it is okay to change that.


The backend scheduler will just resubmit it to the hardware as soon
as the pulse completes. The only reason this works at all is because
of the horrid hack in the execlist scheduler's back end
implementation (in __execlists_schedule_in):
   if (unlikely(intel_context_is_closed(ce) &&
!intel_engine_has_heartbeat(engine)))
   intel_context_set_banned(ce);

Right, is the above code then needed with this patch - when ban is
immediately applied on the higher level?


The actual back end scheduler is saying "Is this a zombie context? Is
the heartbeat disabled? Then ban it". No other scheduler backend is
going to have knowledge of zombie context status or of the heartbeat
status. Nor are they going to call back into the higher levels of the
i915 driver to trigger a ban operation. Certainly a hardware
implemented scheduler is not going to be looking at private i915
driver information to decide whether to submit a context or whether
to tell the OS to kill it off instead.

For persistence to work with a hardware scheduler (or a non-Intel
specific scheduler such as the DRM one), the handling of zombie
contexts, banning, etc. *must* be done entirely in the front end. It
cannot rely on any backend hacks. That means you can't rely on any
fancy behaviour of pulses.

If you want to ban a context then you must explicitly ban that
context. If you want to ban it at some later point then you need to
track it at the top level as a zombie and then explicitly ban that
zombie at whatever later point.

I am still trying to understand it all. If I go by the commit message:

"""
This is needed for GuC submission as a idle pulse doesn't
kick the context off the hardware where it then can check for a
heartbeat and ban the context.
"""

That did not explain things for me. Sentence does not appear to make
sense. Now, it seems "kick off the hardware" is meant as revoke and
not just preempt. Which is fine, perhaps just needs to be written more
explicitly. But the part of checking for heartbeat after idle pulse
does not compute for me. It is the heartbeat which emits idle pulses,
not idle pulse emitting heartbeats.

I am in agreement that the commit message is confusing and does not
explain either the problem or the solution.



But anyway, I can buy the handling at the front end story completely.
It makes sense. We just need to agree that a) it is okay to change the
ABI and b) remove the backend check from execlists if it is not needed
any longer.

And if ABI change is okay then commit message needs to talk about it
loudly and clearly.

I don't think we have a choice. The current ABI is not and cannot ever
be compatible with any scheduler external to i915. It cannot be
implemented with a hardware scheduler such as the GuC and it cannot be
implemented with an external software scheduler such as the DRM one.

So generally on linux we implement helper libraries, which means
massive flexibility everywhere.

https://blog.ffwll.ch/2016/12/midlayers-once-more-with-feeling.html

So it shouldn't be an insurmountable problem to make this happen even
with drm/scheduler, we can patch it up.

Whether that's justified is another question.

Helper libraries won't work with a hardware scheduler.

Hm I guess I misunderstood then what exactly the hold-up is. This entire
discussion feels at least a bit like "heartbeat is unchangeable and guc
must fit", which is pretty much the midlayer

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm + usb-type-c: Add support for out-of-band hotplug notification (v4 resend)

== Series Details ==

Series: drm + usb-type-c: Add support for out-of-band hotplug notification (v4 
resend)
URL   : https://patchwork.freedesktop.org/series/93762/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10491_full -> Patchwork_20838_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20838_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20838_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20838_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_pm_backlight@bad-brightness:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-iclb4/igt@i915_pm_backli...@bad-brightness.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-iclb4/igt@i915_pm_backli...@bad-brightness.html

  * igt@sysfs_heartbeat_interval@mixed@vcs0:
- shard-skl:  [PASS][3] -> [WARN][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-skl2/igt@sysfs_heartbeat_interval@mi...@vcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-skl9/igt@sysfs_heartbeat_interval@mi...@vcs0.html

  
Known issues


  Here are the changes found in Patchwork_20838_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][5] ([i915#1839])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-iclb6/igt@feature_discov...@display-2x.html

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
- shard-apl:  NOTRUN -> [DMESG-WARN][6] ([i915#180]) +1 similar 
issue
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-apl6/igt@gem_ctx_isolation@preservation...@rcs0.html

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
- shard-kbl:  [PASS][7] -> [DMESG-WARN][8] ([i915#180]) +4 similar 
issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-kbl6/igt@gem_ctx_isolation@preservation...@vcs0.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-kbl6/igt@gem_ctx_isolation@preservation...@vcs0.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][9] ([fdo#109271] / [i915#1099])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-snb6/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_ctx_shared@q-in-order:
- shard-snb:  NOTRUN -> [SKIP][10] ([fdo#109271]) +222 similar 
issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-snb6/igt@gem_ctx_sha...@q-in-order.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][11] -> [TIMEOUT][12] ([i915#2369] / 
[i915#3063] / [i915#3648])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb6/igt@gem_...@unwedge-stress.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-tglb3/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-kbl6/igt@gem_exec_fair@basic-none-r...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-kbl4/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-tglb: [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb7/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-tglb3/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][17] -> [FAIL][18] ([i915#2842])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-glk4/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_huc_copy@huc-copy:
- shard-kbl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#2190])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-kbl4/igt@gem_huc_c...@huc-copy.html

  * igt@gem_mmap_gtt@cpuset-big-copy-xy:
- shard-iclb: [PASS][20] -> [FAIL][21] ([i915#307])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-iclb4/igt@gem_mmap_...@cpuset-big-copy-xy.html
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/shard-iclb4/igt@gem_mmap_...@cpuset-big-copy-xy.html

  *

Re: [Intel-gfx] [PATCH 0/1] Fix gem_ctx_persistence failures with GuC submission

2021-08-17 Thread John Harrison


On 8/9/2021 23:38, Daniel Vetter wrote:

On Wed, Jul 28, 2021 at 05:33:59PM -0700, Matthew Brost wrote:

Should fix below failures with GuC submission for the following tests:
gem_exec_balancer --r noheartbeat
gem_ctx_persistence --r heartbeat-close

Not going to fix:
gem_ctx_persistence --r heartbeat-many
gem_ctx_persistence --r heartbeat-stop

After looking at that big thread and being very confused: Are we fixing an
actual use-case here, or is this another case of blindly following igts
tests just because they exist?
My understanding is that this is established behaviour and therefore 
must be maintained because the UAPI (whether documented or not) is 
inviolate. Therefore IGTs have been written to validate this past 
behaviour and now we must conform to the IGTs in order to keep the 
existing behaviour unchanged.


Whether anybody actually makes use of this behaviour or not is another 
matter entirely. I am certainly not aware of any vital use case. Others 
might have more recollection. I do know that we tell the UMD teams to 
explicitly disable persistence on every context they create.




I'm leaning towards that we should stall on this, and first document what
exactly is the actual intention behind all this, and then fix up the tests
I'm not sure there ever was an 'intention'. The rumour I heard way back 
when was that persistence was a bug on earlier platforms (or possibly we 
didn't have hardware support for doing engine resets?). But once the bug 
was realised (or the hardware support was added), it was too late to 
change the default behaviour because existing kernel behaviour must 
never change on pain of painful things. Thus the persistence flag was 
added so that people could opt out of the broken, leaky behaviour and 
have their contexts clean up properly.


Feel free to document what you believe should be the behaviour from a 
software architect point of view. Any documentation I produce is 
basically going to be created by reverse engineering the existing code. 
That is the only 'spec' that I am aware of and as I keep saying, I 
personally think it is a totally broken concept that should just be removed.



to match (if needed). And only then fix up GuC to match whatever we
actually want to do.
I also still maintain there is no 'fix up the GuC'. This is not 
behaviour we should be adding to a hardware scheduler. It is behaviour 
that should be implemented at the front end not the back end. If we 
absolutely need to do this then we need to do it solely at the context 
management level not at the back end submission level. And the solution 
should work by default on any submission back end.


John.



-Daniel


As the above tests change the heartbeat value to 0 (off) after the
context is closed and we have no way to detect that with GuC submission
unless we keep a list of closed but running contexts which seems like
overkill for a non-real world use case. We likely should just skip these
tests with GuC submission.

Signed-off-by: Matthew Brost 

Matthew Brost (1):
   drm/i915: Check if engine has heartbeat when closing a context

  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  5 +++--
  drivers/gpu/drm/i915/gt/intel_context_types.h |  2 ++
  drivers/gpu/drm/i915/gt/intel_engine.h| 21 ++-
  .../drm/i915/gt/intel_execlists_submission.c  | 14 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  6 +-
  .../gpu/drm/i915/gt/uc/intel_guc_submission.h |  2 --
  6 files changed, 26 insertions(+), 24 deletions(-)

--
2.28.0

[Intel-gfx] [PATCH] drm/i915: Use designated initializers for init/exit table

2021-08-17 Thread Kees Cook

The kernel builds with -Werror=designated-init, and __designated_init
is used by CONFIG_GCC_PLUGIN_RANDSTRUCT for automatically selected (all
function pointer) structures. Include the field names in the init/exit
table. Avoids warnings like:

drivers/gpu/drm/i915/i915_module.c:59:4: error: positional initialization of 
field in 'struct' declared with 'designated_init' attribute 
[-Werror=designated-init]

Cc: Jani Nikula 
Cc: Joonas Lahtinen 
Cc: Rodrigo Vivi 
Cc: David Airlie 
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org
Fixes: a04ea6ae7c67 ("drm/i915: Use a table for i915_init/exit (v2)")
Signed-off-by: Kees Cook 
---
 drivers/gpu/drm/i915/i915_module.c | 37 +++---
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_module.c 
b/drivers/gpu/drm/i915/i915_module.c
index c578ea8f56a0..d8b4482c69d0 100644
--- a/drivers/gpu/drm/i915/i915_module.c
+++ b/drivers/gpu/drm/i915/i915_module.c
@@ -47,19 +47,30 @@ static const struct {
int (*init)(void);
void (*exit)(void);
 } init_funcs[] = {
-   { i915_check_nomodeset, NULL },
-   { i915_active_module_init, i915_active_module_exit },
-   { i915_buddy_module_init, i915_buddy_module_exit },
-   { i915_context_module_init, i915_context_module_exit },
-   { i915_gem_context_module_init, i915_gem_context_module_exit },
-   { i915_objects_module_init, i915_objects_module_exit },
-   { i915_request_module_init, i915_request_module_exit },
-   { i915_scheduler_module_init, i915_scheduler_module_exit },
-   { i915_vma_module_init, i915_vma_module_exit },
-   { i915_mock_selftests, NULL },
-   { i915_pmu_init, i915_pmu_exit },
-   { i915_register_pci_driver, i915_unregister_pci_driver },
-   { i915_perf_sysctl_register, i915_perf_sysctl_unregister },
+   { .init = i915_check_nomodeset },
+   { .init = i915_active_module_init,
+ .exit = i915_active_module_exit },
+   { .init = i915_buddy_module_init,
+ .exit = i915_buddy_module_exit },
+   { .init = i915_context_module_init,
+ .exit = i915_context_module_exit },
+   { .init = i915_gem_context_module_init,
+ .exit = i915_gem_context_module_exit },
+   { .init = i915_objects_module_init,
+ .exit = i915_objects_module_exit },
+   { .init = i915_request_module_init,
+ .exit = i915_request_module_exit },
+   { .init = i915_scheduler_module_init,
+ .exit = i915_scheduler_module_exit },
+   { .init = i915_vma_module_init,
+ .exit = i915_vma_module_exit },
+   { .init = i915_mock_selftests },
+   { .init = i915_pmu_init,
+ .exit = i915_pmu_exit },
+   { .init = i915_register_pci_driver,
+ .exit = i915_unregister_pci_driver },
+   { .init = i915_perf_sysctl_register,
+ .exit = i915_perf_sysctl_unregister },
 };
 static int init_progress;
 
-- 
2.30.2

[Intel-gfx] ✓ Fi.CI.IGT: success for GPD Win Max display fixes (rev4)

== Series Details ==

Series: GPD Win Max display fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/90483/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491_full -> Patchwork_20837_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20837_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@display-2x:
- shard-iclb: NOTRUN -> [SKIP][1] ([i915#1839])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-iclb3/igt@feature_discov...@display-2x.html

  * igt@gem_create@create-massive:
- shard-apl:  NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-apl8/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@engines-mixed:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-snb5/igt@gem_ctx_persiste...@engines-mixed.html

  * igt@gem_eio@in-flight-suspend:
- shard-apl:  [PASS][4] -> [DMESG-WARN][5] ([i915#180]) +1 similar 
issue
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-apl8/igt@gem_...@in-flight-suspend.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-apl8/igt@gem_...@in-flight-suspend.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][6] -> [TIMEOUT][7] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb6/igt@gem_...@unwedge-stress.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-tglb6/igt@gem_...@unwedge-stress.html
- shard-iclb: [PASS][8] -> [TIMEOUT][9] ([i915#2369] / [i915#2481] 
/ [i915#3070])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-iclb5/igt@gem_...@unwedge-stress.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-iclb1/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
- shard-kbl:  [PASS][10] -> [FAIL][11] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-kbl4/igt@gem_exec_fair@basic-pace-s...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-kbl6/igt@gem_exec_fair@basic-pace-s...@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-glk4/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][14] -> [SKIP][15] ([i915#2190])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/shard-tglb8/igt@gem_huc_c...@huc-copy.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-tglb6/igt@gem_huc_c...@huc-copy.html
- shard-kbl:  NOTRUN -> [SKIP][16] ([fdo#109271] / [i915#2190])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-kbl3/igt@gem_huc_c...@huc-copy.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-apl:  NOTRUN -> [WARN][17] ([i915#2658])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-apl6/igt@gem_pwr...@basic-exhaustion.html
- shard-tglb: NOTRUN -> [WARN][18] ([i915#2658])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-tglb5/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-to-vebox-yf-tiled:
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#768])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-iclb3/igt@gem_render_c...@y-tiled-to-vebox-yf-tiled.html

  * igt@gem_userptr_blits@dmabuf-sync:
- shard-tglb: NOTRUN -> [SKIP][20] ([i915#3323])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-tglb8/igt@gem_userptr_bl...@dmabuf-sync.html

  * igt@gem_userptr_blits@dmabuf-unsync:
- shard-iclb: NOTRUN -> [SKIP][21] ([i915#3297]) +1 similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-iclb3/igt@gem_userptr_bl...@dmabuf-unsync.html

  * igt@gem_userptr_blits@unsync-unmap-cycles:
- shard-tglb: NOTRUN -> [SKIP][22] ([i915#3297]) +3 similar issues
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-tglb8/igt@gem_userptr_bl...@unsync-unmap-cycles.html

  * igt@gen3_mixed_blits:
- shard-iclb: NOTRUN -> [SKIP][23] ([fdo#109289])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/shard-iclb3/igt@gen3_mixed_blits.html

  * igt@gen3_render_tiledx_blits:
- shard-tglb: NOTRUN -> [SKIP][24]

[Intel-gfx] [PATCH] drm/damage_helper: Fix handling of cursor dirty buffers

Cursors don't have a framebuffer so the fb comparisson was always
failing and atomic state was being committed without any plane state.

So here checking if objects match when checking cursors.

Fixes: b9fc5e01d1ce ("drm: Add helper to implement legacy dirtyfb")
Cc: Daniel Vetter 
Cc: Rob Clark 
Cc: Deepak Rawat 
Cc: Gwan-gyeong Mun 
Signed-off-by: José Roberto de Souza 
---
 drivers/gpu/drm/drm_damage_helper.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_damage_helper.c 
b/drivers/gpu/drm/drm_damage_helper.c
index 8eeff0c7bdd47..595187d97c131 100644
--- a/drivers/gpu/drm/drm_damage_helper.c
+++ b/drivers/gpu/drm/drm_damage_helper.c
@@ -157,12 +157,18 @@ int drm_atomic_helper_dirtyfb(struct drm_framebuffer *fb,
 retry:
drm_for_each_plane(plane, fb->dev) {
struct drm_plane_state *plane_state;
+   bool match;
 
ret = drm_modeset_lock(>mutex, state->acquire_ctx);
if (ret)
goto out;
 
-   if (plane->state->fb != fb) {
+   match = plane->state->fb == fb;
+   /* Check if objs match to handle dirty buffers of cursors */
+   if (plane->type == DRM_PLANE_TYPE_CURSOR && plane->state->fb)
+   match |= fb->obj[0] == plane->state->fb->obj[0];
+
+   if (!match) {
drm_modeset_unlock(>mutex);
continue;
}
-- 
2.32.0

[Intel-gfx] ✓ Fi.CI.BAT: success for drm + usb-type-c: Add support for out-of-band hotplug notification (v4 resend)

== Series Details ==

Series: drm + usb-type-c: Add support for out-of-band hotplug notification (v4 
resend)
URL   : https://patchwork.freedesktop.org/series/93762/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491 -> Patchwork_20838


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/index.html

Known issues


  Here are the changes found in Patchwork_20838 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271]) +10 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#2190])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: NOTRUN -> [DMESG-WARN][4] ([i915#3958])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][5] ([i915#1886] / [i915#2291])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][6] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#533])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  * igt@runner@aborted:
- fi-bdw-5557u:   NOTRUN -> [FAIL][8] ([i915#1602] / [i915#2029])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-bdw-5557u/igt@run...@aborted.html

  
 Possible fixes 

  * igt@core_hotunplug@unbind-rebind:
- fi-rkl-guc: [DMESG-WARN][9] ([i915#3925]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
- fi-ilk-650: [DMESG-WARN][11] ([i915#164]) -> [PASS][12] +1 
similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-soraka:  [INCOMPLETE][13] ([i915#155]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][15] ([i915#1888]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@execlists:
- {fi-tgl-dsi}:   [DMESG-FAIL][17] ([i915#1993]) -> [PASS][18]
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20838/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#164]: https://gitlab.freedesktop.org/drm/intel/issues/164
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1993]: https://gitlab.freedesktop.org/drm/intel/issues/1993
  [i915#2029]: https://gitlab.freedesktop.org/drm/intel/issues/2029
  [i915#2190]:

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for drm + usb-type-c: Add support for out-of-band hotplug notification (v4 resend)

== Series Details ==

Series: drm + usb-type-c: Add support for out-of-band hotplug notification (v4 
resend)
URL   : https://patchwork.freedesktop.org/series/93762/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must be 1 KB"
+./drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgv_sriovmsg.h:312:49: error: static 
assertion failed: "amd_sriov_msg_vf2pf_info must

[Intel-gfx] ✓ Fi.CI.BAT: success for GPD Win Max display fixes (rev4)

== Series Details ==

Series: GPD Win Max display fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/90483/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10491 -> Patchwork_20837


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/index.html

Known issues


  Here are the changes found in Patchwork_20837 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-kbl-soraka:  NOTRUN -> [SKIP][1] ([fdo#109271]) +11 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@gem_huc_copy@huc-copy:
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#2190])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@gem_huc_c...@huc-copy.html

  * igt@i915_selftest@live@gt_pm:
- fi-kbl-soraka:  NOTRUN -> [DMESG-FAIL][3] ([i915#1886] / [i915#2291])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@i915_selftest@live@gt_pm.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: NOTRUN -> [DMESG-FAIL][4] ([i915#3928])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-kbl-soraka:  NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [PASS][6] -> [DMESG-WARN][7] ([i915#2203] / 
[i915#2868])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
- fi-kbl-soraka:  NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#533])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@kms_pipe_crc_ba...@compare-crc-sanitycheck-pipe-d.html

  
 Possible fixes 

  * igt@core_hotunplug@unbind-rebind:
- fi-rkl-guc: [DMESG-WARN][9] ([i915#3925]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-rkl-guc/igt@core_hotunp...@unbind-rebind.html
- fi-ilk-650: [DMESG-WARN][11] ([i915#164]) -> [PASS][12] +1 
similar issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-ilk-650/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0:
- fi-kbl-soraka:  [INCOMPLETE][13] ([i915#155]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-kbl-soraka/igt@gem_exec_susp...@basic-s0.html

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][15] ([i915#1888]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_selftest@live@execlists:
- {fi-tgl-dsi}:   [DMESG-FAIL][17] ([i915#1993]) -> [PASS][18]
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-tgl-dsi/igt@i915_selftest@l...@execlists.html

  
 Warnings 

  * igt@runner@aborted:
- fi-rkl-guc: [FAIL][19] ([i915#1602]) -> [FAIL][20] ([i915#3928])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10491/fi-rkl-guc/igt@run...@aborted.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20837/fi-rkl-guc/igt@run...@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#155]: https://gitlab.freedesktop.org/drm/intel/issues/155
  [i915#1602]: https://gitlab.freedesktop.org/drm/intel/issues/1602
  [i915#164]: https://gitlab.freedesktop.org/drm/intel/issues/164
  [i915#1886]: https://gitlab.freedesktop.org/drm/intel/issues/1886
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1993]:

[Intel-gfx] [PATCH 8/8] usb: typec: altmodes/displayport: Notify drm subsys of hotplug events

Use the new drm_connector_oob_hotplug_event() functions to let drm/kms
drivers know about DisplayPort over Type-C hotplug events.

Reviewed-by: Heikki Krogerus 
Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
Changes in v3:
- Only call drm_connector_oob_hotplug_event() on hpd status bit change
- Adjust for drm_connector_oob_hotplug_event() no longer having a data
  argument

Changes in v2:
- Add missing depends on DRM to TYPEC_DP_ALTMODE Kconfig entry
---
 drivers/usb/typec/altmodes/Kconfig   |  1 +
 drivers/usb/typec/altmodes/displayport.c | 23 +++
 2 files changed, 24 insertions(+)

diff --git a/drivers/usb/typec/altmodes/Kconfig 
b/drivers/usb/typec/altmodes/Kconfig
index 60d375e9c3c7..1a6b5e872b0d 100644
--- a/drivers/usb/typec/altmodes/Kconfig
+++ b/drivers/usb/typec/altmodes/Kconfig
@@ -4,6 +4,7 @@ menu "USB Type-C Alternate Mode drivers"
 
 config TYPEC_DP_ALTMODE
tristate "DisplayPort Alternate Mode driver"
+   depends on DRM
help
  DisplayPort USB Type-C Alternate Mode allows DisplayPort
  displays and adapters to be attached to the USB Type-C
diff --git a/drivers/usb/typec/altmodes/displayport.c 
b/drivers/usb/typec/altmodes/displayport.c
index aa669b9cf70e..c1d8c23baa39 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -11,8 +11,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
 #include "displayport.h"
 
 #define DP_HEADER(_dp, ver, cmd)   (VDO((_dp)->alt->svid, 1, ver, cmd) 
\
@@ -57,11 +59,13 @@ struct dp_altmode {
struct typec_displayport_data data;
 
enum dp_state state;
+   bool hpd;
 
struct mutex lock; /* device lock */
struct work_struct work;
struct typec_altmode *alt;
const struct typec_altmode *port;
+   struct fwnode_handle *connector_fwnode;
 };
 
 static int dp_altmode_notify(struct dp_altmode *dp)
@@ -125,6 +129,7 @@ static int dp_altmode_configure(struct dp_altmode *dp, u8 
con)
 static int dp_altmode_status_update(struct dp_altmode *dp)
 {
bool configured = !!DP_CONF_GET_PIN_ASSIGN(dp->data.conf);
+   bool hpd = !!(dp->data.status & DP_STATUS_HPD_STATE);
u8 con = DP_STATUS_CONNECTION(dp->data.status);
int ret = 0;
 
@@ -137,6 +142,11 @@ static int dp_altmode_status_update(struct dp_altmode *dp)
ret = dp_altmode_configure(dp, con);
if (!ret)
dp->state = DP_STATE_CONFIGURE;
+   } else {
+   if (dp->hpd != hpd) {
+   drm_connector_oob_hotplug_event(dp->connector_fwnode);
+   dp->hpd = hpd;
+   }
}
 
return ret;
@@ -512,6 +522,7 @@ static const struct attribute_group dp_altmode_group = {
 int dp_altmode_probe(struct typec_altmode *alt)
 {
const struct typec_altmode *port = typec_altmode_get_partner(alt);
+   struct fwnode_handle *fwnode;
struct dp_altmode *dp;
int ret;
 
@@ -540,6 +551,11 @@ int dp_altmode_probe(struct typec_altmode *alt)
alt->desc = "DisplayPort";
alt->ops = _altmode_ops;
 
+   fwnode = dev_fwnode(alt->dev.parent->parent); /* typec_port fwnode */
+   dp->connector_fwnode = fwnode_find_reference(fwnode, "displayport", 0);
+   if (IS_ERR(dp->connector_fwnode))
+   dp->connector_fwnode = NULL;
+
typec_altmode_set_drvdata(alt, dp);
 
dp->state = DP_STATE_ENTER;
@@ -555,6 +571,13 @@ void dp_altmode_remove(struct typec_altmode *alt)
 
sysfs_remove_group(>dev.kobj, _altmode_group);
cancel_work_sync(>work);
+
+   if (dp->connector_fwnode) {
+   if (dp->hpd)
+   drm_connector_oob_hotplug_event(dp->connector_fwnode);
+
+   fwnode_handle_put(dp->connector_fwnode);
+   }
 }
 EXPORT_SYMBOL_GPL(dp_altmode_remove);
 
-- 
2.31.1

[Intel-gfx] [PATCH 7/8] usb: typec: altmodes/displayport: Make dp_altmode_notify() more generic

Make dp_altmode_notify() handle the dp->data.conf == 0 case too,
rather then having separate code-paths for this in various places
which call it.

Reviewed-by: Heikki Krogerus 
Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/usb/typec/altmodes/displayport.c | 35 +---
 1 file changed, 13 insertions(+), 22 deletions(-)

diff --git a/drivers/usb/typec/altmodes/displayport.c 
b/drivers/usb/typec/altmodes/displayport.c
index b7f094435b00..aa669b9cf70e 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -66,10 +66,17 @@ struct dp_altmode {
 
 static int dp_altmode_notify(struct dp_altmode *dp)
 {
-   u8 state = get_count_order(DP_CONF_GET_PIN_ASSIGN(dp->data.conf));
+   unsigned long conf;
+   u8 state;
+
+   if (dp->data.conf) {
+   state = get_count_order(DP_CONF_GET_PIN_ASSIGN(dp->data.conf));
+   conf = TYPEC_MODAL_STATE(state);
+   } else {
+   conf = TYPEC_STATE_USB;
+   }
 
-   return typec_altmode_notify(dp->alt, TYPEC_MODAL_STATE(state),
-  >data);
+   return typec_altmode_notify(dp->alt, conf, >data);
 }
 
 static int dp_altmode_configure(struct dp_altmode *dp, u8 con)
@@ -137,21 +144,10 @@ static int dp_altmode_status_update(struct dp_altmode *dp)
 
 static int dp_altmode_configured(struct dp_altmode *dp)
 {
-   int ret;
-
sysfs_notify(>alt->dev.kobj, "displayport", "configuration");
-
-   if (!dp->data.conf)
-   return typec_altmode_notify(dp->alt, TYPEC_STATE_USB,
-   >data);
-
-   ret = dp_altmode_notify(dp);
-   if (ret)
-   return ret;
-
sysfs_notify(>alt->dev.kobj, "displayport", "pin_assignment");
 
-   return 0;
+   return dp_altmode_notify(dp);
 }
 
 static int dp_altmode_configure_vdm(struct dp_altmode *dp, u32 conf)
@@ -172,13 +168,8 @@ static int dp_altmode_configure_vdm(struct dp_altmode *dp, 
u32 conf)
}
 
ret = typec_altmode_vdm(dp->alt, header, , 2);
-   if (ret) {
-   if (DP_CONF_GET_PIN_ASSIGN(dp->data.conf))
-   dp_altmode_notify(dp);
-   else
-   typec_altmode_notify(dp->alt, TYPEC_STATE_USB,
->data);
-   }
+   if (ret)
+   dp_altmode_notify(dp);
 
return ret;
 }
-- 
2.31.1

[Intel-gfx] [PATCH 6/8] drm/i915/dp: Add support for out-of-bound hotplug events

On some Cherry Trail devices, DisplayPort over Type-C is supported through
a USB-PD microcontroller (e.g. a fusb302) + a mux to switch the superspeed
datalines between USB-3 and DP (e.g. a pi3usb30532). The kernel in this
case does the PD/alt-mode negotiation itself, rather then everything being
handled in firmware.

So the kernel itself picks an alt-mode, tells the Type-C "dongle" to switch
to DP mode and sets the mux accordingly. In this setup the HPD pin is not
connected, so the i915 driver needs to respond to a software event and scan
the DP port for changes manually.

This commit adds support for this. Together with the recent addition of
DP alt-mode support to the Type-C subsystem this makes DP over Type-C
work on these devices.

Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/intel_dp.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 75d4ebc66941..e807ffc2d782 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -4590,6 +4590,17 @@ static int intel_dp_connector_atomic_check(struct 
drm_connector *conn,
return intel_modeset_synced_crtcs(state, conn);
 }
 
+static void intel_dp_oob_hotplug_event(struct drm_connector *connector)
+{
+   struct intel_encoder *encoder = 
intel_attached_encoder(to_intel_connector(connector));
+   struct drm_i915_private *i915 = to_i915(connector->dev);
+
+   spin_lock_irq(>irq_lock);
+   i915->hotplug.event_bits |= BIT(encoder->hpd_pin);
+   spin_unlock_irq(>irq_lock);
+   queue_delayed_work(system_wq, >hotplug.hotplug_work, 0);
+}
+
 static const struct drm_connector_funcs intel_dp_connector_funcs = {
.force = intel_dp_force,
.fill_modes = drm_helper_probe_single_connector_modes,
@@ -4600,6 +4611,7 @@ static const struct drm_connector_funcs 
intel_dp_connector_funcs = {
.destroy = intel_connector_destroy,
.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
.atomic_duplicate_state = intel_digital_connector_duplicate_state,
+   .oob_hotplug_event = intel_dp_oob_hotplug_event,
 };
 
 static const struct drm_connector_helper_funcs intel_dp_connector_helper_funcs 
= {
-- 
2.31.1

[Intel-gfx] [PATCH 5/8] drm/i915: Associate ACPI connector nodes with connector entries (v2)

From: Heikki Krogerus 

On Intel platforms we know that the ACPI connector device
node order will follow the order the driver (i915) decides.
The decision is made using the custom Intel ACPI OpRegion
(intel_opregion.c), though the driver does not actually know
that the values it sends to ACPI there are used for
associating a device node for the connectors, and assigning
address for them.

In reality that custom Intel ACPI OpRegion actually violates
ACPI specification (we supply dynamic information to objects
that are defined static, for example _ADR), however, it
makes assigning correct connector node for a connector entry
straightforward (it's one-on-one mapping).

Changes in v2 (Hans de goede):
- Take a reference on the fwnode which we assign to the connector,
  for ACPI nodes this is a no-op but in the future we may see
  software-fwnodes assigned to connectors which are ref-counted.

Signed-off-by: Heikki Krogerus 
Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/intel_acpi.c| 46 
 drivers/gpu/drm/i915/display/intel_acpi.h|  3 ++
 drivers/gpu/drm/i915/display/intel_display.c |  1 +
 3 files changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_acpi.c 
b/drivers/gpu/drm/i915/display/intel_acpi.c
index 7cfe91fc05f2..72cac55c0f0f 100644
--- a/drivers/gpu/drm/i915/display/intel_acpi.c
+++ b/drivers/gpu/drm/i915/display/intel_acpi.c
@@ -282,3 +282,49 @@ void intel_acpi_device_id_update(struct drm_i915_private 
*dev_priv)
}
drm_connector_list_iter_end(_iter);
 }
+
+/* NOTE: The connector order must be final before this is called. */
+void intel_acpi_assign_connector_fwnodes(struct drm_i915_private *i915)
+{
+   struct drm_connector_list_iter conn_iter;
+   struct drm_device *drm_dev = >drm;
+   struct fwnode_handle *fwnode = NULL;
+   struct drm_connector *connector;
+   struct acpi_device *adev;
+
+   drm_connector_list_iter_begin(drm_dev, _iter);
+   drm_for_each_connector_iter(connector, _iter) {
+   /* Always getting the next, even when the last was not used. */
+   fwnode = device_get_next_child_node(drm_dev->dev, fwnode);
+   if (!fwnode)
+   break;
+
+   switch (connector->connector_type) {
+   case DRM_MODE_CONNECTOR_LVDS:
+   case DRM_MODE_CONNECTOR_eDP:
+   case DRM_MODE_CONNECTOR_DSI:
+   /*
+* Integrated displays have a specific address 0x1f on
+* most Intel platforms, but not on all of them.
+*/
+   adev = 
acpi_find_child_device(ACPI_COMPANION(drm_dev->dev),
+ 0x1f, 0);
+   if (adev) {
+   connector->fwnode =
+   
fwnode_handle_get(acpi_fwnode_handle(adev));
+   break;
+   }
+   fallthrough;
+   default:
+   connector->fwnode = fwnode_handle_get(fwnode);
+   break;
+   }
+   }
+   drm_connector_list_iter_end(_iter);
+   /*
+* device_get_next_child_node() takes a reference on the fwnode, if
+* we stopped iterating because we are out of connectors we need to
+* put this, otherwise fwnode is NULL and the put is a no-op.
+*/
+   fwnode_handle_put(fwnode);
+}
diff --git a/drivers/gpu/drm/i915/display/intel_acpi.h 
b/drivers/gpu/drm/i915/display/intel_acpi.h
index 9f197401c313..4a760a2baed9 100644
--- a/drivers/gpu/drm/i915/display/intel_acpi.h
+++ b/drivers/gpu/drm/i915/display/intel_acpi.h
@@ -13,6 +13,7 @@ void intel_register_dsm_handler(void);
 void intel_unregister_dsm_handler(void);
 void intel_dsm_get_bios_data_funcs_supported(struct drm_i915_private *i915);
 void intel_acpi_device_id_update(struct drm_i915_private *i915);
+void intel_acpi_assign_connector_fwnodes(struct drm_i915_private *i915);
 #else
 static inline void intel_register_dsm_handler(void) { return; }
 static inline void intel_unregister_dsm_handler(void) { return; }
@@ -20,6 +21,8 @@ static inline
 void intel_dsm_get_bios_data_funcs_supported(struct drm_i915_private *i915) { 
return; }
 static inline
 void intel_acpi_device_id_update(struct drm_i915_private *i915) { return; }
+static inline
+void intel_acpi_assign_connector_fwnodes(struct drm_i915_private *i915) { 
return; }
 #endif /* CONFIG_ACPI */
 
 #endif /* __INTEL_ACPI_H__ */
diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index a257e5dc381c..88e5fff64b8c 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -12561,6 +12561,7 @@ int intel_modeset_init_nogem(struct drm_i915_private 
*i915)

[Intel-gfx] [PATCH 4/8] drm/connector: Add support for out-of-band hotplug notification (v3)

Add a new drm_connector_oob_hotplug_event() function and
oob_hotplug_event drm_connector_funcs member.

On some hardware a hotplug event notification may come from outside the
display driver / device. An example of this is some USB Type-C setups
where the hardware muxes the DisplayPort data and aux-lines but does
not pass the altmode HPD status bit to the GPU's DP HPD pin.

In cases like this the new drm_connector_oob_hotplug_event() function can
be used to report these out-of-band events.

Changes in v2:
- Make drm_connector_oob_hotplug_event() take a fwnode as argument and
  have it call drm_connector_find_by_fwnode() internally. This allows
  making drm_connector_find_by_fwnode() a drm-internal function and
  avoids code outside the drm subsystem potentially holding on the
  a drm_connector reference for a longer period.

Changes in v3:
- Drop the data argument to the drm_connector_oob_hotplug_event
  function since it is not used atm. This can be re-added later when
  a use for it actually arises.

Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/drm_connector.c | 27 +++
 include/drm/drm_connector.h |  9 +
 2 files changed, 36 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 7d72bcefa4d6..e0a30e0ee86a 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2595,6 +2595,33 @@ struct drm_connector 
*drm_connector_find_by_fwnode(struct fwnode_handle *fwnode)
return found;
 }
 
+/**
+ * drm_connector_oob_hotplug_event - Report out-of-band hotplug event to 
connector
+ * @connector: connector to report the event on
+ *
+ * On some hardware a hotplug event notification may come from outside the 
display
+ * driver / device. An example of this is some USB Type-C setups where the 
hardware
+ * muxes the DisplayPort data and aux-lines but does not pass the altmode HPD
+ * status bit to the GPU's DP HPD pin.
+ *
+ * This function can be used to report these out-of-band events after obtaining
+ * a drm_connector reference through calling drm_connector_find_by_fwnode().
+ */
+void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode)
+{
+   struct drm_connector *connector;
+
+   connector = drm_connector_find_by_fwnode(connector_fwnode);
+   if (IS_ERR(connector))
+   return;
+
+   if (connector->funcs->oob_hotplug_event)
+   connector->funcs->oob_hotplug_event(connector);
+
+   drm_connector_put(connector);
+}
+EXPORT_SYMBOL(drm_connector_oob_hotplug_event);
+
 
 /**
  * DOC: Tile group
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 8132c48b56ae..79fa34e5ccdb 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -1084,6 +1084,14 @@ struct drm_connector_funcs {
 */
void (*atomic_print_state)(struct drm_printer *p,
   const struct drm_connector_state *state);
+
+   /**
+* @oob_hotplug_event:
+*
+* This will get called when a hotplug-event for a drm-connector
+* has been received from a source outside the display driver / device.
+*/
+   void (*oob_hotplug_event)(struct drm_connector *connector);
 };
 
 /**
@@ -1666,6 +1674,7 @@ drm_connector_is_unregistered(struct drm_connector 
*connector)
DRM_CONNECTOR_UNREGISTERED;
 }
 
+void drm_connector_oob_hotplug_event(struct fwnode_handle *connector_fwnode);
 const char *drm_get_connector_type_name(unsigned int connector_type);
 const char *drm_get_connector_status_name(enum drm_connector_status status);
 const char *drm_get_subpixel_order_name(enum subpixel_order order);
-- 
2.31.1

[Intel-gfx] [PATCH 3/8] drm/connector: Add drm_connector_find_by_fwnode() function (v3)

Add a function to find a connector based on a fwnode.

This will be used by the new drm_connector_oob_hotplug_event()
function which is added by the next patch in this patch-set.

Changes in v2:
- Complete rewrite to use a global connector list in drm_connector.c
  rather then using a class-dev-iter in drm_sysfs.c

Changes in v3:
- Add forward declaration for struct fwnode_handle to drm_crtc_internal.h
  (fixes warning reported by kernel test robot )

Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/drm_connector.c | 50 +
 drivers/gpu/drm/drm_crtc_internal.h |  2 ++
 include/drm/drm_connector.h |  8 +
 3 files changed, 60 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 3ad359a216ff..7d72bcefa4d6 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -65,6 +65,14 @@
  * support can instead use e.g. drm_helper_hpd_irq_event().
  */
 
+/*
+ * Global connector list for drm_connector_find_by_fwnode().
+ * Note drm_connector_[un]register() first take connector->lock and then
+ * take the connector_list_lock.
+ */
+static DEFINE_MUTEX(connector_list_lock);
+static LIST_HEAD(connector_list);
+
 struct drm_conn_prop_enum_list {
int type;
const char *name;
@@ -267,6 +275,7 @@ int drm_connector_init(struct drm_device *dev,
goto out_put_type_id;
}
 
+   INIT_LIST_HEAD(>global_connector_list_entry);
INIT_LIST_HEAD(>probed_modes);
INIT_LIST_HEAD(>modes);
mutex_init(>mutex);
@@ -534,6 +543,9 @@ int drm_connector_register(struct drm_connector *connector)
/* Let userspace know we have a new connector */
drm_sysfs_hotplug_event(connector->dev);
 
+   mutex_lock(_list_lock);
+   list_add_tail(>global_connector_list_entry, _list);
+   mutex_unlock(_list_lock);
goto unlock;
 
 err_debugfs:
@@ -562,6 +574,10 @@ void drm_connector_unregister(struct drm_connector 
*connector)
return;
}
 
+   mutex_lock(_list_lock);
+   list_del_init(>global_connector_list_entry);
+   mutex_unlock(_list_lock);
+
if (connector->funcs->early_unregister)
connector->funcs->early_unregister(connector);
 
@@ -2545,6 +2561,40 @@ int drm_mode_getconnector(struct drm_device *dev, void 
*data,
return ret;
 }
 
+/**
+ * drm_connector_find_by_fwnode - Find a connector based on the associated 
fwnode
+ * @fwnode: fwnode for which to find the matching drm_connector
+ *
+ * This functions looks up a drm_connector based on its associated fwnode. When
+ * a connector is found a reference to the connector is returned. The caller 
must
+ * call drm_connector_put() to release this reference when it is done with the
+ * connector.
+ *
+ * Returns: A reference to the found connector or an ERR_PTR().
+ */
+struct drm_connector *drm_connector_find_by_fwnode(struct fwnode_handle 
*fwnode)
+{
+   struct drm_connector *connector, *found = ERR_PTR(-ENODEV);
+
+   if (!fwnode)
+   return ERR_PTR(-ENODEV);
+
+   mutex_lock(_list_lock);
+
+   list_for_each_entry(connector, _list, 
global_connector_list_entry) {
+   if (connector->fwnode == fwnode ||
+   (connector->fwnode && connector->fwnode->secondary == 
fwnode)) {
+   drm_connector_get(connector);
+   found = connector;
+   break;
+   }
+   }
+
+   mutex_unlock(_list_lock);
+
+   return found;
+}
+
 
 /**
  * DOC: Tile group
diff --git a/drivers/gpu/drm/drm_crtc_internal.h 
b/drivers/gpu/drm/drm_crtc_internal.h
index edb772947cb4..63279e984342 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -58,6 +58,7 @@ struct drm_property;
 struct edid;
 struct kref;
 struct work_struct;
+struct fwnode_handle;
 
 /* drm_crtc.c */
 int drm_mode_crtc_set_obj_prop(struct drm_mode_object *obj,
@@ -186,6 +187,7 @@ int drm_connector_set_obj_prop(struct drm_mode_object *obj,
 int drm_connector_create_standard_properties(struct drm_device *dev);
 const char *drm_get_connector_force_name(enum drm_connector_force force);
 void drm_connector_free_work_fn(struct work_struct *work);
+struct drm_connector *drm_connector_find_by_fwnode(struct fwnode_handle 
*fwnode);
 
 /* IOCTL */
 int drm_connector_property_set_ioctl(struct drm_device *dev,
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 69dd488a2154..8132c48b56ae 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -1247,6 +1247,14 @@ struct drm_connector {
 */
struct list_head head;
 
+   /**
+* @global_connector_list_entry:
+*
+* Connector entry in the global connector-list, used by
+* drm_connector_find_by_fwnode().
+*/
+   struct list_head global_connector_list_entry;
+

[Intel-gfx] [PATCH 2/8] drm/connector: Add a fwnode pointer to drm_connector and register with ACPI (v2)

Add a fwnode pointer to struct drm_connector and register an acpi_bus_type
for the connectors with the ACPI subsystem (when CONFIG_ACPI is enabled).

The adding of the fwnode pointer allows drivers to associate a fwnode
that represents a connector with that connector.

When the new fwnode pointer points to an ACPI-companion, then the new
acpi_bus_type will cause the ACPI subsys to bind the device instantiated
for the connector with the fwnode by calling acpi_bind_one(). This will
result in a firmware_node symlink under /sys/class/card#-/
which helps to verify that the fwnode-s and connectors are properly
matched.

Changes in v2:
- Make drm_connector_cleanup() call fwnode_handle_put() on
  connector->fwnode and document this

Co-developed-by: Heikki Krogerus 
Signed-off-by: Heikki Krogerus 
Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/drm_connector.c |  2 ++
 drivers/gpu/drm/drm_sysfs.c | 37 +
 include/drm/drm_connector.h |  8 +++
 3 files changed, 47 insertions(+)

diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 2ba257b1ae20..3ad359a216ff 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -474,6 +474,8 @@ void drm_connector_cleanup(struct drm_connector *connector)
drm_mode_object_unregister(dev, >base);
kfree(connector->name);
connector->name = NULL;
+   fwnode_handle_put(connector->fwnode);
+   connector->fwnode = NULL;
spin_lock_irq(>mode_config.connector_list_lock);
list_del(>head);
dev->mode_config.num_connector--;
diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index f9d92bbb1f98..bf9edce8e2d1 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -10,6 +10,7 @@
  * Copyright (c) 2003-2004 IBM Corp.
  */
 
+#include 
 #include 
 #include 
 #include 
@@ -56,6 +57,39 @@ static struct device_type drm_sysfs_device_connector = {
 
 struct class *drm_class;
 
+#ifdef CONFIG_ACPI
+static bool drm_connector_acpi_bus_match(struct device *dev)
+{
+   return dev->type == _sysfs_device_connector;
+}
+
+static struct acpi_device *drm_connector_acpi_find_companion(struct device 
*dev)
+{
+   struct drm_connector *connector = to_drm_connector(dev);
+
+   return to_acpi_device_node(connector->fwnode);
+}
+
+static struct acpi_bus_type drm_connector_acpi_bus = {
+   .name = "drm_connector",
+   .match = drm_connector_acpi_bus_match,
+   .find_companion = drm_connector_acpi_find_companion,
+};
+
+static void drm_sysfs_acpi_register(void)
+{
+   register_acpi_bus_type(_connector_acpi_bus);
+}
+
+static void drm_sysfs_acpi_unregister(void)
+{
+   unregister_acpi_bus_type(_connector_acpi_bus);
+}
+#else
+static void drm_sysfs_acpi_register(void) { }
+static void drm_sysfs_acpi_unregister(void) { }
+#endif
+
 static char *drm_devnode(struct device *dev, umode_t *mode)
 {
return kasprintf(GFP_KERNEL, "dri/%s", dev_name(dev));
@@ -89,6 +123,8 @@ int drm_sysfs_init(void)
}
 
drm_class->devnode = drm_devnode;
+
+   drm_sysfs_acpi_register();
return 0;
 }
 
@@ -101,6 +137,7 @@ void drm_sysfs_destroy(void)
 {
if (IS_ERR_OR_NULL(drm_class))
return;
+   drm_sysfs_acpi_unregister();
class_remove_file(drm_class, _attr_version.attr);
class_destroy(drm_class);
drm_class = NULL;
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index 1647960c9e50..69dd488a2154 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -1228,6 +1228,14 @@ struct drm_connector {
struct device *kdev;
/** @attr: sysfs attributes */
struct device_attribute *attr;
+   /**
+* @fwnode: associated fwnode supplied by platform firmware
+*
+* Drivers can set this to associate a fwnode with a connector, drivers
+* are expected to get a reference on the fwnode when setting this.
+* drm_connector_cleanup() will call fwnode_handle_put() on this.
+*/
+   struct fwnode_handle *fwnode;
 
/**
 * @head:
-- 
2.31.1

[Intel-gfx] [PATCH 0/8] drm + usb-type-c: Add support for out-of-band hotplug notification (v4 resend)

Hi all,

Here is a rebased-resend of v4 of my patchset making DP over Type-C work on
devices where the Type-C controller does not drive the HPD pin on the GPU,
but instead we need to forward HPD events from the Type-C controller to
the DRM driver.

Changes in v4 resend:
- Rebase on top of latest drm-tip

Changes in v4:
- Rebase on top of latest drm-tip
- Add forward declaration for struct fwnode_handle to drm_crtc_internal.h
  (fixes warning reported by kernel test robot )
- Add Heikki's Reviewed-by to patch 7 & 8
- Add Heikki's Tested-by to the series

Changes in v3:
- Base on top of latest drm-tip, which should fix the CI being unable to
  apply (and thus to test) the patches
- Make intel_acpi_assign_connector_fwnodes() take a ref on the fwnode
  it stores in connector->fwnode and have drm_connector_cleanup() put
  this reference
- Drop data argument from drm_connector_oob_hotplug_event()
- Make the Type-C DP altmode code only call drm_connector_oob_hotplug_event()
  when the HPD bit in the status vdo changes
- Drop the platform/x86/intel_cht_int33fe: Correct "displayport" fwnode
  reference patch, this will be merged independently through the pdx86 tree

Changes in v2:
- Replace the bogus "drm/connector: Make the drm_sysfs connector->kdev
  device hold a reference to the connector" patch with:
  "drm/connector: Give connector sysfs devices there own device_type"
  the new patch is a dep for patch 2/9 see the patches

- Stop using a class-dev-iter, instead at a global connector list
  to drm_connector.c and use that to find the connector by the fwnode,
  similar to how we already do this in drm_panel.c and drm_bridge.c

- Make drm_connector_oob_hotplug_event() take a fwnode pointer as
  argument, rather then a drm_connector pointer and let it do the
  lookup itself. This allows making drm_connector_find_by_fwnode() a
  drm-internal function and avoids code outside the drm subsystem
  potentially holding on the a drm_connector reference for a longer
  period.

This series not only touches drm subsys files but it also touches
drivers/usb/typec/altmodes/typec_displayport.c, that file usually
does not see a whole lot of changes. So I believe it would be best
to just merge the entire series through drm-misc, Assuming we can
get an ack from Greg for merging the typec_displayport.c changes
this way.

Regards,

Hans

Hans de Goede (7):
  drm/connector: Give connector sysfs devices there own device_type
  drm/connector: Add a fwnode pointer to drm_connector and register with
ACPI (v2)
  drm/connector: Add drm_connector_find_by_fwnode() function (v3)
  drm/connector: Add support for out-of-band hotplug notification (v3)
  drm/i915/dp: Add support for out-of-bound hotplug events
  usb: typec: altmodes/displayport: Make dp_altmode_notify() more
generic
  usb: typec: altmodes/displayport: Notify drm subsys of hotplug events

Heikki Krogerus (1):
  drm/i915: Associate ACPI connector nodes with connector entries (v2)

 drivers/gpu/drm/drm_connector.c  | 79 ++
 drivers/gpu/drm/drm_crtc_internal.h  |  2 +
 drivers/gpu/drm/drm_sysfs.c  | 87 +---
 drivers/gpu/drm/i915/display/intel_acpi.c| 46 +++
 drivers/gpu/drm/i915/display/intel_acpi.h|  3 +
 drivers/gpu/drm/i915/display/intel_display.c |  1 +
 drivers/gpu/drm/i915/display/intel_dp.c  | 12 +++
 drivers/usb/typec/altmodes/Kconfig   |  1 +
 drivers/usb/typec/altmodes/displayport.c | 58 -
 include/drm/drm_connector.h  | 25 ++
 10 files changed, 279 insertions(+), 35 deletions(-)

-- 
2.31.1

[Intel-gfx] [PATCH 1/8] drm/connector: Give connector sysfs devices there own device_type

Give connector sysfs devices there own device_type, this allows us to
check if a device passed to functions dealing with generic devices is
a drm_connector or not.

A check like this is necessary in the drm_connector_acpi_bus_match()
function added in the next patch in this series.

Tested-by: Heikki Krogerus 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/drm_sysfs.c | 50 +++--
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c
index 968a9560b4aa..f9d92bbb1f98 100644
--- a/drivers/gpu/drm/drm_sysfs.c
+++ b/drivers/gpu/drm/drm_sysfs.c
@@ -50,6 +50,10 @@ static struct device_type drm_sysfs_device_minor = {
.name = "drm_minor"
 };
 
+static struct device_type drm_sysfs_device_connector = {
+   .name = "drm_connector",
+};
+
 struct class *drm_class;
 
 static char *drm_devnode(struct device *dev, umode_t *mode)
@@ -102,6 +106,11 @@ void drm_sysfs_destroy(void)
drm_class = NULL;
 }
 
+static void drm_sysfs_release(struct device *dev)
+{
+   kfree(dev);
+}
+
 /*
  * Connector properties
  */
@@ -273,27 +282,47 @@ static const struct attribute_group 
*connector_dev_groups[] = {
 int drm_sysfs_connector_add(struct drm_connector *connector)
 {
struct drm_device *dev = connector->dev;
+   struct device *kdev;
+   int r;
 
if (connector->kdev)
return 0;
 
-   connector->kdev =
-   device_create_with_groups(drm_class, dev->primary->kdev, 0,
- connector, connector_dev_groups,
- "card%d-%s", dev->primary->index,
- connector->name);
+   kdev = kzalloc(sizeof(*kdev), GFP_KERNEL);
+   if (!kdev)
+   return -ENOMEM;
+
+   device_initialize(kdev);
+   kdev->class = drm_class;
+   kdev->type = _sysfs_device_connector;
+   kdev->parent = dev->primary->kdev;
+   kdev->groups = connector_dev_groups;
+   kdev->release = drm_sysfs_release;
+   dev_set_drvdata(kdev, connector);
+
+   r = dev_set_name(kdev, "card%d-%s", dev->primary->index, 
connector->name);
+   if (r)
+   goto err_free;
+
DRM_DEBUG("adding \"%s\" to sysfs\n",
  connector->name);
 
-   if (IS_ERR(connector->kdev)) {
-   DRM_ERROR("failed to register connector device: %ld\n", 
PTR_ERR(connector->kdev));
-   return PTR_ERR(connector->kdev);
+   r = device_add(kdev);
+   if (r) {
+   DRM_ERROR("failed to register connector device: %d\n", r);
+   goto err_free;
}
 
+   connector->kdev = kdev;
+
if (connector->ddc)
return sysfs_create_link(>kdev->kobj,
 >ddc->dev.kobj, "ddc");
return 0;
+
+err_free:
+   put_device(kdev);
+   return r;
 }
 
 void drm_sysfs_connector_remove(struct drm_connector *connector)
@@ -374,11 +403,6 @@ void drm_sysfs_connector_status_event(struct drm_connector 
*connector,
 }
 EXPORT_SYMBOL(drm_sysfs_connector_status_event);
 
-static void drm_sysfs_release(struct device *dev)
-{
-   kfree(dev);
-}
-
 struct device *drm_sysfs_minor_alloc(struct drm_minor *minor)
 {
const char *minor_str;
-- 
2.31.1

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

On Tue, Aug 17, 2021 at 10:41 PM Michal Wajdeczko
 wrote:
> On 17.08.2021 19:34, Daniel Vetter wrote:
> > On Tue, Aug 17, 2021 at 07:27:18PM +0200, Michal Wajdeczko wrote:
> >> On 17.08.2021 19:20, Daniel Vetter wrote:
> >>> On Tue, Aug 17, 2021 at 09:36:49AM -0700, Matthew Brost wrote:
>  On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
> >> Add GuC kernel doc for all structures added thus far for GuC submission
> >> and update the main GuC submission section with the new interface
> >> details.
> >>
> >> Signed-off-by: Matthew Brost 
> >
> > There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
> > locking design that also doesn't feel too consistent.
> >
> 
>  That is a different layer than GuC submission so I don't we should
>  mention anything about that layer here. Didn't really write that layer
>  and it super painful to touch that code so I'm going to stay out of any
>  rework you think we need to do there.
> >>>
> >>> Well there's three locks
> >>
> >> It's likely me.
> >>
> >> There is one lock for the recv CTB, one for the send CTB, one for the
> >> list of read messages ready to post process - do you want to use single
> >> lock for both CTBs or single lock for all cases in CT ?
> >>
> >> Michal
> >>
> >> disclaimer: outstanding_g2h are not part of the CTB layer
> >
> > Why? Like apparently there's not enough provided by that right now, so
> > Matt is now papering over that gap with more book-keeping in the next
> > layer. If the layer is not doing a good job it's either the wrong layer,
> > or shouldn't be a layer.
>
> Note that all "outstanding g2h" used by Matt are kind of unsolicited
> "event" messages received from the GuC, that CTB layer is unable
> correlate. CTB only tracks "requests" messages for which "response" (or
> "error") reply is expected. Thus if CTB client is expecting some extra
> message for its previous communication with GuC, it must track it on its
> own, as only client knows where in the CTB message payload, actual
> correlation data (like context ID) is stored.

I thought there's some patches already to reserve g2h space because
guc dies if there's none left? Which would mean ctb should know
already whent there's more coming.

The problem is if every user of guc has to track this themselves we
get a pretty bad spaghetti monster around guc reset. Currently it's
only guc submission, so we could fix it there by wrapping a lock
around all guc submissions it does, but already on the wakeup side
it's more tricky. That really feels like work around issues somewhere
else.

> > And yeah the locking looks like serious amounts of overkill, was it
> > benchmarked that we need the 3 separate locks for this?
>
> I'm not aware of any (micro)benchmarking, but definitely we need some,
> we were just gradually moving from single threaded blocking CTB calls
> (waiting for CTB descriptor updates under mutex) to non-blocking calls
> (protecting only reads/writes to CTB descriptors with spinlock - to
> allow CTB usage from tasklet/irq).

Spinlock is fine, it it really protects everything (I've found a bunch
of checks outside of these locks that leave me wondering). Multiple
spinlocks needs justification since at least to my understand there's
a pile of overlapping stuff you need to protect. Like the reservations
of g2h space.

> And I was just assuming that we can sacrifice few more integers [1] and
> have dedicated spinlocks and avoid early over-optimization.

None of this has anything to do with saving memory, that's entirely
irrelevant here, but about complexity. Any lock you add makes the
complexity worse, and I'm not understanding why ctb needs 3 spinlocks
instead of just one.

If the only justification for this is that maybe it makes things
faster, and it was not properly benchmarked first (microbenchhmarks
don't count if it's not a relevant end use case that umds actually
care about) then it has to go and be simplified. Really should have
never landed, because taking locking complexity out is much harder
than adding it in the first place.

And the current overall i915-gem code is definitely on the wrong side
of "too complex locking design", so there's no wiggle room here for
exceptions.

> > While reading ctb code I also noticed that a bunch of stuff is checked
> > before we grab the relevant spinlocks, and it's not
> > - wrapped in a WARN_ON or GEM_BUG_ON or similar to just check everything
> >   works as expected
> > - there's no other locks
> >
> > So either racy, buggy or playing some extremely clever tricks. None of
> > which is very good.
>
> I'm open to improve that code as needed, but maybe in exchange and to
> increase motivation please provide feedback on already posted fixes [2] ;)

Sure can try, but also these patches have been sitting on the list for
almost 7 weeks now with absolutely. It's your job as

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for GPD Win Max display fixes (rev4)

== Series Details ==

Series: GPD Win Max display fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/90483/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:expected struct 
i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1901:21:got void [noderef] 
__iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1901:21: warning: incorrect type 
in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1268:24: warning: Using plain 
integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for GPD Win Max display fixes (rev4)

== Series Details ==

Series: GPD Win Max display fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/90483/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b615ccdfa166 drm/i915/opregion: add support for mailbox #5 EDID
-:24: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#24: 
https://patchwork.kernel.org/project/intel-gfx/patch/20200828061941.17051-1-jani.nik...@intel.com/

total: 0 errors, 1 warnings, 0 checks, 141 lines checked
b8a3470e871a drm: Add orientation quirk for GPD Win Max

[Intel-gfx] [PATCH v3 2/2] drm: Add orientation quirk for GPD Win Max

2021-08-17 Thread Anisse Astier

Panel is 800x1280, but mounted on a laptop form factor, sideways.

Reviewed-by: Hans de Goede 
Signed-off-by: Anisse Astier 
---
 drivers/gpu/drm/drm_panel_orientation_quirks.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/drm_panel_orientation_quirks.c 
b/drivers/gpu/drm/drm_panel_orientation_quirks.c
index 4e965b0f5502..643b55f9a9d1 100644
--- a/drivers/gpu/drm/drm_panel_orientation_quirks.c
+++ b/drivers/gpu/drm/drm_panel_orientation_quirks.c
@@ -160,6 +160,12 @@ static const struct dmi_system_id orientation_data[] = {
  DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "MicroPC"),
},
.driver_data = (void *)_rightside_up,
+   }, {/* GPD Win Max */
+   .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "GPD"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "G1619-01"),
+   },
+   .driver_data = (void *)_rightside_up,
}, {/*
 * GPD Pocket, note that the the DMI data is less generic then
 * it seems, devices with a board-vendor of "AMI Corporation"
-- 
2.31.1

[Intel-gfx] [PATCH v3 1/2] drm/i915/opregion: add support for mailbox #5 EDID

2021-08-17 Thread Anisse Astier

The ACPI OpRegion Mailbox #5 ASLE extension may contain an EDID to be
used for the embedded display. Add support for using it via by adding
the EDID to the list of available modes on the connector, and use it for
eDP when available.

If a panel's EDID is broken, there may be an override EDID set in the
ACPI OpRegion mailbox #5. Use it if available.

Fixes the GPD Win Max laptop display, which seems to only use this
mechanism to provide a proper EDID for its eDP screen. It would have
been better to provide the EDID through the ACPI _DDC method instead, to
have a more generic solution, but it seems the designers of this system
did not consider it, and shipped the firmware without it.

Based on original patch series by: Jani Nikula 
https://patchwork.kernel.org/project/intel-gfx/patch/20200828061941.17051-1-jani.nik...@intel.com/

Changes since Jani Nikula's series:
 - EDID is copied and validated with drm_edid_is_valid
 - Mode is now added via drm_add_edid_modes instead of using override
   mechanism
 - squashed the two patches

Cc: Jani Nikula 
Cc: Uma Shankar 
Cc: Ville Syrjälä 
Signed-off-by: Anisse Astier 
---
 drivers/gpu/drm/i915/display/intel_dp.c   |  3 +
 drivers/gpu/drm/i915/display/intel_opregion.c | 69 ++-
 drivers/gpu/drm/i915/display/intel_opregion.h |  8 +++
 3 files changed, 79 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
b/drivers/gpu/drm/i915/display/intel_dp.c
index 75d4ebc66941..f9254c0df1a2 100644
--- a/drivers/gpu/drm/i915/display/intel_dp.c
+++ b/drivers/gpu/drm/i915/display/intel_dp.c
@@ -5183,6 +5183,9 @@ static bool intel_edp_init_connector(struct intel_dp 
*intel_dp,
goto out_vdd_off;
}
 
+   /* Set up override EDID, if any, from ACPI OpRegion */
+   intel_opregion_edid_probe(intel_connector);
+
mutex_lock(>mode_config.mutex);
edid = drm_get_edid(connector, _dp->aux.ddc);
if (edid) {
diff --git a/drivers/gpu/drm/i915/display/intel_opregion.c 
b/drivers/gpu/drm/i915/display/intel_opregion.c
index 3855fba70980..b1b87ed758ba 100644
--- a/drivers/gpu/drm/i915/display/intel_opregion.c
+++ b/drivers/gpu/drm/i915/display/intel_opregion.c
@@ -196,6 +196,8 @@ struct opregion_asle_ext {
 #define ASLE_IUER_WINDOWS_BTN  (1 << 1)
 #define ASLE_IUER_POWER_BTN(1 << 0)
 
+#define ASLE_PHED_EDID_VALID_MASK  0x3
+
 /* Software System Control Interrupt (SWSCI) */
 #define SWSCI_SCIC_INDICATOR   (1 << 0)
 #define SWSCI_SCIC_MAIN_FUNCTION_SHIFT 1
@@ -909,8 +911,10 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
opregion->asle->ardy = ASLE_ARDY_NOT_READY;
}
 
-   if (mboxes & MBOX_ASLE_EXT)
+   if (mboxes & MBOX_ASLE_EXT) {
drm_dbg(_priv->drm, "ASLE extension supported\n");
+   opregion->asle_ext = base + OPREGION_ASLE_EXT_OFFSET;
+   }
 
if (intel_load_vbt_firmware(dev_priv) == 0)
goto out;
@@ -1037,6 +1041,68 @@ intel_opregion_get_panel_type(struct drm_i915_private 
*dev_priv)
return ret - 1;
 }
 
+/**
+ * intel_opregion_edid_probe - Add EDID from ACPI OpRegion mailbox #5
+ * @intel_connector: eDP connector
+ *
+ * This reads the ACPI Opregion mailbox #5 to extract the EDID that is passed
+ * to it.
+ *
+ * Will take a lock on the DRM mode_config to add the EDID; make sure it isn't
+ * called with lock taken.
+ *
+ */
+void intel_opregion_edid_probe(struct intel_connector *intel_connector)
+{
+   struct drm_connector *connector = _connector->base;
+   struct drm_i915_private *i915 = to_i915(connector->dev);
+   struct intel_opregion *opregion = >opregion;
+   const void *in_edid;
+   const struct edid *edid;
+   struct edid *new_edid;
+   int len, ret, num;
+
+   if (!opregion->asle_ext || connector->override_edid)
+   return;
+
+   in_edid = opregion->asle_ext->bddc;
+
+   /* Validity corresponds to number of 128-byte blocks */
+   len = (opregion->asle_ext->phed & ASLE_PHED_EDID_VALID_MASK) * 128;
+   if (!len || !memchr_inv(in_edid, 0, len))
+   return;
+
+   edid = in_edid;
+
+   if (len < EDID_LENGTH * (1 + edid->extensions)) {
+   drm_dbg_kms(>drm, "Invalid EDID in ACPI OpRegion (Mailbox 
#5)\n");
+   return;
+   }
+   new_edid = drm_edid_duplicate(edid);
+   if (!new_edid) {
+   drm_err(>drm, "Cannot duplicate EDID\n");
+   return;
+   }
+   if (!drm_edid_is_valid(new_edid)) {
+   kfree(new_edid);
+   drm_dbg_kms(>drm, "Cannot validate EDID in ACPI OpRegion 
(Mailbox #5)\n");
+   return;
+   }
+
+   ret = drm_connector_update_edid_property(connector, new_edid);
+   if (ret) {
+   kfree(new_edid);
+   return;
+   }
+
+   mutex_lock(>dev->mode_config.mutex);
+   num = drm_add_edid_modes(connector, new_edid);
+

[Intel-gfx] [PATCH v3 0/2] GPD Win Max display fixes

2021-08-17 Thread Anisse Astier

This patch series is for making the GPD Win Max display usable with
Linux.

The GPD Win Max is a small laptop, and its eDP panel does not send an
EDID over DPCD; the EDID is instead available in the intel opregion, in
mailbox #5 [1]

The first patch is based on Jani's patch series [2] adding support for
the opregion, with changes. I've changed authorship, but I'd be glad to
revert it

The second patch is just to fix the orientation of the panel.

Changes since v1:
 - rebased on drm-tip
 - squashed patch 1 & 2
 - picked up Reviewed-by from Hans de Goede (thanks for the review)

Changes since v2:
 - rebased on drm-tip
 - updated commit message

When v2 was initially sent [3] Ville Syrjälä suggested that it might be
a good idea to use the ACPI _DDC method instead to get the EDID, to
cover a wider range of hardware. Unfortunately, it doesn't seem
available on GPD Win Max, so I think this work should be done
independently, and this patch series considered separately.

[1]: https://gitlab.freedesktop.org/drm/intel/-/issues/3454
[2]: 
https://patchwork.kernel.org/project/intel-gfx/patch/20200828061941.17051-1-jani.nik...@intel.com/
[3]: 
https://patchwork.kernel.org/project/intel-gfx/patch/20210531204642.4907-2-ani...@astier.eu/


Anisse Astier (2):
  drm/i915/opregion: add support for mailbox #5 EDID
  drm: Add orientation quirk for GPD Win Max

 .../gpu/drm/drm_panel_orientation_quirks.c|  6 ++
 drivers/gpu/drm/i915/display/intel_dp.c   |  3 +
 drivers/gpu/drm/i915/display/intel_opregion.c | 69 ++-
 drivers/gpu/drm/i915/display/intel_opregion.h |  8 +++
 4 files changed, 85 insertions(+), 1 deletion(-)

-- 
2.31.1

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

2021-08-17 Thread Michal Wajdeczko




On 17.08.2021 19:34, Daniel Vetter wrote:
> On Tue, Aug 17, 2021 at 07:27:18PM +0200, Michal Wajdeczko wrote:
>>
>>
>> On 17.08.2021 19:20, Daniel Vetter wrote:
>>> On Tue, Aug 17, 2021 at 09:36:49AM -0700, Matthew Brost wrote:
 On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
>> Add GuC kernel doc for all structures added thus far for GuC submission
>> and update the main GuC submission section with the new interface
>> details.
>>
>> Signed-off-by: Matthew Brost 
>
> There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
> locking design that also doesn't feel too consistent.
>

 That is a different layer than GuC submission so I don't we should
 mention anything about that layer here. Didn't really write that layer
 and it super painful to touch that code so I'm going to stay out of any
 rework you think we need to do there. 
>>>
>>> Well there's three locks 
>>
>> It's likely me.
>>
>> There is one lock for the recv CTB, one for the send CTB, one for the
>> list of read messages ready to post process - do you want to use single
>> lock for both CTBs or single lock for all cases in CT ?
>>
>> Michal
>>
>> disclaimer: outstanding_g2h are not part of the CTB layer
> 
> Why? Like apparently there's not enough provided by that right now, so
> Matt is now papering over that gap with more book-keeping in the next
> layer. If the layer is not doing a good job it's either the wrong layer,
> or shouldn't be a layer.

Note that all "outstanding g2h" used by Matt are kind of unsolicited
"event" messages received from the GuC, that CTB layer is unable
correlate. CTB only tracks "requests" messages for which "response" (or
"error") reply is expected. Thus if CTB client is expecting some extra
message for its previous communication with GuC, it must track it on its
own, as only client knows where in the CTB message payload, actual
correlation data (like context ID) is stored.

> 
> And yeah the locking looks like serious amounts of overkill, was it
> benchmarked that we need the 3 separate locks for this?

I'm not aware of any (micro)benchmarking, but definitely we need some,
we were just gradually moving from single threaded blocking CTB calls
(waiting for CTB descriptor updates under mutex) to non-blocking calls
(protecting only reads/writes to CTB descriptors with spinlock - to
allow CTB usage from tasklet/irq).

And I was just assuming that we can sacrifice few more integers [1] and
have dedicated spinlocks and avoid early over-optimization.

> 
> While reading ctb code I also noticed that a bunch of stuff is checked
> before we grab the relevant spinlocks, and it's not
> - wrapped in a WARN_ON or GEM_BUG_ON or similar to just check everything
>   works as expected
> - there's no other locks
> 
> So either racy, buggy or playing some extremely clever tricks. None of
> which is very good.

I'm open to improve that code as needed, but maybe in exchange and to
increase motivation please provide feedback on already posted fixes [2] ;)

Michal

[1]
https://elixir.bootlin.com/linux/latest/source/arch/ia64/include/asm/spinlock_types.h#L10
[2] https://patchwork.freedesktop.org/series/92118/

> -Daniel
> 
>>
>>
>>> there plus it leaks out (you have your
>>> outstanding_submission_g2h atomic_t which is very closed tied to well,
>>> outstanding guc transmissions), so I guess I need someone else for that?
>>>
>

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/adl_p: Also disable underrun recovery with MSO

== Series Details ==

Series: drm/i915/adl_p: Also disable underrun recovery with MSO
URL   : https://patchwork.freedesktop.org/series/93732/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10490_full -> Patchwork_20835_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20835_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][1] ([fdo#109271] / [i915#1099]) +4 
similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][2] -> [TIMEOUT][3] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb7/igt@gem_...@unwedge-stress.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-tglb2/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][4] ([i915#2846])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-apl2/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-kbl:  [PASS][5] -> [FAIL][6] ([i915#2842]) +1 similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl6/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-kbl2/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][7] ([i915#2842])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-iclb2/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][8] -> [FAIL][9] ([i915#2842]) +1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_suspend@basic-s4-devices:
- shard-glk:  NOTRUN -> [DMESG-WARN][10] ([i915#1610])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-glk6/igt@gem_exec_susp...@basic-s4-devices.html

  * igt@gem_mmap_gtt@cpuset-big-copy:
- shard-iclb: [PASS][11] -> [FAIL][12] ([i915#307])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-iclb3/igt@gem_mmap_...@cpuset-big-copy.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-iclb2/igt@gem_mmap_...@cpuset-big-copy.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][13] ([i915#2658])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-apl7/igt@gem_pr...@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][14] ([i915#2658])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-tglb3/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-to-vebox-y-tiled:
- shard-iclb: NOTRUN -> [SKIP][15] ([i915#768])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-iclb8/igt@gem_render_c...@y-tiled-to-vebox-y-tiled.html

  * igt@gem_userptr_blits@access-control:
- shard-tglb: NOTRUN -> [SKIP][16] ([i915#3297]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-tglb6/igt@gem_userptr_bl...@access-control.html
- shard-iclb: NOTRUN -> [SKIP][17] ([i915#3297]) +1 similar issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-iclb8/igt@gem_userptr_bl...@access-control.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#3002])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-apl7/igt@gem_userptr_bl...@input-checking.html

  * igt@gen9_exec_parse@allowed-single:
- shard-skl:  [PASS][19] -> [DMESG-WARN][20] ([i915#1436] / 
[i915#716])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-skl4/igt@gen9_exec_pa...@allowed-single.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-skl10/igt@gen9_exec_pa...@allowed-single.html

  * igt@i915_suspend@forcewake:
- shard-apl:  NOTRUN -> [DMESG-WARN][21] ([i915#180])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-apl7/igt@i915_susp...@forcewake.html

  * igt@kms_addfb_basic@invalid-smem-bo-on-discrete:
- shard-tglb: NOTRUN -> [SKIP][22] ([i915#3826])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/shard-tglb6/igt@kms_addfb_ba...@invalid-smem-bo-on-discrete.html
-

Re: [Intel-gfx] [PATCH 02/22] drm/i915/guc: Fix outstanding G2H accounting

On Tue, Aug 17, 2021 at 11:39:29AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:19AM -0700, Matthew Brost wrote:
> > A small race that could result in incorrect accounting of the number
> > of outstanding G2H. Basically prior to this patch we did not increment
> > the number of outstanding G2H if we encoutered a GT reset while sending
> > a H2G. This was incorrect as the context state had already been updated
> > to anticipate a G2H response thus the counter should be incremented.
> > 
> > Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in 
> > buffer")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 69faa39da178..b5d3972ae164 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -360,11 +360,13 @@ static int guc_submission_send_busy_loop(struct 
> > intel_guc *guc,
> >  {
> > int err;
> >  
> > -   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
> > -
> > -   if (!err && g2h_len_dw)
> > +   if (g2h_len_dw)
> > atomic_inc(>outstanding_submission_g2h);
> >  
> > +   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
> 
> I'm majorly confused by the _busy_loop naming scheme, especially here.
> Like "why do we want to send a busy loop comand to guc, this doesn't make
> sense".
> 
> It seems like you're using _busy_loop as a suffix for "this is ok to be
> called in atomic context". The linux kernel bikeshed for this is generally
> _atomic() (or _in_atomic() or something like that).  Would be good to
> rename to make this slightly less confusing.

I'd like to save the bikeshedding for follow ups if we can as we should
get the functional fixes in to stablize the stack + clean up the locking
to a somewhat sane state ASAP. Everyone has their favorite color of
paint...

> -Daniel
> 
> > +   if (err == -EBUSY && g2h_len_dw)
> > +   atomic_dec(>outstanding_submission_g2h);
> > +

Also here is an example of why this really should be owned by the
submission code, it wants to increment this here even if the send failed
due to -ENODEV (GT reset in flight) as this is an internal counter of
how many G2H will need to be scrubbed.

Matt

> > return err;
> >  }
> >  
> > -- 
> > 2.32.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/sched dependency handling and implicit sync fixes (rev4)

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/93415/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10490_full -> Patchwork_20836_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20836_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20836_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20836_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_async@concurrent-writes@bcs0:
- shard-tglb: [PASS][1] -> [FAIL][2] +4 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb3/igt@gem_exec_async@concurrent-wri...@bcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-tglb7/igt@gem_exec_async@concurrent-wri...@bcs0.html

  * igt@gem_exec_async@concurrent-writes@vcs1:
- shard-kbl:  [PASS][3] -> [FAIL][4] +1 similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl7/igt@gem_exec_async@concurrent-wri...@vcs1.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-kbl4/igt@gem_exec_async@concurrent-wri...@vcs1.html

  * igt@gem_exec_async@concurrent-writes@vecs0:
- shard-iclb: [PASS][5] -> [FAIL][6] +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-iclb7/igt@gem_exec_async@concurrent-wri...@vecs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-iclb7/igt@gem_exec_async@concurrent-wri...@vecs0.html

  * igt@gem_exec_async@forked-writes@bcs0:
- shard-snb:  [PASS][7] -> [FAIL][8] +2 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-snb5/igt@gem_exec_async@forked-wri...@bcs0.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-snb6/igt@gem_exec_async@forked-wri...@bcs0.html

  
Known issues


  Here are the changes found in Patchwork_20836_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][9] ([i915#3002])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-snb6/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][10] ([fdo#109271] / [i915#1099]) +4 
similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-snb2/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][11] ([i915#2846])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-apl2/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-kbl:  [PASS][12] -> [FAIL][13] ([i915#2842]) +3 similar 
issues
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl6/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-kbl7/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#2842])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-glk5/igt@gem_exec_fair@basic-none-...@rcs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-glk3/igt@gem_exec_fair@basic-none-...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs0:
- shard-kbl:  [PASS][16] -> [SKIP][17] ([fdo#109271])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl7/igt@gem_exec_fair@basic-p...@vcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-kbl3/igt@gem_exec_fair@basic-p...@vcs0.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
- shard-iclb: [PASS][18] -> [FAIL][19] ([i915#2428])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-iclb6/igt@gem_mmap_...@cpuset-big-copy-odd.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-iclb6/igt@gem_mmap_...@cpuset-big-copy-odd.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][20] ([i915#2658])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-apl6/igt@gem_pr...@exhaustion.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][21] ([i915#2658])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/shard-tglb5/igt@gem_pwr...@basic-exhaustion.html

  *

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

On Tue, Aug 17, 2021 at 07:27:18PM +0200, Michal Wajdeczko wrote:
> 
> 
> On 17.08.2021 19:20, Daniel Vetter wrote:
> > On Tue, Aug 17, 2021 at 09:36:49AM -0700, Matthew Brost wrote:
> >> On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
> >>> On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
>  Add GuC kernel doc for all structures added thus far for GuC submission
>  and update the main GuC submission section with the new interface
>  details.
> 
>  Signed-off-by: Matthew Brost 
> >>>
> >>> There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
> >>> locking design that also doesn't feel too consistent.
> >>>
> >>
> >> That is a different layer than GuC submission so I don't we should
> >> mention anything about that layer here. Didn't really write that layer
> >> and it super painful to touch that code so I'm going to stay out of any
> >> rework you think we need to do there. 
> > 
> > Well there's three locks 
> 
> It's likely me.
> 
> There is one lock for the recv CTB, one for the send CTB, one for the
> list of read messages ready to post process - do you want to use single
> lock for both CTBs or single lock for all cases in CT ?
> 
> Michal
> 
> disclaimer: outstanding_g2h are not part of the CTB layer

Why? Like apparently there's not enough provided by that right now, so
Matt is now papering over that gap with more book-keeping in the next
layer. If the layer is not doing a good job it's either the wrong layer,
or shouldn't be a layer.

And yeah the locking looks like serious amounts of overkill, was it
benchmarked that we need the 3 separate locks for this?

While reading ctb code I also noticed that a bunch of stuff is checked
before we grab the relevant spinlocks, and it's not
- wrapped in a WARN_ON or GEM_BUG_ON or similar to just check everything
  works as expected
- there's no other locks

So either racy, buggy or playing some extremely clever tricks. None of
which is very good.
-Daniel

> 
> 
> > there plus it leaks out (you have your
> > outstanding_submission_g2h atomic_t which is very closed tied to well,
> > outstanding guc transmissions), so I guess I need someone else for that?
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)

== Series Details ==

Series: drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)
URL   : https://patchwork.freedesktop.org/series/93711/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10490_full -> Patchwork_20834_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20834_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20834_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20834_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_atomic_transition@plane-toggle-modeset-transition:
- shard-tglb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb1/igt@kms_atomic_transit...@plane-toggle-modeset-transition.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-tglb3/igt@kms_atomic_transit...@plane-toggle-modeset-transition.html

  
Known issues


  Here are the changes found in Patchwork_20834_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][3] ([i915#3002])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-snb2/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
- shard-kbl:  [PASS][4] -> [DMESG-WARN][5] ([i915#180]) +1 similar 
issue
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl6/igt@gem_ctx_isolation@preservation...@bcs0.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-kbl1/igt@gem_ctx_isolation@preservation...@bcs0.html

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-kbl:  [PASS][7] -> [FAIL][8] ([i915#2842])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl6/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-kbl7/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][9] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-iclb1/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][10] -> [FAIL][11] ([i915#2842]) +1 similar 
issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_suspend@basic-s4-devices:
- shard-glk:  NOTRUN -> [DMESG-WARN][12] ([i915#1610])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-glk1/igt@gem_exec_susp...@basic-s4-devices.html

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][13] -> [SKIP][14] ([i915#2190])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb2/igt@gem_huc_c...@huc-copy.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-tglb6/igt@gem_huc_c...@huc-copy.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][15] ([i915#2658])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-tglb1/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_userptr_blits@access-control:
- shard-tglb: NOTRUN -> [SKIP][16] ([i915#3297]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-tglb8/igt@gem_userptr_bl...@access-control.html
- shard-iclb: NOTRUN -> [SKIP][17] ([i915#3297]) +1 similar issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-iclb1/igt@gem_userptr_bl...@access-control.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#3002])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-apl3/igt@gem_userptr_bl...@input-checking.html

  * igt@i915_pm_backlight@fade_with_suspend:
- shard-skl:  [PASS][19] -> [INCOMPLETE][20] ([i915#198])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-skl1/igt@i915_pm_backlight@fade_with_suspend.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/shard-skl6/igt@i915_pm_backlight@fade_with_suspend.html

  *

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

2021-08-17 Thread Michal Wajdeczko




On 17.08.2021 19:20, Daniel Vetter wrote:
> On Tue, Aug 17, 2021 at 09:36:49AM -0700, Matthew Brost wrote:
>> On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
>>> On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
 Add GuC kernel doc for all structures added thus far for GuC submission
 and update the main GuC submission section with the new interface
 details.

 Signed-off-by: Matthew Brost 
>>>
>>> There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
>>> locking design that also doesn't feel too consistent.
>>>
>>
>> That is a different layer than GuC submission so I don't we should
>> mention anything about that layer here. Didn't really write that layer
>> and it super painful to touch that code so I'm going to stay out of any
>> rework you think we need to do there. 
> 
> Well there's three locks 

It's likely me.

There is one lock for the recv CTB, one for the send CTB, one for the
list of read messages ready to post process - do you want to use single
lock for both CTBs or single lock for all cases in CT ?

Michal

disclaimer: outstanding_g2h are not part of the CTB layer


> there plus it leaks out (you have your
> outstanding_submission_g2h atomic_t which is very closed tied to well,
> outstanding guc transmissions), so I guess I need someone else for that?
>

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

On Tue, Aug 17, 2021 at 09:36:49AM -0700, Matthew Brost wrote:
> On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
> > > Add GuC kernel doc for all structures added thus far for GuC submission
> > > and update the main GuC submission section with the new interface
> > > details.
> > > 
> > > Signed-off-by: Matthew Brost 
> > 
> > There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
> > locking design that also doesn't feel too consistent.
> >
> 
> That is a different layer than GuC submission so I don't we should
> mention anything about that layer here. Didn't really write that layer
> and it super painful to touch that code so I'm going to stay out of any
> rework you think we need to do there. 

Well there's three locks there plus it leaks out (you have your
outstanding_submission_g2h atomic_t which is very closed tied to well,
outstanding guc transmissions), so I guess I need someone else for that?

> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_context_types.h |  42 +---
> > >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  19 +++-
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 101 ++
> > >  drivers/gpu/drm/i915/i915_request.h   |  18 ++--
> > >  4 files changed, 131 insertions(+), 49 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > index f6989e6807f7..75d609a1bc33 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > @@ -156,44 +156,56 @@ struct intel_context {
> > >   u8 wa_bb_page; /* if set, page num reserved for context workarounds */
> > >  
> > >   struct {
> > > - /** lock: protects everything in guc_state */
> > > + /** @lock: protects everything in guc_state */
> > >   spinlock_t lock;
> > >   /**
> > > -  * sched_state: scheduling state of this context using GuC
> > > +  * @sched_state: scheduling state of this context using GuC
> > >* submission
> > >*/
> > >   u32 sched_state;
> > >   /*
> > > -  * fences: maintains of list of requests that have a submit
> > > -  * fence related to GuC submission
> > > +  * @fences: maintains a list of requests are currently being
> > > +  * fenced until a GuC operation completes
> > >*/
> > >   struct list_head fences;
> > > - /* GuC context blocked fence */
> > > + /**
> > > +  * @blocked_fence: fence used to signal when the blocking of a
> > > +  * contexts submissions is complete.
> > > +  */
> > >   struct i915_sw_fence blocked_fence;
> > > - /* GuC committed requests */
> > > + /** @number_committed_requests: number of committed requests */
> > >   int number_committed_requests;
> > >   } guc_state;
> > >  
> > >   struct {
> > > - /** lock: protects everything in guc_active */
> > > + /** @lock: protects everything in guc_active */
> > >   spinlock_t lock;
> > 
> > Why do we have two locks spinlocks to protect guc context state?
> > 
> > I do understand the need for a spinlock (at least for now) because of how
> > i915-scheduler runs in tasklet context. But beyond that we really
> > shouldn't need more than two locks to protect context state. You still
> > have an entire pile here, plus some atomics, plus more.
> >
> 
> Yea I actually thought about this after I sent to out, guc_active &
> guc_state should be combined into a single lock. Originally I had two
> different locks because of old hierarchy this is no longer needed. Can
> fix.
>  
> > And this is on a single context, where concurrently submitting stuff
> > really isn't a thing. I'd expect actual benchmarking would show a perf
> > hit, since all these locks and atomics aren't free. This is at least the
> > case with execbuf and the various i915_vma locks we currently have.
> > 
> > What I expect intel_context locking to be is roughly:
> > 
> > - One lock to protect all intel_context state. This probably should be a
> >   dma_resv_lock for a few reasons, least so we can pin state objects
> >   underneath that lock.
> > 
> > - A separate lock if there's anything you need to coordinate with the
> >   backend scheduler while that's running, to avoid dma_fence inversions.
> >   Right now this separate lock might need to be a spinlock because our
> >   scheduler runs in tasklets, and that might mean we need both a mutex and
> >   a spinlock here.
> >
> > Anything that goes beyond that is premature optimization and kills us code
> > complexity vise. I'd be _extremely_ surprised if an IA core cannot keep up
> > with GuC, and therefore anything that goes beyond "one lock per object",
> > plus/minus execution context issues like the above tasklet issue, is
> >

Re: [Intel-gfx] [PATCH 19/22] drm/i915/guc: Proper xarray usage for contexts_lookup

On Tue, Aug 17, 2021 at 07:13:33PM +0200, Daniel Vetter wrote:
> On Tue, Aug 17, 2021 at 08:26:28AM -0700, Matthew Brost wrote:
> > On Tue, Aug 17, 2021 at 12:27:29PM +0200, Daniel Vetter wrote:
> > > On Mon, Aug 16, 2021 at 06:51:36AM -0700, Matthew Brost wrote:
> > > > Lock the xarray and take ref to the context if needed.
> > > > 
> > > > v2:
> > > >  (Checkpatch)
> > > >   - Add new line after declaration
> > > > 
> > > > Signed-off-by: Matthew Brost 
> > > > ---
> > > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 84 ---
> > > >  1 file changed, 73 insertions(+), 11 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > index ba19b99173fc..2ecb2f002bed 100644
> > > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > > @@ -599,8 +599,18 @@ static void 
> > > > scrub_guc_desc_for_outstanding_g2h(struct intel_guc *guc)
> > > > unsigned long index, flags;
> > > > bool pending_disable, pending_enable, deregister, destroyed, 
> > > > banned;
> > > >  
> > > > +   xa_lock_irqsave(>context_lookup, flags);
> > > > xa_for_each(>context_lookup, index, ce) {
> > > > -   spin_lock_irqsave(>guc_state.lock, flags);
> > > > +   /*
> > > > +* Corner case where the ref count on the object is 
> > > > zero but and
> > > > +* deregister G2H was lost. In this case we don't touch 
> > > > the ref
> > > > +* count and finish the destroy of the context.
> > > > +*/
> > > > +   bool do_put = kref_get_unless_zero(>ref);
> > > 
> > > This looks really scary, because in another loop below you have an
> > > unconditional refcount increase. This means sometimes guc->context_lookup
> > 
> > Yea, good catch those loops need something like this too.
> > 
> > > xarray guarantees we hold a full reference on the context, sometimes we
> > > don't. So we're right back in "protect the code" O(N^2) review complexity
> > > instead of invariant rules about the datastructure, which is linear.
> > > 
> > > Essentially anytime you feel like you have to add a comment to explain
> > > what's going on about concurrent stuff you're racing with, you're
> > > protecting code, not data.
> > > 
> > > Since guc can't do a hole lot without the guc_id registered and all that,
> > > I kinda expected you'd always have a full reference here. If there's
> > 
> > The deregister is triggered by the ref count going to zero and we can't
> > fully release the guc_id until that operation completes hence why it is
> > still in the xarray. I think the solution here is to use iterator like
> > you mention below that ref counts this correctly.
> 
> Hm but if the refcount drops to zero while we have a guc_id, how does that
> work? Do we delay the guc_context_destroy until that's done, or is the

Yes, we don't want to release the guc_id and deregister the context with
the GuC until the i915 is done with the context (no refs). We issue the
deregister when we have no refs (done directly now, add worker to do
this in a upcoming patch). We release the guc_id, remove from xarray, and
destroy context when the deregister completes.

> context handed off internally somehow to a worker?
> 
> Afaik intel_context_put is called from all kinds of nasty context, so
> waiting is not an option as-is ...

Right, it is definitely can be called from nasty contexts hence why move
this to a work in an upcoming patch.

Matt

> -Daniel
> 
> > > intermediate stages (e.g. around unregister) where this is currently not
> > > always the case, then those should make sure a full reference is held.
> > > 
> > > Another option would be to threa ->context_lookup as a weak reference that
> > > we lazily clean up when the context is finalized. That works too, but
> > > probably not with a spinlock (since you most likely have to wait for all
> > > pending guc transations to complete), but it's another option.
> > > 
> > > Either way I think standard process is needed here for locking design,
> > > i.e.
> > > 1. come up with the right invariants ("we always have a full reference
> > > when a context is ont he guc->context_lookup xarray")
> > > 2. come up with the locks. From the guc side the xa_lock is maybe good
> > > enough, but from the context side this doesn't protect against a
> > > re-registering racing against a deregistering. So probably needs more
> > > rules on top, and then you have a nice lock inversion in a few places like
> > > here.
> > > 3. document it and roll it out.
> > > 
> > > The other thing is that this is a very tricky iterator, and there's a few
> > > copies of it. That is, if this is the right solution. As-is this should be
> > > abstracted away into guc_context_iter_begin/next_end() helpers, e.g. like
> > > we have for

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/adl_p: Also disable underrun recovery with MSO

2021-08-17 Thread Vudum, Lakshminarayana

Re-reported.

-Original Message-
From: Roper, Matthew D  
Sent: Tuesday, August 17, 2021 9:26 AM
To: intel-gfx@lists.freedesktop.org
Cc: Vudum, Lakshminarayana 
Subject: Re: ✗ Fi.CI.BAT: failure for drm/i915/adl_p: Also disable underrun 
recovery with MSO

On Tue, Aug 17, 2021 at 04:02:14PM +, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915/adl_p: Also disable underrun recovery with MSO
> URL   : https://patchwork.freedesktop.org/series/93732/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20835 
> 
> 
> Summary
> ---
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_20835 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_20835, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> Patchwork_20835:
> 
> ### IGT changes ###
> 
>  Possible regressions 
> 
>   * igt@i915_selftest@live@hangcheck:
> - fi-ivb-3770:[PASS][1] -> [INCOMPLETE][2]
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/i
> gt@i915_selftest@l...@hangcheck.html

This IVB error is unrelated to the patch here (which would only affect 
platforms with display version >= 13).


Matt

> 
>   
> Known issues
> 
> 
>   Here are the changes found in Patchwork_20835 that come from known issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@amdgpu/amd_basic@semaphore:
> - fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271]) +27 similar 
> issues
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/
> igt@amdgpu/amd_ba...@semaphore.html
> 
>   * igt@core_hotunplug@unbind-rebind:
> - fi-bdw-5557u:   NOTRUN -> [WARN][4] ([i915#3718])
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/
> igt@core_hotunp...@unbind-rebind.html
> 
>   * igt@kms_chamelium@dp-crc-fast:
> - fi-bdw-5557u:   NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) 
> +8 similar issues
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/
> igt@kms_chamel...@dp-crc-fast.html
> 
>   * igt@runner@aborted:
> - fi-ivb-3770:NOTRUN -> [FAIL][6] ([fdo#109271])
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/i
> gt@run...@aborted.html
> 
>   
>   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
>   [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
>   [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718
> 
> 
> Participating hosts (36 -> 34)
> --
> 
>   Missing(2): fi-bsw-cyan fi-bdw-samus 
> 
> 
> Build changes
> -
> 
>   * Linux: CI_DRM_10490 -> Patchwork_20835
> 
>   CI-20190529: 20190529
>   CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
> git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   Patchwork_20835: 4a9bac99ddffb1e355f2084d1b46465aac20b6c8 @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> 
> 
> == Linux commits ==
> 
> 4a9bac99ddff drm/i915/adl_p: Also disable underrun recovery with MSO
> 
> == Logs ==
> 
> For more details see: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

--
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

Re: [Intel-gfx] [PATCH 19/22] drm/i915/guc: Proper xarray usage for contexts_lookup

On Tue, Aug 17, 2021 at 08:26:28AM -0700, Matthew Brost wrote:
> On Tue, Aug 17, 2021 at 12:27:29PM +0200, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 06:51:36AM -0700, Matthew Brost wrote:
> > > Lock the xarray and take ref to the context if needed.
> > > 
> > > v2:
> > >  (Checkpatch)
> > >   - Add new line after declaration
> > > 
> > > Signed-off-by: Matthew Brost 
> > > ---
> > >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 84 ---
> > >  1 file changed, 73 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > index ba19b99173fc..2ecb2f002bed 100644
> > > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > > @@ -599,8 +599,18 @@ static void 
> > > scrub_guc_desc_for_outstanding_g2h(struct intel_guc *guc)
> > >   unsigned long index, flags;
> > >   bool pending_disable, pending_enable, deregister, destroyed, banned;
> > >  
> > > + xa_lock_irqsave(>context_lookup, flags);
> > >   xa_for_each(>context_lookup, index, ce) {
> > > - spin_lock_irqsave(>guc_state.lock, flags);
> > > + /*
> > > +  * Corner case where the ref count on the object is zero but and
> > > +  * deregister G2H was lost. In this case we don't touch the ref
> > > +  * count and finish the destroy of the context.
> > > +  */
> > > + bool do_put = kref_get_unless_zero(>ref);
> > 
> > This looks really scary, because in another loop below you have an
> > unconditional refcount increase. This means sometimes guc->context_lookup
> 
> Yea, good catch those loops need something like this too.
> 
> > xarray guarantees we hold a full reference on the context, sometimes we
> > don't. So we're right back in "protect the code" O(N^2) review complexity
> > instead of invariant rules about the datastructure, which is linear.
> > 
> > Essentially anytime you feel like you have to add a comment to explain
> > what's going on about concurrent stuff you're racing with, you're
> > protecting code, not data.
> > 
> > Since guc can't do a hole lot without the guc_id registered and all that,
> > I kinda expected you'd always have a full reference here. If there's
> 
> The deregister is triggered by the ref count going to zero and we can't
> fully release the guc_id until that operation completes hence why it is
> still in the xarray. I think the solution here is to use iterator like
> you mention below that ref counts this correctly.

Hm but if the refcount drops to zero while we have a guc_id, how does that
work? Do we delay the guc_context_destroy until that's done, or is the
context handed off internally somehow to a worker?

Afaik intel_context_put is called from all kinds of nasty context, so
waiting is not an option as-is ...
-Daniel

> > intermediate stages (e.g. around unregister) where this is currently not
> > always the case, then those should make sure a full reference is held.
> > 
> > Another option would be to threa ->context_lookup as a weak reference that
> > we lazily clean up when the context is finalized. That works too, but
> > probably not with a spinlock (since you most likely have to wait for all
> > pending guc transations to complete), but it's another option.
> > 
> > Either way I think standard process is needed here for locking design,
> > i.e.
> > 1. come up with the right invariants ("we always have a full reference
> > when a context is ont he guc->context_lookup xarray")
> > 2. come up with the locks. From the guc side the xa_lock is maybe good
> > enough, but from the context side this doesn't protect against a
> > re-registering racing against a deregistering. So probably needs more
> > rules on top, and then you have a nice lock inversion in a few places like
> > here.
> > 3. document it and roll it out.
> > 
> > The other thing is that this is a very tricky iterator, and there's a few
> > copies of it. That is, if this is the right solution. As-is this should be
> > abstracted away into guc_context_iter_begin/next_end() helpers, e.g. like
> > we have for drm_connector_list_iter_begin/end_next as an example.
> >
> 
> I can check this out.
> 
> Matt
>  
> > Cheers, Daniel
> > 
> > > +
> > > + xa_unlock(>context_lookup);
> > > +
> > > + spin_lock(>guc_state.lock);
> > >  
> > >   /*
> > >* Once we are at this point submission_disabled() is guaranteed
> > > @@ -616,7 +626,9 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > > intel_guc *guc)
> > >   banned = context_banned(ce);
> > >   init_sched_state(ce);
> > >  
> > > - spin_unlock_irqrestore(>guc_state.lock, flags);
> > > + spin_unlock(>guc_state.lock);
> > > +
> > > + GEM_BUG_ON(!do_put && !destroyed);
> > >  
> > >   if (pending_enable || destroyed || deregister) {
> > >

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/adl_p: Also disable underrun recovery with MSO

== Series Details ==

Series: drm/i915/adl_p: Also disable underrun recovery with MSO
URL   : https://patchwork.freedesktop.org/series/93732/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20835


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

Known issues


  Here are the changes found in Patchwork_20835 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][1] ([fdo#109271]) +27 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][2] ([i915#3718])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@i915_selftest@live@hangcheck:
- fi-ivb-3770:[PASS][3] -> [INCOMPLETE][4] ([i915#3303])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-ivb-3770:NOTRUN -> [FAIL][6] ([fdo#109271])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@run...@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#3303]: https://gitlab.freedesktop.org/drm/intel/issues/3303
  [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718


Participating hosts (36 -> 34)
--

  Missing(2): fi-bsw-cyan fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10490 -> Patchwork_20835

  CI-20190529: 20190529
  CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20835: 4a9bac99ddffb1e355f2084d1b46465aac20b6c8 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4a9bac99ddff drm/i915/adl_p: Also disable underrun recovery with MSO

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/sched dependency handling and implicit sync fixes (rev4)

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/93415/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20836


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/index.html

Known issues


  Here are the changes found in Patchwork_20836 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][1] ([fdo#109271]) +27 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][2] ([i915#3718])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718


Participating hosts (36 -> 34)
--

  Missing(2): fi-bsw-cyan fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10490 -> Patchwork_20836

  CI-20190529: 20190529
  CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20836: 724ee6ec97135c8a4fd57f8b19d9802834ad62fc @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

724ee6ec9713 dma-resv: Give the docs a do-over
8c4011c29394 drm/i915: Don't break exclusive fence ordering
ab0a3f52cac3 drm/i915: delete exclude argument from 
i915_sw_fence_await_reservation
2eb5ca447466 drm/etnaviv: Don't break exclusive fence ordering
a3c9153955a4 drm/msm: Don't break exclusive fence ordering
7650a6f74b18 drm/sched: Check locking in drm_sched_job_await_implicit
8f51a69d4e74 drm/sched: Don't store self-dependencies
e5b7840798ca drm/gem: Delete gem array fencing helpers
9f2a9ecfea9c drm/msm: Use scheduler dependency handling
e63e89cc8ac5 drm/etnaviv: Use scheduler dependency handling
2c445c2eb7fe drm/v3d: Use scheduler dependency handling
1663a9467a04 drm/v3d: Move drm_sched_job_init to v3d_job_init
ca1c5aec7cde drm/lima: use scheduler dependency tracking
6687763f729e drm/panfrost: use scheduler dependency tracking
f4b4e005b964 drm/sched: improve docs around drm_sched_entity
ebfbb6077485 drm/sched: drop entity parameter from drm_sched_push_job
255f53586a60 drm/sched: Add dependency tracking
5b5164ff17f2 drm/sched: Barriers are needed for entity->last_scheduled
27dbe1a630f0 drm/msm: Improve drm/sched point of no return rules
0d891258e40a drm/sched: Split drm_sched_job_init

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20836/index.html

Re: [Intel-gfx] [PATCH 08/22] drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered

On Tue, Aug 17, 2021 at 11:47:53AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:25AM -0700, Matthew Brost wrote:
> > When unblocking a context, do not enable scheduling if the context is
> > banned, guc_id invalid, or not registered.
> > 
> > Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index c3b7bf7319dd..353899634fa8 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -1579,6 +1579,9 @@ static void guc_context_unblock(struct intel_context 
> > *ce)
> > spin_lock_irqsave(>guc_state.lock, flags);
> >  
> > if (unlikely(submission_disabled(guc) ||
> > +intel_context_is_banned(ce) ||
> > +context_guc_id_invalid(ce) ||
> > +!lrc_desc_registered(guc, ce->guc_id) ||
> >  !intel_context_is_pinned(ce) ||
> >  context_pending_disable(ce) ||
> >  context_blocked(ce) > 1)) {
> 
> I think this entire if condition here is screaming that our intel_context
> state machinery for guc is way too complex, and on the wrong side of
> incomprehensible.
> 
> Also some of these check state outside of the context, and we don't seem
> to hold spinlocks for those, or anything else.
> 
> I general I have no idea which of these are defensive programming and
> cannot ever happen, and which actually can happen. There's for sure way
> too many races going on given that this is all context-local stuff.

A lot of this is guarding against a full GT reset while trying to
cancel a request. Full GT resets make everything really hard and in
pratice should never really happen because the GuC does per engine /
context resets. Unfortunately IGTs do weird things like turn off per
engine / contexts resets and full GT reset the only way to recover so
the IGTs can will expose all the races around GT reset, especially when
we run IGTs a pre-prod HW that tends to hang for whatever reason.

Matt 

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

On Tue, Aug 17, 2021 at 01:11:41PM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
> > Add GuC kernel doc for all structures added thus far for GuC submission
> > and update the main GuC submission section with the new interface
> > details.
> > 
> > Signed-off-by: Matthew Brost 
> 
> There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
> locking design that also doesn't feel too consistent.
>

That is a different layer than GuC submission so I don't we should
mention anything about that layer here. Didn't really write that layer
and it super painful to touch that code so I'm going to stay out of any
rework you think we need to do there. 
 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_context_types.h |  42 +---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  19 +++-
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 101 ++
> >  drivers/gpu/drm/i915/i915_request.h   |  18 ++--
> >  4 files changed, 131 insertions(+), 49 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index f6989e6807f7..75d609a1bc33 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -156,44 +156,56 @@ struct intel_context {
> > u8 wa_bb_page; /* if set, page num reserved for context workarounds */
> >  
> > struct {
> > -   /** lock: protects everything in guc_state */
> > +   /** @lock: protects everything in guc_state */
> > spinlock_t lock;
> > /**
> > -* sched_state: scheduling state of this context using GuC
> > +* @sched_state: scheduling state of this context using GuC
> >  * submission
> >  */
> > u32 sched_state;
> > /*
> > -* fences: maintains of list of requests that have a submit
> > -* fence related to GuC submission
> > +* @fences: maintains a list of requests are currently being
> > +* fenced until a GuC operation completes
> >  */
> > struct list_head fences;
> > -   /* GuC context blocked fence */
> > +   /**
> > +* @blocked_fence: fence used to signal when the blocking of a
> > +* contexts submissions is complete.
> > +*/
> > struct i915_sw_fence blocked_fence;
> > -   /* GuC committed requests */
> > +   /** @number_committed_requests: number of committed requests */
> > int number_committed_requests;
> > } guc_state;
> >  
> > struct {
> > -   /** lock: protects everything in guc_active */
> > +   /** @lock: protects everything in guc_active */
> > spinlock_t lock;
> 
> Why do we have two locks spinlocks to protect guc context state?
> 
> I do understand the need for a spinlock (at least for now) because of how
> i915-scheduler runs in tasklet context. But beyond that we really
> shouldn't need more than two locks to protect context state. You still
> have an entire pile here, plus some atomics, plus more.
>

Yea I actually thought about this after I sent to out, guc_active &
guc_state should be combined into a single lock. Originally I had two
different locks because of old hierarchy this is no longer needed. Can
fix.
 
> And this is on a single context, where concurrently submitting stuff
> really isn't a thing. I'd expect actual benchmarking would show a perf
> hit, since all these locks and atomics aren't free. This is at least the
> case with execbuf and the various i915_vma locks we currently have.
> 
> What I expect intel_context locking to be is roughly:
> 
> - One lock to protect all intel_context state. This probably should be a
>   dma_resv_lock for a few reasons, least so we can pin state objects
>   underneath that lock.
> 
> - A separate lock if there's anything you need to coordinate with the
>   backend scheduler while that's running, to avoid dma_fence inversions.
>   Right now this separate lock might need to be a spinlock because our
>   scheduler runs in tasklets, and that might mean we need both a mutex and
>   a spinlock here.
>
> Anything that goes beyond that is premature optimization and kills us code
> complexity vise. I'd be _extremely_ surprised if an IA core cannot keep up
> with GuC, and therefore anything that goes beyond "one lock per object",
> plus/minus execution context issues like the above tasklet issue, is
> likely just going to slow everything down.

If I combine the above spin lock, isn't that basically what we have one
lock for the context state as it relates to GuC submission?

Also thinking when we move to DRM scheduler we likely can get rid of all
the atomic contexts in the GuC submission backend.

> 
> > -   /** requests: active requests on this context */
> > +   /** @requests: list of

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency handling and implicit sync fixes (rev4)

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev4)
URL   : https://patchwork.freedesktop.org/series/93415/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0d891258e40a drm/sched: Split drm_sched_job_init
-:240: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#240: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173:
+   unsigned seq;

-:336: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#336: FILE: drivers/gpu/drm/scheduler/sched_main.c:623:
+   BUG_ON(!entity);

-:405: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#405: FILE: include/drm/gpu_scheduler.h:391:
+struct drm_sched_fence *drm_sched_fence_alloc(

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 1 checks, 248 lines checked
27dbe1a630f0 drm/msm: Improve drm/sched point of no return rules
-:78: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 37 lines checked
5b5164ff17f2 drm/sched: Barriers are needed for entity->last_scheduled
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 43 lines checked
255f53586a60 drm/sched: Add dependency tracking
-:195: CHECK:LINE_SPACING: Please don't use multiple blank lines
#195: FILE: drivers/gpu/drm/scheduler/sched_main.c:729:
+
+

-:271: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'?
#271: FILE: include/drm/gpu_scheduler.h:244:
+* drm_sched_job_add_implicit_dependencies() this can be ommitted and
 

-:286: CHECK:LINE_SPACING: Please don't use multiple blank lines
#286: FILE: include/drm/gpu_scheduler.h:378:
+
+

-:289: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 2 checks, 230 lines checked
ebfbb6077485 drm/sched: drop entity parameter from drm_sched_push_job
-:228: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 110 lines checked
f4b4e005b964 drm/sched: improve docs around drm_sched_entity
-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: 
move entity handling into separate file")'
#17: 
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 346 lines checked
6687763f729e drm/panfrost: use scheduler dependency tracking
-:215: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 158 lines checked
ca1c5aec7cde drm/lima: use scheduler dependency tracking
-:119: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 75 lines checked
1663a9467a04 drm/v3d: Move drm_sched_job_init to v3d_job_init
-:344: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 288 lines checked
2c445c2eb7fe drm/v3d: Use scheduler dependency handling
-:207: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 162 lines checked
e63e89cc8ac5 drm/etnaviv: Use scheduler dependency handling
-:13: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#13: 
I wanted to to in the previous round (and did, for all other drivers).

-:122: WARNING:LINE_SPACING: Missing a blank line after declarations
#122: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:552:
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {

-:297: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 0 checks, 243 lines checked
9f2a9ecfea9c drm/msm: Use scheduler dependency handling
-:132: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/adl_p: Also disable underrun recovery with MSO

2021-08-17 Thread Matt Roper

On Tue, Aug 17, 2021 at 04:02:14PM +, Patchwork wrote:
> == Series Details ==
> 
> Series: drm/i915/adl_p: Also disable underrun recovery with MSO
> URL   : https://patchwork.freedesktop.org/series/93732/
> State : failure
> 
> == Summary ==
> 
> CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20835
> 
> 
> Summary
> ---
> 
>   **FAILURE**
> 
>   Serious unknown changes coming with Patchwork_20835 absolutely need to be
>   verified manually.
>   
>   If you think the reported changes have nothing to do with the changes
>   introduced in Patchwork_20835, please notify your bug team to allow them
>   to document this new failure mode, which will reduce false positives in CI.
> 
>   External URL: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html
> 
> Possible new issues
> ---
> 
>   Here are the unknown changes that may have been introduced in 
> Patchwork_20835:
> 
> ### IGT changes ###
> 
>  Possible regressions 
> 
>   * igt@i915_selftest@live@hangcheck:
> - fi-ivb-3770:[PASS][1] -> [INCOMPLETE][2]
>[1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html
>[2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html

This IVB error is unrelated to the patch here (which would only affect
platforms with display version >= 13).


Matt

> 
>   
> Known issues
> 
> 
>   Here are the changes found in Patchwork_20835 that come from known issues:
> 
> ### IGT changes ###
> 
>  Issues hit 
> 
>   * igt@amdgpu/amd_basic@semaphore:
> - fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271]) +27 similar 
> issues
>[3]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html
> 
>   * igt@core_hotunplug@unbind-rebind:
> - fi-bdw-5557u:   NOTRUN -> [WARN][4] ([i915#3718])
>[4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html
> 
>   * igt@kms_chamelium@dp-crc-fast:
> - fi-bdw-5557u:   NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) 
> +8 similar issues
>[5]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html
> 
>   * igt@runner@aborted:
> - fi-ivb-3770:NOTRUN -> [FAIL][6] ([fdo#109271])
>[6]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@run...@aborted.html
> 
>   
>   [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
>   [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
>   [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718
> 
> 
> Participating hosts (36 -> 34)
> --
> 
>   Missing(2): fi-bsw-cyan fi-bdw-samus 
> 
> 
> Build changes
> -
> 
>   * Linux: CI_DRM_10490 -> Patchwork_20835
> 
>   CI-20190529: 20190529
>   CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
> git://anongit.freedesktop.org/gfx-ci/linux
>   IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>   Patchwork_20835: 4a9bac99ddffb1e355f2084d1b46465aac20b6c8 @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> 
> 
> == Linux commits ==
> 
> 4a9bac99ddff drm/i915/adl_p: Also disable underrun recovery with MSO
> 
> == Logs ==
> 
> For more details see: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

Re: [Intel-gfx] [PATCH 14/22] drm/i915: Allocate error capture in atomic context

On Tue, Aug 17, 2021 at 12:06:16PM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:31AM -0700, Matthew Brost wrote:
> > Error captures can now be done in a work queue processing G2H messages.
> > These messages need to be completely done being processed in the reset
> > path, to avoid races in the missing G2H cleanup, which create a
> > dependency on memory allocations and dma fences (i915_requests).
> > Requests depend on resets, thus now we have a circular dependency. To
> > work around this, allocate the error capture in an atomic context.
> > 
> > Fixes: dc0dad365c5e ("Fix for error capture after full GPU reset with GuC")
> > Fixes: 573ba126aef3 ("Capture error state on context reset")
> > Signed-off-by: Matthew Brost 
> > ---
> >  drivers/gpu/drm/i915/i915_gpu_error.c | 37 +--
> >  1 file changed, 18 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> > b/drivers/gpu/drm/i915/i915_gpu_error.c
> > index 0f08bcfbe964..453376aa6d9f 100644
> > --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> > +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> > @@ -49,7 +49,6 @@
> >  #include "i915_memcpy.h"
> >  #include "i915_scatterlist.h"
> >  
> > -#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
> >  #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
> 
> This one doesn't make much sense. GFP_ATOMIC essentially means we're
> high-priority and failure would be a pretty bad day. Meanwhile
> __GFP_NOWARN means we can totally cope with failure, pls don't holler.
> 
> GFP_NOWAIT | __GFP_NOWARN would the more consistent one here I think.
> 
> gfp.h for all the docs for this.
> 
> Separate patch ofc. This one is definitely the right direction, since
> GFP_KERNEL from the reset worker is not a good idea.

Lockdep is happy with GFP_NOWAIT so this works for me.

Matt

> -Daniel
> 
> >  
> >  static void __sg_set_buf(struct scatterlist *sg,
> > @@ -79,7 +78,7 @@ static bool __i915_error_grow(struct 
> > drm_i915_error_state_buf *e, size_t len)
> > if (e->cur == e->end) {
> > struct scatterlist *sgl;
> >  
> > -   sgl = (typeof(sgl))__get_free_page(ALLOW_FAIL);
> > +   sgl = (typeof(sgl))__get_free_page(ATOMIC_MAYFAIL);
> > if (!sgl) {
> > e->err = -ENOMEM;
> > return false;
> > @@ -99,10 +98,10 @@ static bool __i915_error_grow(struct 
> > drm_i915_error_state_buf *e, size_t len)
> > }
> >  
> > e->size = ALIGN(len + 1, SZ_64K);
> > -   e->buf = kmalloc(e->size, ALLOW_FAIL);
> > +   e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
> > if (!e->buf) {
> > e->size = PAGE_ALIGN(len + 1);
> > -   e->buf = kmalloc(e->size, GFP_KERNEL);
> > +   e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
> > }
> > if (!e->buf) {
> > e->err = -ENOMEM;
> > @@ -243,12 +242,12 @@ static bool compress_init(struct i915_vma_compress *c)
> >  {
> > struct z_stream_s *zstream = >zstream;
> >  
> > -   if (pool_init(>pool, ALLOW_FAIL))
> > +   if (pool_init(>pool, ATOMIC_MAYFAIL))
> > return false;
> >  
> > zstream->workspace =
> > kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
> > -   ALLOW_FAIL);
> > +   ATOMIC_MAYFAIL);
> > if (!zstream->workspace) {
> > pool_fini(>pool);
> > return false;
> > @@ -256,7 +255,7 @@ static bool compress_init(struct i915_vma_compress *c)
> >  
> > c->tmp = NULL;
> > if (i915_has_memcpy_from_wc())
> > -   c->tmp = pool_alloc(>pool, ALLOW_FAIL);
> > +   c->tmp = pool_alloc(>pool, ATOMIC_MAYFAIL);
> >  
> > return true;
> >  }
> > @@ -280,7 +279,7 @@ static void *compress_next_page(struct 
> > i915_vma_compress *c,
> > if (dst->page_count >= dst->num_pages)
> > return ERR_PTR(-ENOSPC);
> >  
> > -   page = pool_alloc(>pool, ALLOW_FAIL);
> > +   page = pool_alloc(>pool, ATOMIC_MAYFAIL);
> > if (!page)
> > return ERR_PTR(-ENOMEM);
> >  
> > @@ -376,7 +375,7 @@ struct i915_vma_compress {
> >  
> >  static bool compress_init(struct i915_vma_compress *c)
> >  {
> > -   return pool_init(>pool, ALLOW_FAIL) == 0;
> > +   return pool_init(>pool, ATOMIC_MAYFAIL) == 0;
> >  }
> >  
> >  static bool compress_start(struct i915_vma_compress *c)
> > @@ -391,7 +390,7 @@ static int compress_page(struct i915_vma_compress *c,
> >  {
> > void *ptr;
> >  
> > -   ptr = pool_alloc(>pool, ALLOW_FAIL);
> > +   ptr = pool_alloc(>pool, ATOMIC_MAYFAIL);
> > if (!ptr)
> > return -ENOMEM;
> >  
> > @@ -997,7 +996,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
> >  
> > num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
> > num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
> > -   dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
> > +   dst = kmalloc(sizeof(*dst) +

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/adl_p: Also disable underrun recovery with MSO

== Series Details ==

Series: drm/i915/adl_p: Also disable underrun recovery with MSO
URL   : https://patchwork.freedesktop.org/series/93732/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20835


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20835 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20835, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20835:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@hangcheck:
- fi-ivb-3770:[PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@i915_selftest@l...@hangcheck.html

  
Known issues


  Here are the changes found in Patchwork_20835 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271]) +27 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][4] ([i915#3718])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][5] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-ivb-3770:NOTRUN -> [FAIL][6] ([fdo#109271])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/fi-ivb-3770/igt@run...@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718


Participating hosts (36 -> 34)
--

  Missing(2): fi-bsw-cyan fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10490 -> Patchwork_20835

  CI-20190529: 20190529
  CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20835: 4a9bac99ddffb1e355f2084d1b46465aac20b6c8 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

4a9bac99ddff drm/i915/adl_p: Also disable underrun recovery with MSO

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20835/index.html

Re: [Intel-gfx] [PATCH v2] drm: avoid races with modesetting rights

2021-08-17 Thread Desmond Cheong Zhi Xi

On 16/8/21 9:59 pm, Daniel Vetter wrote:

On Mon, Aug 16, 2021 at 12:31 PM Desmond Cheong Zhi Xi
wrote:

On 16/8/21 5:04 pm, Daniel Vetter wrote:

On Mon, Aug 16, 2021 at 10:53 AM Desmond Cheong Zhi Xi
wrote:

On 16/8/21 2:47 am, kernel test robot wrote:

Hi Desmond,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20210813]
[also build test ERROR on v5.14-rc5]
[cannot apply to linus/master v5.14-rc5 v5.14-rc4 v5.14-rc3]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
base:4b358aabb93a2c654cd1dcab1a25a589f6e2b153
config: i386-randconfig-a004-20210815 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
#
https://github.com/0day-ci/linux/commit/cf6d8354b7d7953cd866fad004cbb189adfa074f
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review
Desmond-Cheong-Zhi-Xi/drm-avoid-races-with-modesetting-rights/20210815-234145
git checkout cf6d8354b7d7953cd866fad004cbb189adfa074f
# save the attached .config to linux build tree
make W=1 ARCH=i386

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot

All errors (new ones prefixed by >>, old ones prefixed by <<):

ERROR: modpost: "task_work_add" [drivers/gpu/drm/drm.ko] undefined!

I'm a bit uncertain about this. Looking into the .config used, this
error seems to happen because task_work_add isn't an exported symbol,
but DRM is being compiled as a loadable kernel module (CONFIG_DRM=m).

One way to deal with this is to export the symbol, but there was a
proposed patch to do this a few months back that wasn't picked up [1],
so I'm not sure what to make of this.

I'll export the symbol as part of a v3 series, and check in with the
task-work maintainers.

Link:
https://lore.kernel.org/lkml/20210127150029.13766-3-josh...@samsung.com/ [1]

Yeah that sounds best. I have two more thoughts on the patch:
- drm_master_flush isn't used by any modules outside of drm.ko, so we
can unexport it and drop the kerneldoc (the comment is still good).
These kind of internal functions have their declaration in
drm-internal.h - there's already a few there from drm_auth.c

Sounds good, I'll do that and move the declaration from drm_auth.h to
drm_internal.h.

- We know have 3 locks for master state, that feels a bit like
overkill. The spinlock I think we need to keep due to lock inversions,
but the master_mutex and master_rwsem look like we should be able to
merge them? I.e. anywhere we currently grab the master_mutex we could
instead grab the rwsem in either write mode (when we change stuff) or
read mode (when we just check, like in master_internal_acquire).

Thoughts?
-Daniel

Using rwsem in the places where we currently hold the mutex seems pretty
doable.

There are some tricky bits once we add rwsem read locks to the ioctl
handler. Some ioctl functions like drm_authmagic need a write lock.

Ah yes, I only looked at the dropmaster/setmaster ioctl, and those
don't have the DRM_MASTER bit set.

In this particular case, it might make sense to break master_mutex down
into finer-grained locks, since the function doesn't change master
permissions. It just needs to prevent concurrent writes to the
drm_master.magic_map idr.

Yeah for authmagic we could perhaps just reuse the spinlock to protect
->magic_map?

Yup, I had to move the spinlock from struct drm_file to struct
drm_device, but I think that should work.

For other ioctls, I'll take a closer look on a case-by-case basis.

If it's too much shuffling then I think totally fine to leave things
as-is. Just feels a bit silly to have 3 locks, on of which is an
rwlock itself, for this fairly small amount of state.
-Daniel

Agreed, there's a lot of overlap between the master_mutex and rwsem so
this a good opportunity to refactor things.

I'm cleaning up a v3 series now. There's some movement, but most of it
are fixes to potential bugs that I saw while refactoring. We can see if
the new version is a better design.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

On Tue, Aug 17, 2021 at 5:13 PM Matthew Brost  wrote:
> On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote:
> > On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> > > Progagating errors to dependent fences is wrong, don't do it. Selftest
> > > in following patch exposes this bug.
> >
> > Please explain what "this bug" is, it's hard to read minds, especially at
> > a distance in spacetime :-)
> >
>
> Not a very good explaination.
>
> > > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> > > children on unhold")
> >
> > I think it would be better to outright revert this, instead of just
> > disabling it like this.
> >
>
> I tried revert and git did some really odd things that I couldn't
> resolve, hence the new patch.

If there's any conflict git just gives you your current code, and what
was there with the revert applied, with the block markers. Then it's
your job to manually apply that change.

Occasionally (when there's been ridiculous amounts of code movement)
it gets completely lost and puts these into very non-intuitive places.
In that case just delete it, keep the current code, and check what
change you're missing that needs to be manually reverted still. Also
sometimes there's a follow-up patch that you should revert first,
which makes the revert clean. In that case it's generally the right
thing to revert the follow-up first, and then apply your revert. Often
there's subtle functional dependencies hiding.
-Daniel

>
> > Also please cite the dma_fence error propagation revert from Jason:
> >
> > commit 93a2711cddd5760e2f0f901817d71c93183c3b87
> > Author: Jason Ekstrand 
> > Date:   Wed Jul 14 14:34:16 2021 -0500
> >
> > Revert "drm/i915: Propagate errors on awaiting already signaled fences"
> >
> > Maybe in full, if you need the justification.
> >
>
> Will site.
>
> > > Signed-off-by: Matthew Brost 
> > > Cc: 
> >
> > Unless "this bug" is some real world impact thing I wouldn't put cc:
> > stable on this.
>
> Got it.
>
> Matt
>
> > -Daniel
> > > ---
> > >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
> > >  1 file changed, 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index de5f9c86b9a4..cafb0608ffb4 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request 
> > > *rq)
> > > if (p->flags & I915_DEPENDENCY_WEAK)
> > > continue;
> > >
> > > -   /* Propagate any change in error status */
> > > -   if (rq->fence.error)
> > > -   i915_request_set_error_once(w, 
> > > rq->fence.error);
> > > -
> > > if (w->engine != rq->engine)
> > > continue;
> > >
> > > --
> > > 2.32.0
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 18/22] drm/i915/guc: Rework and simplify locking

On Tue, Aug 17, 2021 at 12:15:21PM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:35AM -0700, Matthew Brost wrote:
> > Rework and simplify the locking with GuC subission. Drop
> > sched_state_no_lock and move all fields under the guc_state.sched_state
> > and protect all these fields with guc_state.lock . This requires
> > changing the locking hierarchy from guc_state.lock -> sched_engine.lock
> > to sched_engine.lock -> guc_state.lock.
> > 
> > Signed-off-by: Matthew Brost 
> 
> Yeah this is definitely going in the right direction. Especially
> sprinkling lockdep_assert_held around.
> 
> One comment below.
> 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_context_types.h |   5 +-
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 186 --
> >  drivers/gpu/drm/i915/i915_trace.h |   6 +-
> >  3 files changed, 89 insertions(+), 108 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > index c06171ee8792..d5d643b04d54 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > @@ -161,7 +161,7 @@ struct intel_context {
> >  * sched_state: scheduling state of this context using GuC
> >  * submission
> >  */
> > -   u16 sched_state;
> > +   u32 sched_state;
> > /*
> >  * fences: maintains of list of requests that have a submit
> >  * fence related to GuC submission
> > @@ -178,9 +178,6 @@ struct intel_context {
> > struct list_head requests;
> > } guc_active;
> >  
> > -   /* GuC scheduling state flags that do not require a lock. */
> > -   atomic_t guc_sched_state_no_lock;
> > -
> > /* GuC LRC descriptor ID */
> > u16 guc_id;
> >  
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 7aa16371908a..ba19b99173fc 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -72,86 +72,23 @@ guc_create_virtual(struct intel_engine_cs **siblings, 
> > unsigned int count);
> >  
> >  #define GUC_REQUEST_SIZE 64 /* bytes */
> >  
> > -/*
> > - * Below is a set of functions which control the GuC scheduling state 
> > which do
> > - * not require a lock as all state transitions are mutually exclusive. 
> > i.e. It
> > - * is not possible for the context pinning code and submission, for the 
> > same
> > - * context, to be executing simultaneously. We still need an atomic as it 
> > is
> > - * possible for some of the bits to changing at the same time though.
> > - */
> > -#define SCHED_STATE_NO_LOCK_ENABLEDBIT(0)
> > -#define SCHED_STATE_NO_LOCK_PENDING_ENABLE BIT(1)
> > -#define SCHED_STATE_NO_LOCK_REGISTERED BIT(2)
> > -static inline bool context_enabled(struct intel_context *ce)
> > -{
> > -   return (atomic_read(>guc_sched_state_no_lock) &
> > -   SCHED_STATE_NO_LOCK_ENABLED);
> > -}
> > -
> > -static inline void set_context_enabled(struct intel_context *ce)
> > -{
> > -   atomic_or(SCHED_STATE_NO_LOCK_ENABLED, >guc_sched_state_no_lock);
> > -}
> > -
> > -static inline void clr_context_enabled(struct intel_context *ce)
> > -{
> > -   atomic_and((u32)~SCHED_STATE_NO_LOCK_ENABLED,
> > -  >guc_sched_state_no_lock);
> > -}
> > -
> > -static inline bool context_pending_enable(struct intel_context *ce)
> > -{
> > -   return (atomic_read(>guc_sched_state_no_lock) &
> > -   SCHED_STATE_NO_LOCK_PENDING_ENABLE);
> > -}
> > -
> > -static inline void set_context_pending_enable(struct intel_context *ce)
> > -{
> > -   atomic_or(SCHED_STATE_NO_LOCK_PENDING_ENABLE,
> > - >guc_sched_state_no_lock);
> > -}
> > -
> > -static inline void clr_context_pending_enable(struct intel_context *ce)
> > -{
> > -   atomic_and((u32)~SCHED_STATE_NO_LOCK_PENDING_ENABLE,
> > -  >guc_sched_state_no_lock);
> > -}
> > -
> > -static inline bool context_registered(struct intel_context *ce)
> > -{
> > -   return (atomic_read(>guc_sched_state_no_lock) &
> > -   SCHED_STATE_NO_LOCK_REGISTERED);
> > -}
> > -
> > -static inline void set_context_registered(struct intel_context *ce)
> > -{
> > -   atomic_or(SCHED_STATE_NO_LOCK_REGISTERED,
> > - >guc_sched_state_no_lock);
> > -}
> > -
> > -static inline void clr_context_registered(struct intel_context *ce)
> > -{
> > -   atomic_and((u32)~SCHED_STATE_NO_LOCK_REGISTERED,
> > -  >guc_sched_state_no_lock);
> > -}
> > -
> >  /*
> >   * Below is a set of functions which control the GuC scheduling state which
> > - * require a lock, aside from the special case where the functions are 
> > called
> > - * from guc_lrc_desc_pin(). In that case it isn't possible for any other 
> > code
> > - * path to be executing on the context.
> > + * require a

Re: [Intel-gfx] [PATCH 19/22] drm/i915/guc: Proper xarray usage for contexts_lookup

On Tue, Aug 17, 2021 at 12:27:29PM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:36AM -0700, Matthew Brost wrote:
> > Lock the xarray and take ref to the context if needed.
> > 
> > v2:
> >  (Checkpatch)
> >   - Add new line after declaration
> > 
> > Signed-off-by: Matthew Brost 
> > ---
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 84 ---
> >  1 file changed, 73 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index ba19b99173fc..2ecb2f002bed 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -599,8 +599,18 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > intel_guc *guc)
> > unsigned long index, flags;
> > bool pending_disable, pending_enable, deregister, destroyed, banned;
> >  
> > +   xa_lock_irqsave(>context_lookup, flags);
> > xa_for_each(>context_lookup, index, ce) {
> > -   spin_lock_irqsave(>guc_state.lock, flags);
> > +   /*
> > +* Corner case where the ref count on the object is zero but and
> > +* deregister G2H was lost. In this case we don't touch the ref
> > +* count and finish the destroy of the context.
> > +*/
> > +   bool do_put = kref_get_unless_zero(>ref);
> 
> This looks really scary, because in another loop below you have an
> unconditional refcount increase. This means sometimes guc->context_lookup

Yea, good catch those loops need something like this too.

> xarray guarantees we hold a full reference on the context, sometimes we
> don't. So we're right back in "protect the code" O(N^2) review complexity
> instead of invariant rules about the datastructure, which is linear.
> 
> Essentially anytime you feel like you have to add a comment to explain
> what's going on about concurrent stuff you're racing with, you're
> protecting code, not data.
> 
> Since guc can't do a hole lot without the guc_id registered and all that,
> I kinda expected you'd always have a full reference here. If there's

The deregister is triggered by the ref count going to zero and we can't
fully release the guc_id until that operation completes hence why it is
still in the xarray. I think the solution here is to use iterator like
you mention below that ref counts this correctly.

> intermediate stages (e.g. around unregister) where this is currently not
> always the case, then those should make sure a full reference is held.
> 
> Another option would be to threa ->context_lookup as a weak reference that
> we lazily clean up when the context is finalized. That works too, but
> probably not with a spinlock (since you most likely have to wait for all
> pending guc transations to complete), but it's another option.
> 
> Either way I think standard process is needed here for locking design,
> i.e.
> 1. come up with the right invariants ("we always have a full reference
> when a context is ont he guc->context_lookup xarray")
> 2. come up with the locks. From the guc side the xa_lock is maybe good
> enough, but from the context side this doesn't protect against a
> re-registering racing against a deregistering. So probably needs more
> rules on top, and then you have a nice lock inversion in a few places like
> here.
> 3. document it and roll it out.
> 
> The other thing is that this is a very tricky iterator, and there's a few
> copies of it. That is, if this is the right solution. As-is this should be
> abstracted away into guc_context_iter_begin/next_end() helpers, e.g. like
> we have for drm_connector_list_iter_begin/end_next as an example.
>

I can check this out.

Matt
 
> Cheers, Daniel
> 
> > +
> > +   xa_unlock(>context_lookup);
> > +
> > +   spin_lock(>guc_state.lock);
> >  
> > /*
> >  * Once we are at this point submission_disabled() is guaranteed
> > @@ -616,7 +626,9 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > intel_guc *guc)
> > banned = context_banned(ce);
> > init_sched_state(ce);
> >  
> > -   spin_unlock_irqrestore(>guc_state.lock, flags);
> > +   spin_unlock(>guc_state.lock);
> > +
> > +   GEM_BUG_ON(!do_put && !destroyed);
> >  
> > if (pending_enable || destroyed || deregister) {
> > atomic_dec(>outstanding_submission_g2h);
> > @@ -645,7 +657,12 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > intel_guc *guc)
> >  
> > intel_context_put(ce);
> > }
> > +
> > +   if (do_put)
> > +   intel_context_put(ce);
> > +   xa_lock(>context_lookup);
> > }
> > +   xa_unlock_irqrestore(>context_lookup, flags);
> >  }
> >  
> >  static inline bool
> > @@ -866,16 +883,26 @@ void intel_guc_submission_reset(struct intel_guc 
> > *guc, bool stalled)
> >  {
> > struct

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)

== Series Details ==

Series: drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)
URL   : https://patchwork.freedesktop.org/series/93711/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20834


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/index.html

Known issues


  Here are the changes found in Patchwork_20834 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [PASS][1] -> [FAIL][2] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#3057]: https://gitlab.freedesktop.org/drm/intel/issues/3057
  [i915#3970]: https://gitlab.freedesktop.org/drm/intel/issues/3970


Participating hosts (36 -> 34)
--

  Missing(2): fi-bsw-cyan fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10490 -> Patchwork_20834

  CI-20190529: 20190529
  CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20834: 32dc940d3e7e45a61a14938652fc5e8a24b5a923 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

32dc940d3e7e drm/i915: Ditch the i915_gem_ww_ctx loop member

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20834/index.html

Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

On Tue, Aug 17, 2021 at 11:21:27AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> > Progagating errors to dependent fences is wrong, don't do it. Selftest
> > in following patch exposes this bug.
> 
> Please explain what "this bug" is, it's hard to read minds, especially at
> a distance in spacetime :-)
> 

Not a very good explaination.

> > Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> > children on unhold")
> 
> I think it would be better to outright revert this, instead of just
> disabling it like this.
>

I tried revert and git did some really odd things that I couldn't
resolve, hence the new patch.
 
> Also please cite the dma_fence error propagation revert from Jason:
> 
> commit 93a2711cddd5760e2f0f901817d71c93183c3b87
> Author: Jason Ekstrand 
> Date:   Wed Jul 14 14:34:16 2021 -0500
> 
> Revert "drm/i915: Propagate errors on awaiting already signaled fences"
> 
> Maybe in full, if you need the justification.
>

Will site.

> > Signed-off-by: Matthew Brost 
> > Cc: 
> 
> Unless "this bug" is some real world impact thing I wouldn't put cc:
> stable on this.

Got it.

Matt

> -Daniel
> > ---
> >  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
> >  1 file changed, 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > index de5f9c86b9a4..cafb0608ffb4 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request 
> > *rq)
> > if (p->flags & I915_DEPENDENCY_WEAK)
> > continue;
> >  
> > -   /* Propagate any change in error status */
> > -   if (rq->fence.error)
> > -   i915_request_set_error_once(w, rq->fence.error);
> > -
> > if (w->engine != rq->engine)
> > continue;
> >  
> > -- 
> > 2.32.0
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 05/22] drm/i915/guc: Workaround reset G2H is received after schedule done G2H

On Tue, Aug 17, 2021 at 11:32:56AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:22AM -0700, Matthew Brost wrote:
> > If the context is reset as a result of the request cancelation the
> > context reset G2H is received after schedule disable done G2H which is
> > likely the wrong order. The schedule disable done G2H release the
> > waiting request cancelation code which resubmits the context. This races
> > with the context reset G2H which also wants to resubmit the context but
> > in this case it really should be a NOP as request cancelation code owns
> > the resubmit. Use some clever tricks of checking the context state to
> > seal this race until if / when the GuC firmware is fixed.
> > 
> > v2:
> >  (Checkpatch)
> >   - Fix typos
> > 
> > Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 43 ---
> >  1 file changed, 37 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 3cd2da6f5c03..c3b7bf7319dd 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -826,17 +826,35 @@ __unwind_incomplete_requests(struct intel_context *ce)
> >  static void __guc_reset_context(struct intel_context *ce, bool stalled)
> >  {
> > struct i915_request *rq;
> > +   unsigned long flags;
> > u32 head;
> > +   bool skip = false;
> >  
> > intel_context_get(ce);
> >  
> > /*
> > -* GuC will implicitly mark the context as non-schedulable
> > -* when it sends the reset notification. Make sure our state
> > -* reflects this change. The context will be marked enabled
> > -* on resubmission.
> > +* GuC will implicitly mark the context as non-schedulable when it sends
> > +* the reset notification. Make sure our state reflects this change. The
> > +* context will be marked enabled on resubmission.
> > +*
> > +* XXX: If the context is reset as a result of the request cancellation
> > +* this G2H is received after the schedule disable complete G2H which is
> > +* likely wrong as this creates a race between the request cancellation
> > +* code re-submitting the context and this G2H handler. This likely
> > +* should be fixed in the GuC but until if / when that gets fixed we
> > +* need to workaround this. Convert this function to a NOP if a pending
> > +* enable is in flight as this indicates that a request cancellation has
> > +* occurred.
> >  */
> > -   clr_context_enabled(ce);
> > +   spin_lock_irqsave(>guc_state.lock, flags);
> > +   if (likely(!context_pending_enable(ce))) {
> > +   clr_context_enabled(ce);
> > +   } else {
> > +   skip = true;
> > +   }
> > +   spin_unlock_irqrestore(>guc_state.lock, flags);
> > +   if (unlikely(skip))
> > +   goto out_put;
> >  
> > rq = intel_context_find_active_request(ce);
> > if (!rq) {
> > @@ -855,6 +873,7 @@ static void __guc_reset_context(struct intel_context 
> > *ce, bool stalled)
> >  out_replay:
> > guc_reset_state(ce, head, stalled);
> > __unwind_incomplete_requests(ce);
> > +out_put:
> > intel_context_put(ce);
> >  }
> >  
> > @@ -1599,6 +1618,13 @@ static void guc_context_cancel_request(struct 
> > intel_context *ce,
> > guc_reset_state(ce, intel_ring_wrap(ce->ring, rq->head),
> > true);
> > }
> > +
> > +   /*
> > +* XXX: Racey if context is reset, see comment in
> > +* __guc_reset_context().
> > +*/
> > +   flush_work(_to_guc(ce)->ct.requests.worker);
> 
> This looks racy, and I think that holds in general for all the flush_work
> you're adding: This only flushes the processing of the work, it doesn't
> stop any re-queueing (as far as I can tell at least), which means it
> doesn't do a hole lot.
> 
> Worse, your task is re-queue because it only processes one item at a time.
> That means flush_work only flushes the first invocation, but not even
> drains them all. So even if you do prevent requeueing somehow, this isn't
> what you want. Two solutions.
> 
> - flush_work_sync, which flushes until self-requeues are all done too
> 
> - Or more preferred, make you're worker a bit more standard for this
>   stuff: a) under the spinlock, take the entire list, not just the first
>   entry, with list_move or similar to a local list b) process that local
>   list in a loop b) don't requeue youreself.

This seems better, not sure what it currently doesn't do that as I
didn't write that code.

Also BTW, confirmed with the GuC team the order of the G2H is incorrect
and will get fixed in an upcoming release, once that happens most of
this patch can get dropped.

Matt 

> 
> Cheers, Daniel
> > +
> >

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)

== Series Details ==

Series: drm/i915: Ditch the i915_gem_ww_ctx loop member (rev2)
URL   : https://patchwork.freedesktop.org/series/93711/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
32dc940d3e7e drm/i915: Ditch the i915_gem_ww_ctx loop member
-:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_ww' - possible side-effects?
#70: FILE: drivers/gpu/drm/i915/i915_gem_ww.h:37:
+#define for_i915_gem_ww(_ww, _err, _intr)\
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+(_err) == -EDEADLK;  \
+(_err) = __i915_gem_ww_fini(_ww, _err))

-:70: CHECK:MACRO_ARG_REUSE: Macro argument reuse '_err' - possible 
side-effects?
#70: FILE: drivers/gpu/drm/i915/i915_gem_ww.h:37:
+#define for_i915_gem_ww(_ww, _err, _intr)\
+   for (i915_gem_ww_ctx_init(_ww, _intr), (_err) = -EDEADLK; \
+(_err) == -EDEADLK;  \
+(_err) = __i915_gem_ww_fini(_ww, _err))

total: 0 errors, 0 warnings, 2 checks, 47 lines checked

[Intel-gfx] ✗ Fi.CI.IGT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)
URL   : https://patchwork.freedesktop.org/series/93704/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10490_full -> Patchwork_20833_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20833_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20833_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20833_full:

### IGT changes ###

 Possible regressions 

  * igt@kms_flip_tiling@flip-to-yf-tiled@edp-1-pipe-a:
- shard-skl:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-skl10/igt@kms_flip_tiling@flip-to-yf-ti...@edp-1-pipe-a.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-skl7/igt@kms_flip_tiling@flip-to-yf-ti...@edp-1-pipe-a.html

  
New tests
-

  New tests have been introduced between CI_DRM_10490_full and 
Patchwork_20833_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 5 pass(s)
- Exec time: [0.95, 4.69] s

  

Known issues


  Here are the changes found in Patchwork_20833_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +2 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-snb5/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][4] -> [TIMEOUT][5] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb7/igt@gem_...@unwedge-stress.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-tglb7/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-kbl:  [PASS][6] -> [FAIL][7] ([i915#2842]) +5 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-kbl6/igt@gem_exec_fair@basic-none-s...@rcs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-kbl4/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][8] ([i915#2842]) +1 similar issue
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-iclb2/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][9] -> [FAIL][10] ([i915#2842]) +2 similar 
issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-glk9/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_suspend@basic-s4-devices:
- shard-glk:  NOTRUN -> [DMESG-WARN][11] ([i915#1610])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-glk6/igt@gem_exec_susp...@basic-s4-devices.html

  * igt@gem_exec_whisper@basic-fds-forked:
- shard-glk:  [PASS][12] -> [DMESG-WARN][13] ([i915#118] / 
[i915#95])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-glk7/igt@gem_exec_whis...@basic-fds-forked.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-glk7/igt@gem_exec_whis...@basic-fds-forked.html

  * igt@gem_huc_copy@huc-copy:
- shard-tglb: [PASS][14] -> [SKIP][15] ([i915#2190])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/shard-tglb2/igt@gem_huc_c...@huc-copy.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-tglb6/igt@gem_huc_c...@huc-copy.html

  * igt@gem_pwrite@basic-exhaustion:
- shard-tglb: NOTRUN -> [WARN][16] ([i915#2658])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-tglb8/igt@gem_pwr...@basic-exhaustion.html

  * igt@gem_render_copy@y-tiled-to-vebox-y-tiled:
- shard-iclb: NOTRUN -> [SKIP][17] ([i915#768])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-iclb4/igt@gem_render_c...@y-tiled-to-vebox-y-tiled.html

  * igt@gem_userptr_blits@access-control:
- shard-tglb: NOTRUN -> [SKIP][18] ([i915#3297]) +1 similar issue
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/shard-tglb5/igt@gem_userptr_bl...@access-control.html
- shard-iclb: NOTRUN -> [SKIP][19] ([i915#3297]) +1 similar issue
   [19]:

[Intel-gfx] ✓ Fi.CI.BAT: success for Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)
URL   : https://patchwork.freedesktop.org/series/93704/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10490 -> Patchwork_20833


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20833:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@i915_selftest@live@gt_heartbeat:
- {fi-ehl-2}: [PASS][1] -> [DMESG-FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-ehl-2/igt@i915_selftest@live@gt_heartbeat.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-ehl-2/igt@i915_selftest@live@gt_heartbeat.html

  
New tests
-

  New tests have been introduced between CI_DRM_10490 and Patchwork_20833:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 30 pass(s)
- Exec time: [0.40, 5.19] s

  

Known issues


  Here are the changes found in Patchwork_20833 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@semaphore:
- fi-bdw-5557u:   NOTRUN -> [SKIP][3] ([fdo#109271]) +27 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-bdw-5557u/igt@amdgpu/amd_ba...@semaphore.html

  * igt@core_hotunplug@unbind-rebind:
- fi-bdw-5557u:   NOTRUN -> [WARN][4] ([i915#3718])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-bdw-5557u/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_parallel@engines@userptr:
- fi-pnv-d510:[PASS][5] -> [INCOMPLETE][6] ([i915#299])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10490/fi-pnv-d510/igt@gem_exec_parallel@engi...@userptr.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-pnv-d510/igt@gem_exec_parallel@engi...@userptr.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-bdw-5557u:   NOTRUN -> [SKIP][7] ([fdo#109271] / [fdo#111827]) +8 
similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-bdw-5557u/igt@kms_chamel...@dp-crc-fast.html

  * igt@runner@aborted:
- fi-pnv-d510:NOTRUN -> [FAIL][8] ([i915#2403] / [i915#2505] / 
[i915#2722])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20833/fi-pnv-d510/igt@run...@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2403]: https://gitlab.freedesktop.org/drm/intel/issues/2403
  [i915#2505]: https://gitlab.freedesktop.org/drm/intel/issues/2505
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#299]: https://gitlab.freedesktop.org/drm/intel/issues/299
  [i915#3718]: https://gitlab.freedesktop.org/drm/intel/issues/3718


Participating hosts (36 -> 34)
--

  Missing(2): fi-bsw-cyan fi-bdw-samus 


Build changes
-

  * Linux: CI_DRM_10490 -> Patchwork_20833

  CI-20190529: 20190529
  CI_DRM_10490: 3bd74b377986fcb89cf4563629f97c5b3199ca6f @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6177: f474644e7226dd319195ca03b3cde82ad10ac54c @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20833: c4c34f7bb22c9a83377812d75d8eb207a44a1b9b @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c4c34f7bb22c drm/i915/guc: Add GuC kernel doc
bb901831764c drm/i915/guc: Move GuC priority fields in context under guc_active
492de6bdaacb drm/i915/guc: Drop pin count check trick between sched_disable and 
re-pin
69e885b61035 drm/i915/guc: Proper xarray usage for contexts_lookup
fa32fd7346d0 drm/i915/guc: Rework and simplify locking
c8b83840007d drm/i915/guc: Move guc_blocked fence to struct guc_state
dcd9725de04f drm/i915/guc: Release submit fence from an IRQ
31fbd295c9f5 drm/i915/guc: Flush G2H work queue during reset
e21f028c082e drm/i915: Allocate error capture in atomic context
48c820953477 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
738284a940e2 drm/i915/guc: Don't touch guc_state.sched_state without a lock
957737f84734 drm/i915/guc: Take context ref when cancelling request
241da61be83d drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
221846309949 drm/i915/selftests: Fix memory corruption in live_lrc_isolation
ba1d218343a3 drm/i915/guc: Don't enable scheduling on a banned context, guc_id 
invalid, not registered
862260cf6795 drm/i915/selftests: Add a cancel request selftest that

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev2)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b9583f1b134e drm/i915/guc: Fix blocked context accounting
5703206b5f51 drm/i915/guc: Fix outstanding G2H accounting
ee0ecb333df9 drm/i915/guc: Unwind context requests in reverse order
97ee783e2b00 drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
6f42cfc0eeb4 drm/i915/guc: Workaround reset G2H is received after schedule done 
G2H
-:7: WARNING:TYPO_SPELLING: 'cancelation' may be misspelled - perhaps 
'cancellation'?
#7: 
If the context is reset as a result of the request cancelation the
   ^^^

-:10: WARNING:TYPO_SPELLING: 'cancelation' may be misspelled - perhaps 
'cancellation'?
#10: 
waiting request cancelation code which resubmits the context. This races
^^^

-:12: WARNING:TYPO_SPELLING: 'cancelation' may be misspelled - perhaps 
'cancellation'?
#12: 
in this case it really should be a NOP as request cancelation code owns
  ^^^

-:58: WARNING:BRACES: braces {} are not necessary for any arm of this statement
#58: FILE: drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c:850:
+   if (likely(!context_pending_enable(ce))) {
[...]
+   } else {
[...]

total: 0 errors, 4 warnings, 0 checks, 73 lines checked
806479ce9909 drm/i915/execlists: Do not propagate errors to dependent fences
862260cf6795 drm/i915/selftests: Add a cancel request selftest that triggers a 
reset
ba1d218343a3 drm/i915/guc: Don't enable scheduling on a banned context, guc_id 
invalid, not registered
221846309949 drm/i915/selftests: Fix memory corruption in live_lrc_isolation
241da61be83d drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
-:104: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#104: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 232 lines checked
957737f84734 drm/i915/guc: Take context ref when cancelling request
738284a940e2 drm/i915/guc: Don't touch guc_state.sched_state without a lock
48c820953477 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
e21f028c082e drm/i915: Allocate error capture in atomic context
31fbd295c9f5 drm/i915/guc: Flush G2H work queue during reset
dcd9725de04f drm/i915/guc: Release submit fence from an IRQ
c8b83840007d drm/i915/guc: Move guc_blocked fence to struct guc_state
fa32fd7346d0 drm/i915/guc: Rework and simplify locking
69e885b61035 drm/i915/guc: Proper xarray usage for contexts_lookup
492de6bdaacb drm/i915/guc: Drop pin count check trick between sched_disable and 
re-pin
bb901831764c drm/i915/guc: Move GuC priority fields in context under guc_active
c4c34f7bb22c drm/i915/guc: Add GuC kernel doc

Re: [Intel-gfx] [PATCH 22/22] drm/i915/guc: Add GuC kernel doc

On Mon, Aug 16, 2021 at 06:51:39AM -0700, Matthew Brost wrote:
> Add GuC kernel doc for all structures added thus far for GuC submission
> and update the main GuC submission section with the new interface
> details.
> 
> Signed-off-by: Matthew Brost 

There's quite a bit more, e.g. intel_guc_ct, which has it's own world of
locking design that also doesn't feel too consistent.

> ---
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  42 +---
>  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  19 +++-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 101 ++
>  drivers/gpu/drm/i915/i915_request.h   |  18 ++--
>  4 files changed, 131 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index f6989e6807f7..75d609a1bc33 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -156,44 +156,56 @@ struct intel_context {
>   u8 wa_bb_page; /* if set, page num reserved for context workarounds */
>  
>   struct {
> - /** lock: protects everything in guc_state */
> + /** @lock: protects everything in guc_state */
>   spinlock_t lock;
>   /**
> -  * sched_state: scheduling state of this context using GuC
> +  * @sched_state: scheduling state of this context using GuC
>* submission
>*/
>   u32 sched_state;
>   /*
> -  * fences: maintains of list of requests that have a submit
> -  * fence related to GuC submission
> +  * @fences: maintains a list of requests are currently being
> +  * fenced until a GuC operation completes
>*/
>   struct list_head fences;
> - /* GuC context blocked fence */
> + /**
> +  * @blocked_fence: fence used to signal when the blocking of a
> +  * contexts submissions is complete.
> +  */
>   struct i915_sw_fence blocked_fence;
> - /* GuC committed requests */
> + /** @number_committed_requests: number of committed requests */
>   int number_committed_requests;
>   } guc_state;
>  
>   struct {
> - /** lock: protects everything in guc_active */
> + /** @lock: protects everything in guc_active */
>   spinlock_t lock;

Why do we have two locks spinlocks to protect guc context state?

I do understand the need for a spinlock (at least for now) because of how
i915-scheduler runs in tasklet context. But beyond that we really
shouldn't need more than two locks to protect context state. You still
have an entire pile here, plus some atomics, plus more.

And this is on a single context, where concurrently submitting stuff
really isn't a thing. I'd expect actual benchmarking would show a perf
hit, since all these locks and atomics aren't free. This is at least the
case with execbuf and the various i915_vma locks we currently have.

What I expect intel_context locking to be is roughly:

- One lock to protect all intel_context state. This probably should be a
  dma_resv_lock for a few reasons, least so we can pin state objects
  underneath that lock.

- A separate lock if there's anything you need to coordinate with the
  backend scheduler while that's running, to avoid dma_fence inversions.
  Right now this separate lock might need to be a spinlock because our
  scheduler runs in tasklets, and that might mean we need both a mutex and
  a spinlock here.

Anything that goes beyond that is premature optimization and kills us code
complexity vise. I'd be _extremely_ surprised if an IA core cannot keep up
with GuC, and therefore anything that goes beyond "one lock per object",
plus/minus execution context issues like the above tasklet issue, is
likely just going to slow everything down.

> - /** requests: active requests on this context */
> + /** @requests: list of active requests on this context */
>   struct list_head requests;
> - /*
> -  * GuC priority management
> -  */
> + /** @guc_prio: the contexts current guc priority */
>   u8 guc_prio;
> + /**
> +  * @guc_prio_count: a counter of the number requests inflight in
> +  * each priority bucket
> +  */
>   u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
>   } guc_active;
>  
> - /* GuC LRC descriptor ID */
> + /**
> +  * @guc_id: unique handle which is used to communicate information with
> +  * the GuC about this context, protected by guc->contexts_lock
> +  */
>   u16 guc_id;
>  
> - /* GuC LRC descriptor reference count */
> + /**
> +  * @guc_id_ref: the number of references to the guc_id, when
> +  * transitioning in

Re: [Intel-gfx] [PATCH 19/22] drm/i915/guc: Proper xarray usage for contexts_lookup

On Mon, Aug 16, 2021 at 06:51:36AM -0700, Matthew Brost wrote:
> Lock the xarray and take ref to the context if needed.
> 
> v2:
>  (Checkpatch)
>   - Add new line after declaration
> 
> Signed-off-by: Matthew Brost 
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 84 ---
>  1 file changed, 73 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index ba19b99173fc..2ecb2f002bed 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -599,8 +599,18 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> intel_guc *guc)
>   unsigned long index, flags;
>   bool pending_disable, pending_enable, deregister, destroyed, banned;
>  
> + xa_lock_irqsave(>context_lookup, flags);
>   xa_for_each(>context_lookup, index, ce) {
> - spin_lock_irqsave(>guc_state.lock, flags);
> + /*
> +  * Corner case where the ref count on the object is zero but and
> +  * deregister G2H was lost. In this case we don't touch the ref
> +  * count and finish the destroy of the context.
> +  */
> + bool do_put = kref_get_unless_zero(>ref);

This looks really scary, because in another loop below you have an
unconditional refcount increase. This means sometimes guc->context_lookup
xarray guarantees we hold a full reference on the context, sometimes we
don't. So we're right back in "protect the code" O(N^2) review complexity
instead of invariant rules about the datastructure, which is linear.

Essentially anytime you feel like you have to add a comment to explain
what's going on about concurrent stuff you're racing with, you're
protecting code, not data.

Since guc can't do a hole lot without the guc_id registered and all that,
I kinda expected you'd always have a full reference here. If there's
intermediate stages (e.g. around unregister) where this is currently not
always the case, then those should make sure a full reference is held.

Another option would be to threa ->context_lookup as a weak reference that
we lazily clean up when the context is finalized. That works too, but
probably not with a spinlock (since you most likely have to wait for all
pending guc transations to complete), but it's another option.

Either way I think standard process is needed here for locking design,
i.e.
1. come up with the right invariants ("we always have a full reference
when a context is ont he guc->context_lookup xarray")
2. come up with the locks. From the guc side the xa_lock is maybe good
enough, but from the context side this doesn't protect against a
re-registering racing against a deregistering. So probably needs more
rules on top, and then you have a nice lock inversion in a few places like
here.
3. document it and roll it out.

The other thing is that this is a very tricky iterator, and there's a few
copies of it. That is, if this is the right solution. As-is this should be
abstracted away into guc_context_iter_begin/next_end() helpers, e.g. like
we have for drm_connector_list_iter_begin/end_next as an example.

Cheers, Daniel

> +
> + xa_unlock(>context_lookup);
> +
> + spin_lock(>guc_state.lock);
>  
>   /*
>* Once we are at this point submission_disabled() is guaranteed
> @@ -616,7 +626,9 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> intel_guc *guc)
>   banned = context_banned(ce);
>   init_sched_state(ce);
>  
> - spin_unlock_irqrestore(>guc_state.lock, flags);
> + spin_unlock(>guc_state.lock);
> +
> + GEM_BUG_ON(!do_put && !destroyed);
>  
>   if (pending_enable || destroyed || deregister) {
>   atomic_dec(>outstanding_submission_g2h);
> @@ -645,7 +657,12 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> intel_guc *guc)
>  
>   intel_context_put(ce);
>   }
> +
> + if (do_put)
> + intel_context_put(ce);
> + xa_lock(>context_lookup);
>   }
> + xa_unlock_irqrestore(>context_lookup, flags);
>  }
>  
>  static inline bool
> @@ -866,16 +883,26 @@ void intel_guc_submission_reset(struct intel_guc *guc, 
> bool stalled)
>  {
>   struct intel_context *ce;
>   unsigned long index;
> + unsigned long flags;
>  
>   if (unlikely(!guc_submission_initialized(guc))) {
>   /* Reset called during driver load? GuC not yet initialised! */
>   return;
>   }
>  
> - xa_for_each(>context_lookup, index, ce)
> + xa_lock_irqsave(>context_lookup, flags);
> + xa_for_each(>context_lookup, index, ce) {
> + intel_context_get(ce);
> + xa_unlock(>context_lookup);
> +
>   if (intel_context_is_pinned(ce))
>

Re: [Intel-gfx] [PATCH 18/22] drm/i915/guc: Rework and simplify locking

On Mon, Aug 16, 2021 at 06:51:35AM -0700, Matthew Brost wrote:
> Rework and simplify the locking with GuC subission. Drop
> sched_state_no_lock and move all fields under the guc_state.sched_state
> and protect all these fields with guc_state.lock . This requires
> changing the locking hierarchy from guc_state.lock -> sched_engine.lock
> to sched_engine.lock -> guc_state.lock.
> 
> Signed-off-by: Matthew Brost 

Yeah this is definitely going in the right direction. Especially
sprinkling lockdep_assert_held around.

One comment below.

> ---
>  drivers/gpu/drm/i915/gt/intel_context_types.h |   5 +-
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 186 --
>  drivers/gpu/drm/i915/i915_trace.h |   6 +-
>  3 files changed, 89 insertions(+), 108 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index c06171ee8792..d5d643b04d54 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -161,7 +161,7 @@ struct intel_context {
>* sched_state: scheduling state of this context using GuC
>* submission
>*/
> - u16 sched_state;
> + u32 sched_state;
>   /*
>* fences: maintains of list of requests that have a submit
>* fence related to GuC submission
> @@ -178,9 +178,6 @@ struct intel_context {
>   struct list_head requests;
>   } guc_active;
>  
> - /* GuC scheduling state flags that do not require a lock. */
> - atomic_t guc_sched_state_no_lock;
> -
>   /* GuC LRC descriptor ID */
>   u16 guc_id;
>  
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 7aa16371908a..ba19b99173fc 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -72,86 +72,23 @@ guc_create_virtual(struct intel_engine_cs **siblings, 
> unsigned int count);
>  
>  #define GUC_REQUEST_SIZE 64 /* bytes */
>  
> -/*
> - * Below is a set of functions which control the GuC scheduling state which 
> do
> - * not require a lock as all state transitions are mutually exclusive. i.e. 
> It
> - * is not possible for the context pinning code and submission, for the same
> - * context, to be executing simultaneously. We still need an atomic as it is
> - * possible for some of the bits to changing at the same time though.
> - */
> -#define SCHED_STATE_NO_LOCK_ENABLED  BIT(0)
> -#define SCHED_STATE_NO_LOCK_PENDING_ENABLE   BIT(1)
> -#define SCHED_STATE_NO_LOCK_REGISTERED   BIT(2)
> -static inline bool context_enabled(struct intel_context *ce)
> -{
> - return (atomic_read(>guc_sched_state_no_lock) &
> - SCHED_STATE_NO_LOCK_ENABLED);
> -}
> -
> -static inline void set_context_enabled(struct intel_context *ce)
> -{
> - atomic_or(SCHED_STATE_NO_LOCK_ENABLED, >guc_sched_state_no_lock);
> -}
> -
> -static inline void clr_context_enabled(struct intel_context *ce)
> -{
> - atomic_and((u32)~SCHED_STATE_NO_LOCK_ENABLED,
> ->guc_sched_state_no_lock);
> -}
> -
> -static inline bool context_pending_enable(struct intel_context *ce)
> -{
> - return (atomic_read(>guc_sched_state_no_lock) &
> - SCHED_STATE_NO_LOCK_PENDING_ENABLE);
> -}
> -
> -static inline void set_context_pending_enable(struct intel_context *ce)
> -{
> - atomic_or(SCHED_STATE_NO_LOCK_PENDING_ENABLE,
> -   >guc_sched_state_no_lock);
> -}
> -
> -static inline void clr_context_pending_enable(struct intel_context *ce)
> -{
> - atomic_and((u32)~SCHED_STATE_NO_LOCK_PENDING_ENABLE,
> ->guc_sched_state_no_lock);
> -}
> -
> -static inline bool context_registered(struct intel_context *ce)
> -{
> - return (atomic_read(>guc_sched_state_no_lock) &
> - SCHED_STATE_NO_LOCK_REGISTERED);
> -}
> -
> -static inline void set_context_registered(struct intel_context *ce)
> -{
> - atomic_or(SCHED_STATE_NO_LOCK_REGISTERED,
> -   >guc_sched_state_no_lock);
> -}
> -
> -static inline void clr_context_registered(struct intel_context *ce)
> -{
> - atomic_and((u32)~SCHED_STATE_NO_LOCK_REGISTERED,
> ->guc_sched_state_no_lock);
> -}
> -
>  /*
>   * Below is a set of functions which control the GuC scheduling state which
> - * require a lock, aside from the special case where the functions are called
> - * from guc_lrc_desc_pin(). In that case it isn't possible for any other code
> - * path to be executing on the context.
> + * require a lock.
>   */
>  #define SCHED_STATE_WAIT_FOR_DEREGISTER_TO_REGISTER  BIT(0)
>  #define SCHED_STATE_DESTROYEDBIT(1)
>  #define SCHED_STATE_PENDING_DISABLE  BIT(2)
>  #define SCHED_STATE_BANNED   BIT(3)

Re: [Intel-gfx] [PATCH 17/22] drm/i915/guc: Move guc_blocked fence to struct guc_state

On Mon, Aug 16, 2021 at 06:51:34AM -0700, Matthew Brost wrote:
> Move guc_blocked fence to struct guc_state as the lock which protects
> the fence lives there.
> 
> s/ce->guc_blocked/ce->guc_state.blocked_fence/g
> 
> Signed-off-by: Matthew Brost 

General comment, but latest when your combine your count state with a wait
queue you're very far into "reinventing a mutex/semaphore, badly" land.

I think we really need to look into why we can't just protect this all
with a mutex and make sure the awkward transition states are never visible
to anyone else.
-Daniel

> ---
>  drivers/gpu/drm/i915/gt/intel_context.c|  5 +++--
>  drivers/gpu/drm/i915/gt/intel_context_types.h  |  5 ++---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 +-
>  3 files changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> b/drivers/gpu/drm/i915/gt/intel_context.c
> index 745e84c72c90..0e48939ec85f 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context.c
> +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> @@ -405,8 +405,9 @@ intel_context_init(struct intel_context *ce, struct 
> intel_engine_cs *engine)
>* Initialize fence to be complete as this is expected to be complete
>* unless there is a pending schedule disable outstanding.
>*/
> - i915_sw_fence_init(>guc_blocked, sw_fence_dummy_notify);
> - i915_sw_fence_commit(>guc_blocked);
> + i915_sw_fence_init(>guc_state.blocked_fence,
> +sw_fence_dummy_notify);
> + i915_sw_fence_commit(>guc_state.blocked_fence);
>  
>   i915_active_init(>active,
>__intel_context_active, __intel_context_retire, 0);
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index 3a73f3117873..c06171ee8792 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -167,6 +167,8 @@ struct intel_context {
>* fence related to GuC submission
>*/
>   struct list_head fences;
> + /* GuC context blocked fence */
> + struct i915_sw_fence blocked_fence;
>   } guc_state;
>  
>   struct {
> @@ -190,9 +192,6 @@ struct intel_context {
>*/
>   struct list_head guc_id_link;
>  
> - /* GuC context blocked fence */
> - struct i915_sw_fence guc_blocked;
> -
>   /*
>* GuC priority management
>*/
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 9ae4633aa7cb..7aa16371908a 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1482,24 +1482,24 @@ static void guc_blocked_fence_complete(struct 
> intel_context *ce)
>  {
>   lockdep_assert_held(>guc_state.lock);
>  
> - if (!i915_sw_fence_done(>guc_blocked))
> - i915_sw_fence_complete(>guc_blocked);
> + if (!i915_sw_fence_done(>guc_state.blocked_fence))
> + i915_sw_fence_complete(>guc_state.blocked_fence);
>  }
>  
>  static void guc_blocked_fence_reinit(struct intel_context *ce)
>  {
>   lockdep_assert_held(>guc_state.lock);
> - GEM_BUG_ON(!i915_sw_fence_done(>guc_blocked));
> + GEM_BUG_ON(!i915_sw_fence_done(>guc_state.blocked_fence));
>  
>   /*
>* This fence is always complete unless a pending schedule disable is
>* outstanding. We arm the fence here and complete it when we receive
>* the pending schedule disable complete message.
>*/
> - i915_sw_fence_fini(>guc_blocked);
> - i915_sw_fence_reinit(>guc_blocked);
> - i915_sw_fence_await(>guc_blocked);
> - i915_sw_fence_commit(>guc_blocked);
> + i915_sw_fence_fini(>guc_state.blocked_fence);
> + i915_sw_fence_reinit(>guc_state.blocked_fence);
> + i915_sw_fence_await(>guc_state.blocked_fence);
> + i915_sw_fence_commit(>guc_state.blocked_fence);
>  }
>  
>  static u16 prep_context_pending_disable(struct intel_context *ce)
> @@ -1539,7 +1539,7 @@ static struct i915_sw_fence *guc_context_block(struct 
> intel_context *ce)
>   if (enabled)
>   clr_context_enabled(ce);
>   spin_unlock_irqrestore(>guc_state.lock, flags);
> - return >guc_blocked;
> + return >guc_state.blocked_fence;
>   }
>  
>   /*
> @@ -1555,7 +1555,7 @@ static struct i915_sw_fence *guc_context_block(struct 
> intel_context *ce)
>   with_intel_runtime_pm(runtime_pm, wakeref)
>   __guc_context_sched_disable(guc, ce, guc_id);
>  
> - return >guc_blocked;
> + return >guc_state.blocked_fence;
>  }
>  
>  static void guc_context_unblock(struct intel_context *ce)
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 16/22] drm/i915/guc: Release submit fence from an IRQ

On Mon, Aug 16, 2021 at 06:51:33AM -0700, Matthew Brost wrote:
> A subsequent patch will flip the locking hierarchy from
> ce->guc_state.lock -> sched_engine->lock to sched_engine->lock ->
> ce->guc_state.lock. As such we need to release the submit fence for a
> request from an IRQ to break a lock inversion - i.e. the fence must be
> release went holding ce->guc_state.lock and the releasing of the can
> acquire sched_engine->lock.
> 
> Signed-off-by: Matthew Brost 

Title should be "irq work", otherwise it reads a bit strange. Also these
kind of nestings would be good to document in the kerneldoc too (maybe as
you go even).
-Daniel

> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 15 ++-
>  drivers/gpu/drm/i915/i915_request.h   |  5 +
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 8c560ed14976..9ae4633aa7cb 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -2017,6 +2017,14 @@ static const struct intel_context_ops guc_context_ops 
> = {
>   .create_virtual = guc_create_virtual,
>  };
>  
> +static void submit_work_cb(struct irq_work *wrk)
> +{
> + struct i915_request *rq = container_of(wrk, typeof(*rq), submit_work);
> +
> + might_lock(>engine->sched_engine->lock);
> + i915_sw_fence_complete(>submit);
> +}
> +
>  static void __guc_signal_context_fence(struct intel_context *ce)
>  {
>   struct i915_request *rq;
> @@ -2026,8 +2034,12 @@ static void __guc_signal_context_fence(struct 
> intel_context *ce)
>   if (!list_empty(>guc_state.fences))
>   trace_intel_context_fence_release(ce);
>  
> + /*
> +  * Use an IRQ to ensure locking order of sched_engine->lock ->
> +  * ce->guc_state.lock is preserved.
> +  */
>   list_for_each_entry(rq, >guc_state.fences, guc_fence_link)
> - i915_sw_fence_complete(>submit);
> + irq_work_queue(>submit_work);
>  
>   INIT_LIST_HEAD(>guc_state.fences);
>  }
> @@ -2137,6 +2149,7 @@ static int guc_request_alloc(struct i915_request *rq)
>   spin_lock_irqsave(>guc_state.lock, flags);
>   if (context_wait_for_deregister_to_register(ce) ||
>   context_pending_disable(ce)) {
> + init_irq_work(>submit_work, submit_work_cb);
>   i915_sw_fence_await(>submit);
>  
>   list_add_tail(>guc_fence_link, >guc_state.fences);
> diff --git a/drivers/gpu/drm/i915/i915_request.h 
> b/drivers/gpu/drm/i915/i915_request.h
> index 1bc1349ba3c2..d818cfbfc41d 100644
> --- a/drivers/gpu/drm/i915/i915_request.h
> +++ b/drivers/gpu/drm/i915/i915_request.h
> @@ -218,6 +218,11 @@ struct i915_request {
>   };
>   struct llist_head execute_cb;
>   struct i915_sw_fence semaphore;
> + /**
> +  * @submit_work: complete submit fence from an IRQ if needed for
> +  * locking hierarchy reasons.
> +  */
> + struct irq_work submit_work;
>  
>   /*
>* A list of everyone we wait upon, and everyone who waits upon us.
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 15/22] drm/i915/guc: Flush G2H work queue during reset

On Mon, Aug 16, 2021 at 06:51:32AM -0700, Matthew Brost wrote:
> It isn't safe to scrub for missing G2H or continue with the reset until
> all G2H processing is complete. Flush the G2H work queue during reset to
> ensure it is done running.
> 
> Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC 
> interface")
> Signed-off-by: Matthew Brost 
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c  | 18 ++
>  1 file changed, 2 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 3a01743e09ea..8c560ed14976 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -707,8 +707,6 @@ static void guc_flush_submissions(struct intel_guc *guc)
>  
>  void intel_guc_submission_reset_prepare(struct intel_guc *guc)
>  {
> - int i;
> -
>   if (unlikely(!guc_submission_initialized(guc))) {
>   /* Reset called during driver load? GuC not yet initialised! */
>   return;
> @@ -724,20 +722,8 @@ void intel_guc_submission_reset_prepare(struct intel_guc 
> *guc)
>  
>   guc_flush_submissions(guc);
>  
> - /*
> -  * Handle any outstanding G2Hs before reset. Call IRQ handler directly
> -  * each pass as interrupt have been disabled. We always scrub for
> -  * outstanding G2H as it is possible for outstanding_submission_g2h to
> -  * be incremented after the context state update.
> -  */
> - for (i = 0; i < 4 && atomic_read(>outstanding_submission_g2h); 
> ++i) {
> - intel_guc_to_host_event_handler(guc);
> -#define wait_for_reset(guc, wait_var) \
> - intel_guc_wait_for_pending_msg(guc, wait_var, false, (HZ / 20))
> - do {
> - wait_for_reset(guc, >outstanding_submission_g2h);
> - } while (!list_empty(>ct.requests.incoming));
> - }
> + flush_work(>ct.requests.worker);

Same thing about flush_work as in an earlier patch.
-Daniel

> +
>   scrub_guc_desc_for_outstanding_g2h(guc);
>  }
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 14/22] drm/i915: Allocate error capture in atomic context

On Mon, Aug 16, 2021 at 06:51:31AM -0700, Matthew Brost wrote:
> Error captures can now be done in a work queue processing G2H messages.
> These messages need to be completely done being processed in the reset
> path, to avoid races in the missing G2H cleanup, which create a
> dependency on memory allocations and dma fences (i915_requests).
> Requests depend on resets, thus now we have a circular dependency. To
> work around this, allocate the error capture in an atomic context.
> 
> Fixes: dc0dad365c5e ("Fix for error capture after full GPU reset with GuC")
> Fixes: 573ba126aef3 ("Capture error state on context reset")
> Signed-off-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 37 +--
>  1 file changed, 18 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 0f08bcfbe964..453376aa6d9f 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -49,7 +49,6 @@
>  #include "i915_memcpy.h"
>  #include "i915_scatterlist.h"
>  
> -#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
>  #define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)

This one doesn't make much sense. GFP_ATOMIC essentially means we're
high-priority and failure would be a pretty bad day. Meanwhile
__GFP_NOWARN means we can totally cope with failure, pls don't holler.

GFP_NOWAIT | __GFP_NOWARN would the more consistent one here I think.

gfp.h for all the docs for this.

Separate patch ofc. This one is definitely the right direction, since
GFP_KERNEL from the reset worker is not a good idea.
-Daniel

>  
>  static void __sg_set_buf(struct scatterlist *sg,
> @@ -79,7 +78,7 @@ static bool __i915_error_grow(struct 
> drm_i915_error_state_buf *e, size_t len)
>   if (e->cur == e->end) {
>   struct scatterlist *sgl;
>  
> - sgl = (typeof(sgl))__get_free_page(ALLOW_FAIL);
> + sgl = (typeof(sgl))__get_free_page(ATOMIC_MAYFAIL);
>   if (!sgl) {
>   e->err = -ENOMEM;
>   return false;
> @@ -99,10 +98,10 @@ static bool __i915_error_grow(struct 
> drm_i915_error_state_buf *e, size_t len)
>   }
>  
>   e->size = ALIGN(len + 1, SZ_64K);
> - e->buf = kmalloc(e->size, ALLOW_FAIL);
> + e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
>   if (!e->buf) {
>   e->size = PAGE_ALIGN(len + 1);
> - e->buf = kmalloc(e->size, GFP_KERNEL);
> + e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
>   }
>   if (!e->buf) {
>   e->err = -ENOMEM;
> @@ -243,12 +242,12 @@ static bool compress_init(struct i915_vma_compress *c)
>  {
>   struct z_stream_s *zstream = >zstream;
>  
> - if (pool_init(>pool, ALLOW_FAIL))
> + if (pool_init(>pool, ATOMIC_MAYFAIL))
>   return false;
>  
>   zstream->workspace =
>   kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
> - ALLOW_FAIL);
> + ATOMIC_MAYFAIL);
>   if (!zstream->workspace) {
>   pool_fini(>pool);
>   return false;
> @@ -256,7 +255,7 @@ static bool compress_init(struct i915_vma_compress *c)
>  
>   c->tmp = NULL;
>   if (i915_has_memcpy_from_wc())
> - c->tmp = pool_alloc(>pool, ALLOW_FAIL);
> + c->tmp = pool_alloc(>pool, ATOMIC_MAYFAIL);
>  
>   return true;
>  }
> @@ -280,7 +279,7 @@ static void *compress_next_page(struct i915_vma_compress 
> *c,
>   if (dst->page_count >= dst->num_pages)
>   return ERR_PTR(-ENOSPC);
>  
> - page = pool_alloc(>pool, ALLOW_FAIL);
> + page = pool_alloc(>pool, ATOMIC_MAYFAIL);
>   if (!page)
>   return ERR_PTR(-ENOMEM);
>  
> @@ -376,7 +375,7 @@ struct i915_vma_compress {
>  
>  static bool compress_init(struct i915_vma_compress *c)
>  {
> - return pool_init(>pool, ALLOW_FAIL) == 0;
> + return pool_init(>pool, ATOMIC_MAYFAIL) == 0;
>  }
>  
>  static bool compress_start(struct i915_vma_compress *c)
> @@ -391,7 +390,7 @@ static int compress_page(struct i915_vma_compress *c,
>  {
>   void *ptr;
>  
> - ptr = pool_alloc(>pool, ALLOW_FAIL);
> + ptr = pool_alloc(>pool, ATOMIC_MAYFAIL);
>   if (!ptr)
>   return -ENOMEM;
>  
> @@ -997,7 +996,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  
>   num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
>   num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
> - dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
> + dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ATOMIC_MAYFAIL);
>   if (!dst)
>   return NULL;
>  
> @@ -1433,7 +1432,7 @@ capture_engine(struct intel_engine_cs *engine,
>   struct i915_request *rq = NULL;
>   unsigned long flags;
>  
> - ee =

Re: [Intel-gfx] [PATCH 08/22] drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered

On Tue, Aug 17, 2021 at 11:47:53AM +0200, Daniel Vetter wrote:
> On Mon, Aug 16, 2021 at 06:51:25AM -0700, Matthew Brost wrote:
> > When unblocking a context, do not enable scheduling if the context is
> > banned, guc_id invalid, or not registered.
> > 
> > Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index c3b7bf7319dd..353899634fa8 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -1579,6 +1579,9 @@ static void guc_context_unblock(struct intel_context 
> > *ce)
> > spin_lock_irqsave(>guc_state.lock, flags);
> >  
> > if (unlikely(submission_disabled(guc) ||
> > +intel_context_is_banned(ce) ||
> > +context_guc_id_invalid(ce) ||
> > +!lrc_desc_registered(guc, ce->guc_id) ||
> >  !intel_context_is_pinned(ce) ||
> >  context_pending_disable(ce) ||
> >  context_blocked(ce) > 1)) {
> 
> I think this entire if condition here is screaming that our intel_context
> state machinery for guc is way too complex, and on the wrong side of
> incomprehensible.
> 
> Also some of these check state outside of the context, and we don't seem
> to hold spinlocks for those, or anything else.
> 
> I general I have no idea which of these are defensive programming and
> cannot ever happen, and which actually can happen. There's for sure way
> too many races going on given that this is all context-local stuff.

Races here meaining that we seem to be dropping locks while the context is
in an inconsistent state, which then means that every other code path
touching contexts needs to check whether the context is in an inconsistent
state.

This is a bit an example of protecting code, vs protecting datastructures.
Protecting code is having state bits of intermediate/transitional state
leak outside of the locked section (like context_blocked), so that every
other piece of code must be aware about the transition and not screw
things up for worse when they race.

This means your review and validation effort scales O(N^2) with the amount
of code and features you have. Which doesn't work.

Datastructure or object oriented locking design goes different:

1. You figure out what the invariants of your datastructure are. That
means what should hold after each state transition is finished. I have no
idea what is the solution for all them here, but e.g. why is
context_blocked even visible to other threads? Usual approach is a) take
lock b) do whatever is necessary (we're talking about reset stuff here, so
performance really doesn't matter) c) unlock. I know that i915-gem is full
of these leaky counting things, but that's really not a good design.

2. Next up, for every piece of state you think how it's protected with a
per-object lock. The fewer locks you have (but still per-objects so it's
not becoming a mess for different reasons) the higher chances that you
don't leak inconsistent state to other threads. This is a bit tricky when
multipled objects are involved, or if you have to split your locks for a
single object because some of it needs to be accessed from irq context
(like a tasklet).

3. Document your rules in kerneldoc, so that when new code gets added you
don't have to review everything for consistency against the rules. This
way you get overall O(N) effort for validation and review, because all you
have to do is check every function that changes state against the overall
contract, and not everything against everything else.

If you have a pile of if checks every time you grab a lock, your locking
design has too much state that leaks outside of the locked sections.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 08/22] drm/i915/guc: Don't enable scheduling on a banned context, guc_id invalid, not registered

On Mon, Aug 16, 2021 at 06:51:25AM -0700, Matthew Brost wrote:
> When unblocking a context, do not enable scheduling if the context is
> banned, guc_id invalid, or not registered.
> 
> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> Signed-off-by: Matthew Brost 
> Cc: 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index c3b7bf7319dd..353899634fa8 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -1579,6 +1579,9 @@ static void guc_context_unblock(struct intel_context 
> *ce)
>   spin_lock_irqsave(>guc_state.lock, flags);
>  
>   if (unlikely(submission_disabled(guc) ||
> +  intel_context_is_banned(ce) ||
> +  context_guc_id_invalid(ce) ||
> +  !lrc_desc_registered(guc, ce->guc_id) ||
>!intel_context_is_pinned(ce) ||
>context_pending_disable(ce) ||
>context_blocked(ce) > 1)) {

I think this entire if condition here is screaming that our intel_context
state machinery for guc is way too complex, and on the wrong side of
incomprehensible.

Also some of these check state outside of the context, and we don't seem
to hold spinlocks for those, or anything else.

I general I have no idea which of these are defensive programming and
cannot ever happen, and which actually can happen. There's for sure way
too many races going on given that this is all context-local stuff.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 02/22] drm/i915/guc: Fix outstanding G2H accounting

On Mon, Aug 16, 2021 at 06:51:19AM -0700, Matthew Brost wrote:
> A small race that could result in incorrect accounting of the number
> of outstanding G2H. Basically prior to this patch we did not increment
> the number of outstanding G2H if we encoutered a GT reset while sending
> a H2G. This was incorrect as the context state had already been updated
> to anticipate a G2H response thus the counter should be incremented.
> 
> Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer")
> Signed-off-by: Matthew Brost 
> Cc: 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 69faa39da178..b5d3972ae164 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -360,11 +360,13 @@ static int guc_submission_send_busy_loop(struct 
> intel_guc *guc,
>  {
>   int err;
>  
> - err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
> -
> - if (!err && g2h_len_dw)
> + if (g2h_len_dw)
>   atomic_inc(>outstanding_submission_g2h);
>  
> + err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);

I'm majorly confused by the _busy_loop naming scheme, especially here.
Like "why do we want to send a busy loop comand to guc, this doesn't make
sense".

It seems like you're using _busy_loop as a suffix for "this is ok to be
called in atomic context". The linux kernel bikeshed for this is generally
_atomic() (or _in_atomic() or something like that).  Would be good to
rename to make this slightly less confusing.
-Daniel

> + if (err == -EBUSY && g2h_len_dw)
> + atomic_dec(>outstanding_submission_g2h);
> +
>   return err;
>  }
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 05/22] drm/i915/guc: Workaround reset G2H is received after schedule done G2H

On Mon, Aug 16, 2021 at 06:51:22AM -0700, Matthew Brost wrote:
> If the context is reset as a result of the request cancelation the
> context reset G2H is received after schedule disable done G2H which is
> likely the wrong order. The schedule disable done G2H release the
> waiting request cancelation code which resubmits the context. This races
> with the context reset G2H which also wants to resubmit the context but
> in this case it really should be a NOP as request cancelation code owns
> the resubmit. Use some clever tricks of checking the context state to
> seal this race until if / when the GuC firmware is fixed.
> 
> v2:
>  (Checkpatch)
>   - Fix typos
> 
> Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
> Signed-off-by: Matthew Brost 
> Cc: 
> ---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 43 ---
>  1 file changed, 37 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> index 3cd2da6f5c03..c3b7bf7319dd 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> @@ -826,17 +826,35 @@ __unwind_incomplete_requests(struct intel_context *ce)
>  static void __guc_reset_context(struct intel_context *ce, bool stalled)
>  {
>   struct i915_request *rq;
> + unsigned long flags;
>   u32 head;
> + bool skip = false;
>  
>   intel_context_get(ce);
>  
>   /*
> -  * GuC will implicitly mark the context as non-schedulable
> -  * when it sends the reset notification. Make sure our state
> -  * reflects this change. The context will be marked enabled
> -  * on resubmission.
> +  * GuC will implicitly mark the context as non-schedulable when it sends
> +  * the reset notification. Make sure our state reflects this change. The
> +  * context will be marked enabled on resubmission.
> +  *
> +  * XXX: If the context is reset as a result of the request cancellation
> +  * this G2H is received after the schedule disable complete G2H which is
> +  * likely wrong as this creates a race between the request cancellation
> +  * code re-submitting the context and this G2H handler. This likely
> +  * should be fixed in the GuC but until if / when that gets fixed we
> +  * need to workaround this. Convert this function to a NOP if a pending
> +  * enable is in flight as this indicates that a request cancellation has
> +  * occurred.
>*/
> - clr_context_enabled(ce);
> + spin_lock_irqsave(>guc_state.lock, flags);
> + if (likely(!context_pending_enable(ce))) {
> + clr_context_enabled(ce);
> + } else {
> + skip = true;
> + }
> + spin_unlock_irqrestore(>guc_state.lock, flags);
> + if (unlikely(skip))
> + goto out_put;
>  
>   rq = intel_context_find_active_request(ce);
>   if (!rq) {
> @@ -855,6 +873,7 @@ static void __guc_reset_context(struct intel_context *ce, 
> bool stalled)
>  out_replay:
>   guc_reset_state(ce, head, stalled);
>   __unwind_incomplete_requests(ce);
> +out_put:
>   intel_context_put(ce);
>  }
>  
> @@ -1599,6 +1618,13 @@ static void guc_context_cancel_request(struct 
> intel_context *ce,
>   guc_reset_state(ce, intel_ring_wrap(ce->ring, rq->head),
>   true);
>   }
> +
> + /*
> +  * XXX: Racey if context is reset, see comment in
> +  * __guc_reset_context().
> +  */
> + flush_work(_to_guc(ce)->ct.requests.worker);

This looks racy, and I think that holds in general for all the flush_work
you're adding: This only flushes the processing of the work, it doesn't
stop any re-queueing (as far as I can tell at least), which means it
doesn't do a hole lot.

Worse, your task is re-queue because it only processes one item at a time.
That means flush_work only flushes the first invocation, but not even
drains them all. So even if you do prevent requeueing somehow, this isn't
what you want. Two solutions.

- flush_work_sync, which flushes until self-requeues are all done too

- Or more preferred, make you're worker a bit more standard for this
  stuff: a) under the spinlock, take the entire list, not just the first
  entry, with list_move or similar to a local list b) process that local
  list in a loop b) don't requeue youreself.

Cheers, Daniel
> +
>   guc_context_unblock(ce);
>   }
>  }
> @@ -2719,7 +2745,12 @@ static void guc_handle_context_reset(struct intel_guc 
> *guc,
>  {
>   trace_intel_context_reset(ce);
>  
> - if (likely(!intel_context_is_banned(ce))) {
> + /*
> +  * XXX: Racey if request cancellation has occurred, see comment in
> +  * __guc_reset_context().
> +  */
> + if (likely(!intel_context_is_banned(ce) &&
> +!context_blocked(ce))) {
>

Re: [Intel-gfx] [PATCH 06/22] drm/i915/execlists: Do not propagate errors to dependent fences

On Mon, Aug 16, 2021 at 06:51:23AM -0700, Matthew Brost wrote:
> Progagating errors to dependent fences is wrong, don't do it. Selftest
> in following patch exposes this bug.

Please explain what "this bug" is, it's hard to read minds, especially at
a distance in spacetime :-)

> Fixes: 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to 
> children on unhold")

I think it would be better to outright revert this, instead of just
disabling it like this.

Also please cite the dma_fence error propagation revert from Jason:

commit 93a2711cddd5760e2f0f901817d71c93183c3b87
Author: Jason Ekstrand 
Date:   Wed Jul 14 14:34:16 2021 -0500

Revert "drm/i915: Propagate errors on awaiting already signaled fences"

Maybe in full, if you need the justification.

> Signed-off-by: Matthew Brost 
> Cc: 

Unless "this bug" is some real world impact thing I wouldn't put cc:
stable on this.
-Daniel
> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de5f9c86b9a4..cafb0608ffb4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq)
>   if (p->flags & I915_DEPENDENCY_WEAK)
>   continue;
>  
> - /* Propagate any change in error status */
> - if (rq->fence.error)
> - i915_request_set_error_once(w, rq->fence.error);
> -
>   if (w->engine != rq->engine)
>   continue;
>  
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v6 10/15] drm/i915/pxp: interfaces for using protected objects

On Mon, Aug 16, 2021 at 08:58:49AM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 8/16/2021 8:15 AM, Daniel Vetter wrote:
> > On Fri, Aug 13, 2021 at 08:18:02AM -0700, Daniele Ceraolo Spurio wrote:
> > > 
> > > On 8/13/2021 7:37 AM, Daniel Vetter wrote:
> > > > On Wed, Jul 28, 2021 at 07:01:01PM -0700, Daniele Ceraolo Spurio wrote:
> > > > > This api allow user mode to create protected buffers and to mark
> > > > > contexts as making use of such objects. Only when using contexts
> > > > > marked in such a way is the execution guaranteed to work as expected.
> > > > > 
> > > > > Contexts can only be marked as using protected content at creation 
> > > > > time
> > > > > (i.e. the parameter is immutable) and they must be both bannable and 
> > > > > not
> > > > > recoverable.
> > > > > 
> > > > > All protected objects and contexts that have backing storage will be
> > > > > considered invalid when the PXP session is destroyed and all new
> > > > > submissions using them will be rejected. All intel contexts within the
> > > > > invalidated gem contexts will be marked banned. A new flag has been
> > > > > added to the RESET_STATS ioctl to report the context invalidation to
> > > > > userspace.
> > > > > 
> > > > > This patch was previously sent as 2 separate patches, which have been
> > > > > squashed following a request to have all the uapi in a single patch.
> > > > > I've retained the s-o-b from both.
> > > > > 
> > > > > v5: squash patches, rebase on proto_ctx, update kerneldoc
> > > > > 
> > > > > v6: rebase on obj create_ext changes
> > > > > 
> > > > > Signed-off-by: Daniele Ceraolo Spurio 
> > > > > 
> > > > > Signed-off-by: Bommu Krishnaiah 
> > > > > Cc: Rodrigo Vivi 
> > > > > Cc: Chris Wilson 
> > > > > Cc: Lionel Landwerlin 
> > > > > Cc: Jason Ekstrand 
> > > > > Cc: Daniel Vetter 
> > > > > Reviewed-by: Rodrigo Vivi  #v5
> > > > > ---
> > > > >drivers/gpu/drm/i915/gem/i915_gem_context.c   | 68 --
> > > > >drivers/gpu/drm/i915/gem/i915_gem_context.h   | 18 
> > > > >.../gpu/drm/i915/gem/i915_gem_context_types.h |  2 +
> > > > >drivers/gpu/drm/i915/gem/i915_gem_create.c| 75 
> > > > >.../gpu/drm/i915/gem/i915_gem_execbuffer.c| 40 -
> > > > >drivers/gpu/drm/i915/gem/i915_gem_object.c|  6 ++
> > > > >drivers/gpu/drm/i915/gem/i915_gem_object.h| 12 +++
> > > > >.../gpu/drm/i915/gem/i915_gem_object_types.h  |  9 ++
> > > > >drivers/gpu/drm/i915/pxp/intel_pxp.c  | 89 
> > > > > +++
> > > > >drivers/gpu/drm/i915/pxp/intel_pxp.h  | 15 
> > > > >drivers/gpu/drm/i915/pxp/intel_pxp_session.c  |  3 +
> > > > >drivers/gpu/drm/i915/pxp/intel_pxp_types.h|  5 ++
> > > > >include/uapi/drm/i915_drm.h   | 55 +++-
> > > > >13 files changed, 371 insertions(+), 26 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > index cff72679ad7c..0cd3e2d06188 100644
> > > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > > @@ -77,6 +77,8 @@
> > > > >#include "gt/intel_gpu_commands.h"
> > > > >#include "gt/intel_ring.h"
> > > > > +#include "pxp/intel_pxp.h"
> > > > > +
> > > > >#include "i915_gem_context.h"
> > > > >#include "i915_trace.h"
> > > > >#include "i915_user_extensions.h"
> > > > > @@ -241,6 +243,25 @@ static int proto_context_set_persistence(struct 
> > > > > drm_i915_private *i915,
> > > > >   return 0;
> > > > >}
> > > > > +static int proto_context_set_protected(struct drm_i915_private *i915,
> > > > > +struct i915_gem_proto_context 
> > > > > *pc,
> > > > > +bool protected)
> > > > > +{
> > > > > + int ret = 0;
> > > > > +
> > > > > + if (!intel_pxp_is_enabled(>gt.pxp))
> > > > > + ret = -ENODEV;
> > > > > + else if (!protected)
> > > > > + pc->user_flags &= ~BIT(UCONTEXT_PROTECTED);
> > > > > + else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
> > > > > +  !(pc->user_flags & BIT(UCONTEXT_BANNABLE)))
> > > > > + ret = -EPERM;
> > > > > + else
> > > > > + pc->user_flags |= BIT(UCONTEXT_PROTECTED);
> > > > > +
> > > > > + return ret;
> > > > > +}
> > > > > +
> > > > >static struct i915_gem_proto_context *
> > > > >proto_context_create(struct drm_i915_private *i915, unsigned int 
> > > > > flags)
> > > > >{
> > > > > @@ -686,6 +707,8 @@ static int set_proto_ctx_param(struct 
> > > > > drm_i915_file_private *fpriv,
> > > > >   ret = -EPERM;
> > > > >   else if (args->value)
> > > > >   pc->user_flags |= BIT(UCONTEXT_BANNABLE);
> > > > > + else if (pc->user_flags & BIT(UCONTEXT_PROTECTED))
> > > > > +

[Intel-gfx] [PATCH] drm/msm: Improve drm/sched point of no return rules

Originally drm_sched_job_init was the point of no return, after which
drivers really should submit a job. I've split that up, which allows
us to fix this issue pretty easily.

Only thing we have to take care of is to not skip to error paths after
that. Other drivers do this the same for out-fence and similar things.

v2: It's not really a bugfix, just an improvement, since all
drm_sched_job_arm does is reserve the fence number. And gaps should be
fine, as long as the drm_sched_job doesn't escape anywhere at all.

For robustness it's still better to align with other drivers here and
not bail out after job_arm().

Cc: Rob Clark 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-arm-...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 4d1c4d5f6a2a..371ed9154e58 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
-   drm_sched_job_arm(>base);
-
xa_init_flags(>deps, XA_FLAGS_ALLOC);
 
kref_init(>ref);
@@ -882,6 +880,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 
submit->user_fence = dma_fence_get(>base.s_fence->finished);
 
+   drm_sched_job_arm(>base);
+
/*
 * Allocate an id which can be used by WAIT_FENCE ioctl to map back
 * to the underlying fence.
@@ -891,17 +891,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
*data,
if (submit->fence_id < 0) {
ret = submit->fence_id = 0;
submit->fence_id = 0;
-   goto out;
}
 
-   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
+   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
struct sync_file *sync_file = 
sync_file_create(submit->user_fence);
if (!sync_file) {
ret = -ENOMEM;
-   goto out;
+   } else {
+   fd_install(out_fence_fd, sync_file->file);
+   args->fence_fd = out_fence_fd;
}
-   fd_install(out_fence_fd, sync_file->file);
-   args->fence_fd = out_fence_fd;
}
 
submit_attach_object_fences(submit);
-- 
2.32.0

[Intel-gfx] [PATCH] drm/sched: Split drm_sched_job_init