date:20210826

[Intel-gfx] ✓ Fi.CI.IGT: success for Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)
URL   : https://patchwork.freedesktop.org/series/93704/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10526_full -> Patchwork_20907_full


Summary
---

  **SUCCESS**

  No regressions found.

  

New tests
-

  New tests have been introduced between CI_DRM_10526_full and 
Patchwork_20907_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 5 pass(s)
- Exec time: [0.52, 1.48] s

  

Known issues


  Here are the changes found in Patchwork_20907_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_create@create-massive:
- shard-snb:  NOTRUN -> [DMESG-WARN][1] ([i915#3002])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-snb6/igt@gem_cre...@create-massive.html
- shard-kbl:  NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-kbl6/igt@gem_cre...@create-massive.html

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +2 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_eio@unwedge-stress:
- shard-tglb: [PASS][4] -> [TIMEOUT][5] ([i915#2369] / [i915#3063] 
/ [i915#3648])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/shard-tglb8/igt@gem_...@unwedge-stress.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-tglb8/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][6] ([i915#2846])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-apl1/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][7] ([i915#2842])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-tglb2/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][8] ([i915#2842])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-iclb1/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/shard-tglb7/igt@gem_exec_fair@basic-p...@vecs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-tglb8/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-iclb: [PASS][11] -> [FAIL][12] ([i915#2849])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/shard-iclb1/igt@gem_exec_fair@basic-throt...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_flush@basic-batch-kernel-default-cmd:
- shard-snb:  NOTRUN -> [SKIP][13] ([fdo#109271]) +267 similar 
issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-snb6/igt@gem_exec_fl...@basic-batch-kernel-default-cmd.html

  * igt@gem_huc_copy@huc-copy:
- shard-apl:  NOTRUN -> [SKIP][14] ([fdo#109271] / [i915#2190])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-apl6/igt@gem_huc_c...@huc-copy.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][15] ([i915#2658])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-apl8/igt@gem_pr...@exhaustion.html
- shard-snb:  NOTRUN -> [WARN][16] ([i915#2658])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-snb7/igt@gem_pr...@exhaustion.html

  * igt@gem_userptr_blits@coherency-unsync:
- shard-tglb: NOTRUN -> [SKIP][17] ([i915#3297])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-tglb2/igt@gem_userptr_bl...@coherency-unsync.html

  * igt@gem_userptr_blits@input-checking:
- shard-apl:  NOTRUN -> [DMESG-WARN][18] ([i915#3002]) +1 similar 
issue
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-apl1/igt@gem_userptr_bl...@input-checking.html

  * igt@gem_userptr_blits@vma-merge:
- shard-apl:  NOTRUN -> [FAIL][19] ([i915#3318])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-apl7/igt@gem_userptr_bl...@vma-merge.html
- shard-skl:  NOTRUN -> [FAIL][20] ([i915#3318])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/shard-skl4/igt@gem_userptr_bl...@vma-merge.html

  * igt@gen9_exec_parse@allowed-single:
- shard-skl:  [PASS][21] -> [DMESG-WARN][22] ([i915#1436] / 
[i915#716])
   [21]:

Re: [Intel-gfx] [PATCH v15 10/12] swiotlb: Add restricted DMA pool initialization

2021-08-26 Thread Claire Chang

On Tue, Aug 24, 2021 at 10:26 PM Guenter Roeck  wrote:
>
> Hi Claire,
>
> On Thu, Jun 24, 2021 at 11:55:24PM +0800, Claire Chang wrote:
> > Add the initialization function to create restricted DMA pools from
> > matching reserved-memory nodes.
> >
> > Regardless of swiotlb setting, the restricted DMA pool is preferred if
> > available.
> >
> > The restricted DMA pools provide a basic level of protection against the
> > DMA overwriting buffer contents at unexpected times. However, to protect
> > against general data leakage and system memory corruption, the system
> > needs to provide a way to lock down the memory access, e.g., MPU.
> >
> > Signed-off-by: Claire Chang 
> > Reviewed-by: Christoph Hellwig 
> > Tested-by: Stefano Stabellini 
> > Tested-by: Will Deacon 
> > ---
> >  include/linux/swiotlb.h |  3 +-
> >  kernel/dma/Kconfig  | 14 
> >  kernel/dma/swiotlb.c| 76 +
> >  3 files changed, 92 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> > index 3b9454d1e498..39284ff2a6cd 100644
> > --- a/include/linux/swiotlb.h
> > +++ b/include/linux/swiotlb.h
> > @@ -73,7 +73,8 @@ extern enum swiotlb_force swiotlb_force;
> >   *   range check to see if the memory was in fact allocated by this
> >   *   API.
> >   * @nslabs:  The number of IO TLB blocks (in groups of 64) between @start 
> > and
> > - *   @end. This is command line adjustable via setup_io_tlb_npages.
> > + *   @end. For default swiotlb, this is command line adjustable via
> > + *   setup_io_tlb_npages.
> >   * @used:The number of used IO TLB block.
> >   * @list:The free list describing the number of free entries available
> >   *   from each index.
> > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> > index 77b405508743..3e961dc39634 100644
> > --- a/kernel/dma/Kconfig
> > +++ b/kernel/dma/Kconfig
> > @@ -80,6 +80,20 @@ config SWIOTLB
> >   bool
> >   select NEED_DMA_MAP_STATE
> >
> > +config DMA_RESTRICTED_POOL
> > + bool "DMA Restricted Pool"
> > + depends on OF && OF_RESERVED_MEM
> > + select SWIOTLB
>
> This makes SWIOTLB user configurable, which in turn results in
>
> mips64-linux-ld: arch/mips/kernel/setup.o: in function `arch_mem_init':
> setup.c:(.init.text+0x19c8): undefined reference to `plat_swiotlb_setup'
> make[1]: *** [Makefile:1280: vmlinux] Error 1
>
> when building mips:allmodconfig.
>
> Should this possibly be "depends on SWIOTLB" ?

Patch is sent here: https://lkml.org/lkml/2021/8/26/932

>
> Thanks,
> Guenter

Thanks,
Claire

Re: [Intel-gfx] [PATCH 02/27] drm/i915/guc: Fix outstanding G2H accounting

2021-08-26 Thread Matthew Brost

On Thu, Aug 26, 2021 at 04:09:59PM -0700, Daniele Ceraolo Spurio wrote:
> 
> 
> On 8/25/2021 8:23 PM, Matthew Brost wrote:
> > A small race that could result in incorrect accounting of the number
> > of outstanding G2H. Basically prior to this patch we did not increment
> > the number of outstanding G2H if we encoutered a GT reset while sending
> > a H2G. This was incorrect as the context state had already been updated
> > to anticipate a G2H response thus the counter should be incremented.
> > 
> > Also always use helper when decrementing this value.
> > 
> > Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in 
> > buffer")
> > Signed-off-by: Matthew Brost 
> > Cc: 
> > ---
> >   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 23 ++-
> >   1 file changed, 12 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
> > b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > index 69faa39da178..03a86da6011e 100644
> > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
> > @@ -352,6 +352,12 @@ static inline void set_lrc_desc_registered(struct 
> > intel_guc *guc, u32 id,
> > xa_unlock_irqrestore(>context_lookup, flags);
> >   }
> > +static void decr_outstanding_submission_g2h(struct intel_guc *guc)
> > +{
> > +   if (atomic_dec_and_test(>outstanding_submission_g2h))
> > +   wake_up_all(>ct.wq);
> > +}
> > +
> >   static int guc_submission_send_busy_loop(struct intel_guc *guc,
> >  const u32 *action,
> >  u32 len,
> > @@ -360,11 +366,12 @@ static int guc_submission_send_busy_loop(struct 
> > intel_guc *guc,
> >   {
> > int err;
> > -   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
> > -
> > -   if (!err && g2h_len_dw)
> > +   if (g2h_len_dw)
> > atomic_inc(>outstanding_submission_g2h);
> > +   err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);
> > +   GEM_BUG_ON(g2h_len_dw && err == -EBUSY);
> 
> AFAICS having a return g2h is not tied to not returning EBUSY, the only way
> to avoid  EBUSY seems to be for loop to be true. maybe have instead:
> 
> GEM_BUG_ON(g2h_len_dw && !loop);
> 
> earlier on?
> 

Yep, that is better. Can you respin this for me while I'm out?

Matt

> Daniele
> 
> > +
> > return err;
> >   }
> > @@ -616,7 +623,7 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > intel_guc *guc)
> > init_sched_state(ce);
> > if (pending_enable || destroyed || deregister) {
> > -   atomic_dec(>outstanding_submission_g2h);
> > +   decr_outstanding_submission_g2h(guc);
> > if (deregister)
> > guc_signal_context_fence(ce);
> > if (destroyed) {
> > @@ -635,7 +642,7 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
> > intel_guc *guc)
> > intel_engine_signal_breadcrumbs(ce->engine);
> > }
> > intel_context_sched_disable_unpin(ce);
> > -   atomic_dec(>outstanding_submission_g2h);
> > +   decr_outstanding_submission_g2h(guc);
> > spin_lock_irqsave(>guc_state.lock, flags);
> > guc_blocked_fence_complete(ce);
> > spin_unlock_irqrestore(>guc_state.lock, flags);
> > @@ -2583,12 +2590,6 @@ g2h_context_lookup(struct intel_guc *guc, u32 
> > desc_idx)
> > return ce;
> >   }
> > -static void decr_outstanding_submission_g2h(struct intel_guc *guc)
> > -{
> > -   if (atomic_dec_and_test(>outstanding_submission_g2h))
> > -   wake_up_all(>ct.wq);
> > -}
> > -
> >   int intel_guc_deregister_done_process_msg(struct intel_guc *guc,
> >   const u32 *msg,
> >   u32 len)
>

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: remove unused i915->active_pipes

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: remove unused i915->active_pipes
URL   : https://patchwork.freedesktop.org/series/94076/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10525_full -> Patchwork_20906_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20906_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20906_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20906_full:

### IGT changes ###

 Possible regressions 

  * igt@syncobj_timeline@wait-for-submit-snapshot:
- shard-skl:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-skl5/igt@syncobj_timel...@wait-for-submit-snapshot.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-skl7/igt@syncobj_timel...@wait-for-submit-snapshot.html

  * igt@sysfs_heartbeat_interval@mixed@vcs0:
- shard-skl:  [PASS][3] -> [WARN][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-skl5/igt@sysfs_heartbeat_interval@mi...@vcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-skl9/igt@sysfs_heartbeat_interval@mi...@vcs0.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_workarounds@suspend-resume-fd:
- {shard-rkl}:[PASS][5] -> [INCOMPLETE][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-rkl-2/igt@gem_workarou...@suspend-resume-fd.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-rkl-1/igt@gem_workarou...@suspend-resume-fd.html

  

### Piglit changes ###

 Possible regressions 

  * spec@ext_packed_depth_stencil@depthstencil-render-miplevels 146 
d=z24_s8_s=z24_s8 (NEW):
- pig-skl-6260u:  NOTRUN -> [INCOMPLETE][7]
   [7]: None

  
New tests
-

  New tests have been introduced between CI_DRM_10525_full and 
Patchwork_20906_full:

### New Piglit tests (1) ###

  * spec@ext_packed_depth_stencil@depthstencil-render-miplevels 146 
d=z24_s8_s=z24_s8:
- Statuses : 1 incomplete(s)
- Exec time: [0.0] s

  

Known issues


  Here are the changes found in Patchwork_20906_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][8] ([fdo#109271] / [i915#1099])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-snb2/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][9] ([i915#280])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-tglb8/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
- shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-tglb6/igt@gem_exec_fair@basic-none-sh...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][12] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-tglb5/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842]) +2 similar 
issues
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-kbl2/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][15] ([fdo#112283])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-tglb8/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][16] ([i915#2658])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-apl3/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
- shard-skl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +21 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-skl10/igt@gem_render_c...@yf-tiled-to-vebox-linear.html
- shard-iclb: NOTRUN -> [SKIP][18] ([i915#768])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/shard-iclb5/igt@gem_render_c...@yf-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@dmabuf-sync:
- shard-kbl:

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915: Ensure wa_init_finish() is called for ctx workaround list (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Ensure wa_init_finish() is called for ctx workaround list 
(rev2)
URL   : https://patchwork.freedesktop.org/series/94053/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525_full -> Patchwork_20905_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20905_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@device_reset@unbind-reset-rebind:
- shard-tglb: [PASS][1] -> [INCOMPLETE][2] ([i915#750])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb6/igt@device_re...@unbind-reset-rebind.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb1/igt@device_re...@unbind-reset-rebind.html

  * igt@feature_discovery@psr2:
- shard-iclb: [PASS][3] -> [SKIP][4] ([i915#658])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb2/igt@feature_discov...@psr2.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-iclb8/igt@feature_discov...@psr2.html

  * igt@gem_ctx_isolation@preservation-s3@vecs0:
- shard-apl:  [PASS][5] -> [DMESG-WARN][6] ([i915#180])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-apl7/igt@gem_ctx_isolation@preservation...@vecs0.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-apl2/igt@gem_ctx_isolation@preservation...@vecs0.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#1099])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-snb5/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][8] ([i915#280])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb3/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][9] ([i915#2846])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-apl8/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb5/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-tglb: [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb5/igt@gem_exec_fair@basic-p...@vcs1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb3/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][13] ([fdo#112283])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb3/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_mmap_gtt@cpuset-big-copy:
- shard-glk:  [PASS][14] -> [FAIL][15] ([i915#1888] / [i915#307])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-glk4/igt@gem_mmap_...@cpuset-big-copy.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-glk4/igt@gem_mmap_...@cpuset-big-copy.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][16] ([i915#2658]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-apl6/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
- shard-skl:  NOTRUN -> [SKIP][17] ([fdo#109271]) +19 similar issues
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-skl8/igt@gem_render_c...@yf-tiled-to-vebox-linear.html
- shard-iclb: NOTRUN -> [SKIP][18] ([i915#768])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-iclb7/igt@gem_render_c...@yf-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@dmabuf-sync:
- shard-kbl:  NOTRUN -> [SKIP][19] ([fdo#109271] / [i915#3323])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-kbl3/igt@gem_userptr_bl...@dmabuf-sync.html

  * igt@gem_userptr_blits@input-checking:
- shard-snb:  NOTRUN -> [DMESG-WARN][20] ([i915#3002])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-snb5/igt@gem_userptr_bl...@input-checking.html

  * igt@gem_userptr_blits@readonly-pwrite-unsync:
- shard-tglb: NOTRUN -> [SKIP][21] ([i915#3297])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-tglb5/igt@gem_userptr_bl...@readonly-pwrite-unsync.html
- shard-iclb: NOTRUN -> [SKIP][22] ([i915#3297])
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/shard-iclb7/igt@gem_userptr_bl...@readonly-pwrite-unsync.html

  * igt@gem_userptr_blits@vma-merge:
- shard-snb:  NOTRUN -> [FAIL][23]

Re: [Intel-gfx] [PATCH 11/27] drm/i915/guc: Copy whole golden context, set engine state size of subset

2021-08-26 Thread John Harrison


On 8/25/2021 20:23, Matthew Brost wrote:

When the GuC does a media reset, it copies a golden context state back
into the corrupted context's state. The address of the golden context
and the size of the engine state restore are passed in via the GuC ADS.
The i915 had a bug where it passed in the whole size of the golden
context, not the size of the engine state to restore resulting in a
memory corruption.

Also copy the entire golden context on init rather than just the engine
state that is restored.

Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 28 +-
  1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6926919bcac6..df2734bfe078 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -358,6 +358,11 @@ static int guc_prep_golden_context(struct intel_guc *guc,
u8 engine_class, guc_class;
struct guc_gt_system_info *info, local_info;
  
+	/* Skip execlist and PPGTT registers + HWSP */

+   const u32 lr_hw_context_size = 80 * sizeof(u32);
+   const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
+   lr_hw_context_size;
+
/*
 * Reserve the memory for the golden contexts and point GuC at it but
 * leave it empty for now. The context data will be filled in later
@@ -396,7 +401,18 @@ static int guc_prep_golden_context(struct intel_guc *guc,
if (!blob)
continue;
  
-		blob->ads.eng_state_size[guc_class] = real_size;

+   /*
+* This interface is slightly confusing. We need to pass the
+* base address of the golden context and the engine state size
+* which is not the size of the whole golden context, it is a
+* subset that the GuC uses when doing a watchdog reset. The
+* engine state size must match the size of the golden context
+* minus the first part of the golden context that the GuC does
+* not retore during reset. Currently no real way to verify this
+* other than reading the GuC spec / code and ensuring the
+* 'skip_size' below matches the value used in the GuC code.
+*/
+   blob->ads.eng_state_size[guc_class] = real_size - skip_size;
blob->ads.golden_context_lrca[guc_class] = addr_ggtt;
addr_ggtt += alloc_size;
}
@@ -437,8 +453,8 @@ static void guc_init_golden_context(struct intel_guc *guc)
u8 *ptr;
  
  	/* Skip execlist and PPGTT registers + HWSP */

-   const u32 lr_hw_context_size = 80 * sizeof(u32);
-   const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
+   __maybe_unused const u32 lr_hw_context_size = 80 * sizeof(u32);
+   __maybe_unused const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
lr_hw_context_size;
Not sure why the 'maybe unused'? The values are not only used in BUG_ONs 
or such that could vanish.


More importantly, you now have two sets of definitions for these magic 
numbers. That seems like a very bad idea. They should be moved into a 
helper function rather than repeated.


John.


  
  	if (!intel_uc_uses_guc_submission(>uc))

@@ -476,12 +492,12 @@ static void guc_init_golden_context(struct intel_guc *guc)
continue;
}
  
-		GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != real_size);

+   GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
+  real_size - skip_size);
GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != 
addr_ggtt);
addr_ggtt += alloc_size;
  
-		shmem_read(engine->default_state, skip_size, ptr + skip_size,

-  real_size - skip_size);
+   shmem_read(engine->default_state, 0, ptr, real_size);
ptr += alloc_size;
}

Re: [Intel-gfx] [PATCH 23/27] drm/i915/guc: Move GuC priority fields in context under guc_active

2021-08-26 Thread Daniele Ceraolo Spurio





On 8/25/2021 8:23 PM, Matthew Brost wrote:

Move GuC management fields in context under guc_active struct as this is
where the lock that protects theses fields lives. Also only set guc_prio
field once during context init.

v2:
  (Daniele)
   - set CONTEXT_SET_INIT

Signed-off-by: Matthew Brost 


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


---
  drivers/gpu/drm/i915/gt/intel_context_types.h | 12 ++--
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 69 +++
  drivers/gpu/drm/i915/i915_trace.h |  2 +-
  3 files changed, 46 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 3a5d98e908f4..b56960a781da 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -112,6 +112,7 @@ struct intel_context {
  #define CONTEXT_FORCE_SINGLE_SUBMISSION   7
  #define CONTEXT_NOPREEMPT 8
  #define CONTEXT_LRCA_DIRTY9
+#define CONTEXT_GUC_INIT   10
  
  	struct {

u64 timeout_us;
@@ -178,6 +179,11 @@ struct intel_context {
spinlock_t lock;
/** requests: active requests on this context */
struct list_head requests;
+   /*
+* GuC priority management
+*/
+   u8 prio;
+   u32 prio_count[GUC_CLIENT_PRIORITY_NUM];
} guc_active;
  
  	/* GuC LRC descriptor ID */

@@ -191,12 +197,6 @@ struct intel_context {
 */
struct list_head guc_id_link;
  
-	/*

-* GuC priority management
-*/
-   u8 guc_prio;
-   u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
-
  #ifdef CONFIG_DRM_I915_SELFTEST
/**
 * @drop_schedule_enable: Force drop of schedule enable G2H for selftest
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 14a512533c39..bc68c0122be4 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -1367,8 +1367,6 @@ static void guc_context_policy_init(struct 
intel_engine_cs *engine,
desc->preemption_timeout = engine->props.preempt_timeout_ms * 1000;
  }
  
-static inline u8 map_i915_prio_to_guc_prio(int prio);

-
  static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)
  {
struct intel_engine_cs *engine = ce->engine;
@@ -1376,8 +1374,6 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
struct intel_guc *guc = >gt->uc.guc;
u32 desc_idx = ce->guc_id;
struct guc_lrc_desc *desc;
-   const struct i915_gem_context *ctx;
-   int prio = I915_CONTEXT_DEFAULT_PRIORITY;
bool context_registered;
intel_wakeref_t wakeref;
int ret = 0;
@@ -1394,12 +1390,6 @@ static int guc_lrc_desc_pin(struct intel_context *ce, 
bool loop)
  
  	context_registered = lrc_desc_registered(guc, desc_idx);
  
-	rcu_read_lock();

-   ctx = rcu_dereference(ce->gem_context);
-   if (ctx)
-   prio = ctx->sched.priority;
-   rcu_read_unlock();
-
reset_lrc_desc(guc, desc_idx);
set_lrc_desc_registered(guc, desc_idx, ce);
  
@@ -1408,8 +1398,7 @@ static int guc_lrc_desc_pin(struct intel_context *ce, bool loop)

desc->engine_submit_mask = adjust_engine_mask(engine->class,
  engine->mask);
desc->hw_context_desc = ce->lrc.lrca;
-   ce->guc_prio = map_i915_prio_to_guc_prio(prio);
-   desc->priority = ce->guc_prio;
+   desc->priority = ce->guc_active.prio;
desc->context_flags = CONTEXT_REGISTRATION_FLAG_KMD;
guc_context_policy_init(engine, desc);
  
@@ -1805,10 +1794,10 @@ static inline void guc_lrc_desc_unpin(struct intel_context *ce)
  
  static void __guc_context_destroy(struct intel_context *ce)

  {
-   GEM_BUG_ON(ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_HIGH] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
-  ce->guc_prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
+   GEM_BUG_ON(ce->guc_active.prio_count[GUC_CLIENT_PRIORITY_KMD_HIGH] ||
+  ce->guc_active.prio_count[GUC_CLIENT_PRIORITY_HIGH] ||
+  ce->guc_active.prio_count[GUC_CLIENT_PRIORITY_KMD_NORMAL] ||
+  ce->guc_active.prio_count[GUC_CLIENT_PRIORITY_NORMAL]);
GEM_BUG_ON(ce->guc_state.number_committed_requests);
  
  	lrc_fini(ce);

@@ -1918,14 +1907,17 @@ static void guc_context_set_prio(struct intel_guc *guc,
  
  	GEM_BUG_ON(prio < GUC_CLIENT_PRIORITY_KMD_HIGH ||

   prio > GUC_CLIENT_PRIORITY_NORMAL);
+   lockdep_assert_held(>guc_active.lock);
  
-	if (ce->guc_prio == prio || submission_disabled(guc) ||

-   !context_registered(ce))
+   if

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Be more gentle when exiting non-persistent contexts (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Be more gentle when exiting non-persistent contexts (rev2)
URL   : https://patchwork.freedesktop.org/series/93420/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10525_full -> Patchwork_20903_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20903_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20903_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20903_full:

### IGT changes ###

 Possible regressions 

  * igt@i915_suspend@debugfs-reader:
- shard-skl:  [PASS][1] -> [DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-skl5/igt@i915_susp...@debugfs-reader.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-skl3/igt@i915_susp...@debugfs-reader.html

  
Known issues


  Here are the changes found in Patchwork_20903_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@drm_import_export@flink:
- shard-glk:  [PASS][3] -> [INCOMPLETE][4] ([i915#2369])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-glk7/igt@drm_import_exp...@flink.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-glk6/igt@drm_import_exp...@flink.html

  * igt@feature_discovery@psr2:
- shard-iclb: [PASS][5] -> [SKIP][6] ([i915#658])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb2/igt@feature_discov...@psr2.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-iclb4/igt@feature_discov...@psr2.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][7] ([fdo#109271] / [i915#1099])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-snb7/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][8] ([i915#280])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-tglb2/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][9] ([i915#2846])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-apl1/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][10] -> [FAIL][11] ([i915#2842]) +1 similar 
issue
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb6/igt@gem_exec_fair@basic-f...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-tglb2/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][12] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-tglb7/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar 
issue
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-kbl7/igt@gem_exec_fair@basic-none-...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-kbl1/igt@gem_exec_fair@basic-none-...@rcs0.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][15] ([fdo#112283])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-tglb2/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_exec_whisper@basic-queues-priority-all:
- shard-glk:  [PASS][16] -> [DMESG-WARN][17] ([i915#118] / 
[i915#95])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-glk6/igt@gem_exec_whis...@basic-queues-priority-all.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-glk9/igt@gem_exec_whis...@basic-queues-priority-all.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][18] ([i915#2658]) +1 similar issue
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-apl7/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
- shard-skl:  NOTRUN -> [SKIP][19] ([fdo#109271]) +19 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-skl5/igt@gem_render_c...@yf-tiled-to-vebox-linear.html
- shard-iclb: NOTRUN -> [SKIP][20] ([i915#768])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/shard-iclb6/igt@gem_render_c...@yf-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@input-checking:
- shard-snb:  NOTRUN -> [DMESG-WARN][21] ([i915#3002])
   [21]:

Re: [Intel-gfx] [PATCH 11/27] drm/i915/guc: Copy whole golden context, set engine state size of subset

2021-08-26 Thread Daniele Ceraolo Spurio





On 8/25/2021 8:23 PM, Matthew Brost wrote:

When the GuC does a media reset, it copies a golden context state back
into the corrupted context's state. The address of the golden context
and the size of the engine state restore are passed in via the GuC ADS.
The i915 had a bug where it passed in the whole size of the golden
context, not the size of the engine state to restore resulting in a
memory corruption.

Also copy the entire golden context on init rather than just the engine
state that is restored.

Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c | 28 +-
  1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
index 6926919bcac6..df2734bfe078 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
@@ -358,6 +358,11 @@ static int guc_prep_golden_context(struct intel_guc *guc,
u8 engine_class, guc_class;
struct guc_gt_system_info *info, local_info;
  
+	/* Skip execlist and PPGTT registers + HWSP */

+   const u32 lr_hw_context_size = 80 * sizeof(u32);
+   const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
+   lr_hw_context_size;
+
/*
 * Reserve the memory for the golden contexts and point GuC at it but
 * leave it empty for now. The context data will be filled in later
@@ -396,7 +401,18 @@ static int guc_prep_golden_context(struct intel_guc *guc,
if (!blob)
continue;
  
-		blob->ads.eng_state_size[guc_class] = real_size;

+   /*
+* This interface is slightly confusing. We need to pass the
+* base address of the golden context and the engine state size
+* which is not the size of the whole golden context, it is a
+* subset that the GuC uses when doing a watchdog reset. The
+* engine state size must match the size of the golden context
+* minus the first part of the golden context that the GuC does
+* not retore during reset. Currently no real way to verify this
+* other than reading the GuC spec / code and ensuring the
+* 'skip_size' below matches the value used in the GuC code.
+*/


This last statement is incorrect. The skipped size is the PPHWSP + the 
execlists context. The size of the execlists context is defined in the 
specs (as part of the full context layout) and it is therefore not a 
magic number only available in the GuC code. With the comment fixed:


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


+   blob->ads.eng_state_size[guc_class] = real_size - skip_size;
blob->ads.golden_context_lrca[guc_class] = addr_ggtt;
addr_ggtt += alloc_size;
}
@@ -437,8 +453,8 @@ static void guc_init_golden_context(struct intel_guc *guc)
u8 *ptr;
  
  	/* Skip execlist and PPGTT registers + HWSP */

-   const u32 lr_hw_context_size = 80 * sizeof(u32);
-   const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
+   __maybe_unused const u32 lr_hw_context_size = 80 * sizeof(u32);
+   __maybe_unused const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
lr_hw_context_size;
  
  	if (!intel_uc_uses_guc_submission(>uc))

@@ -476,12 +492,12 @@ static void guc_init_golden_context(struct intel_guc *guc)
continue;
}
  
-		GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != real_size);

+   GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
+  real_size - skip_size);
GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != 
addr_ggtt);
addr_ggtt += alloc_size;
  
-		shmem_read(engine->default_state, skip_size, ptr + skip_size,

-  real_size - skip_size);
+   shmem_read(engine->default_state, 0, ptr, real_size);
ptr += alloc_size;
}

Re: [Intel-gfx] [PATCH 06/27] drm/i915/guc: Workaround reset G2H is received after schedule done G2H

2021-08-26 Thread Daniele Ceraolo Spurio





On 8/25/2021 8:23 PM, Matthew Brost wrote:

If the context is reset as a result of the request cancellation the
context reset G2H is received after schedule disable done G2H which is
the wrong order. The schedule disable done G2H release the waiting
request cancellation code which resubmits the context. This races
with the context reset G2H which also wants to resubmit the context but
in this case it really should be a NOP as request cancellation code owns
the resubmit. Use some clever tricks of checking the context state to
seal this race until the GuC firmware is fixed.

v2:
  (Checkpatch)
   - Fix typos
v3:
  (Daniele)
   - State that is a bug in the GuC firmware

Fixes: 62eaf0ae217d ("drm/i915/guc: Support request cancellation")
Signed-off-by: Matthew Brost 
Cc: 


Reviewed-by: Daniele Ceraolo Spurio 

Daniele


---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 41 ---
  1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index d94e7e1a876f..592b421e1429 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -831,17 +831,33 @@ __unwind_incomplete_requests(struct intel_context *ce)
  static void __guc_reset_context(struct intel_context *ce, bool stalled)
  {
struct i915_request *rq;
+   unsigned long flags;
u32 head;
+   bool skip = false;
  
  	intel_context_get(ce);
  
  	/*

-* GuC will implicitly mark the context as non-schedulable
-* when it sends the reset notification. Make sure our state
-* reflects this change. The context will be marked enabled
-* on resubmission.
+* GuC will implicitly mark the context as non-schedulable when it sends
+* the reset notification. Make sure our state reflects this change. The
+* context will be marked enabled on resubmission.
+*
+* XXX: If the context is reset as a result of the request cancellation
+* this G2H is received after the schedule disable complete G2H which is
+* wrong as this creates a race between the request cancellation code
+* re-submitting the context and this G2H handler. This is a bug in the
+* GuC but can be worked around in the meantime but converting this to a
+* NOP if a pending enable is in flight as this indicates that a request
+* cancellation has occurred.
 */
-   clr_context_enabled(ce);
+   spin_lock_irqsave(>guc_state.lock, flags);
+   if (likely(!context_pending_enable(ce)))
+   clr_context_enabled(ce);
+   else
+   skip = true;
+   spin_unlock_irqrestore(>guc_state.lock, flags);
+   if (unlikely(skip))
+   goto out_put;
  
  	rq = intel_context_find_active_request(ce);

if (!rq) {
@@ -860,6 +876,7 @@ static void __guc_reset_context(struct intel_context *ce, 
bool stalled)
  out_replay:
guc_reset_state(ce, head, stalled);
__unwind_incomplete_requests(ce);
+out_put:
intel_context_put(ce);
  }
  
@@ -1604,6 +1621,13 @@ static void guc_context_cancel_request(struct intel_context *ce,

guc_reset_state(ce, intel_ring_wrap(ce->ring, rq->head),
true);
}
+
+   /*
+* XXX: Racey if context is reset, see comment in
+* __guc_reset_context().
+*/
+   flush_work(_to_guc(ce)->ct.requests.worker);
+
guc_context_unblock(ce);
}
  }
@@ -2718,7 +2742,12 @@ static void guc_handle_context_reset(struct intel_guc 
*guc,
  {
trace_intel_context_reset(ce);
  
-	if (likely(!intel_context_is_banned(ce))) {

+   /*
+* XXX: Racey if request cancellation has occurred, see comment in
+* __guc_reset_context().
+*/
+   if (likely(!intel_context_is_banned(ce) &&
+  !context_blocked(ce))) {
capture_error_state(guc, ce);
guc_context_replay(ce);
}

Re: [Intel-gfx] [PATCH 02/27] drm/i915/guc: Fix outstanding G2H accounting

2021-08-26 Thread Daniele Ceraolo Spurio





On 8/25/2021 8:23 PM, Matthew Brost wrote:

A small race that could result in incorrect accounting of the number
of outstanding G2H. Basically prior to this patch we did not increment
the number of outstanding G2H if we encoutered a GT reset while sending
a H2G. This was incorrect as the context state had already been updated
to anticipate a G2H response thus the counter should be incremented.

Also always use helper when decrementing this value.

Fixes: f4eb1f3fe946 ("drm/i915/guc: Ensure G2H response has space in buffer")
Signed-off-by: Matthew Brost 
Cc: 
---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 23 ++-
  1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 69faa39da178..03a86da6011e 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -352,6 +352,12 @@ static inline void set_lrc_desc_registered(struct 
intel_guc *guc, u32 id,
xa_unlock_irqrestore(>context_lookup, flags);
  }
  
+static void decr_outstanding_submission_g2h(struct intel_guc *guc)

+{
+   if (atomic_dec_and_test(>outstanding_submission_g2h))
+   wake_up_all(>ct.wq);
+}
+
  static int guc_submission_send_busy_loop(struct intel_guc *guc,
 const u32 *action,
 u32 len,
@@ -360,11 +366,12 @@ static int guc_submission_send_busy_loop(struct intel_guc 
*guc,
  {
int err;
  
-	err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);

-
-   if (!err && g2h_len_dw)
+   if (g2h_len_dw)
atomic_inc(>outstanding_submission_g2h);
  
+	err = intel_guc_send_busy_loop(guc, action, len, g2h_len_dw, loop);

+   GEM_BUG_ON(g2h_len_dw && err == -EBUSY);


AFAICS having a return g2h is not tied to not returning EBUSY, the only 
way to avoid  EBUSY seems to be for loop to be true. maybe have instead:


GEM_BUG_ON(g2h_len_dw && !loop);

earlier on?

Daniele


+
return err;
  }
  
@@ -616,7 +623,7 @@ static void scrub_guc_desc_for_outstanding_g2h(struct intel_guc *guc)

init_sched_state(ce);
  
  		if (pending_enable || destroyed || deregister) {

-   atomic_dec(>outstanding_submission_g2h);
+   decr_outstanding_submission_g2h(guc);
if (deregister)
guc_signal_context_fence(ce);
if (destroyed) {
@@ -635,7 +642,7 @@ static void scrub_guc_desc_for_outstanding_g2h(struct 
intel_guc *guc)
intel_engine_signal_breadcrumbs(ce->engine);
}
intel_context_sched_disable_unpin(ce);
-   atomic_dec(>outstanding_submission_g2h);
+   decr_outstanding_submission_g2h(guc);
spin_lock_irqsave(>guc_state.lock, flags);
guc_blocked_fence_complete(ce);
spin_unlock_irqrestore(>guc_state.lock, flags);
@@ -2583,12 +2590,6 @@ g2h_context_lookup(struct intel_guc *guc, u32 desc_idx)
return ce;
  }
  
-static void decr_outstanding_submission_g2h(struct intel_guc *guc)

-{
-   if (atomic_dec_and_test(>outstanding_submission_g2h))
-   wake_up_all(>ct.wq);
-}
-
  int intel_guc_deregister_done_process_msg(struct intel_guc *guc,
  const u32 *msg,
  u32 len)

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gt: Register the migrate contexts with their engines (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gt: Register the migrate contexts with their engines (rev2)
URL   : https://patchwork.freedesktop.org/series/94058/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10525_full -> Patchwork_20902_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20902_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20902_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20902_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_parallel@fds@vcs0:
- shard-iclb: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb4/igt@gem_exec_parallel@f...@vcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-iclb5/igt@gem_exec_parallel@f...@vcs0.html

  
Known issues


  Here are the changes found in Patchwork_20902_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@psr2:
- shard-iclb: [PASS][3] -> [SKIP][4] ([i915#658])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb2/igt@feature_discov...@psr2.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-iclb6/igt@feature_discov...@psr2.html

  * igt@gem_ctx_persistence@idempotent:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-snb2/igt@gem_ctx_persiste...@idempotent.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][6] ([i915#280])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-tglb1/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-kbl:  [PASS][7] -> [SKIP][8] ([fdo#109271])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-kbl4/igt@gem_exec_fair@basic-f...@rcs0.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-kbl4/igt@gem_exec_fair@basic-f...@rcs0.html
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb6/igt@gem_exec_fair@basic-f...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][11] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-tglb1/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-kbl:  NOTRUN -> [FAIL][12] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-kbl3/igt@gem_exec_fair@basic-n...@vecs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
- shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-kbl4/igt@gem_exec_fair@basic-pace-s...@rcs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-kbl1/igt@gem_exec_fair@basic-pace-s...@rcs0.html

  * igt@gem_exec_flush@basic-batch-kernel-default-cmd:
- shard-snb:  NOTRUN -> [SKIP][15] ([fdo#109271]) +376 similar 
issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-snb2/igt@gem_exec_fl...@basic-batch-kernel-default-cmd.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][16] ([fdo#112283])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-tglb1/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_exec_suspend@basic-s3:
- shard-skl:  [PASS][17] -> [INCOMPLETE][18] ([i915#198])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-skl2/igt@gem_exec_susp...@basic-s3.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-skl10/igt@gem_exec_susp...@basic-s3.html

  * igt@gem_exec_whisper@basic-normal:
- shard-glk:  [PASS][19] -> [DMESG-WARN][20] ([i915#118] / 
[i915#95])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-glk5/igt@gem_exec_whis...@basic-normal.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-glk4/igt@gem_exec_whis...@basic-normal.html

  * igt@gem_pread@exhaustion:
- shard-apl:  NOTRUN -> [WARN][21] ([i915#2658]) +1 similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/shard-apl2/igt@gem_pr...@exhaustion.html

  *

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/sched dependency handling and implicit sync fixes (rev5)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev5)
URL   : https://patchwork.freedesktop.org/series/93415/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10525_full -> Patchwork_20901_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20901_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20901_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20901_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_async@concurrent-writes@bcs0:
- shard-tglb: [PASS][1] -> [FAIL][2] +4 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb8/igt@gem_exec_async@concurrent-wri...@bcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-tglb6/igt@gem_exec_async@concurrent-wri...@bcs0.html

  * igt@gem_exec_async@concurrent-writes@rcs0:
- shard-snb:  [PASS][3] -> [FAIL][4] +2 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-snb2/igt@gem_exec_async@concurrent-wri...@rcs0.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-snb2/igt@gem_exec_async@concurrent-wri...@rcs0.html

  * igt@gem_exec_async@concurrent-writes@vcs0:
- shard-kbl:  NOTRUN -> [FAIL][5] +3 similar issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-kbl3/igt@gem_exec_async@concurrent-wri...@vcs0.html

  * igt@gem_exec_async@concurrent-writes@vecs0:
- shard-iclb: [PASS][6] -> [FAIL][7] +2 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb8/igt@gem_exec_async@concurrent-wri...@vecs0.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-iclb7/igt@gem_exec_async@concurrent-wri...@vecs0.html

  * igt@i915_pm_rpm@system-suspend-modeset:
- shard-glk:  [PASS][8] -> [DMESG-WARN][9]
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-glk4/igt@i915_pm_...@system-suspend-modeset.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-glk9/igt@i915_pm_...@system-suspend-modeset.html

  
 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_async@concurrent-writes@rcs0:
- {shard-rkl}:[PASS][10] -> [FAIL][11] +2 similar issues
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-rkl-2/igt@gem_exec_async@concurrent-wri...@rcs0.html
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-rkl-6/igt@gem_exec_async@concurrent-wri...@rcs0.html

  
Known issues


  Here are the changes found in Patchwork_20901_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@feature_discovery@psr2:
- shard-iclb: [PASS][12] -> [SKIP][13] ([i915#658])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-iclb2/igt@feature_discov...@psr2.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-iclb1/igt@feature_discov...@psr2.html

  * igt@gem_ctx_persistence@legacy-engines-hang:
- shard-snb:  NOTRUN -> [SKIP][14] ([fdo#109271] / [i915#1099])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-hang.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][15] ([i915#280])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-tglb3/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-deadline:
- shard-apl:  NOTRUN -> [FAIL][16] ([i915#2846])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-apl2/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][17] -> [FAIL][18] ([i915#2842])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-tglb6/igt@gem_exec_fair@basic-f...@rcs0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-tglb8/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][19] ([i915#2842])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/shard-tglb6/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@rcs0:
- shard-kbl:  [PASS][20] -> [FAIL][21] ([i915#2842]) +1 similar 
issue
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/shard-kbl4/igt@gem_exec_fair@basic-p...@rcs0.html
   [21]:

[Intel-gfx] ✓ Fi.CI.BAT: success for Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)
URL   : https://patchwork.freedesktop.org/series/93704/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10526 -> Patchwork_20907


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/index.html

New tests
-

  New tests have been introduced between CI_DRM_10526 and Patchwork_20907:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 30 pass(s)
- Exec time: [0.42, 3.61] s

  

Known issues


  Here are the changes found in Patchwork_20907 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_exec_suspend@basic-s0:
- fi-tgl-1115g4:  NOTRUN -> [FAIL][1] ([i915#1888])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s0.html

  * igt@gem_huc_copy@huc-copy:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][2] ([i915#2190])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_backlight@basic-brightness:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][3] ([i915#1155])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@i915_pm_backli...@basic-brightness.html

  * igt@i915_pm_rpm@module-reload:
- fi-tgl-1115g4:  NOTRUN -> [INCOMPLETE][4] ([i915#4006])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@i915_pm_...@module-reload.html

  * igt@kms_addfb_basic@too-wide:
- fi-tgl-1115g4:  NOTRUN -> [DMESG-WARN][5] ([i915#4002]) +89 similar 
issues
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_addfb_ba...@too-wide.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][6] ([fdo#111827]) +7 similar issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_chamelium@dp-hpd-fast:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][7] ([fdo#111827] / [i915#1385])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_chamel...@dp-hpd-fast.html

  * igt@kms_force_connector_basic@force-load-detect:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][8] ([fdo#109285])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_force_connector_ba...@force-load-detect.html

  * igt@kms_psr@primary_mmap_gtt:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][9] ([i915#1072]) +2 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_psr@primary_mmap_gtt.html

  * igt@kms_psr@primary_page_flip:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][10] ([i915#1072] / [i915#1385])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html

  * igt@prime_vgem@basic-userptr:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][11] ([i915#3301])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@prime_v...@basic-userptr.html

  * igt@runner@aborted:
- fi-tgl-1115g4:  NOTRUN -> [FAIL][12] ([i915#2722] / [i915#3834])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-tgl-1115g4/igt@run...@aborted.html

  
 Possible fixes 

  * igt@i915_pm_rpm@basic-rte:
- fi-rkl-guc: [SKIP][13] ([fdo#109308]) -> [PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/fi-rkl-guc/igt@i915_pm_...@basic-rte.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-rkl-guc/igt@i915_pm_...@basic-rte.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [DMESG-WARN][15] ([i915#3958]) -> [PASS][16]
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  * igt@i915_selftest@live@requests:
- fi-rkl-guc: [DMESG-FAIL][17] -> [PASS][18]
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10526/fi-rkl-guc/igt@i915_selftest@l...@requests.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20907/fi-rkl-guc/igt@i915_selftest@l...@requests.html

  
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#109308]: https://bugs.freedesktop.org/show_bug.cgi?id=109308
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#2190]:

Re: [Intel-gfx] [PATCH 5/5] drm/i915/fdi: convert BUG()'s to MISSING_CASE()

2021-08-26 Thread Rodrigo Vivi

On Wed, Aug 25, 2021 at 06:47:52PM +0300, Jani Nikula wrote:
> These shouldn't happen, but in the off chance they do, we'll want a
> warning rather than panic.

looks better indeed:

Reviewed-by: Rodrigo Vivi 


> 
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/display/intel_fdi.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.c 
> b/drivers/gpu/drm/i915/display/intel_fdi.c
> index cc83a6532a71..fc09b781f15f 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.c
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.c
> @@ -93,7 +93,8 @@ static int ilk_check_fdi_lanes(struct drm_device *dev, enum 
> pipe pipe,
>   }
>   return 0;
>   default:
> - BUG();
> + MISSING_CASE(pipe);
> + return 0;
>   }
>  }
>  
> @@ -217,7 +218,7 @@ static void ivb_update_fdi_bc_bifurcation(const struct 
> intel_crtc_state *crtc_st
>  
>   break;
>   default:
> - BUG();
> + MISSING_CASE(crtc->pipe);
>   }
>  }
>  
> -- 
> 2.20.1
>

Re: [Intel-gfx] [PATCH 4/5] drm/i915/fdi: move fdi mphy reset and programming to intel_fdi.c

2021-08-26 Thread Rodrigo Vivi

On Wed, Aug 25, 2021 at 06:47:51PM +0300, Jani Nikula wrote:
> This fairly detailed stuff that really has no place in
> intel_display.c. Combine the calls into one to avoid exposing both.
> 
> Signed-off-by: Jani Nikula 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 102 +--
>  drivers/gpu/drm/i915/display/intel_fdi.c | 100 ++
>  drivers/gpu/drm/i915/display/intel_fdi.h |   1 +
>  3 files changed, 103 insertions(+), 100 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index e7b6969cd2e2..c8da214083b5 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -4916,102 +4916,6 @@ static void ilk_init_pch_refclk(struct 
> drm_i915_private *dev_priv)
>   BUG_ON(val != final);
>  }
>  
> -static void lpt_reset_fdi_mphy(struct drm_i915_private *dev_priv)
> -{
> - u32 tmp;
> -
> - tmp = intel_de_read(dev_priv, SOUTH_CHICKEN2);
> - tmp |= FDI_MPHY_IOSFSB_RESET_CTL;
> - intel_de_write(dev_priv, SOUTH_CHICKEN2, tmp);
> -
> - if (wait_for_us(intel_de_read(dev_priv, SOUTH_CHICKEN2) &
> - FDI_MPHY_IOSFSB_RESET_STATUS, 100))
> - drm_err(_priv->drm, "FDI mPHY reset assert timeout\n");
> -
> - tmp = intel_de_read(dev_priv, SOUTH_CHICKEN2);
> - tmp &= ~FDI_MPHY_IOSFSB_RESET_CTL;
> - intel_de_write(dev_priv, SOUTH_CHICKEN2, tmp);
> -
> - if (wait_for_us((intel_de_read(dev_priv, SOUTH_CHICKEN2) &
> -  FDI_MPHY_IOSFSB_RESET_STATUS) == 0, 100))
> - drm_err(_priv->drm, "FDI mPHY reset de-assert timeout\n");
> -}
> -
> -/* WaMPhyProgramming:hsw */
> -static void lpt_program_fdi_mphy(struct drm_i915_private *dev_priv)
> -{
> - u32 tmp;
> -
> - tmp = intel_sbi_read(dev_priv, 0x8008, SBI_MPHY);
> - tmp &= ~(0xFF << 24);
> - tmp |= (0x12 << 24);
> - intel_sbi_write(dev_priv, 0x8008, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2008, SBI_MPHY);
> - tmp |= (1 << 11);
> - intel_sbi_write(dev_priv, 0x2008, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2108, SBI_MPHY);
> - tmp |= (1 << 11);
> - intel_sbi_write(dev_priv, 0x2108, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x206C, SBI_MPHY);
> - tmp |= (1 << 24) | (1 << 21) | (1 << 18);
> - intel_sbi_write(dev_priv, 0x206C, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x216C, SBI_MPHY);
> - tmp |= (1 << 24) | (1 << 21) | (1 << 18);
> - intel_sbi_write(dev_priv, 0x216C, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2080, SBI_MPHY);
> - tmp &= ~(7 << 13);
> - tmp |= (5 << 13);
> - intel_sbi_write(dev_priv, 0x2080, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2180, SBI_MPHY);
> - tmp &= ~(7 << 13);
> - tmp |= (5 << 13);
> - intel_sbi_write(dev_priv, 0x2180, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x208C, SBI_MPHY);
> - tmp &= ~0xFF;
> - tmp |= 0x1C;
> - intel_sbi_write(dev_priv, 0x208C, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x218C, SBI_MPHY);
> - tmp &= ~0xFF;
> - tmp |= 0x1C;
> - intel_sbi_write(dev_priv, 0x218C, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2098, SBI_MPHY);
> - tmp &= ~(0xFF << 16);
> - tmp |= (0x1C << 16);
> - intel_sbi_write(dev_priv, 0x2098, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x2198, SBI_MPHY);
> - tmp &= ~(0xFF << 16);
> - tmp |= (0x1C << 16);
> - intel_sbi_write(dev_priv, 0x2198, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x20C4, SBI_MPHY);
> - tmp |= (1 << 27);
> - intel_sbi_write(dev_priv, 0x20C4, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x21C4, SBI_MPHY);
> - tmp |= (1 << 27);
> - intel_sbi_write(dev_priv, 0x21C4, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x20EC, SBI_MPHY);
> - tmp &= ~(0xF << 28);
> - tmp |= (4 << 28);
> - intel_sbi_write(dev_priv, 0x20EC, tmp, SBI_MPHY);
> -
> - tmp = intel_sbi_read(dev_priv, 0x21EC, SBI_MPHY);
> - tmp &= ~(0xF << 28);
> - tmp |= (4 << 28);
> - intel_sbi_write(dev_priv, 0x21EC, tmp, SBI_MPHY);
> -}
> -
>  /* Implements 3 different sequences from BSpec chapter "Display iCLK
>   * Programming" based on the parameters passed:
>   * - Sequence to enable CLKOUT_DP
> @@ -5044,10 +4948,8 @@ static void lpt_enable_clkout_dp(struct 
> drm_i915_private *dev_priv,
>   tmp &= ~SBI_SSCCTL_PATHALT;
>   intel_sbi_write(dev_priv, SBI_SSCCTL, tmp, SBI_ICLK);
>  
> - if (with_fdi) {
> - lpt_reset_fdi_mphy(dev_priv);
> - lpt_program_fdi_mphy(dev_priv);
> - }
> + if (with_fdi)
> + lpt_fdi_program_mphy(dev_priv);
>   }
>  
>

Re: [Intel-gfx] [PATCH 3/5] drm/i915/fdi: move more FDI stuff to FDI link train hooks

2021-08-26 Thread Rodrigo Vivi

On Wed, Aug 25, 2021 at 06:47:50PM +0300, Jani Nikula wrote:
> Accept slight duplication in the fdi link train hooks in exchange for
> simplification in ilk_pch_enable(). This lets us make
> ivb_update_fdi_bc_bifurcation() static again, now in intel_fdi.c.

For a moment I thought there were some order changes of the calls here,
but in the end it is crtc_enable and then link_training, so it looks
okay. Also CI passed and I trust your experiments and experience on
the order here.

So,

Reviewed-by: Rodrigo Vivi 



> 
> Signed-off-by: Jani Nikula 
> ---
>  drivers/gpu/drm/i915/display/intel_display.c |  8 ---
>  drivers/gpu/drm/i915/display/intel_fdi.c | 25 +++-
>  drivers/gpu/drm/i915/display/intel_fdi.h |  1 -
>  3 files changed, 24 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index f62bbff7a6be..e7b6969cd2e2 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -2059,14 +2059,6 @@ static void ilk_pch_enable(const struct 
> intel_atomic_state *state,
>  
>   assert_pch_transcoder_disabled(dev_priv, pipe);
>  
> - if (IS_IVYBRIDGE(dev_priv))
> - ivb_update_fdi_bc_bifurcation(crtc_state);
> -
> - /* Write the TU size bits before fdi link training, so that error
> -  * detection works. */
> - intel_de_write(dev_priv, FDI_RX_TUSIZE1(pipe),
> -intel_de_read(dev_priv, PIPE_DATA_M1(pipe)) & 
> TU_SIZE_MASK);
> -
>   /* For PCH output, training FDI link */
>   dev_priv->display.fdi_link_train(crtc, crtc_state);
>  
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.c 
> b/drivers/gpu/drm/i915/display/intel_fdi.c
> index f8ffd5c032ae..f5e42985084a 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.c
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.c
> @@ -195,7 +195,7 @@ static void cpt_set_fdi_bc_bifurcation(struct 
> drm_i915_private *dev_priv, bool e
>   intel_de_posting_read(dev_priv, SOUTH_CHICKEN1);
>  }
>  
> -void ivb_update_fdi_bc_bifurcation(const struct intel_crtc_state *crtc_state)
> +static void ivb_update_fdi_bc_bifurcation(const struct intel_crtc_state 
> *crtc_state)
>  {
>   struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
>   struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
> @@ -270,6 +270,13 @@ static void ilk_fdi_link_train(struct intel_crtc *crtc,
>   i915_reg_t reg;
>   u32 temp, tries;
>  
> + /*
> +  * Write the TU size bits before fdi link training, so that error
> +  * detection works.
> +  */
> + intel_de_write(dev_priv, FDI_RX_TUSIZE1(pipe),
> +intel_de_read(dev_priv, PIPE_DATA_M1(pipe)) & 
> TU_SIZE_MASK);
> +
>   /* FDI needs bits from pipe first */
>   assert_pipe_enabled(dev_priv, crtc_state->cpu_transcoder);
>  
> @@ -373,6 +380,13 @@ static void gen6_fdi_link_train(struct intel_crtc *crtc,
>   i915_reg_t reg;
>   u32 temp, i, retry;
>  
> + /*
> +  * Write the TU size bits before fdi link training, so that error
> +  * detection works.
> +  */
> + intel_de_write(dev_priv, FDI_RX_TUSIZE1(pipe),
> +intel_de_read(dev_priv, PIPE_DATA_M1(pipe)) & 
> TU_SIZE_MASK);
> +
>   /* Train 1: umask FDI RX Interrupt symbol_lock and bit_lock bit
>  for train result */
>   reg = FDI_RX_IMR(pipe);
> @@ -510,6 +524,15 @@ static void ivb_manual_fdi_link_train(struct intel_crtc 
> *crtc,
>   i915_reg_t reg;
>   u32 temp, i, j;
>  
> + ivb_update_fdi_bc_bifurcation(crtc_state);
> +
> + /*
> +  * Write the TU size bits before fdi link training, so that error
> +  * detection works.
> +  */
> + intel_de_write(dev_priv, FDI_RX_TUSIZE1(pipe),
> +intel_de_read(dev_priv, PIPE_DATA_M1(pipe)) & 
> TU_SIZE_MASK);
> +
>   /* Train 1: umask FDI RX Interrupt symbol_lock and bit_lock bit
>  for train result */
>   reg = FDI_RX_IMR(pipe);
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.h 
> b/drivers/gpu/drm/i915/display/intel_fdi.h
> index 135802e4da68..cda9a32c25ba 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.h
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.h
> @@ -16,7 +16,6 @@ int intel_fdi_link_freq(struct drm_i915_private *i915,
>   const struct intel_crtc_state *pipe_config);
>  int ilk_fdi_compute_config(struct intel_crtc *intel_crtc,
>  struct intel_crtc_state *pipe_config);
> -void ivb_update_fdi_bc_bifurcation(const struct intel_crtc_state 
> *crtc_state);
>  void intel_fdi_normal_train(struct intel_crtc *crtc);
>  void ilk_fdi_disable(struct intel_crtc *crtc);
>  void ilk_fdi_pll_disable(struct intel_crtc *intel_crtc);
> -- 
> 2.20.1
>

Re: [Intel-gfx] [PATCH 2/4] drm/dp: use more of the extended receiver cap

2021-08-26 Thread Lyude Paul

On Thu, 2021-08-26 at 14:11 +0300, Jani Nikula wrote:
> On Wed, 25 Aug 2021, Jani Nikula  wrote:
> > On Thu, 19 Aug 2021, Ville Syrjälä  wrote:
> > > On Fri, Aug 13, 2021 at 01:43:20PM +0300, Jani Nikula wrote:
> > > > Extend the use of extended receiver cap at 0x2200 to cover
> > > > MAIN_LINK_CHANNEL_CODING_CAP in 0x2206, in case an implementation
> > > > hides
> > > > the DP 2.0 128b/132b channel encoding cap.
> > > > 
> > > > Cc: Manasi Navare 
> > > > Signed-off-by: Jani Nikula 
> > > > ---
> > > >  drivers/gpu/drm/drm_dp_helper.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_dp_helper.c
> > > > b/drivers/gpu/drm/drm_dp_helper.c
> > > > index 9b2a2961fca8..9389f92cb944 100644
> > > > --- a/drivers/gpu/drm/drm_dp_helper.c
> > > > +++ b/drivers/gpu/drm/drm_dp_helper.c
> > > > @@ -608,7 +608,7 @@ static u8 drm_dp_downstream_port_count(const u8
> > > > dpcd[DP_RECEIVER_CAP_SIZE])
> > > >  static int drm_dp_read_extended_dpcd_caps(struct drm_dp_aux *aux,
> > > >   u8
> > > > dpcd[DP_RECEIVER_CAP_SIZE])
> > > >  {
> > > > -   u8 dpcd_ext[6];
> > > > +   u8 dpcd_ext[DP_MAIN_LINK_CHANNEL_CODING + 1];
> > > 
> > > Why are we even reading less of this than the normal receiver caps?
> > 
> > Good question. I forget my reasoning to only extend to what might affect
> > this use case. Should we extend to the size of the usual receiver caps?
> 
> Ah, there was a previous discussion [1] with Lyude (Cc'd).

Yeah - basically the problem is that we just need to make sure we take care to
avoid clearing info from the non-extended DPCD by accident. Extending this to
7 bits should be fine.

JFYI reading back at your comments it sounds like we might actually be safe to
read the entire DPCD, but we need to make sure we take care to avoid
accidentally replacing the main DPCD with a zeroed-out DPCD which could happen
on systems that have no support for extended DPCDs.

(Also - super bonus points if you can write a unit test to confirm we're not
overwriting the original DPCD! I don't know how much effort this would be
though so don't worry about it too much)

> 
> BR,
> Jani.
> 
> 
> [1]
> https://patchwork.freedesktop.org/patch/msgid/20200901123226.4177-1-jani.nik...@intel.com
> 
> 
> > 
> > BR,
> > Jani.
> > 
> > 
> > > 
> > > > int ret;
> > > >  
> > > > /*
> > > > -- 
> > > > 2.20.1
> 

-- 
Cheers,
 Lyude Paul (she/her)
 Software Engineer at Red Hat

Re: [Intel-gfx] [PATCH 2/5] drm/i915/fdi: move fdi bc bifurcation functions to intel_fdi.c

2021-08-26 Thread Rodrigo Vivi

On Wed, Aug 25, 2021 at 06:47:49PM +0300, Jani Nikula wrote:
> Move FDI related functions to intel_fdi.c. Don't bother with renaming as
> we'll make the functions static shortly.
> 
> Signed-off-by: Jani Nikula 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 49 
>  drivers/gpu/drm/i915/display/intel_fdi.c | 49 
>  drivers/gpu/drm/i915/display/intel_fdi.h |  1 +
>  3 files changed, 50 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 3a9afe04ce0a..f62bbff7a6be 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -2010,55 +2010,6 @@ static void ilk_pch_transcoder_set_timings(const 
> struct intel_crtc_state *crtc_s
>  intel_de_read(dev_priv, VSYNCSHIFT(cpu_transcoder)));
>  }
>  
> -static void cpt_set_fdi_bc_bifurcation(struct drm_i915_private *dev_priv, 
> bool enable)
> -{
> - u32 temp;
> -
> - temp = intel_de_read(dev_priv, SOUTH_CHICKEN1);
> - if (!!(temp & FDI_BC_BIFURCATION_SELECT) == enable)
> - return;
> -
> - drm_WARN_ON(_priv->drm,
> - intel_de_read(dev_priv, FDI_RX_CTL(PIPE_B)) &
> - FDI_RX_ENABLE);
> - drm_WARN_ON(_priv->drm,
> - intel_de_read(dev_priv, FDI_RX_CTL(PIPE_C)) &
> - FDI_RX_ENABLE);
> -
> - temp &= ~FDI_BC_BIFURCATION_SELECT;
> - if (enable)
> - temp |= FDI_BC_BIFURCATION_SELECT;
> -
> - drm_dbg_kms(_priv->drm, "%sabling fdi C rx\n",
> - enable ? "en" : "dis");
> - intel_de_write(dev_priv, SOUTH_CHICKEN1, temp);
> - intel_de_posting_read(dev_priv, SOUTH_CHICKEN1);
> -}
> -
> -static void ivb_update_fdi_bc_bifurcation(const struct intel_crtc_state 
> *crtc_state)
> -{
> - struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> - struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
> -
> - switch (crtc->pipe) {
> - case PIPE_A:
> - break;
> - case PIPE_B:
> - if (crtc_state->fdi_lanes > 2)
> - cpt_set_fdi_bc_bifurcation(dev_priv, false);
> - else
> - cpt_set_fdi_bc_bifurcation(dev_priv, true);
> -
> - break;
> - case PIPE_C:
> - cpt_set_fdi_bc_bifurcation(dev_priv, true);
> -
> - break;
> - default:
> - BUG();
> - }
> -}
> -
>  /*
>   * Finds the encoder associated with the given CRTC. This can only be
>   * used when we know that the CRTC isn't feeding multiple encoders!
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.c 
> b/drivers/gpu/drm/i915/display/intel_fdi.c
> index 88a78dafd54d..f8ffd5c032ae 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.c
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.c
> @@ -170,6 +170,55 @@ int ilk_fdi_compute_config(struct intel_crtc *crtc,
>   return ret;
>  }
>  
> +static void cpt_set_fdi_bc_bifurcation(struct drm_i915_private *dev_priv, 
> bool enable)
> +{
> + u32 temp;
> +
> + temp = intel_de_read(dev_priv, SOUTH_CHICKEN1);
> + if (!!(temp & FDI_BC_BIFURCATION_SELECT) == enable)
> + return;
> +
> + drm_WARN_ON(_priv->drm,
> + intel_de_read(dev_priv, FDI_RX_CTL(PIPE_B)) &
> + FDI_RX_ENABLE);
> + drm_WARN_ON(_priv->drm,
> + intel_de_read(dev_priv, FDI_RX_CTL(PIPE_C)) &
> + FDI_RX_ENABLE);
> +
> + temp &= ~FDI_BC_BIFURCATION_SELECT;
> + if (enable)
> + temp |= FDI_BC_BIFURCATION_SELECT;
> +
> + drm_dbg_kms(_priv->drm, "%sabling fdi C rx\n",
> + enable ? "en" : "dis");
> + intel_de_write(dev_priv, SOUTH_CHICKEN1, temp);
> + intel_de_posting_read(dev_priv, SOUTH_CHICKEN1);
> +}
> +
> +void ivb_update_fdi_bc_bifurcation(const struct intel_crtc_state *crtc_state)
> +{
> + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> + struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
> +
> + switch (crtc->pipe) {
> + case PIPE_A:
> + break;
> + case PIPE_B:
> + if (crtc_state->fdi_lanes > 2)
> + cpt_set_fdi_bc_bifurcation(dev_priv, false);
> + else
> + cpt_set_fdi_bc_bifurcation(dev_priv, true);
> +
> + break;
> + case PIPE_C:
> + cpt_set_fdi_bc_bifurcation(dev_priv, true);
> +
> + break;
> + default:
> + BUG();
> + }
> +}
> +
>  void intel_fdi_normal_train(struct intel_crtc *crtc)
>  {
>   struct drm_device *dev = crtc->base.dev;
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.h 
> b/drivers/gpu/drm/i915/display/intel_fdi.h
> index cda9a32c25ba..135802e4da68 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.h
> +++

Re: [Intel-gfx] [PATCH 1/5] drm/i915/fdi: move intel_update_fdi_pll_freq to intel_fdi.c

2021-08-26 Thread Rodrigo Vivi

On Wed, Aug 25, 2021 at 06:47:48PM +0300, Jani Nikula wrote:
> Move FDI related functions to intel_fdi.c. Rename to have intel_fdi
> prefix while at it.
> 
> Signed-off-by: Jani Nikula 

Reviewed-by: Rodrigo Vivi 

> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 18 +-
>  drivers/gpu/drm/i915/display/intel_fdi.c | 16 
>  drivers/gpu/drm/i915/display/intel_fdi.h |  1 +
>  3 files changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
> b/drivers/gpu/drm/i915/display/intel_display.c
> index 794690c0dba5..3a9afe04ce0a 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -11564,22 +11564,6 @@ static void sanitize_watermarks(struct 
> drm_i915_private *dev_priv)
>   drm_modeset_acquire_fini();
>  }
>  
> -static void intel_update_fdi_pll_freq(struct drm_i915_private *dev_priv)
> -{
> - if (IS_IRONLAKE(dev_priv)) {
> - u32 fdi_pll_clk =
> - intel_de_read(dev_priv, FDI_PLL_BIOS_0) & 
> FDI_PLL_FB_CLOCK_MASK;
> -
> - dev_priv->fdi_pll_freq = (fdi_pll_clk + 2) * 1;
> - } else if (IS_SANDYBRIDGE(dev_priv) || IS_IVYBRIDGE(dev_priv)) {
> - dev_priv->fdi_pll_freq = 27;
> - } else {
> - return;
> - }
> -
> - drm_dbg(_priv->drm, "FDI PLL freq=%d\n", dev_priv->fdi_pll_freq);
> -}
> -
>  static int intel_initial_commit(struct drm_device *dev)
>  {
>   struct drm_atomic_state *state = NULL;
> @@ -11833,7 +11817,7 @@ int intel_modeset_init_nogem(struct drm_i915_private 
> *i915)
>  
>   intel_plane_possible_crtcs_init(i915);
>   intel_shared_dpll_init(dev);
> - intel_update_fdi_pll_freq(i915);
> + intel_fdi_pll_freq_update(i915);
>  
>   intel_update_czclk(i915);
>   intel_modeset_init_hw(i915);
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.c 
> b/drivers/gpu/drm/i915/display/intel_fdi.c
> index 13f8ba4c9188..88a78dafd54d 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.c
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.c
> @@ -95,6 +95,22 @@ static int ilk_check_fdi_lanes(struct drm_device *dev, 
> enum pipe pipe,
>   }
>  }
>  
> +void intel_fdi_pll_freq_update(struct drm_i915_private *i915)
> +{
> + if (IS_IRONLAKE(i915)) {
> + u32 fdi_pll_clk =
> + intel_de_read(i915, FDI_PLL_BIOS_0) & 
> FDI_PLL_FB_CLOCK_MASK;
> +
> + i915->fdi_pll_freq = (fdi_pll_clk + 2) * 1;
> + } else if (IS_SANDYBRIDGE(i915) || IS_IVYBRIDGE(i915)) {
> + i915->fdi_pll_freq = 27;
> + } else {
> + return;
> + }
> +
> + drm_dbg(>drm, "FDI PLL freq=%d\n", i915->fdi_pll_freq);
> +}
> +
>  int intel_fdi_link_freq(struct drm_i915_private *i915,
>   const struct intel_crtc_state *pipe_config)
>  {
> diff --git a/drivers/gpu/drm/i915/display/intel_fdi.h 
> b/drivers/gpu/drm/i915/display/intel_fdi.h
> index 2c8ffd9ceaed..cda9a32c25ba 100644
> --- a/drivers/gpu/drm/i915/display/intel_fdi.h
> +++ b/drivers/gpu/drm/i915/display/intel_fdi.h
> @@ -23,5 +23,6 @@ void ilk_fdi_pll_enable(const struct intel_crtc_state 
> *crtc_state);
>  void intel_fdi_init_hook(struct drm_i915_private *dev_priv);
>  void hsw_fdi_link_train(struct intel_encoder *encoder,
>   const struct intel_crtc_state *crtc_state);
> +void intel_fdi_pll_freq_update(struct drm_i915_private *i915);
>  
>  #endif
> -- 
> 2.20.1
>

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write8' - different lock

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev7)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
7fe22714f5a9 drm/i915/guc: Fix blocked context accounting
4970f2849625 drm/i915/guc: Fix outstanding G2H accounting
2e610b253118 drm/i915/guc: Unwind context requests in reverse order
d105ec255b1d drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
7bd2db9a1759 drm/i915/guc: Process all G2H message at once in work queue
a2475cc1bce0 drm/i915/guc: Workaround reset G2H is received after schedule done 
G2H
36182517bd1a Revert "drm/i915/gt: Propagate change in error status to children 
on unhold"
-:8: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 3761baae908a ("Revert "drm/i915: 
Propagate errors on awaiting already signaled fences"")'
#8: 
errors from one client ending up in another.  In 3761baae908a (Revert

-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 8e9f84cf5cac ("drm/i915/gt: 
Propagate change in error status to children on unhold")'
#11: 
added in 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status

-:24: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#24: 
References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting 
already signaled fences"")'

total: 2 errors, 1 warnings, 0 checks, 10 lines checked
4253aac3070d drm/i915/selftests: Add a cancel request selftest that triggers a 
reset
dce056059ff5 drm/i915/guc: Kick tasklet after queuing a request
-:8: WARNING:TYPO_SPELLING: 'inteface' may be misspelled - perhaps 'interface'?
#8: 
Fixes: 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new 
inteface")
 


total: 0 errors, 1 warnings, 0 checks, 7 lines checked
afc29a46456f drm/i915/guc: Don't enable scheduling on a banned context, guc_id 
invalid, not registered
b663fb835571 drm/i915/guc: Copy whole golden context, set engine state size of 
subset
7fe07569686e drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
-:108: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#108: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 233 lines checked
af7d6c0767f7 drm/i915/guc: Take context ref when cancelling request
8d24ad7df290 drm/i915/guc: Don't touch guc_state.sched_state without a lock
11bb5b5e2f49 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
0fba3913bf23 drm/i915: Allocate error capture in nowait context
dae3a133265a drm/i915/guc: Flush G2H work queue during reset
1308964ace8e drm/i915/guc: Release submit fence from an irq_work
c98d42a500e0 drm/i915/guc: Move guc_blocked fence to struct guc_state
0df9cd86f965 drm/i915/guc: Rework and simplify locking
0e16363d8865 drm/i915/guc: Proper xarray usage for contexts_lookup
2815870d2076 drm/i915/guc: Drop pin count check trick between sched_disable and 
re-pin
d898d084dfd7 drm/i915/guc: Move GuC priority fields in context under guc_active
515567407bed drm/i915/guc: Move fields protected by guc->contexts_lock into sub 
structure
8d7454b6d5bf drm/i915/guc: Drop guc_active move everything into guc_state
2847489a97ab drm/i915/guc: Add GuC kernel doc
d18d0c97031f drm/i915/guc: Drop static inline functions intel_guc_submission.c

Re: [Intel-gfx] [BUG - BISECTED] display not detected anymore

2021-08-26 Thread Heiko Carstens

Hi Ville,

> > > ef79d62b5ce5 ("drm/i915: Encapsulate dbuf state handling harder")
> > > 
> > > With that commit the display is not detected anymore, one commit
> > > before that it still works. So this one seems to be broken.
> > > 
> > > Ville, Stanislav, any idea how to fix this?
> > > 
> > > commit ef79d62b5ce53851901d6c1d21b74cbb9e27219b
> > > Author: Ville Syrjälä 
> > > Date:   Fri Jan 22 22:56:32 2021 +0200
> > > 
> > > drm/i915: Encapsulate dbuf state handling harder
> > 
> > That has nothing to do with display detection, so very mysterious.
> > 
> > Please file a bug at https://gitlab.freedesktop.org/drm/intel/issues/new
> > boot with drm.debug=0xe with both good and bad kernels and attach the
> > dmesg from both to the bug.
> 
> Everything (hopefully) provided here:
> https://gitlab.freedesktop.org/drm/intel/-/issues/4013
> 
> Please let me know if you need more, or if I can help otherwise to
> resolve this.

Did you have any time to look into this already?

[Intel-gfx] ✓ Fi.CI.IGT: success for drm/i915/gem: Fix the mman selftest

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gem: Fix the mman selftest
URL   : https://patchwork.freedesktop.org/series/94062/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10524_full -> Patchwork_20900_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20900_full:

### IGT changes ###

 Suppressed 

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@kms_vblank@pipe-c-query-busy:
- {shard-rkl}:[SKIP][1] ([i915#1845]) -> [TIMEOUT][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-rkl-5/igt@kms_vbl...@pipe-c-query-busy.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-rkl-5/igt@kms_vbl...@pipe-c-query-busy.html

  * igt@syncobj_timeline@multi-wait-all-for-submit-available-signaled:
- {shard-rkl}:[PASS][3] -> [TIMEOUT][4] +2 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-rkl-5/igt@syncobj_timel...@multi-wait-all-for-submit-available-signaled.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-rkl-5/igt@syncobj_timel...@multi-wait-all-for-submit-available-signaled.html

  
Known issues


  Here are the changes found in Patchwork_20900_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@engines-hostile-preempt:
- shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-snb2/igt@gem_ctx_persiste...@engines-hostile-preempt.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][6] ([i915#280])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-tglb5/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][7] -> [FAIL][8] ([i915#2842]) +2 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-tglb7/igt@gem_exec_fair@basic-f...@rcs0.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-tglb7/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-rrul@rcs0:
- shard-iclb: [PASS][9] -> [FAIL][10] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-iclb7/igt@gem_exec_fair@basic-none-r...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-iclb2/igt@gem_exec_fair@basic-none-r...@rcs0.html
- shard-glk:  [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-glk5/igt@gem_exec_fair@basic-none-r...@rcs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-glk3/igt@gem_exec_fair@basic-none-r...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][13] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-tglb3/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
- shard-iclb: NOTRUN -> [FAIL][14] ([i915#2842]) +1 similar issue
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-iclb4/igt@gem_exec_fair@basic-n...@vcs1.html
- shard-kbl:  NOTRUN -> [FAIL][15] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-kbl7/igt@gem_exec_fair@basic-n...@vcs1.html

  * igt@gem_exec_fair@basic-pace@rcs0:
- shard-kbl:  [PASS][16] -> [FAIL][17] ([i915#2851])
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-kbl4/igt@gem_exec_fair@basic-p...@rcs0.html
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-kbl6/igt@gem_exec_fair@basic-p...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-kbl:  [PASS][18] -> [FAIL][19] ([i915#2842])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-kbl4/igt@gem_exec_fair@basic-p...@vcs1.html
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][20] ([fdo#112283])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-tglb5/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
- shard-iclb: [PASS][21] -> [FAIL][22] ([i915#307]) +1 similar issue
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/shard-iclb3/igt@gem_mmap_...@cpuset-big-copy-odd.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/shard-iclb5/igt@gem_mmap_...@cpuset-big-copy-odd.html

  * igt@gem_pread@exhaustion:
- shard-apl:

Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev5)

2021-08-26 Thread Matthew Brost

On Thu, Aug 26, 2021 at 10:34:36AM +, Patchwork wrote:
> Patch Details
> 
> Series:  Clean up GuC CI failures, simplify locking, and kernel DOC (rev5)
> URL: https://patchwork.freedesktop.org/series/93704/
> State:   failure
> Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/index.html
> 
> CI Bug Log - changes from CI_DRM_10522_full -> Patchwork_20896_full
> 
> Summary
> 
> FAILURE
> 
> Serious unknown changes coming with Patchwork_20896_full absolutely need to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_20896_full, please notify your bug team to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> Possible new issues
> 
> Here are the unknown changes that may have been introduced in
> Patchwork_20896_full:
> 
> IGT changes
> 
> Possible regressions
> 
>   • igt@gem_exec_schedule@reorder-wide@vcs0:
>   □ shard-skl: PASS -> FAIL
> 

Not really sure what this one is about but I don't see how it could be
related to this series as almost all the changes in this series are in
the GuC backend while this test is runing on a much older platform.

Matt 

> New tests
> 
> New tests have been introduced between CI_DRM_10522_full and
> Patchwork_20896_full:
> 
> New IGT tests (1)
> 
>   • igt@i915_selftest@live@guc:
>   □ Statuses : 8 pass(s)
>   □ Exec time: [0.47, 4.95] s
> 
> Known issues
> 
> Here are the changes found in Patchwork_20896_full that come from known 
> issues:
> 
> IGT changes
> 
> Issues hit
> 
>   • igt@gem_ctx_persistence@legacy-engines-mixed-process:
> 
>   □ shard-snb: NOTRUN -> SKIP (fdo#109271 / i915#1099) +1 similar issue
>   • igt@gem_ctx_sseu@mmap-args:
> 
>   □ shard-tglb: NOTRUN -> SKIP ([i915#280])
>   • igt@gem_eio@in-flight-10ms:
> 
>   □ shard-skl: PASS -> TIMEOUT ([i915#3063]) +1 similar issue
>   • igt@gem_exec_fair@basic-deadline:
> 
>   □ shard-kbl: PASS -> FAIL ([i915#2846])
>   • igt@gem_exec_fair@basic-none-solo@rcs0:
> 
>   □ shard-tglb: NOTRUN -> FAIL ([i915#2842])
>   • igt@gem_exec_fair@basic-pace@vcs1:
> 
>   □ shard-iclb: NOTRUN -> FAIL ([i915#2842])
>   • igt@gem_exec_fair@basic-pace@vecs0:
> 
>   □ shard-kbl: PASS -> FAIL ([i915#2842]) +1 similar issue
> 
>   □ shard-tglb: PASS -> FAIL ([i915#2842])
> 
>   • igt@gem_exec_fair@basic-throttle@rcs0:
> 
>   □ shard-glk: PASS -> FAIL ([i915#2842]) +1 similar issue
> 
>   □ shard-iclb: PASS -> FAIL ([i915#2849])
> 
>   • igt@gem_exec_params@secure-non-master:
> 
>   □ shard-tglb: NOTRUN -> SKIP (fdo#112283)
>   • igt@gem_pread@exhaustion:
> 
>   □ shard-snb: NOTRUN -> WARN ([i915#2658])
>   • igt@gem_render_copy@yf-tiled-to-vebox-linear:
> 
>   □ shard-iclb: NOTRUN -> SKIP ([i915#768])
>   • igt@gem_userptr_blits@readonly-pwrite-unsync:
> 
>   □ shard-tglb: NOTRUN -> SKIP ([i915#3297])
> 
>   □ shard-iclb: NOTRUN -> SKIP ([i915#3297])
> 
>   • igt@gen3_render_tiledy_blits:
> 
>   □ shard-tglb: NOTRUN -> SKIP (fdo#109289)
>   • igt@i915_pm_dc@dc6-psr:
> 
>   □ shard-iclb: PASS -> FAIL ([i915#454])
>   • igt@i915_pm_rpm@modeset-non-lpsp-stress-no-wait:
> 
>   □ shard-tglb: NOTRUN -> SKIP (fdo#111644 / i915#1397 / i915#2411)
>   • igt@kms_big_fb@linear-16bpp-rotate-90:
> 
>   □ shard-apl: NOTRUN -> SKIP (fdo#109271) +177 similar issues
>   • igt@kms_big_fb@linear-32bpp-rotate-0:
> 
>   □ shard-glk: PASS -> DMESG-WARN (i915#118 / [i915#95]) +2 similar issues
>   • igt@kms_big_fb@x-tiled-max-hw-stride-32bpp-rotate-0-hflip:
> 
>   □ shard-skl: NOTRUN -> SKIP (fdo#109271 / [i915#3777])
>   • igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-hflip:
> 
>   □ shard-apl: NOTRUN -> SKIP (fdo#109271 / [i915#3777])
>   • igt@kms_big_fb@yf-tiled-max-hw-stride-64bpp-rotate-0-hflip:
> 
>   □ shard-tglb: NOTRUN -> SKIP (fdo#111615)
>   • igt@kms_ccs@pipe-a-bad-pixel-format-y_tiled_gen12_rc_ccs_cc:
> 
>   □ shard-skl: NOTRUN -> SKIP (fdo#109271 / [i915#3886]) +4 similar issues
>   • igt@kms_ccs@pipe-a-ccs-on-another-bo-y_tiled_gen12_mc_ccs:
> 
>   □ shard-apl: NOTRUN -> SKIP (fdo#109271 / [i915#3886]) +6 similar issues
>   • igt@kms_ccs@pipe-b-missing-ccs-buffer-y_tiled_gen12_rc_ccs_cc:
> 
>   □ shard-iclb: NOTRUN -> SKIP (fdo#109278 / [i915#3886])
>   • igt@kms_ccs@pipe-c-random-ccs-data-y_tiled_gen12_mc_ccs:
> 
>   □ shard-tglb: NOTRUN -> SKIP ([i915#3689] / [i915#3886])
>   • igt@kms_chamelium@dp-audio:
> 
>   □ shard-tglb: NOTRUN -> SKIP (fdo#109284 / fdo#111827) +1 similar issue
>   • igt@kms_chamelium@dp-crc-single:
> 
>   □ shard-snb: NOTRUN -> SKIP (fdo#109271 / fdo#111827) +8 similar issues
>   • igt@kms_chamelium@hdmi-hpd-fast:
> 
>   □ shard-iclb: NOTRUN -> SKIP (fdo#109284 / fdo#111827) +1 similar issue
>   • igt@kms_chamelium@vga-hpd-enable-disable-mode:
> 
>   □ shard-skl: NOTRUN -> SKIP (fdo#109271 / fdo#111827)
>   •

Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)

2021-08-26 Thread Matthew Brost

On Thu, Aug 26, 2021 at 04:17:07PM +, Patchwork wrote:
> Patch Details
> 
> Series:  Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
> URL: https://patchwork.freedesktop.org/series/93704/
> State:   failure
> Details: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/index.html
> 
> CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20904
> 
> Summary
> 
> FAILURE
> 
> Serious unknown changes coming with Patchwork_20904 absolutely need to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_20904, please notify your bug team to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/
> index.html
> 
> Possible new issues
> 
> Here are the unknown changes that may have been introduced in Patchwork_20904:
> 
> IGT changes
> 
> Possible regressions
> 
>   • igt@i915_selftest@live@hangcheck:
>   □ fi-rkl-guc: PASS -> INCOMPLETE

I've seen this locally before and after this series. I wouldn't hold of
the merge of this series because of this as I don't believe it is a
regression, just an existing instability in the stack. I haven't been
able to root cause this yet, but my initial analysis points to the GuC
losing a submission after the GuC has reset a context. Will dig into
this and hopefully get a fix after I'm back from vacation on 9/7.

Matt 

> 
> New tests
> 
> New tests have been introduced between CI_DRM_10525 and Patchwork_20904:
> 
> New IGT tests (1)
> 
>   • igt@i915_selftest@live@guc:
>   □ Statuses : 30 pass(s)
>   □ Exec time: [0.41, 5.26] s
> 
> Known issues
> 
> Here are the changes found in Patchwork_20904 that come from known issues:
> 
> IGT changes
> 
> Issues hit
> 
>   • igt@amdgpu/amd_cs_nop@sync-compute0:
> 
>   □ fi-kbl-soraka: NOTRUN -> SKIP (fdo#109271) +5 similar issues
>   • igt@runner@aborted:
> 
>   □ fi-rkl-guc: NOTRUN -> FAIL (i915#3928)
> 
> {name}: This element is suppressed. This means it is ignored when computing
> the status of the difference (SUCCESS, WARNING, or FAILURE).
> 
> Participating hosts (40 -> 33)
> 
> Missing (7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan
> fi-bdw-samus bat-jsl-1
> 
> Build changes
> 
>   • Linux: CI_DRM_10525 -> Patchwork_20904
> 
> CI-20190529: 20190529
> CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ git://
> anongit.freedesktop.org/gfx-ci/linux
> IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ https://
> gitlab.freedesktop.org/drm/igt-gpu-tools.git
> Patchwork_20904: 0c1d27ac9fce7e231e7dddebcf56905e05302cae @ git://
> anongit.freedesktop.org/gfx-ci/linux
> 
> == Linux commits ==
> 
> 0c1d27ac9fce drm/i915/guc: Drop static inline functions intel_guc_submission.c
> 50ada01b3d95 drm/i915/guc: Add GuC kernel doc
> 883eccfa8221 drm/i915/guc: Drop guc_active move everything into guc_state
> fa075902c938 drm/i915/guc: Move fields protected by guc->contexts_lock into 
> sub
> structure
> a1c73c8c481a drm/i915/guc: Move GuC priority fields in context under 
> guc_active
> f16c0554ae08 drm/i915/guc: Drop pin count check trick between sched_disable 
> and
> re-pin
> 42ac1b77a019 drm/i915/guc: Proper xarray usage for contexts_lookup
> 9b9222998c83 drm/i915/guc: Rework and simplify locking
> 244934484f63 drm/i915/guc: Move guc_blocked fence to struct guc_state
> ba695a58136a drm/i915/guc: Release submit fence from an irq_work
> 3bd5803d5e25 drm/i915/guc: Flush G2H work queue during reset
> b87ba9121748 drm/i915: Allocate error capture in nowait context
> adb35ad83c76 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
> 97e616063006 drm/i915/guc: Don't touch guc_state.sched_state without a lock
> 1ff99308ef88 drm/i915/guc: Take context ref when cancelling request
> ff84f14ddceb drm/i915/selftests: Add initial GuC selftest for scrubbing lost
> G2H
> abd6a8884cf4 drm/i915/guc: Copy whole golden context, set engine state size of
> subset
> a19ba1f51009 drm/i915/guc: Don't enable scheduling on a banned context, guc_id
> invalid, not registered
> f29b2b338002 drm/i915/guc: Kick tasklet after queuing a request
> f577a4fdeeab drm/i915/selftests: Add a cancel request selftest that triggers a
> reset
> da3d87dfe8c5 Revert "drm/i915/gt: Propagate change in error status to children
> on unhold"
> 25273a034c8d drm/i915/guc: Workaround reset G2H is received after schedule 
> done
> G2H
> c00d543957c2 drm/i915/guc: Process all G2H message at once in work queue
> 5b7ff1fa9e43 drm/i915/guc: Don't drop ce->guc_active.lock when unwinding
> context
> 54cd904fa232 drm/i915/guc: Unwind context requests in reverse order
> 593f21493fda drm/i915/guc: Fix outstanding G2H accounting
> 6b511953d015 drm/i915/guc: Fix blocked context accounting
> 
> SECURITY NOTE: file ~/.netrc must not be accessible by others

[Intel-gfx] [PULL] drm-intel-fixes

2021-08-26 Thread Rodrigo Vivi

Hi Dave and Daniel,

I also had other 2 display patches, but I decided to keep them
out for now because CI_DIF_604 returned a bunch of link training
errors on TGL when compared to CI_DIF_603 which is based
on drm/drm-fixes.

Those patches are:
d7f213c131ad ("drm/i915/dp: Use max params for panels < eDP 1.4")
dab1b47e57e0 ("drm/i915/dp: return proper DPRX link training result")

Likely, this second one is the culprit so I will try to keep this out
and try to include the first one, but I'm not sure if CI will return
results in time, so let's try to not be late and propagate the
other 2 good patches below:

Here goes drm-intel-fixes-2021-08-26:

- Fix syncmap memory leak
- Drop redundant display port debug print

Thanks,
Rodrigo.

The following changes since commit e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93:

  Linux 5.14-rc7 (2021-08-22 14:24:56 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-fixes-2021-08-26

for you to fetch changes up to 71de496cc489b6bae2f51f89da7f28849bf2836e:

  drm/i915/dp: Drop redundant debug print (2021-08-26 07:31:52 -0400)


- Fix syncmap memory leak
- Drop redundant display port debug print


Matthew Brost (1):
  drm/i915: Fix syncmap memory leak

Swati Sharma (1):
  drm/i915/dp: Drop redundant debug print

 drivers/gpu/drm/i915/display/intel_dp.c  | 9 ++---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 9 +
 2 files changed, 11 insertions(+), 7 deletions(-)

Re: [Intel-gfx] [PATCH v3] drm/i915/dp: Use max params for panels < eDP 1.4

2021-08-26 Thread Rodrigo Vivi

On Fri, Aug 20, 2021 at 08:26:14PM +0300, Ville Syrjälä wrote:
> On Fri, Aug 20, 2021 at 03:52:59PM +0800, Kai-Heng Feng wrote:
> > Users reported that after commit 2bbd6dba84d4 ("drm/i915: Try to use
> > fast+narrow link on eDP again and fall back to the old max strategy on
> > failure"), the screen starts to have wobbly effect.
> > 
> > Commit a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for
> > everything") doesn't help either, that means the affected eDP 1.2 panels
> > only work with max params.
> > 
> > So use max params for panels < eDP 1.4 as Windows does to solve the
> > issue.
> > 
> > v3:
> >  - Do the eDP rev check in intel_edp_init_dpcd()
> > 
> > v2:
> >  - Check eDP 1.4 instead of DPCD 1.1 to apply max params
> > 
> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3714
> > Fixes: 2bbd6dba84d4 ("drm/i915: Try to use fast+narrow link on eDP again 
> > and fall back to the old max strategy on failure")
> > Fixes: a5c936add6a2 ("drm/i915/dp: Use slow and wide link training for 
> > everything")
> > Suggested-by: Ville Syrjälä 
> > Signed-off-by: Kai-Heng Feng 
> 
> Slapped a cc:stable on it and pushed to drm-intel-next. Thanks.

Since I got a strange failure on CI_DIF_604 that I don't see on CI_DIF_603,
I'm avoiding the display patches. This one and also
dab1b47e57e0 ("drm/i915/dp: return proper DPRX link training result")

I know, it is probably the other one, but I had to remove both patches for
now and I'm not confident the CI will allow me to test with this one alone.

If we have -rc8 I will check again later. Otherwise we will have to send
to the stable mailing list later.

> 
> > ---
> >  drivers/gpu/drm/i915/display/intel_dp.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_dp.c 
> > b/drivers/gpu/drm/i915/display/intel_dp.c
> > index 75d4ebc669411..e0dbd35ae7bc0 100644
> > --- a/drivers/gpu/drm/i915/display/intel_dp.c
> > +++ b/drivers/gpu/drm/i915/display/intel_dp.c
> > @@ -2445,11 +2445,14 @@ intel_edp_init_dpcd(struct intel_dp *intel_dp)
> >  */
> > if (drm_dp_dpcd_read(_dp->aux, DP_EDP_DPCD_REV,
> >  intel_dp->edp_dpcd, sizeof(intel_dp->edp_dpcd)) ==
> > -sizeof(intel_dp->edp_dpcd))
> > +sizeof(intel_dp->edp_dpcd)) {
> > drm_dbg_kms(_priv->drm, "eDP DPCD: %*ph\n",
> > (int)sizeof(intel_dp->edp_dpcd),
> > intel_dp->edp_dpcd);
> >  
> > +   intel_dp->use_max_params = intel_dp->edp_dpcd[0] < DP_EDP_14;
> > +   }
> > +
> > /*
> >  * This has to be called after intel_dp->edp_dpcd is filled, PSR checks
> >  * for SET_POWER_CAPABLE bit in intel_dp->edp_dpcd[1]
> > -- 
> > 2.32.0
> 
> -- 
> Ville Syrjälä
> Intel

Re: [Intel-gfx] [PATCH] drm/i915/dp: return proper DPRX link training result

2021-08-26 Thread Rodrigo Vivi

On Sat, Aug 21, 2021 at 02:02:03AM +0300, Imre Deak wrote:
> On Sat, Aug 21, 2021 at 01:20:04AM +0300, Ville Syrjälä wrote:
> > On Wed, Aug 18, 2021 at 07:17:12PM +0300, Imre Deak wrote:
> > > On Wed, Aug 18, 2021 at 06:09:43PM +0300, Lee, Shawn C wrote:
> > > > On Tue, 2021-07-07, Lee Shawn C  wrote:
> > > > >On Tue, 2021-07-07, Almahallawy, Khaled  
> > > > >wrote:
> > > > >>I believe Imre's LT fallback:
> > > > >>https://github.com/ideak/linux/commits/linktraining-fallback-fix  and 
> > > > >>Chrome user space fix:
> > > > >>https://chromium-review.googlesource.com/c/chromium/src/+/3003487
> > > > >>should address Chrome concerns for LT failure and LTTPRs
> > > > >>
> > > > >
> > > > >Thanks for comment! The new fallback patch should help on this DPRX 
> > > > >problem.
> > > > >One more thing. If driver did not handle DPRX link train failed 
> > > > >properly.
> > > > >It would impact link layer compliance test case in below.
> > > > >
> > > > >400.3.1.3
> > > > >400.3.1.4
> > > > >400.3.1.6
> > > > >400.3.1.12
> > > > >400.3.1.13
> > > > >400.3.1.14
> > > > >
> > > > >Best regards,
> > > > >Shawn
> > > > >
> > > > 
> > > > Hi all, before Imre's patch series land on upstream driver. The link 
> > > > train failed
> > > > handling works for LTTPR only. But DPRX does not. Could you please 
> > > > consider to have
> > > > this change as temporary solution? Thanks!
> > > 
> > > I sent already fixing this, see
> > > https://lore.kernel.org/intel-gfx/20201027133600.3656665-1-imre.d...@intel.com/
> > > 
> > > but it fell through the cracks. Applied now your patch, thanks.
> > 
> > We seem to have a tgl that fails consistently at DPRX link training:
> > https://intel-gfx-ci.01.org/tree/drm-tip/fi-tgl-1115g4.html
> > 
> > Previously the error went unnoticed.
> 
> Yea, didn't notice this. Can't see anything obvious, besides that it's a
> DPCD rev 1.1 monitor, so maybe not compatible with LTTPRs. I follow up
> if I find something.

I opened this thread exactly to tell that I'm avoiding this patch on
this week's pull request targeting 5.14 exactly because I saw something
strange with CI_DIF_604 on TGL that doesn't happen without this patch
CI_DIF_603.

Since I don't know what's going on there I'm also avoiding
d7f213c131ad ("drm/i915/dp: Use max params for panels < eDP 1.4")
just in case...

> 
> > 
> > -- 
> > Ville Syrjälä
> > Intel

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: remove unused i915->active_pipes

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: remove unused i915->active_pipes
URL   : https://patchwork.freedesktop.org/series/94076/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20906


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/index.html

Known issues


  Here are the changes found in Patchwork_20906 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][1] ([fdo#109271]) +9 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/fi-kbl-soraka/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@kms_chamelium@hdmi-edid-read:
- fi-kbl-7500u:   [PASS][2] -> [FAIL][3] ([i915#3449])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-kbl-7500u/igt@kms_chamel...@hdmi-edid-read.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/fi-kbl-7500u/igt@kms_chamel...@hdmi-edid-read.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#3449]: https://gitlab.freedesktop.org/drm/intel/issues/3449


Participating hosts (40 -> 33)
--

  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20906

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20906: e8c9a5aa908f6917e49a5c2653db34346f141253 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

e8c9a5aa908f drm/i915: remove unused i915->active_pipes

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20906/index.html

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: remove unused i915->active_pipes

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: remove unused i915->active_pipes
URL   : https://patchwork.freedesktop.org/series/94076/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
e8c9a5aa908f drm/i915: remove unused i915->active_pipes
-:10: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit ef79d62b5ce5 ("drm/i915: 
Encapsulate dbuf state handling harder")'
#10: 
ef79d62b5ce5 ("drm/i915: Encapsulate dbuf state handling harder"), and

-:35: CHECK:MULTIPLE_ASSIGNMENTS: multiple assignments should be avoided
#35: FILE: drivers/gpu/drm/i915/display/intel_display.c:12353:
+   cdclk_state->active_pipes = dbuf_state->active_pipes = active_pipes;

total: 1 errors, 0 warnings, 1 checks, 28 lines checked

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Ensure wa_init_finish() is called for ctx workaround list (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Ensure wa_init_finish() is called for ctx workaround list 
(rev2)
URL   : https://patchwork.freedesktop.org/series/94053/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20905


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/index.html

Known issues


  Here are the changes found in Patchwork_20905 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][1] ([fdo#109271]) +9 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-kbl-soraka/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [PASS][2] -> [DMESG-FAIL][3] ([i915#3928])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@kms_busy@basic@modeset:
- fi-tgl-1115g4:  [PASS][4] -> [DMESG-WARN][5] ([i915#4002])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@kms_busy@ba...@modeset.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-tgl-1115g4/igt@kms_busy@ba...@modeset.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][6] ([i915#3928])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-rkl-guc/igt@run...@aborted.html

  
 Possible fixes 

  * igt@core_hotunplug@unbind-rebind:
- fi-tgl-1115g4:  [DMESG-WARN][7] ([i915#4002]) -> [PASS][8] +1 similar 
issue
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@core_hotunp...@unbind-rebind.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-tgl-1115g4/igt@core_hotunp...@unbind-rebind.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][9] ([i915#1888]) -> [DMESG-WARN][10] 
([i915#4002])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@i915_pm_rpm@module-reload:
- fi-tgl-1115g4:  [INCOMPLETE][11] ([i915#1385] / [i915#4006]) -> 
[INCOMPLETE][12] ([i915#4006])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@i915_pm_...@module-reload.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-tgl-1115g4/igt@i915_pm_...@module-reload.html

  * igt@vgem_basic@unload:
- fi-tgl-1115g4:  [DMESG-WARN][13] ([i915#4002]) -> [DMESG-WARN][14] 
([i915#1385] / [i915#4002])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@vgem_ba...@unload.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/fi-tgl-1115g4/igt@vgem_ba...@unload.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#3928]: https://gitlab.freedesktop.org/drm/intel/issues/3928
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002
  [i915#4006]: https://gitlab.freedesktop.org/drm/intel/issues/4006


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-bsw-cyan fi-bdw-samus 
bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20905

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20905: a8e610fb830482a5031a7ad8421db89552027a80 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

a8e610fb8304 drm/i915: Ensure wa_init_finish() is called for ctx workaround list

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20905/index.html

Re: [Intel-gfx] [PATCH 16/27] drm/i915: Allocate error capture in nowait context

2021-08-26 Thread Matthew Brost

On Wed, Aug 25, 2021 at 08:23:16PM -0700, Matthew Brost wrote:
> Error captures can now be done in a work queue processing G2H messages.
> These messages need to be completely done being processed in the reset
> path, to avoid races in the missing G2H cleanup, which create a
> dependency on memory allocations and dma fences (i915_requests).
> Requests depend on resets, thus now we have a circular dependency. To
> work around this, allocate the error capture in a nowait context.
> 

For completeness Daniel suggested we include the lockdep splat,
included below:

[  154.625989] ==
[  154.632195] WARNING: possible circular locking dependency detected
[  154.638393] 5.14.0-rc5-guc+ #50 Tainted: G U
[  154.643991] --
[  154.650196] i915_selftest/1673 is trying to acquire lock:
[  154.655621] 8881079cb918 
((work_completion)(>requests.worker)){+.+.}-{0:0}, at: 
__flush_work+0x350/0x4d0 [  154.665826]
   but task is already holding lock:
[  154.671682] 8881079cbfb8 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0xf0/0x300 [i915] [  154.680659]
   which lock already depends on the new lock.

[  154.688857]
   the existing dependency chain (in reverse order) is:
[  154.696365]
   -> #2 (>reset.mutex){+.+.}-{3:3}:
[  154.702571]lock_acquire+0xd2/0x300
[  154.706695]i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
[  154.712959]intel_gt_init_reset+0x61/0x80 [i915]
[  154.718258]intel_gt_init_early+0xe6/0x120 [i915]
[  154.723648]i915_driver_probe+0x592/0xdc0 [i915]
[  154.728942]i915_pci_probe+0x43/0x1c0 [i915]
[  154.733891]pci_device_probe+0x9b/0x110
[  154.738362]really_probe+0x1a6/0x3a0
[  154.742568]__driver_probe_device+0xf9/0x170
[  154.747468]driver_probe_device+0x19/0x90
[  154.752114]__driver_attach+0x99/0x170
[  154.756492]bus_for_each_dev+0x73/0xc0
[  154.760870]bus_add_driver+0x14b/0x1f0
[  154.765248]driver_register+0x67/0xb0
[  154.769542]i915_init+0x18/0x8c [i915]
[  154.773964]do_one_initcall+0x53/0x2e0
[  154.778343]do_init_module+0x56/0x210
[  154.782639]load_module+0x25fc/0x29f0
[  154.786934]__do_sys_finit_module+0xae/0x110
[  154.791835]do_syscall_64+0x38/0xc0
[  154.795958]entry_SYSCALL_64_after_hwframe+0x44/0xae
[  154.801558]
   -> #1 (fs_reclaim){+.+.}-{0:0}:
[  154.807241]lock_acquire+0xd2/0x300
[  154.811361]fs_reclaim_acquire+0x9e/0xd0
[  154.815914]kmem_cache_alloc_trace+0x30/0x790
[  154.820899]i915_gpu_coredump_alloc+0x53/0x1a0 [i915]
[  154.826649]i915_gpu_coredump+0x39/0x560 [i915]
[  154.831866]i915_capture_error_state+0xa/0x70 [i915]
[  154.837513]intel_guc_context_reset_process_msg+0x174/0x1f0 [i915]
[  154.844383]ct_incoming_request_worker_func+0x130/0x1b0 [i915]
[  154.850898]process_one_work+0x264/0x590
[  154.855451]worker_thread+0x4b/0x3a0
[  154.859655]kthread+0x147/0x170
[  154.863428]ret_from_fork+0x1f/0x30
[  154.867548]
   -> #0 ((work_completion)(>requests.worker)){+.+.}-{0:0}:
[  154.875747]check_prev_add+0x90/0xc30
[  154.880042]__lock_acquire+0x1643/0x2110
[  154.884595]lock_acquire+0xd2/0x300
[  154.888715]__flush_work+0x373/0x4d0
[  154.892920]intel_guc_submission_reset_prepare+0xf3/0x340 [i915]
[  154.899606]intel_uc_reset_prepare+0x40/0x50 [i915]
[  154.905166]reset_prepare+0x55/0x60 [i915]
[  154.909946]intel_gt_reset+0x11c/0x300 [i915]
[  154.914984]do_device_reset+0x13/0x20 [i915]
[  154.919936]check_whitelist_across_reset+0x166/0x250 [i915]
[  154.926212]live_reset_whitelist.cold+0x6a/0x7a [i915]
[  154.932037]__i915_subtests.cold+0x20/0x74 [i915]
[  154.937428]__run_selftests.cold+0x96/0xee [i915]
[  154.942816]i915_live_selftests+0x2c/0x60 [i915]
[  154.948125]i915_pci_probe+0x93/0x1c0 [i915]
[  154.953076]pci_device_probe+0x9b/0x110
[  154.957545]really_probe+0x1a6/0x3a0
[  154.961749]__driver_probe_device+0xf9/0x170
[  154.966653]driver_probe_device+0x19/0x90
[  154.971290]__driver_attach+0x99/0x170
[  154.975671]bus_for_each_dev+0x73/0xc0
[  154.980053]bus_add_driver+0x14b/0x1f0
[  154.984431]driver_register+0x67/0xb0
[  154.988725]i915_init+0x18/0x8c [i915]
[  154.993149]do_one_initcall+0x53/0x2e0
[  154.997527]do_init_module+0x56/0x210
[  155.001822]load_module+0x25fc/0x29f0
[  155.006118]__do_sys_finit_module+0xae/0x110
[  155.011019]do_syscall_64+0x38/0xc0
[  155.015139]entry_SYSCALL_64_after_hwframe+0x44/0xae
[  155.020729]
   other info that might help us

Re: [Intel-gfx] [PATCH 16/27] drm/i915: Allocate error capture in nowait context

2021-08-26 Thread Daniel Vetter

On Wed, Aug 25, 2021 at 08:23:16PM -0700, Matthew Brost wrote:
> Error captures can now be done in a work queue processing G2H messages.
> These messages need to be completely done being processed in the reset
> path, to avoid races in the missing G2H cleanup, which create a
> dependency on memory allocations and dma fences (i915_requests).
> Requests depend on resets, thus now we have a circular dependency. To
> work around this, allocate the error capture in a nowait context.
> 
> v2:
>  (Daniel Vetter)
>   - Use GFP_NOWAIT instead GFP_ATOMIC
> 
> Fixes: dc0dad365c5e ("Fix for error capture after full GPU reset with GuC")
> Fixes: 573ba126aef3 ("Capture error state on context reset")
> Signed-off-by: Matthew Brost 

Would be good to include an example splat here, since memory inversions
are a bit wtf due to the fake lockdep locks involved. In generally always
good to put all the data you have into the commit message (maybe condensed
down) so it's easier to dig out things again.

With that: Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/i915/i915_gpu_error.c | 39 +--
>  1 file changed, 19 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c 
> b/drivers/gpu/drm/i915/i915_gpu_error.c
> index b9f66dbd46bb..8696ead02118 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -49,8 +49,7 @@
>  #include "i915_memcpy.h"
>  #include "i915_scatterlist.h"
>  
> -#define ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN)
> -#define ATOMIC_MAYFAIL (GFP_ATOMIC | __GFP_NOWARN)
> +#define ATOMIC_MAYFAIL (GFP_NOWAIT | __GFP_NOWARN)
>  
>  static void __sg_set_buf(struct scatterlist *sg,
>void *addr, unsigned int len, loff_t it)
> @@ -79,7 +78,7 @@ static bool __i915_error_grow(struct 
> drm_i915_error_state_buf *e, size_t len)
>   if (e->cur == e->end) {
>   struct scatterlist *sgl;
>  
> - sgl = (typeof(sgl))__get_free_page(ALLOW_FAIL);
> + sgl = (typeof(sgl))__get_free_page(ATOMIC_MAYFAIL);
>   if (!sgl) {
>   e->err = -ENOMEM;
>   return false;
> @@ -99,10 +98,10 @@ static bool __i915_error_grow(struct 
> drm_i915_error_state_buf *e, size_t len)
>   }
>  
>   e->size = ALIGN(len + 1, SZ_64K);
> - e->buf = kmalloc(e->size, ALLOW_FAIL);
> + e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
>   if (!e->buf) {
>   e->size = PAGE_ALIGN(len + 1);
> - e->buf = kmalloc(e->size, GFP_KERNEL);
> + e->buf = kmalloc(e->size, ATOMIC_MAYFAIL);
>   }
>   if (!e->buf) {
>   e->err = -ENOMEM;
> @@ -243,12 +242,12 @@ static bool compress_init(struct i915_vma_compress *c)
>  {
>   struct z_stream_s *zstream = >zstream;
>  
> - if (pool_init(>pool, ALLOW_FAIL))
> + if (pool_init(>pool, ATOMIC_MAYFAIL))
>   return false;
>  
>   zstream->workspace =
>   kmalloc(zlib_deflate_workspacesize(MAX_WBITS, MAX_MEM_LEVEL),
> - ALLOW_FAIL);
> + ATOMIC_MAYFAIL);
>   if (!zstream->workspace) {
>   pool_fini(>pool);
>   return false;
> @@ -256,7 +255,7 @@ static bool compress_init(struct i915_vma_compress *c)
>  
>   c->tmp = NULL;
>   if (i915_has_memcpy_from_wc())
> - c->tmp = pool_alloc(>pool, ALLOW_FAIL);
> + c->tmp = pool_alloc(>pool, ATOMIC_MAYFAIL);
>  
>   return true;
>  }
> @@ -280,7 +279,7 @@ static void *compress_next_page(struct i915_vma_compress 
> *c,
>   if (dst->page_count >= dst->num_pages)
>   return ERR_PTR(-ENOSPC);
>  
> - page = pool_alloc(>pool, ALLOW_FAIL);
> + page = pool_alloc(>pool, ATOMIC_MAYFAIL);
>   if (!page)
>   return ERR_PTR(-ENOMEM);
>  
> @@ -376,7 +375,7 @@ struct i915_vma_compress {
>  
>  static bool compress_init(struct i915_vma_compress *c)
>  {
> - return pool_init(>pool, ALLOW_FAIL) == 0;
> + return pool_init(>pool, ATOMIC_MAYFAIL) == 0;
>  }
>  
>  static bool compress_start(struct i915_vma_compress *c)
> @@ -391,7 +390,7 @@ static int compress_page(struct i915_vma_compress *c,
>  {
>   void *ptr;
>  
> - ptr = pool_alloc(>pool, ALLOW_FAIL);
> + ptr = pool_alloc(>pool, ATOMIC_MAYFAIL);
>   if (!ptr)
>   return -ENOMEM;
>  
> @@ -1026,7 +1025,7 @@ i915_vma_coredump_create(const struct intel_gt *gt,
>  
>   num_pages = min_t(u64, vma->size, vma->obj->base.size) >> PAGE_SHIFT;
>   num_pages = DIV_ROUND_UP(10 * num_pages, 8); /* worstcase zlib growth */
> - dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ALLOW_FAIL);
> + dst = kmalloc(sizeof(*dst) + num_pages * sizeof(u32 *), ATOMIC_MAYFAIL);
>   if (!dst)
>   return NULL;
>  
> @@ -1462,7 +1461,7 @@ capture_engine(struct intel_engine_cs *engine,
>   struct i915_request *rq = NULL;
>   unsigned

[Intel-gfx] ✗ Fi.CI.BAT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
URL   : https://patchwork.freedesktop.org/series/93704/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20904


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20904 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20904, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20904:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@hangcheck:
- fi-rkl-guc: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-rkl-guc/igt@i915_selftest@l...@hangcheck.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/fi-rkl-guc/igt@i915_selftest@l...@hangcheck.html

  
New tests
-

  New tests have been introduced between CI_DRM_10525 and Patchwork_20904:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 30 pass(s)
- Exec time: [0.41, 5.26] s

  

Known issues


  Here are the changes found in Patchwork_20904 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-compute0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][3] ([fdo#109271]) +5 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/fi-kbl-soraka/igt@amdgpu/amd_cs_...@sync-compute0.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][4] ([i915#3928])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20904/fi-rkl-guc/igt@run...@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2411]: https://gitlab.freedesktop.org/drm/intel/issues/2411
  [i915#3928]: https://gitlab.freedesktop.org/drm/intel/issues/3928


Participating hosts (40 -> 33)
--

  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20904

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20904: 0c1d27ac9fce7e231e7dddebcf56905e05302cae @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

0c1d27ac9fce drm/i915/guc: Drop static inline functions intel_guc_submission.c
50ada01b3d95 drm/i915/guc: Add GuC kernel doc
883eccfa8221 drm/i915/guc: Drop guc_active move everything into guc_state
fa075902c938 drm/i915/guc: Move fields protected by guc->contexts_lock into sub 
structure
a1c73c8c481a drm/i915/guc: Move GuC priority fields in context under guc_active
f16c0554ae08 drm/i915/guc: Drop pin count check trick between sched_disable and 
re-pin
42ac1b77a019 drm/i915/guc: Proper xarray usage for contexts_lookup
9b9222998c83 drm/i915/guc: Rework and simplify locking
244934484f63 drm/i915/guc: Move guc_blocked fence to struct guc_state
ba695a58136a drm/i915/guc: Release submit fence from an irq_work
3bd5803d5e25 drm/i915/guc: Flush G2H work queue during reset
b87ba9121748 drm/i915: Allocate error capture in nowait context
adb35ad83c76 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
97e616063006 drm/i915/guc: Don't touch guc_state.sched_state without a lock
1ff99308ef88 drm/i915/guc: Take context ref when cancelling request
ff84f14ddceb drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
abd6a8884cf4 drm/i915/guc: Copy whole golden context, set engine state size of 
subset
a19ba1f51009 drm/i915/guc: Don't enable scheduling on a banned context, guc_id 
invalid, not registered
f29b2b338002 drm/i915/guc: Kick tasklet after queuing a request
f577a4fdeeab drm/i915/selftests: Add a cancel request selftest that triggers a 
reset
da3d87dfe8c5 Revert "drm/i915/gt: Propagate change in error status to children 
on unhold"
25273a034c8d drm/i915/guc: Workaround reset G2H is received after schedule done 
G2H
c00d543957c2 drm/i915/guc: Process all G2H message at once in work queue
5b7ff1fa9e43 drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
54cd904fa232 drm/i915/guc: Unwind context requests in reverse order
593f21493fda drm/i915/guc: Fix outstanding G2H

Re: [Intel-gfx] [PATCH v5 16/20] drm/msm: Don't break exclusive fence ordering

2021-08-26 Thread Rob Clark

On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
>
> There's only one exclusive slot, and we must not break the ordering.
>
> Adding a new exclusive fence drops all previous fences from the
> dma_resv. To avoid violating the signalling order we err on the side of
> over-synchronizing by waiting for the existing fences, even if
> userspace asked us to ignore them.
>
> A better fix would be to us a dma_fence_chain or _array like e.g.
> amdgpu now uses, but
> - msm has a synchronous dma_fence_wait for anything from another
>   context, so doesn't seem to care much,
> - and it probably makes sense to lift this into dma-resv.c code as a
>   proper concept, so that drivers don't have to hack up their own
>   solution each on their own.
>
> v2: Improve commit message per Lucas' suggestion.
>
> Cc: Lucas Stach 
> Signed-off-by: Daniel Vetter 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org

a-b

> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index fb5a2eab27a2..66633dfd58a2 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -330,7 +330,8 @@ static int submit_fence_sync(struct msm_gem_submit 
> *submit, bool no_implicit)
> return ret;
> }
>
> -   if (no_implicit)
> +   /* exclusive fences must be ordered */
> +   if (no_implicit && !write)
> continue;
>
> ret = drm_sched_job_add_implicit_dependencies(>base,
> --
> 2.32.0
>

Re: [Intel-gfx] [PATCH v5 12/20] drm/msm: Use scheduler dependency handling

2021-08-26 Thread Rob Clark

On Thu, Aug 5, 2021 at 3:47 AM Daniel Vetter  wrote:
>
> drm_sched_job_init is already at the right place, so this boils down
> to deleting code.
>
> Signed-off-by: Daniel Vetter 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-arm-...@vger.kernel.org
> Cc: freedr...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org

r-b

> ---
>  drivers/gpu/drm/msm/msm_gem.h|  5 -
>  drivers/gpu/drm/msm/msm_gem_submit.c | 19 +--
>  drivers/gpu/drm/msm/msm_ringbuffer.c | 12 
>  3 files changed, 5 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem.h b/drivers/gpu/drm/msm/msm_gem.h
> index f9e3ffb2309a..8bf0ac707fd7 100644
> --- a/drivers/gpu/drm/msm/msm_gem.h
> +++ b/drivers/gpu/drm/msm/msm_gem.h
> @@ -312,11 +312,6 @@ struct msm_gem_submit {
> struct ww_acquire_ctx ticket;
> uint32_t seqno; /* Sequence number of the submit on the ring 
> */
>
> -   /* Array of struct dma_fence * to block on before submitting this job.
> -*/
> -   struct xarray deps;
> -   unsigned long last_dep;
> -
> /* Hw fence, which is created when the scheduler executes the job, and
>  * is signaled when the hw finishes (via seqno write from cmdstream)
>  */
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 96cea0ba4cfd..fb5a2eab27a2 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> drm_device *dev,
> return ERR_PTR(ret);
> }
>
> -   xa_init_flags(>deps, XA_FLAGS_ALLOC);
> -
> kref_init(>ref);
> submit->dev = dev;
> submit->aspace = queue->ctx->aspace;
> @@ -72,8 +70,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
>  {
> struct msm_gem_submit *submit =
> container_of(kref, struct msm_gem_submit, ref);
> -   unsigned long index;
> -   struct dma_fence *fence;
> unsigned i;
>
> if (submit->fence_id) {
> @@ -82,12 +78,6 @@ void __msm_gem_submit_destroy(struct kref *kref)
> mutex_unlock(>queue->lock);
> }
>
> -   xa_for_each (>deps, index, fence) {
> -   dma_fence_put(fence);
> -   }
> -
> -   xa_destroy(>deps);
> -
> dma_fence_put(submit->user_fence);
> dma_fence_put(submit->hw_fence);
>
> @@ -343,8 +333,9 @@ static int submit_fence_sync(struct msm_gem_submit 
> *submit, bool no_implicit)
> if (no_implicit)
> continue;
>
> -   ret = drm_gem_fence_array_add_implicit(>deps, obj,
> -   write);
> +   ret = drm_sched_job_add_implicit_dependencies(>base,
> + obj,
> + write);
> if (ret)
> break;
> }
> @@ -588,7 +579,7 @@ static struct drm_syncobj **msm_parse_deps(struct 
> msm_gem_submit *submit,
> if (ret)
> break;
>
> -   ret = drm_gem_fence_array_add(>deps, fence);
> +   ret = drm_sched_job_add_dependency(>base, fence);
> if (ret)
> break;
>
> @@ -798,7 +789,7 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
> goto out_unlock;
> }
>
> -   ret = drm_gem_fence_array_add(>deps, in_fence);
> +   ret = drm_sched_job_add_dependency(>base, in_fence);
> if (ret)
> goto out_unlock;
> }
> diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
> b/drivers/gpu/drm/msm/msm_ringbuffer.c
> index bd54c1412649..652b1dedd7c1 100644
> --- a/drivers/gpu/drm/msm/msm_ringbuffer.c
> +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
> @@ -11,17 +11,6 @@ static uint num_hw_submissions = 8;
>  MODULE_PARM_DESC(num_hw_submissions, "The max # of jobs to write into 
> ringbuffer (default 8)");
>  module_param(num_hw_submissions, uint, 0600);
>
> -static struct dma_fence *msm_job_dependency(struct drm_sched_job *job,
> -   struct drm_sched_entity *s_entity)
> -{
> -   struct msm_gem_submit *submit = to_msm_submit(job);
> -
> -   if (!xa_empty(>deps))
> -   return xa_erase(>deps, submit->last_dep++);
> -
> -   return NULL;
> -}
> -
>  static struct dma_fence *msm_job_run(struct drm_sched_job *job)
>  {
> struct msm_gem_submit *submit = to_msm_submit(job);
> @@ -52,7 +41,6 @@ static void msm_job_free(struct drm_sched_job *job)
>  }
>
>  const struct drm_sched_backend_ops msm_sched_ops = {
> -   .dependency = msm_job_dependency,
> .run_job = msm_job_run,
> .free_job = msm_job_free
>  };
> --
> 2.32.0

Re: [Intel-gfx] [PATCH] drm/msm: Improve drm/sched point of no return rules

2021-08-26 Thread Rob Clark

On Thu, Aug 26, 2021 at 2:33 AM Daniel Vetter  wrote:
>
> Originally drm_sched_job_init was the point of no return, after which
> drivers really should submit a job. I've split that up, which allows
> us to fix this issue pretty easily.
>
> Only thing we have to take care of is to not skip to error paths after
> that. Other drivers do this the same for out-fence and similar things.
>
> v2: It's not really a bugfix, just an improvement, since all
> drm_sched_job_arm does is reserve the fence number. And gaps should be
> fine, as long as the drm_sched_job doesn't escape anywhere at all.
>
> For robustness it's still better to align with other drivers here and
> not bail out after job_arm().
>
> v3: I misplaced drm_sched_job_arm by _one_ line! Thanks to Rob for
> testing and debug help.
>
> Cc: Rob Clark 
> Cc: Rob Clark 
> Cc: Sean Paul 
> Cc: Sumit Semwal 
> Cc: "Christian König" 
> Cc: linux-arm-...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: freedr...@lists.freedesktop.org
> Cc: linux-me...@vger.kernel.org
> Cc: linaro-mm-...@lists.linaro.org
> Signed-off-by: Daniel Vetter 

t-b && r-b

BR,
-R

> ---
>  drivers/gpu/drm/msm/msm_gem_submit.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
> b/drivers/gpu/drm/msm/msm_gem_submit.c
> index 4d1c4d5f6a2a..71b8c8f752a3 100644
> --- a/drivers/gpu/drm/msm/msm_gem_submit.c
> +++ b/drivers/gpu/drm/msm/msm_gem_submit.c
> @@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct 
> drm_device *dev,
> return ERR_PTR(ret);
> }
>
> -   drm_sched_job_arm(>base);
> -
> xa_init_flags(>deps, XA_FLAGS_ALLOC);
>
> kref_init(>ref);
> @@ -880,6 +878,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
>
> submit->nr_cmds = i;
>
> +   drm_sched_job_arm(>base);
> +
> submit->user_fence = dma_fence_get(>base.s_fence->finished);
>
> /*
> @@ -891,17 +891,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
> *data,
> if (submit->fence_id < 0) {
> ret = submit->fence_id = 0;
> submit->fence_id = 0;
> -   goto out;
> }
>
> -   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> +   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
> struct sync_file *sync_file = 
> sync_file_create(submit->user_fence);
> if (!sync_file) {
> ret = -ENOMEM;
> -   goto out;
> +   } else {
> +   fd_install(out_fence_fd, sync_file->file);
> +   args->fence_fd = out_fence_fd;
> }
> -   fd_install(out_fence_fd, sync_file->file);
> -   args->fence_fd = out_fence_fd;
> }
>
> submit_attach_object_fences(submit);
> --
> 2.32.0
>

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/selftests/i915_syncmap.c:80:54: warning: dubious: x | !y
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write8' - different lock

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev6)
URL   : https://patchwork.freedesktop.org/series/93704/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
6b511953d015 drm/i915/guc: Fix blocked context accounting
593f21493fda drm/i915/guc: Fix outstanding G2H accounting
54cd904fa232 drm/i915/guc: Unwind context requests in reverse order
5b7ff1fa9e43 drm/i915/guc: Don't drop ce->guc_active.lock when unwinding context
c00d543957c2 drm/i915/guc: Process all G2H message at once in work queue
25273a034c8d drm/i915/guc: Workaround reset G2H is received after schedule done 
G2H
da3d87dfe8c5 Revert "drm/i915/gt: Propagate change in error status to children 
on unhold"
-:8: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 3761baae908a ("Revert "drm/i915: 
Propagate errors on awaiting already signaled fences"")'
#8: 
errors from one client ending up in another.  In 3761baae908a (Revert

-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 8e9f84cf5cac ("drm/i915/gt: 
Propagate change in error status to children on unhold")'
#11: 
added in 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status

-:24: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description 
(prefer a maximum 75 chars per line)
#24: 
References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting 
already signaled fences"")'

total: 2 errors, 1 warnings, 0 checks, 10 lines checked
f577a4fdeeab drm/i915/selftests: Add a cancel request selftest that triggers a 
reset
f29b2b338002 drm/i915/guc: Kick tasklet after queuing a request
-:8: WARNING:TYPO_SPELLING: 'inteface' may be misspelled - perhaps 'interface'?
#8: 
Fixes: 3a4cdf1982f0 ("drm/i915/guc: Implement GuC context operations for new 
inteface")
 


total: 0 errors, 1 warnings, 0 checks, 7 lines checked
a19ba1f51009 drm/i915/guc: Don't enable scheduling on a banned context, guc_id 
invalid, not registered
abd6a8884cf4 drm/i915/guc: Copy whole golden context, set engine state size of 
subset
ff84f14ddceb drm/i915/selftests: Add initial GuC selftest for scrubbing lost G2H
-:108: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does 
MAINTAINERS need updating?
#108: 
new file mode 100644

total: 0 errors, 1 warnings, 0 checks, 233 lines checked
1ff99308ef88 drm/i915/guc: Take context ref when cancelling request
97e616063006 drm/i915/guc: Don't touch guc_state.sched_state without a lock
adb35ad83c76 drm/i915/guc: Reset LRC descriptor if register returns -ENODEV
b87ba9121748 drm/i915: Allocate error capture in nowait context
3bd5803d5e25 drm/i915/guc: Flush G2H work queue during reset
ba695a58136a drm/i915/guc: Release submit fence from an irq_work
244934484f63 drm/i915/guc: Move guc_blocked fence to struct guc_state
9b9222998c83 drm/i915/guc: Rework and simplify locking
42ac1b77a019 drm/i915/guc: Proper xarray usage for contexts_lookup
f16c0554ae08 drm/i915/guc: Drop pin count check trick between sched_disable and 
re-pin
a1c73c8c481a drm/i915/guc: Move GuC priority fields in context under guc_active
fa075902c938 drm/i915/guc: Move fields protected by guc->contexts_lock into sub 
structure
883eccfa8221 drm/i915/guc: Drop guc_active move everything into guc_state
50ada01b3d95 drm/i915/guc: Add GuC kernel doc
0c1d27ac9fce drm/i915/guc: Drop static inline functions intel_guc_submission.c

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915: Be more gentle when exiting non-persistent contexts (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Be more gentle when exiting non-persistent contexts (rev2)
URL   : https://patchwork.freedesktop.org/series/93420/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20903


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/index.html

Known issues


  Here are the changes found in Patchwork_20903 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-fork-compute0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][1] ([fdo#109271]) +8 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/fi-kbl-soraka/igt@amdgpu/amd_cs_...@sync-fork-compute0.html

  * igt@i915_module_load@reload:
- fi-tgl-1115g4:  [PASS][2] -> [DMESG-WARN][3] ([i915#4002]) +1 similar 
issue
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@i915_module_l...@reload.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/fi-tgl-1115g4/igt@i915_module_l...@reload.html

  
 Possible fixes 

  * igt@kms_prop_blob@basic:
- fi-tgl-1115g4:  [DMESG-WARN][4] ([i915#4002]) -> [PASS][5]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@kms_prop_b...@basic.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/fi-tgl-1115g4/igt@kms_prop_b...@basic.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [FAIL][6] ([i915#1888]) -> [DMESG-WARN][7] 
([i915#4002])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@kms_psr@primary_page_flip:
- fi-tgl-1115g4:  [SKIP][8] ([i915#1072] / [i915#1385]) -> [SKIP][9] 
([i915#1072])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-bsw-cyan fi-bdw-samus 
bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20903

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20903: 660ff9cdf6970f83c2b76f96b4daeaeebf1a7b98 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

660ff9cdf697 drm/i915: Be more gentle when exiting non-persistent contexts

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20903/index.html

Re: [Intel-gfx] Tracing a "drm_mode_prune_invalid"

2021-08-26 Thread Adam Chasen

Ville,
It appears we are receiving some minimal information about the DP to dual-link 
DVI adapter which may be used to indicate dual-link support. Any chance we can 
use this information to augment the EDID to not filter out the higher clocks?

> EDID can't help us since it would only tell us whether the display
> supports dual-link or not. The dongle may still be single link only.

[CONNECTOR:95:DP-1]: status: connected
...
DP branch device present: yes
Type: DVI
ID: ***m2DVIa***
HW: 0.1
SW: 2.0
...

I very well may be barking up the wrong tree with the following:

The "DP-1" reports a "branch device" with an "ID" of m2DVIa which is mentioned 
in another (amd) video driver:

> +/*DP to Dual link DVI converter*/
> +static const uint8_t DP_DVI_CONVERTER_ID_4[] = "m2DVIa";
> +static const uint8_t DP_DVI_CONVERTER_ID_5[] = "3393N2";
 from https://lore.kernel.org/patchwork/patch/1338037/

This is used here which appears to do something with I2C: 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c#L309

There are mention of a small number of "external converter chips" which are 
used in the above conditional: 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/display/include/ddc_service_types.h#L30

Thanks,
Adam

On Fri, Jun 4, 2021, at 1:13 PM, Ville Syrjälä wrote:
> On Fri, Jun 04, 2021 at 12:57:25PM -0400, Adam Chasen wrote:
> > Thanks for staying with me! Still hoping I can get back to using 
> > KMS/Wayland combination with my setup.
> > 
> > I understand the current recommendation is to push the mode setting to the 
> > wayland compositor per Ville here: 
> > https://gitlab.freedesktop.org/drm/intel/-/issues/393#note_337616
> > 
> > Alas, I am using Mutter (similar to issue #393) which (historically) 
> > doesn't support mode setting (yet?).
> > 
> > There is mention of drm_dp_downstream_max_clock() in an i915 comment, which 
> > looks like could be a reference to drm_dp_downstream_max_tmds_clock(). 
> > 
> > It seems there is a hard coded 165MHz max for DP_DWN_STRM_PORT_TYPE_TMDS or 
> > (note the comment in the below code):
> > 
> > case DP_DS_PORT_TYPE_DVI:
> > if ((dpcd[DP_DOWNSTREAMPORT_PRESENT] & 
> > DP_DETAILED_CAP_INFO_AVAILABLE) == 0)
> > return 165000;
> > /* FIXME what to do about DVI dual link? */
> > return port_cap[1] * 2500;
> > 
> > Still wondering about the "one byte" format is for configuration, but I 
> > presume it is setting DP_DETAILED_CAP_INFO_AVAILABLE to 0 which triggers 
> > this.
> > 
> > Is there a recommended approach to setting the port to support Dual-Link 
> > based on EDID response (or is it too late by the time we get  the EDID)?
> 
> EDID can't help us since it would only tell us whether the display
> supports dual-link or not. The dongle may still be single link only.
> 
> > 
> > Is there a recommended approach for a "disable filter", or "manual 
> > modeset"? There are others who seem interested in overriding the filtering 
> > logic (e.g. "do what I say even though it isn't clear it will work"). 
> > https://gitlab.freedesktop.org/drm/intel/-/issues/393#note_829142
> 
> Userspace is free to force any it wants. But I guess mutter+wayland
> doesn't support that for some reason.
> 
> I've occasionally pondered about adding some kind of connector property
> for this, but not sure wht it should look like. And it would still
> require userspace support to set it. Another idea would be to extend the
> video= cmdline with some kind of knob that lets you override these
> limits. But again it's a bit hard to come up with a decent solution since
> there are various different clock limits involved (TMDS clock for HDMI
> vs. link rate for DP, dotclock for everyting). And just saying "ignore
> all limits" is not a very flexible solution since there may be some
> limit you do want to enforce, just not as low as what we would
> auto-detect.
> 
> > 
> > -Adam
> > 
> > -- Related --
> > 
> > I found these following a thread on the 165MHz clock limit in the context 
> > of DP dual mode HDMI dongles with a patch experimenting with turning off 
> > the limit: https://bugs.freedesktop.org/show_bug.cgi?id=112018#c2 (now 
> > https://gitlab.freedesktop.org/drm/intel/-/issues/511) There is even a hack 
> > for what appears to be a similar limitation
> > (using Dual mode DP): https://github.com/hansmi/fake-dp-dual-mode
> > 
> > Researching answers for previous questions: 
> > 
> > "one byte" cap:
> > /*
> >  * 0x80-0x8f describe downstream port capabilities, but there are two 
> > layouts
> >  * based on whether DP_DETAILED_CAP_INFO_AVAILABLE was set.  If it was not,
> >  * each port's descriptor is one byte wide.  If it was set, each port's is
> >  * four bytes wide, starting with the one byte from the base info.  As of
> >  * DP interop v1.1a only VGA defines additional detail.
> >  */
> > 
> >

Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 03:59:30PM +0200, Thomas Hellström wrote:
> On Thu, 2021-08-26 at 14:44 +0200, Daniel Vetter wrote:
> > On Thu, Aug 26, 2021 at 12:45:14PM +0200, Thomas Hellström wrote:
> > > Pinned contexts, like the migrate contexts need reset after resume
> > > since their context image may have been lost. Also the GuC needs to
> > > register pinned contexts.
> > > 
> > > Add a list to struct intel_engine_cs where we add all pinned
> > > contexts on
> > > creation, and traverse that list at resume time to reset the pinned
> > > contexts.
> > > 
> > > This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest
> > > for now,
> > > but proper LMEM backup / restore is needed for full suspend
> > > functionality.
> > > However, note that even with full LMEM backup / restore it may be
> > > desirable to keep the reset since backing up the migrate context
> > > images
> > > must happen using memcpy() after the migrate context has become
> > > inactive,
> > > and for performance- and other reasons we want to avoid memcpy()
> > > from
> > > LMEM.
> > > 
> > > Also traverse the list at guc_init_lrc_mapping() calling
> > > guc_kernel_context_pin() for the pinned contexts, like is already
> > > done
> > > for the kernel context.
> > > 
> > > v2:
> > > - Don't reset the contexts on each __engine_unpark() but rather at
> > >   resume time (Chris Wilson).
> > > 
> > > Cc: Tvrtko Ursulin 
> > > Cc: Matthew Auld 
> > > Cc: Maarten Lankhorst 
> > > Cc: Brost Matthew 
> > > Cc: Chris Wilson 
> > > Signed-off-by: Thomas Hellström 
> > 
> > I guess it got lost, but I few weeks ago I stumbled over this and
> > wondered
> > why we're even setting up a separate context or at least why a
> > separate vm
> > compared to the gt->vm we have already?
> > 
> > Even on chips with bazillions of copy engines the plan is that we
> > only
> > reserve a single one for kernel migrations, so there's not really a
> > need
> > for quite this much generality I think. Maybe check with Jon
> > Bloomfield on
> > this.
> 
> Are you referring to the generality of the migration code itself or to
> the generality of using a list in this patch to register multiple
> pinned contexts to an engine? 
> 
> For the migration code itself, I figured reserving one copy engine for
> migration was strictly needed for recoverable page-faults? In the
> current version we're not doing that, but just tying a pinned migration
> context to the first available copy engine on the gt, to be used when
> we don't have a ww context available to pin a separate context using a
> random copy engine. Note also the ring size of the migration contexts;
> since we're populating the page-tables for each blit, it's not hard to
> fill the ring and in the end multiple contexts I guess boils down to
> avoiding priority inversion on migration, including blocking high
> priority kernel context tasks.
> 
> As for not using the gt->vm, I'm not completely sure if we can do our
> special page-table setup on that, Got to defer that question to Chris,
> but once Ram's work of supporting 64K LMEM PTEs on that has landed I
> guess we could easily reuse the gt->vm if possible and suitable.

Just on why we have gt->vm and then also the migration vm. The old mail I
typed up on this:

https://lore.kernel.org/dri-devel/CAKMK7uG6g+DQQEcjqeA6=z2enhogamuvkerdgkm5jkq3u+a...@mail.gmail.com/

-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915: Be more gentle when exiting non-persistent contexts (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Be more gentle when exiting non-persistent contexts (rev2)
URL   : https://patchwork.freedesktop.org/series/93420/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
660ff9cdf697 drm/i915: Be more gentle when exiting non-persistent contexts
-:213: WARNING:LONG_LINE: line length of 107 exceeds 100 columns
#213: FILE: drivers/gpu/drm/i915/gt/intel_context_types.h:40:
+   void (*revoke)(struct intel_context *ce, struct i915_request *rq, 
unsigned int preempt_timeout_ms);

total: 0 errors, 1 warnings, 0 checks, 247 lines checked

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Register the migrate contexts with their engines (rev2)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gt: Register the migrate contexts with their engines (rev2)
URL   : https://patchwork.freedesktop.org/series/94058/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20902


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/index.html

Known issues


  Here are the changes found in Patchwork_20902 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_cs_nop@sync-compute0:
- fi-kbl-soraka:  NOTRUN -> [SKIP][1] ([fdo#109271]) +4 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/fi-kbl-soraka/igt@amdgpu/amd_cs_...@sync-compute0.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][2] -> [DMESG-WARN][3] ([i915#3958])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  
 Possible fixes 

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
- {fi-dg1-1}: [INCOMPLETE][4] ([i915#3717]) -> [PASS][5]
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-dg1-1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-a.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/fi-dg1-1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-a.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#3717]: https://gitlab.freedesktop.org/drm/intel/issues/3717
  [i915#3958]: https://gitlab.freedesktop.org/drm/intel/issues/3958


Participating hosts (40 -> 33)
--

  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20902

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20902: c3a760daf2a2b36d07c1ce110c988fda0bd1d16a @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

c3a760daf2a2 drm/i915/gt: Register the migrate contexts with their engines

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20902/index.html

Re: [Intel-gfx] [PATCH 24/33] drm/i915/guc: Implement banned contexts for GuC submission

2021-08-26 Thread Matthew Brost

On Thu, Aug 26, 2021 at 12:27:31PM +0100, Tvrtko Ursulin wrote:
> 
> On 26/08/2021 04:49, Matthew Brost wrote:
> > On Wed, Aug 25, 2021 at 11:39:10AM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 27/07/2021 01:23, Matthew Brost wrote:
> > > > When using GuC submission, if a context gets banned disable scheduling
> > > > and mark all inflight requests as complete.
> > > > 
> > > > Cc: John Harrison 
> > > > Signed-off-by: Matthew Brost 
> > > > Reviewed-by: John Harrison 
> > > > ---
> > > >drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
> > > >drivers/gpu/drm/i915/gt/intel_context.h   |  13 ++
> > > >drivers/gpu/drm/i915/gt/intel_context_types.h |   2 +
> > > >drivers/gpu/drm/i915/gt/intel_reset.c |  32 +---
> > > >.../gpu/drm/i915/gt/intel_ring_submission.c   |  20 +++
> > > >drivers/gpu/drm/i915/gt/uc/intel_guc.h|   2 +
> > > >.../gpu/drm/i915/gt/uc/intel_guc_submission.c | 151 
> > > > --
> > > >drivers/gpu/drm/i915/i915_trace.h |  10 ++
> > > >8 files changed, 195 insertions(+), 37 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> > > > b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > index e3df01a201d7..05c3ee191710 100644
> > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> > > > @@ -1084,7 +1084,7 @@ static void kill_engines(struct i915_gem_engines 
> > > > *engines, bool ban)
> > > > for_each_gem_engine(ce, engines, it) {
> > > > struct intel_engine_cs *engine;
> > > > -   if (ban && intel_context_set_banned(ce))
> > > > +   if (ban && intel_context_ban(ce, NULL))
> > > > continue;
> > > > /*
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
> > > > b/drivers/gpu/drm/i915/gt/intel_context.h
> > > > index 2ed9bf5f91a5..814d9277096a 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > > > @@ -16,6 +16,7 @@
> > > >#include "intel_engine_types.h"
> > > >#include "intel_ring_types.h"
> > > >#include "intel_timeline_types.h"
> > > > +#include "i915_trace.h"
> > > >#define CE_TRACE(ce, fmt, ...) do {  
> > > > \
> > > > const struct intel_context *ce__ = (ce);
> > > > \
> > > > @@ -243,6 +244,18 @@ static inline bool intel_context_set_banned(struct 
> > > > intel_context *ce)
> > > > return test_and_set_bit(CONTEXT_BANNED, >flags);
> > > >}
> > > > +static inline bool intel_context_ban(struct intel_context *ce,
> > > > +struct i915_request *rq)
> > > > +{
> > > > +   bool ret = intel_context_set_banned(ce);
> > > > +
> > > > +   trace_intel_context_ban(ce);
> > > > +   if (ce->ops->ban)
> > > > +   ce->ops->ban(ce, rq);
> > > > +
> > > > +   return ret;
> > > > +}
> > > > +
> > > >static inline bool
> > > >intel_context_force_single_submission(const struct intel_context *ce)
> > > >{
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> > > > b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > > index 035108c10b2c..57c19ee3e313 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> > > > @@ -35,6 +35,8 @@ struct intel_context_ops {
> > > > int (*alloc)(struct intel_context *ce);
> > > > +   void (*ban)(struct intel_context *ce, struct i915_request *rq);
> > > > +
> > > > int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx 
> > > > *ww, void **vaddr);
> > > > int (*pin)(struct intel_context *ce, void *vaddr);
> > > > void (*unpin)(struct intel_context *ce);
> > > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
> > > > b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > index 4d281bc8a38c..91200c43951f 100644
> > > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > > > @@ -22,7 +22,6 @@
> > > >#include "intel_reset.h"
> > > >#include "uc/intel_guc.h"
> > > > -#include "uc/intel_guc_submission.h"
> > > >#define RESET_MAX_RETRIES 3
> > > > @@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore 
> > > > *uncore, i915_reg_t reg, u32 clr)
> > > > intel_uncore_rmw_fw(uncore, reg, clr, 0);
> > > >}
> > > > -static void skip_context(struct i915_request *rq)
> > > > -{
> > > > -   struct intel_context *hung_ctx = rq->context;
> > > > -
> > > > -   list_for_each_entry_from_rcu(rq, _ctx->timeline->requests, 
> > > > link) {
> > > > -   if (!i915_request_is_active(rq))
> > > > -   return;
> > > > -
> > > > -   if (rq->context == hung_ctx) {
> > > > -   i915_request_set_error_once(rq, -EIO);
> > > > -

[Intel-gfx] [PATCH] drm/i915: remove unused i915->active_pipes

2021-08-26 Thread Jani Nikula

Apparently the last reader of i915->active_pipes was removed with commit
ef79d62b5ce5 ("drm/i915: Encapsulate dbuf state handling harder"), and
now it's only ever written to. Remove it completely.

Cc: Stanislav Lisovskiy 
Cc: Ville Syrjälä 
Signed-off-by: Jani Nikula 
---
 drivers/gpu/drm/i915/display/intel_display.c | 4 +---
 drivers/gpu/drm/i915/i915_drv.h  | 6 --
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c 
b/drivers/gpu/drm/i915/display/intel_display.c
index fe5ad599c218..a692971b0209 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -3781,7 +3781,6 @@ static void intel_crtc_disable_noatomic(struct intel_crtc 
*crtc,
 
intel_display_power_put_all_in_set(dev_priv, 
>enabled_power_domains);
 
-   dev_priv->active_pipes &= ~BIT(pipe);
cdclk_state->min_cdclk[pipe] = 0;
cdclk_state->min_voltage_level[pipe] = 0;
cdclk_state->active_pipes &= ~BIT(pipe);
@@ -12351,8 +12350,7 @@ static void intel_modeset_readout_hw_state(struct 
drm_device *dev)
enableddisabled(crtc_state->hw.active));
}
 
-   dev_priv->active_pipes = cdclk_state->active_pipes =
-   dbuf_state->active_pipes = active_pipes;
+   cdclk_state->active_pipes = dbuf_state->active_pipes = active_pipes;
 
readout_plane_state(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f64ba566fe8c..033031169d74 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1015,12 +1015,6 @@ struct drm_i915_private {
 
struct list_head global_obj_list;
 
-   /*
-* For reading active_pipes holding any crtc lock is
-* sufficient, for writing must hold all of them.
-*/
-   u8 active_pipes;
-
struct i915_wa_list gt_wa_list;
 
struct i915_frontbuffer_tracking fb_tracking;
-- 
2.20.1

Re: [Intel-gfx] [PATCH 08/27] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2021-08-26 Thread Matthew Brost

On Thu, Aug 26, 2021 at 10:32:54AM +0100, Tvrtko Ursulin wrote:
> 
> On 26/08/2021 04:23, Matthew Brost wrote:
> > Add a cancel request selftest that results in an engine reset to cancel
> > the request as it is non-preemptable. Also insert a NOP request after
> > the cancelled request and confirm that it completely successfully.
> 
> Which patch fixes a problem this exposes in the execlists implementation?
> 

https://patchwork.freedesktop.org/patch/451421/?series=93704=6

> > Signed-off-by: Matthew Brost 
> > ---
> >   drivers/gpu/drm/i915/selftests/i915_request.c | 100 ++
> >   1 file changed, 100 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
> > b/drivers/gpu/drm/i915/selftests/i915_request.c
> > index d67710d10615..e2c5db77f087 100644
> > --- a/drivers/gpu/drm/i915/selftests/i915_request.c
> > +++ b/drivers/gpu/drm/i915/selftests/i915_request.c
> > @@ -772,6 +772,98 @@ static int __cancel_completed(struct intel_engine_cs 
> > *engine)
> > return err;
> >   }
> > +static int __cancel_reset(struct intel_engine_cs *engine)
> > +{
> > +   struct intel_context *ce;
> > +   struct igt_spinner spin;
> > +   struct i915_request *rq, *nop;
> > +   unsigned long preempt_timeout_ms;
> > +   int err = 0;
> > +
> 
> You may need to skip the test if preempt timeout is compiled out or if GPU
> reset is altogether disabled.
>

Yes, probably. Will fix this.
 
> > +   preempt_timeout_ms = engine->props.preempt_timeout_ms;
> > +   engine->props.preempt_timeout_ms = 100;
> > +
> > +   if (igt_spinner_init(, engine->gt))
> > +   goto out_restore;
> > +
> > +   ce = intel_context_create(engine);
> > +   if (IS_ERR(ce)) {
> > +   err = PTR_ERR(ce);
> > +   goto out_spin;
> > +   }
> > +
> > +   rq = igt_spinner_create_request(, ce, MI_NOOP);
> > +   if (IS_ERR(rq)) {
> > +   err = PTR_ERR(rq);
> > +   goto out_ce;
> > +   }
> > +
> > +   pr_debug("%s: Cancelling active request\n", engine->name);
> 
> "active non-preemptable" perhaps?
> 

Sure.

> > +   i915_request_get(rq);
> > +   i915_request_add(rq);
> > +   if (!igt_wait_for_spinner(, rq)) {
> > +   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
> > +
> > +   pr_err("Failed to start spinner on %s\n", engine->name);
> > +   intel_engine_dump(engine, , "%s\n", engine->name);
> > +   err = -ETIME;
> > +   goto out_rq;
> > +   }
> > +
> > +   nop = intel_context_create_request(ce);
> > +   if (IS_ERR(nop))
> > +   goto out_nop;
> > +   i915_request_get(nop);
> > +   i915_request_add(nop);
> > +
> > +   i915_request_cancel(rq, -EINTR);
> > +
> > +   if (i915_request_wait(rq, 0, HZ) < 0) {
> > +   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
> > +
> > +   pr_err("%s: Failed to cancel hung request\n", engine->name);
> > +   intel_engine_dump(engine, , "%s\n", engine->name);
> > +   err = -ETIME;
> > +   goto out_nop;
> > +   }
> > +
> > +   if (rq->fence.error != -EINTR) {
> > +   pr_err("%s: fence not cancelled (%u)\n",
> > +  engine->name, rq->fence.error);
> > +   err = -EINVAL;
> > +   goto out_nop;
> > +   }
> > +
> > +   if (i915_request_wait(nop, 0, HZ) < 0) {
> > +   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
> > +
> > +   pr_err("%s: Failed to complete nop request\n", engine->name);
> > +   intel_engine_dump(engine, , "%s\n", engine->name);
> > +   err = -ETIME;
> > +   goto out_nop;
> > +   }
> > +
> > +   if (nop->fence.error != 0) {
> > +   pr_err("%s: Nop request errored (%u)\n",
> 
> Maybe s/nop/innocent/ in the respective log messages?
> 

I kinda perfer NOP.

> > +  engine->name, nop->fence.error);
> > +   err = -EINVAL;
> > +   }
> > +
> > +out_nop:
> > +   i915_request_put(nop);
> > +out_rq:
> > +   i915_request_put(rq);
> > +out_ce:
> > +   intel_context_put(ce);
> > +out_spin:
> > +   igt_spinner_fini();
> > +out_restore:
> > +   engine->props.preempt_timeout_ms = preempt_timeout_ms;
> > +   if (err)
> > +   pr_err("%s: %s error %d\n", __func__, engine->name, err);
> > +   return err;
> > +}
> > +
> >   static int live_cancel_request(void *arg)
> >   {
> > struct drm_i915_private *i915 = arg;
> > @@ -804,6 +896,14 @@ static int live_cancel_request(void *arg)
> > return err;
> > if (err2)
> > return err2;
> > +
> > +   /* Expects reset so call outside of igt_live_test_* */
> 
> Hm there are live tests like live_preempt_cancel which seemingly manage to
> do resets under the live test block.
>

You can increment t->reset_global if a GT reset is expected, problem is
only execlists do a GT while GuC submission does a GuC engine based
reset so we'd have to put in a statement like this if was within the
begin / end block:

if !guc

Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Thomas Hellström

On Thu, 2021-08-26 at 14:44 +0200, Daniel Vetter wrote:
> On Thu, Aug 26, 2021 at 12:45:14PM +0200, Thomas Hellström wrote:
> > Pinned contexts, like the migrate contexts need reset after resume
> > since their context image may have been lost. Also the GuC needs to
> > register pinned contexts.
> > 
> > Add a list to struct intel_engine_cs where we add all pinned
> > contexts on
> > creation, and traverse that list at resume time to reset the pinned
> > contexts.
> > 
> > This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest
> > for now,
> > but proper LMEM backup / restore is needed for full suspend
> > functionality.
> > However, note that even with full LMEM backup / restore it may be
> > desirable to keep the reset since backing up the migrate context
> > images
> > must happen using memcpy() after the migrate context has become
> > inactive,
> > and for performance- and other reasons we want to avoid memcpy()
> > from
> > LMEM.
> > 
> > Also traverse the list at guc_init_lrc_mapping() calling
> > guc_kernel_context_pin() for the pinned contexts, like is already
> > done
> > for the kernel context.
> > 
> > v2:
> > - Don't reset the contexts on each __engine_unpark() but rather at
> >   resume time (Chris Wilson).
> > 
> > Cc: Tvrtko Ursulin 
> > Cc: Matthew Auld 
> > Cc: Maarten Lankhorst 
> > Cc: Brost Matthew 
> > Cc: Chris Wilson 
> > Signed-off-by: Thomas Hellström 
> 
> I guess it got lost, but I few weeks ago I stumbled over this and
> wondered
> why we're even setting up a separate context or at least why a
> separate vm
> compared to the gt->vm we have already?
> 
> Even on chips with bazillions of copy engines the plan is that we
> only
> reserve a single one for kernel migrations, so there's not really a
> need
> for quite this much generality I think. Maybe check with Jon
> Bloomfield on
> this.

Are you referring to the generality of the migration code itself or to
the generality of using a list in this patch to register multiple
pinned contexts to an engine? 

For the migration code itself, I figured reserving one copy engine for
migration was strictly needed for recoverable page-faults? In the
current version we're not doing that, but just tying a pinned migration
context to the first available copy engine on the gt, to be used when
we don't have a ww context available to pin a separate context using a
random copy engine. Note also the ring size of the migration contexts;
since we're populating the page-tables for each blit, it's not hard to
fill the ring and in the end multiple contexts I guess boils down to
avoiding priority inversion on migration, including blocking high
priority kernel context tasks.

As for not using the gt->vm, I'm not completely sure if we can do our
special page-table setup on that, Got to defer that question to Chris,
but once Ram's work of supporting 64K LMEM PTEs on that has landed I
guess we could easily reuse the gt->vm if possible and suitable.

Thanks,
/Thomas

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/sched dependency handling and implicit sync fixes (rev5)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev5)
URL   : https://patchwork.freedesktop.org/series/93415/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10525 -> Patchwork_20901


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/index.html

Known issues


  Here are the changes found in Patchwork_20901 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@gt_lrc:
- fi-bsw-n3050:   [PASS][1] -> [DMESG-FAIL][2] ([i915#2373])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-bsw-n3050/igt@i915_selftest@live@gt_lrc.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/fi-bsw-n3050/igt@i915_selftest@live@gt_lrc.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [PASS][3] -> [DMESG-FAIL][4] ([i915#3928])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10525/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@runner@aborted:
- fi-rkl-guc: NOTRUN -> [FAIL][5] ([i915#3928])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/fi-rkl-guc/igt@run...@aborted.html

  
  [i915#2373]: https://gitlab.freedesktop.org/drm/intel/issues/2373
  [i915#3928]: https://gitlab.freedesktop.org/drm/intel/issues/3928


Participating hosts (40 -> 33)
--

  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10525 -> Patchwork_20901

  CI-20190529: 20190529
  CI_DRM_10525: 059309d37ac2de5d93cf6d71fd7fe33c9c2c66ea @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20901: d83acd41677bc9968b313704bf42b3c12ea6d018 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

d83acd41677b dma-resv: Give the docs a do-over
4f9da5ea4882 drm/i915: Don't break exclusive fence ordering
6f297c479190 drm/i915: delete exclude argument from 
i915_sw_fence_await_reservation
335ff7e05f44 drm/etnaviv: Don't break exclusive fence ordering
9578a0679e6b drm/msm: Don't break exclusive fence ordering
89463913c4ad drm/sched: Check locking in drm_sched_job_await_implicit
3d4cc3335c0e drm/sched: Don't store self-dependencies
9de878031434 drm/gem: Delete gem array fencing helpers
2e8edd04f89c drm/msm: Use scheduler dependency handling
164bbe5be15b drm/etnaviv: Use scheduler dependency handling
79d00ee75ee5 drm/v3d: Use scheduler dependency handling
563fd199edf5 drm/v3d: Move drm_sched_job_init to v3d_job_init
c8f3842d2afd drm/lima: use scheduler dependency tracking
e31bfc2e79ff drm/panfrost: use scheduler dependency tracking
ea09670e5838 drm/sched: improve docs around drm_sched_entity
fcd4fea4cae3 drm/sched: drop entity parameter from drm_sched_push_job
4bfa4c96b80c drm/sched: Add dependency tracking
b4bf87348863 drm/sched: Barriers are needed for entity->last_scheduled
b5c679582c27 drm/msm: Improve drm/sched point of no return rules
5cdf79bc4298 drm/sched: Split drm_sched_job_init

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20901/index.html

[Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gt: Register the migrate contexts with their engines
URL   : https://patchwork.freedesktop.org/series/94058/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10522_full -> Patchwork_20899_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20899_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20899_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20899_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_softpin@allocator-evict-all-engines:
- shard-glk:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-glk7/igt@gem_soft...@allocator-evict-all-engines.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-glk5/igt@gem_soft...@allocator-evict-all-engines.html

  
Known issues


  Here are the changes found in Patchwork_20899_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@legacy-engines-queued:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +3 
similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-snb7/igt@gem_ctx_persiste...@legacy-engines-queued.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][4] ([i915#280])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-tglb1/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_eio@unwedge-stress:
- shard-snb:  NOTRUN -> [FAIL][5] ([i915#3354])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-snb7/igt@gem_...@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
- shard-kbl:  [PASS][6] -> [FAIL][7] ([i915#2846])
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl3/igt@gem_exec_f...@basic-deadline.html
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-kbl1/igt@gem_exec_f...@basic-deadline.html
- shard-apl:  NOTRUN -> [FAIL][8] ([i915#2846])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-apl8/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][9] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-tglb2/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-iclb4/igt@gem_exec_fair@basic-p...@vcs1.html
- shard-kbl:  [PASS][11] -> [FAIL][12] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl4/igt@gem_exec_fair@basic-p...@vcs1.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-kbl6/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-tglb: [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-tglb8/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-tglb1/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][15] -> [FAIL][16] ([i915#2842])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-glk6/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][17] ([fdo#112283])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-tglb1/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_pread@exhaustion:
- shard-snb:  NOTRUN -> [WARN][18] ([i915#2658])
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-snb5/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
- shard-skl:  NOTRUN -> [SKIP][19] ([fdo#109271]) +18 similar issues
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-skl2/igt@gem_render_c...@yf-tiled-to-vebox-linear.html
- shard-iclb: NOTRUN -> [SKIP][20] ([i915#768])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/shard-iclb5/igt@gem_render_c...@yf-tiled-to-vebox-linear.html

  * igt@gem_userptr_blits@readonly-pwrite-unsync:
- shard-tglb: NOTRUN -> [SKIP][21] ([i915#3297])
   [21]:

Re: [Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 11:52:14AM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> When a non-persistent context exits we currently mark it as banned in
> order to trigger fast termination of any outstanding GPU jobs it may have
> left running.
> 
> In doing so we apply a very strict 1ms limit in which the left over job
> has to preempt before we issues an engine resets.
> 
> Some workloads are not able to cleanly preempt in that time window and it
> can be argued that it would instead be better to give them a bit more
> grace since avoiding engine resets is generally preferrable.
> 
> To achieve this the patch splits handling of banned contexts from simply
> closed non-persistent ones and then applies different timeouts for both
> and also extends the criteria which determines if a request should be
> scheduled back in after preemption or not.
> 
> 20ms preempt timeout grace is given to exited non-persistent contexts
> which have been empirically tested to satisfy customers requirements
> and still provides reasonably quick cleanup post exit.
> 
> v2:
>  * Streamline fast path checks.
> 
> v3:
>  * Simplify by using only schedulable status.
>  * Increase timeout to 20ms.
> 
> v4:
>  * Fix live_execlists selftest.
> 
> v5:
>  * Fix logic in kill_engines.
> 
> v6:
>  * Rebase.
> 
> v7:
>  * Add GuC support.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Chris Wilson 
> Cc: Zhen Han 
> Cc: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +++-
>  drivers/gpu/drm/i915/gt/intel_context.c   | 25 ++
>  drivers/gpu/drm/i915/gt/intel_context.h   | 26 ++-
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  3 ++-
>  .../drm/i915/gt/intel_execlists_submission.c  | 13 +++---
>  .../gpu/drm/i915/gt/intel_ring_submission.c   |  7 ++---
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 13 ++
>  drivers/gpu/drm/i915/i915_request.c   |  2 +-
>  8 files changed, 84 insertions(+), 27 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> index fd169cf2f75a..6ae803cb4de3 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
> @@ -1072,7 +1072,8 @@ static struct intel_engine_cs *active_engine(struct 
> intel_context *ce)
>   return engine;
>  }
>  
> -static void kill_engines(struct i915_gem_engines *engines, bool ban)
> +static void
> +kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
>  {
>   struct i915_gem_engines_iter it;
>   struct intel_context *ce;
> @@ -1086,8 +1087,15 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>*/
>   for_each_gem_engine(ce, engines, it) {
>   struct intel_engine_cs *engine;
> + bool skip = false;
>  
> - if (ban && intel_context_ban(ce, NULL))
> + if (ban)
> + skip = intel_context_ban(ce, NULL);
> + else if (!persistent)
> + skip = intel_context_exit_nonpersistent(ce, NULL);
> +
> + /* Already banned or non-persistent closed. */
> + if (skip)
>   continue;
>  
>   /*
> @@ -1100,7 +1108,7 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>   engine = active_engine(ce);
>  
>   /* First attempt to gracefully cancel the context */
> - if (engine && !__cancel_engine(engine) && ban)
> + if (engine && !__cancel_engine(engine) && (ban || !persistent))
>   /*
>* If we are unable to send a preemptive pulse to bump
>* the context from the GPU, we have to resort to a full
> @@ -1112,8 +1120,6 @@ static void kill_engines(struct i915_gem_engines 
> *engines, bool ban)
>  
>  static void kill_context(struct i915_gem_context *ctx)
>  {
> - bool ban = (!i915_gem_context_is_persistent(ctx) ||
> - !ctx->i915->params.enable_hangcheck);
>   struct i915_gem_engines *pos, *next;
>  
>   spin_lock_irq(>stale.lock);
> @@ -1126,7 +1132,8 @@ static void kill_context(struct i915_gem_context *ctx)
>  
>   spin_unlock_irq(>stale.lock);
>  
> - kill_engines(pos, ban);
> + kill_engines(pos, !ctx->i915->params.enable_hangcheck,
> +  i915_gem_context_is_persistent(ctx));
>  
>   spin_lock_irq(>stale.lock);
>   GEM_BUG_ON(i915_sw_fence_signaled(>fence));
> @@ -1172,7 +1179,8 @@ static void engines_idle_release(struct 
> i915_gem_context *ctx,
>  
>  kill:
>   if (list_empty(>link)) /* raced, already closed */
> - kill_engines(engines, true);
> + kill_engines(engines, true,
> +  i915_gem_context_is_persistent(ctx));
>  
>   i915_sw_fence_commit(>fence);
>  }
> diff --git

Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Thomas Hellström

On Thu, 2021-08-26 at 14:04 +0100, Tvrtko Ursulin wrote:
> 
> On 26/08/2021 11:45, Thomas Hellström wrote:
> > Pinned contexts, like the migrate contexts need reset after resume
> > since their context image may have been lost. Also the GuC needs to
> > register pinned contexts.
> 
> So kernel context can get corrupt because we park the GPU with it 
> active. Blitter context for a different reason - which is that it is 
> used to copy itself over to smem, no?
> 
> If that is correct, then why bother copying the blitter context in
> the 
> first place and not just always re-create it on resume?
> 
> That would be along the lines of marking the backing store as
> "dontneed" 
> (however the exact mechanics of that look these days) so suspend can 
> skip them.

I think that is marking the object with I915_BO_ALLOC_VOLATILE. However
I assume this follows the rule of the internal backend objects:
Contents are valid while pinned (or locked), and these images are
indeed pinned on suspend so we need to come up with something else.
Perhaps I915_BO_ALLOC_PM_NOSAVE for the context images (and engine
status pages?) I915_BO_ALLOC_PM_MEMCPY for the migrate vm pagetables
only. The latter will come in handy also for supporting small apertures
where we need to pin these in the mappable area.

> 
> > Add a list to struct intel_engine_cs where we add all pinned
> > contexts on
> > creation, and traverse that list at resume time to reset the pinned
> > contexts.
> > 
> > This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest
> > for now,
> > but proper LMEM backup / restore is needed for full suspend
> > functionality.
> > However, note that even with full LMEM backup / restore it may be
> > desirable to keep the reset since backing up the migrate context
> > images
> > must happen using memcpy() after the migrate context has become
> > inactive,
> > and for performance- and other reasons we want to avoid memcpy()
> > from
> > LMEM.
> 
> Hm I guess this talks about the issue - so are these images migrated
> at 
> all today or not?

My current WIP backs them up. But with something like the above flags,
that's easily changed. Suggestions welcome.

/Thomas

Re: [Intel-gfx] [PATCH v8 7/7] drm: remove drm_file.master_lookup_lock

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:22AM +0800, Desmond Cheong Zhi Xi wrote:
> Previously, master_lookup_lock was introduced in
> commit 0b0860a3cf5e ("drm: serialize drm_file.master with a new
> spinlock") to serialize accesses to drm_file.master. This then allowed
> us to write drm_file_get_master in commit 56f0729a510f ("drm: protect
> drm_master pointers in drm_lease.c").
> 
> The rationale behind introducing a new spinlock at the time was that
> the other lock that could have been used (drm_device.master_mutex) was
> the outermost lock, so embedding calls to drm_file_get_master and
> drm_is_current_master in various functions easily caused us to invert
> the lock hierarchy.
> 
> Following the conversion of master_mutex into a rwsem, and its use to
> plug races with modesetting rights, we've untangled some lock
> hierarchies and removed the need for using drm_file_get_master and the
> unlocked version of drm_is_current_master in multiple places.
> 
> Hence, we can take this opportunity to clean up the locking design by
> replacing master_lookup_lock with drm_device.master_rwsem.
> 
> Signed-off-by: Desmond Cheong Zhi Xi 
> ---
>  drivers/gpu/drm/drm_auth.c | 19 +++
>  drivers/gpu/drm/drm_file.c |  1 -
>  drivers/gpu/drm/drm_internal.h |  1 +
>  drivers/gpu/drm/drm_ioctl.c|  4 ++--
>  drivers/gpu/drm/drm_lease.c| 18 --
>  include/drm/drm_file.h |  9 +
>  6 files changed, 19 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index f2b2f197052a..232416119407 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -61,10 +61,9 @@
>   * trusted clients.
>   */
>  
> -static bool drm_is_current_master_locked(struct drm_file *fpriv)
> +bool drm_is_current_master_locked(struct drm_file *fpriv)
>  {
> - lockdep_assert_once(lockdep_is_held(>master_lookup_lock) ||
> - lockdep_is_held(>minor->dev->master_rwsem));
> + lockdep_assert_held_once(>minor->dev->master_rwsem);
>  
>   return fpriv->is_master && drm_lease_owner(fpriv->master) == 
> fpriv->minor->dev->master;
>  }
> @@ -83,9 +82,9 @@ bool drm_is_current_master(struct drm_file *fpriv)
>  {
>   bool ret;
>  
> - spin_lock(>master_lookup_lock);
> + down_read(>minor->dev->master_rwsem);

Looking at the 3 patches and the need to have a locked version of pretty
much everything I'm wondering: Can't we just drop the spinlock completely,
and everywhere we've taking it thus far replace it with a
lockdep_assert_held_once?

The thing is, if there's any path left that doesn't hold the rwsem in at
least read mode we have a bug. And the right way to fix such a bug is to
grab the rwsem sufficiently high up in the callchain. That way I think we
should be able to avoid all these tedious changes to everything, including
touching i915 and vmwgfx drivers.

Or am I missing something big time?
-Daniel

>   ret = drm_is_current_master_locked(fpriv);
> - spin_unlock(>master_lookup_lock);
> + up_read(>minor->dev->master_rwsem);
>  
>   return ret;
>  }
> @@ -120,7 +119,7 @@ int drm_authmagic(struct drm_device *dev, void *data,
>   DRM_DEBUG("%u\n", auth->magic);
>  
>   down_write(>master_rwsem);
> - if (unlikely(!drm_is_current_master(file_priv))) {
> + if (unlikely(!drm_is_current_master_locked(file_priv))) {
>   up_write(>master_rwsem);
>   return -EACCES;
>   }
> @@ -178,9 +177,7 @@ static int drm_new_set_master(struct drm_device *dev, 
> struct drm_file *fpriv)
>   new_master = drm_master_create(dev);
>   if (!new_master)
>   return -ENOMEM;
> - spin_lock(>master_lookup_lock);
>   fpriv->master = new_master;
> - spin_unlock(>master_lookup_lock);
>  
>   fpriv->is_master = 1;
>   fpriv->authenticated = 1;
> @@ -343,9 +340,7 @@ int drm_master_open(struct drm_file *file_priv)
>   if (!dev->master) {
>   ret = drm_new_set_master(dev, file_priv);
>   } else {
> - spin_lock(_priv->master_lookup_lock);
>   file_priv->master = drm_master_get(dev->master);
> - spin_unlock(_priv->master_lookup_lock);
>   }
>   up_write(>master_rwsem);
>  
> @@ -413,13 +408,13 @@ struct drm_master *drm_file_get_master(struct drm_file 
> *file_priv)
>   if (!file_priv)
>   return NULL;
>  
> - spin_lock(_priv->master_lookup_lock);
> + down_read(_priv->minor->dev->master_rwsem);
>   if (!file_priv->master)
>   goto unlock;
>   master = drm_master_get(file_priv->master);
>  
>  unlock:
> - spin_unlock(_priv->master_lookup_lock);
> + up_read(_priv->minor->dev->master_rwsem);
>   return master;
>  }
>  EXPORT_SYMBOL(drm_file_get_master);
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index 90b62f360da1..8c846e0179d7 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++

[Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/sched dependency handling and implicit sync fixes (rev5)

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/sched dependency handling and implicit sync fixes (rev5)
URL   : https://patchwork.freedesktop.org/series/93415/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
5cdf79bc4298 drm/sched: Split drm_sched_job_init
-:240: WARNING:UNSPECIFIED_INT: Prefer 'unsigned int' to bare use of 'unsigned'
#240: FILE: drivers/gpu/drm/scheduler/sched_fence.c:173:
+   unsigned seq;

-:336: WARNING:AVOID_BUG: Avoid crashing the kernel - try using WARN_ON & 
recovery code rather than BUG() or BUG_ON()
#336: FILE: drivers/gpu/drm/scheduler/sched_main.c:623:
+   BUG_ON(!entity);

-:405: CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#405: FILE: include/drm/gpu_scheduler.h:391:
+struct drm_sched_fence *drm_sched_fence_alloc(

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 1 checks, 248 lines checked
b5c679582c27 drm/msm: Improve drm/sched point of no return rules
-:81: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 37 lines checked
b4bf87348863 drm/sched: Barriers are needed for entity->last_scheduled
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 43 lines checked
4bfa4c96b80c drm/sched: Add dependency tracking
-:195: CHECK:LINE_SPACING: Please don't use multiple blank lines
#195: FILE: drivers/gpu/drm/scheduler/sched_main.c:729:
+
+

-:271: WARNING:TYPO_SPELLING: 'ommitted' may be misspelled - perhaps 'omitted'?
#271: FILE: include/drm/gpu_scheduler.h:244:
+* drm_sched_job_add_implicit_dependencies() this can be ommitted and
 

-:286: CHECK:LINE_SPACING: Please don't use multiple blank lines
#286: FILE: include/drm/gpu_scheduler.h:378:
+
+

-:289: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 2 warnings, 2 checks, 230 lines checked
fcd4fea4cae3 drm/sched: drop entity parameter from drm_sched_push_job
-:228: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 110 lines checked
ea09670e5838 drm/sched: improve docs around drm_sched_entity
-:17: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ 
chars of sha1> ("")' - ie: 'commit 620e762f9a98 ("drm/scheduler: 
move entity handling into separate file")'
#17: 
  move here: 620e762f9a98 ("drm/scheduler: move entity handling into

-:413: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 1 errors, 1 warnings, 0 checks, 346 lines checked
e31bfc2e79ff drm/panfrost: use scheduler dependency tracking
-:215: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 158 lines checked
c8f3842d2afd drm/lima: use scheduler dependency tracking
-:119: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 75 lines checked
563fd199edf5 drm/v3d: Move drm_sched_job_init to v3d_job_init
-:344: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 288 lines checked
79d00ee75ee5 drm/v3d: Use scheduler dependency handling
-:207: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1 warnings, 0 checks, 162 lines checked
164bbe5be15b drm/etnaviv: Use scheduler dependency handling
-:13: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#13: 
I wanted to to in the previous round (and did, for all other drivers).

-:122: WARNING:LINE_SPACING: Missing a blank line after declarations
#122: FILE: drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c:552:
+   struct dma_fence *in_fence = 
sync_file_get_fence(args->fence_fd);
+   if (!in_fence) {

-:297: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 3 warnings, 0 checks, 243 lines checked
2e8edd04f89c drm/msm: Use scheduler dependency handling
-:132: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address 
mismatch: 'From: Daniel Vetter ' != 'Signed-off-by: 
Daniel Vetter '

total: 0 errors, 1

Re: [Intel-gfx] [PATCH v8 5/7] drm: avoid circular locks in drm_mode_object_find

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:20AM +0800, Desmond Cheong Zhi Xi wrote:
> __drm_mode_object_find checks if the given drm file holds the required
> lease on a object by calling _drm_lease_held. _drm_lease_held in turn
> uses drm_file_get_master to access drm_file.master.
> 
> However, in a future patch, the drm_file.master_lookup_lock in
> drm_file_get_master will be replaced by drm_device.master_rwsem. This
> is an issue for two reasons:
> 
> 1. master_rwsem is sometimes already held when __drm_mode_object_find
> is called, which leads to recursive locks on master_rwsem
> 
> 2. drm_mode_object_find is sometimes called with the modeset_mutex
> held, which leads to an inversion of the master_rwsem -->
> modeset_mutex lock hierarchy
> 
> To fix this, we make __drm_mode_object_find the locked version of
> drm_mode_object_find, and wrap calls to __drm_mode_object_find with
> locks on master_rwsem. This allows us to safely access drm_file.master
> in _drm_lease_held (__drm_mode_object_find is its only caller) without
> the use of drm_file_get_master.
> 
> Functions that already lock master_rwsem are modified to call
> __drm_mode_object_find, whereas functions that haven't locked
> master_rwsem should call drm_mode_object_find. These two options
> allow us to grab master_rwsem before modeset_mutex (such as in
> drm_mode_get_obj_get_properties_ioctl).
> 
> This new rule requires more extensive changes to three functions:
> drn_connector_lookup, drm_crtc_find, and drm_plane_find. These
> functions are only sometimes called with master_rwsem held. Hence, we
> have to further split them into locked and unlocked versions that call
> __drm_mode_object_find and drm_mode_object_find respectively.

I think approach looks good, but the naming isn't so great. Usually __
prefix means "do not call directly, this is only exported for static
inline and other helpers". For these the usual rule is to add a _locked or
_unlocked suffix. I'd leave the normal _find functions as-is (since those
take the lock) themselves, and annotate the _locked ones.

Also same for the other lookup helpers.

> 
> Signed-off-by: Desmond Cheong Zhi Xi 
> ---
>  drivers/gpu/drm/drm_atomic_uapi.c|  7 ++---
>  drivers/gpu/drm/drm_color_mgmt.c |  2 +-
>  drivers/gpu/drm/drm_crtc.c   |  5 ++--
>  drivers/gpu/drm/drm_framebuffer.c|  2 +-
>  drivers/gpu/drm/drm_lease.c  | 21 +--
>  drivers/gpu/drm/drm_mode_object.c| 28 +---
>  drivers/gpu/drm/drm_plane.c  |  8 +++---
>  drivers/gpu/drm/drm_property.c   |  6 ++---
>  drivers/gpu/drm/i915/display/intel_overlay.c |  2 +-
>  drivers/gpu/drm/i915/display/intel_sprite.c  |  2 +-
>  drivers/gpu/drm/vmwgfx/vmwgfx_kms.c  |  2 +-
>  include/drm/drm_connector.h  | 23 
>  include/drm/drm_crtc.h   | 22 +++
>  include/drm/drm_mode_object.h|  3 +++
>  include/drm/drm_plane.h  | 20 ++
>  15 files changed, 118 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index 909f31833181..cda9a501cf74 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -557,7 +557,7 @@ static int drm_atomic_plane_set_property(struct drm_plane 
> *plane,
>   return -EINVAL;
>  
>   } else if (property == config->prop_crtc_id) {
> - struct drm_crtc *crtc = drm_crtc_find(dev, file_priv, val);
> + struct drm_crtc *crtc = __drm_crtc_find(dev, file_priv, val);
>  
>   if (val && !crtc)
>   return -EACCES;
> @@ -709,7 +709,7 @@ static int drm_atomic_connector_set_property(struct 
> drm_connector *connector,
>   int ret;
>  
>   if (property == config->prop_crtc_id) {
> - struct drm_crtc *crtc = drm_crtc_find(dev, file_priv, val);
> + struct drm_crtc *crtc = __drm_crtc_find(dev, file_priv, val);
>  
>   if (val && !crtc)
>   return -EACCES;
> @@ -1385,7 +1385,8 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
>   goto out;
>   }
>  
> - obj = drm_mode_object_find(dev, file_priv, obj_id, 
> DRM_MODE_OBJECT_ANY);
> + obj = __drm_mode_object_find(dev, file_priv, obj_id,
> +  DRM_MODE_OBJECT_ANY);
>   if (!obj) {
>   ret = -ENOENT;
>   goto out;
> diff --git a/drivers/gpu/drm/drm_color_mgmt.c 
> b/drivers/gpu/drm/drm_color_mgmt.c
> index bb14f488c8f6..9dcb2ccca3ab 100644
> --- a/drivers/gpu/drm/drm_color_mgmt.c
> +++ b/drivers/gpu/drm/drm_color_mgmt.c
> @@ -365,7 +365,7 @@ int drm_mode_gamma_set_ioctl(struct drm_device *dev,
>   if (!drm_core_check_feature(dev, DRIVER_MODESET))
>   return

Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Tvrtko Ursulin




On 26/08/2021 11:45, Thomas Hellström wrote:

Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.


So kernel context can get corrupt because we park the GPU with it 
active. Blitter context for a different reason - which is that it is 
used to copy itself over to smem, no?


If that is correct, then why bother copying the blitter context in the 
first place and not just always re-create it on resume?


That would be along the lines of marking the backing store as "dontneed" 
(however the exact mechanics of that look these days) so suspend can 
skip them.



Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.


Hm I guess this talks about the issue - so are these images migrated at 
all today or not?


Regards,

Tvrtko



Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

v2:
- Don't reset the contexts on each __engine_unpark() but rather at
   resume time (Chris Wilson).

Cc: Tvrtko Ursulin 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
Cc: Brost Matthew 
Cc: Chris Wilson 
Signed-off-by: Thomas Hellström 
---
  drivers/gpu/drm/i915/gt/intel_context_types.h |  8 +++
  drivers/gpu/drm/i915/gt/intel_engine_cs.c |  4 
  drivers/gpu/drm/i915/gt/intel_engine_pm.c | 23 +++
  drivers/gpu/drm/i915/gt/intel_engine_pm.h |  2 ++
  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ++
  drivers/gpu/drm/i915/gt/intel_gt_pm.c |  3 +++
  drivers/gpu/drm/i915/gt/mock_engine.c |  1 +
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 +---
  8 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e54351a170e2..a63631ea0ec4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -152,6 +152,14 @@ struct intel_context {
/** sseu: Control eu/slice partitioning */
struct intel_sseu sseu;
  
+	/**

+* pinned_contexts_link: List link for the engine's pinned contexts.
+* This is only used if this is a perma-pinned kernel context and
+* the list is assumed to only be manipulated during driver load
+* or unload time so no mutex protection currently.
+*/
+   struct list_head pinned_contexts_link;
+
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
  
  	struct {

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 332efea696a5..c606a4714904 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
  
  	BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
  
+	INIT_LIST_HEAD(>pinned_contexts_list);

engine->id = id;
engine->legacy_idx = INVALID_ENGINE;
engine->mask = BIT(id);
@@ -875,6 +876,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs 
*engine,
return ERR_PTR(err);
}
  
+	list_add_tail(>pinned_contexts_link, >pinned_contexts_list);

+
/*
 * Give our perma-pinned kernel timelines a separate lockdep class,
 * so that we can use them from within the normal user timelines
@@ -897,6 +900,7 @@ void intel_engine_destroy_pinned_context(struct 
intel_context *ce)
list_del(>timeline->engine_link);
mutex_unlock(>vm->mutex);
  
+	list_del(>pinned_contexts_link);

intel_context_unpin(ce);
intel_context_put(ce);
  }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 1f07ac4e0672..dacd62773735 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -298,6 +298,29 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
intel_engine_init_heartbeat(engine);
  }
  
+/**

+ * intel_engine_reset_pinned_contexts - Reset the pinned contexts of
+ * an engine.
+ * @engine: The engine whose pinned contexts we want to reset.
+ *
+ * Typically the pinned context LMEM images lose or get their content
+ * corrupted on suspend. This function resets their images.
+ */
+void

Re: [Intel-gfx] [PATCH v8 4/7] drm: avoid races with modesetting rights

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:19AM +0800, Desmond Cheong Zhi Xi wrote:
> In drm_client_modeset.c and drm_fb_helper.c,
> drm_master_internal_{acquire,release} are used to avoid races with DRM
> userspace. These functions hold onto drm_device.master_rwsem while
> committing, and bail if there's already a master.
> 
> However, there are other places where modesetting rights can race. A
> time-of-check-to-time-of-use error can occur if an ioctl that changes
> the modeset has its rights revoked after it validates its permissions,
> but before it completes.
> 
> There are four places where modesetting permissions can change:
> 
> - DROP_MASTER ioctl removes rights for a master and its leases
> 
> - REVOKE_LEASE ioctl revokes rights for a specific lease
> 
> - SET_MASTER ioctl sets the device master if the master role hasn't
> been acquired yet
> 
> - drm_open which can create a new master for a device if one does not
> currently exist
> 
> These races can be avoided using drm_device.master_rwsem: users that
> perform modesetting should hold a read lock on the new
> drm_device.master_rwsem, and users that change these permissions
> should hold a write lock.
> 
> To avoid deadlocks with master_rwsem, for ioctls that need to check
> for modesetting permissions, but also need to hold a write lock on
> master_rwsem to protect some other attribute (or recurses to some
> function that holds a write lock, like drm_mode_create_lease_ioctl
> which eventually calls drm_master_open), we remove the DRM_MASTER flag
> and push the master_rwsem lock and permissions check into the ioctl.
> 
> Reported-by: Daniel Vetter 
> Signed-off-by: Desmond Cheong Zhi Xi 
> ---
>  drivers/gpu/drm/drm_auth.c  |  4 
>  drivers/gpu/drm/drm_ioctl.c | 20 +++-
>  drivers/gpu/drm/drm_lease.c | 35 ---
>  include/drm/drm_device.h|  5 +
>  4 files changed, 48 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index 73ade0513ccb..65065f7e1499 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -120,6 +120,10 @@ int drm_authmagic(struct drm_device *dev, void *data,
>   DRM_DEBUG("%u\n", auth->magic);
>  
>   down_write(>master_rwsem);
> + if (unlikely(!drm_is_current_master(file_priv))) {
> + up_write(>master_rwsem);
> + return -EACCES;
> + }
>   file = idr_find(_priv->master->magic_map, auth->magic);
>   if (file) {
>   file->authenticated = 1;
> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
> index 158629d88319..8bea39ffc5c0 100644
> --- a/drivers/gpu/drm/drm_ioctl.c
> +++ b/drivers/gpu/drm/drm_ioctl.c
> @@ -386,6 +386,10 @@ static int drm_setversion(struct drm_device *dev, void 
> *data, struct drm_file *f
>   int if_version, retcode = 0;
>  
>   down_write(>master_rwsem);
> + if (unlikely(!drm_is_current_master(file_priv))) {
> + retcode = -EACCES;
> + goto unlock;
> + }
>   if (sv->drm_di_major != -1) {
>   if (sv->drm_di_major != DRM_IF_MAJOR ||
>   sv->drm_di_minor < 0 || sv->drm_di_minor > DRM_IF_MINOR) {
> @@ -420,8 +424,9 @@ static int drm_setversion(struct drm_device *dev, void 
> *data, struct drm_file *f
>   sv->drm_di_minor = DRM_IF_MINOR;
>   sv->drm_dd_major = dev->driver->major;
>   sv->drm_dd_minor = dev->driver->minor;
> - up_write(>master_rwsem);
>  
> +unlock:
> + up_write(>master_rwsem);
>   return retcode;
>  }
>  
> @@ -574,12 +579,12 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
>   DRM_IOCTL_DEF(DRM_IOCTL_GET_STATS, drm_getstats, 0),
>   DRM_IOCTL_DEF(DRM_IOCTL_GET_CAP, drm_getcap, DRM_RENDER_ALLOW),
>   DRM_IOCTL_DEF(DRM_IOCTL_SET_CLIENT_CAP, drm_setclientcap, 0),
> - DRM_IOCTL_DEF(DRM_IOCTL_SET_VERSION, drm_setversion, DRM_MASTER),
> + DRM_IOCTL_DEF(DRM_IOCTL_SET_VERSION, drm_setversion, 0),

Random bikeshed, if you're bored: In newer code we've given ioctl
callbacks an _ioctl suffix, so they'r easier to spot. Could do that in a
follow-up if you want.

>  
>   DRM_IOCTL_DEF(DRM_IOCTL_SET_UNIQUE, drm_invalid_op, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
>   DRM_IOCTL_DEF(DRM_IOCTL_BLOCK, drm_noop, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
>   DRM_IOCTL_DEF(DRM_IOCTL_UNBLOCK, drm_noop, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
> - DRM_IOCTL_DEF(DRM_IOCTL_AUTH_MAGIC, drm_authmagic, DRM_MASTER),
> + DRM_IOCTL_DEF(DRM_IOCTL_AUTH_MAGIC, drm_authmagic, 0),
>  
>   DRM_LEGACY_IOCTL_DEF(DRM_IOCTL_ADD_MAP, drm_legacy_addmap_ioctl, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
>   DRM_LEGACY_IOCTL_DEF(DRM_IOCTL_RM_MAP, drm_legacy_rmmap_ioctl, 
> DRM_AUTH),
> @@ -706,10 +711,10 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
> DRM_RENDER_ALLOW),
>   DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 
> 0),
>

Re: [Intel-gfx] [PATCH v8 1/7] drm: fix null ptr dereference in drm_master_release

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 07:53:58PM +0800, Desmond Cheong Zhi Xi wrote:
> On 26/8/21 5:53 pm, Daniel Vetter wrote:
> > On Thu, Aug 26, 2021 at 10:01:16AM +0800, Desmond Cheong Zhi Xi wrote:
> > > drm_master_release can be called on a drm_file without a master, which
> > > results in a null ptr dereference of file_priv->master->magic_map. The
> > > three cases are:
> > > 
> > > 1. Error path in drm_open_helper
> > >drm_open():
> > >  drm_open_helper():
> > >drm_master_open():
> > >  drm_new_set_master(); <--- returns -ENOMEM,
> > > drm_file.master not set
> > >drm_file_free():
> > >  drm_master_release(); <--- NULL ptr dereference
> > > (file_priv->master->magic_map)
> > > 
> > > 2. Error path in mock_drm_getfile
> > >mock_drm_getfile():
> > >  anon_inode_getfile(); <--- returns error, drm_file.master not set
> > >  drm_file_free():
> > >drm_master_release(); <--- NULL ptr dereference
> > >   (file_priv->master->magic_map)
> > > 
> > > 3. In drm_client_close, as drm_client_open doesn't set up a master
> > > 
> > > drm_file.master is set up in drm_open_helper through the call to
> > > drm_master_open, so we mirror it with a call to drm_master_release in
> > > drm_close_helper, and remove drm_master_release from drm_file_free to
> > > avoid the null ptr dereference.
> > > 
> > > Signed-off-by: Desmond Cheong Zhi Xi 
> > 
> > Reviewed-by: Daniel Vetter 
> > 
> > I guess we should also have a cc: stable on this one? I think this bug
> > existed since pretty much forever, but maybe more prominent with the
> > drm_client stuff added a while ago.
> > -Daniel
> > 
> 
> Thanks for the reviews, Daniel.
> 
> Took a closer look. I think if we cc: stable, this fix should accompany
> commit 7eeaeb90a6a5 ("drm/file: Don't set master on in-kernel clients")
> which moves the drm_master_open out from drm_file_alloc into
> drm_open_helper.

Ah right, please reference that commit with a Fixes: line.
-Daniel

> 
> > > ---
> > >   drivers/gpu/drm/drm_file.c | 6 +++---
> > >   1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > > index ed25168619fc..90b62f360da1 100644
> > > --- a/drivers/gpu/drm/drm_file.c
> > > +++ b/drivers/gpu/drm/drm_file.c
> > > @@ -282,9 +282,6 @@ void drm_file_free(struct drm_file *file)
> > >   drm_legacy_ctxbitmap_flush(dev, file);
> > > - if (drm_is_primary_client(file))
> > > - drm_master_release(file);
> > > -
> > >   if (dev->driver->postclose)
> > >   dev->driver->postclose(dev, file);
> > > @@ -305,6 +302,9 @@ static void drm_close_helper(struct file *filp)
> > >   list_del(_priv->lhead);
> > >   mutex_unlock(>filelist_mutex);
> > > + if (drm_is_primary_client(file_priv))
> > > + drm_master_release(file_priv);
> > > +
> > >   drm_file_free(file_priv);
> > >   }
> > > -- 
> > > 2.25.1
> > > 
> > 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 12:45:14PM +0200, Thomas Hellström wrote:
> Pinned contexts, like the migrate contexts need reset after resume
> since their context image may have been lost. Also the GuC needs to
> register pinned contexts.
> 
> Add a list to struct intel_engine_cs where we add all pinned contexts on
> creation, and traverse that list at resume time to reset the pinned
> contexts.
> 
> This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
> but proper LMEM backup / restore is needed for full suspend functionality.
> However, note that even with full LMEM backup / restore it may be
> desirable to keep the reset since backing up the migrate context images
> must happen using memcpy() after the migrate context has become inactive,
> and for performance- and other reasons we want to avoid memcpy() from
> LMEM.
> 
> Also traverse the list at guc_init_lrc_mapping() calling
> guc_kernel_context_pin() for the pinned contexts, like is already done
> for the kernel context.
> 
> v2:
> - Don't reset the contexts on each __engine_unpark() but rather at
>   resume time (Chris Wilson).
> 
> Cc: Tvrtko Ursulin 
> Cc: Matthew Auld 
> Cc: Maarten Lankhorst 
> Cc: Brost Matthew 
> Cc: Chris Wilson 
> Signed-off-by: Thomas Hellström 

I guess it got lost, but I few weeks ago I stumbled over this and wondered
why we're even setting up a separate context or at least why a separate vm
compared to the gt->vm we have already?

Even on chips with bazillions of copy engines the plan is that we only
reserve a single one for kernel migrations, so there's not really a need
for quite this much generality I think. Maybe check with Jon Bloomfield on
this.

Iirc I had also a few other questions on simplifying this area.
-Daniel


> ---
>  drivers/gpu/drm/i915/gt/intel_context_types.h |  8 +++
>  drivers/gpu/drm/i915/gt/intel_engine_cs.c |  4 
>  drivers/gpu/drm/i915/gt/intel_engine_pm.c | 23 +++
>  drivers/gpu/drm/i915/gt/intel_engine_pm.h |  2 ++
>  drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ++
>  drivers/gpu/drm/i915/gt/intel_gt_pm.c |  3 +++
>  drivers/gpu/drm/i915/gt/mock_engine.c |  1 +
>  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 +---
>  8 files changed, 55 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
> b/drivers/gpu/drm/i915/gt/intel_context_types.h
> index e54351a170e2..a63631ea0ec4 100644
> --- a/drivers/gpu/drm/i915/gt/intel_context_types.h
> +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
> @@ -152,6 +152,14 @@ struct intel_context {
>   /** sseu: Control eu/slice partitioning */
>   struct intel_sseu sseu;
>  
> + /**
> +  * pinned_contexts_link: List link for the engine's pinned contexts.
> +  * This is only used if this is a perma-pinned kernel context and
> +  * the list is assumed to only be manipulated during driver load
> +  * or unload time so no mutex protection currently.
> +  */
> + struct list_head pinned_contexts_link;
> +
>   u8 wa_bb_page; /* if set, page num reserved for context workarounds */
>  
>   struct {
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> index 332efea696a5..c606a4714904 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
> @@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
> intel_engine_id id)
>  
>   BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
>  
> + INIT_LIST_HEAD(>pinned_contexts_list);
>   engine->id = id;
>   engine->legacy_idx = INVALID_ENGINE;
>   engine->mask = BIT(id);
> @@ -875,6 +876,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs 
> *engine,
>   return ERR_PTR(err);
>   }
>  
> + list_add_tail(>pinned_contexts_link, >pinned_contexts_list);
> +
>   /*
>* Give our perma-pinned kernel timelines a separate lockdep class,
>* so that we can use them from within the normal user timelines
> @@ -897,6 +900,7 @@ void intel_engine_destroy_pinned_context(struct 
> intel_context *ce)
>   list_del(>timeline->engine_link);
>   mutex_unlock(>vm->mutex);
>  
> + list_del(>pinned_contexts_link);
>   intel_context_unpin(ce);
>   intel_context_put(ce);
>  }
> diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
> b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> index 1f07ac4e0672..dacd62773735 100644
> --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
> @@ -298,6 +298,29 @@ void intel_engine_init__pm(struct intel_engine_cs 
> *engine)
>   intel_engine_init_heartbeat(engine);
>  }
>  
> +/**
> + * intel_engine_reset_pinned_contexts - Reset the pinned contexts of
> + * an engine.
> + * @engine: The engine whose pinned contexts we want to reset.
> + *
> + * Typically the pinned context LMEM images

Re: [Intel-gfx] [PATCH 4/7] drm/i915/bios: use alternate aux channel directly from child data

2021-08-26 Thread Nautiyal, Ankit K




On 8/24/2021 7:04 PM, Jani Nikula wrote:

Avoid extra caching of the data.

Cc: José Roberto de Souza 
Signed-off-by: Jani Nikula 
---
  drivers/gpu/drm/i915/display/intel_bios.c | 26 +++
  drivers/gpu/drm/i915/i915_drv.h   |  1 -
  2 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_bios.c 
b/drivers/gpu/drm/i915/display/intel_bios.c
index 10b2beddc121..674f1424fcc2 100644
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -1565,28 +1565,29 @@ static enum port get_port_by_aux_ch(struct 
drm_i915_private *i915, u8 aux_ch)
for_each_port(port) {
info = >vbt.ddi_port_info[port];
  
-		if (info->devdata && aux_ch == info->alternate_aux_channel)

+   if (info->devdata && aux_ch == info->devdata->child.aux_channel)
return port;
}
  
  	return PORT_NONE;

  }
  
-static void sanitize_aux_ch(struct drm_i915_private *i915,

+static void sanitize_aux_ch(struct intel_bios_encoder_data *devdata,
enum port port)
  {
-   struct ddi_vbt_port_info *info = >vbt.ddi_port_info[port];
+   struct drm_i915_private *i915 = devdata->i915;
+   struct ddi_vbt_port_info *info;
struct child_device_config *child;
enum port p;
  
-	p = get_port_by_aux_ch(i915, info->alternate_aux_channel);

+   p = get_port_by_aux_ch(i915, devdata->child.aux_channel);
if (p == PORT_NONE)
return;
  
  	drm_dbg_kms(>drm,

"port %c trying to use the same AUX CH (0x%x) as port %c, "
"disabling port %c DP support\n",
-   port_name(port), info->alternate_aux_channel,
+   port_name(port), devdata->child.aux_channel,
port_name(p), port_name(p));
  
  	/*

@@ -1602,7 +1603,7 @@ static void sanitize_aux_ch(struct drm_i915_private *i915,
child = >devdata->child;
  
  	child->device_type &= ~DEVICE_TYPE_DISPLAYPORT_OUTPUT;

-   info->alternate_aux_channel = 0;
+   child->aux_channel = 0;
  }
  
  static const u8 cnp_ddc_pin_map[] = {

@@ -1980,11 +1981,8 @@ static void parse_ddi_port(struct drm_i915_private *i915,
}
}
  
-	if (is_dp) {

-   info->alternate_aux_channel = child->aux_channel;
-
-   sanitize_aux_ch(i915, port);
-   }
+   if (is_dp)
+   sanitize_aux_ch(devdata, port);
  
  	hdmi_level_shift = _intel_bios_hdmi_level_shift(devdata);

if (hdmi_level_shift >= 0) {
@@ -2863,7 +2861,7 @@ enum aux_ch intel_bios_port_aux_ch(struct 
drm_i915_private *i915,
>vbt.ddi_port_info[port];
enum aux_ch aux_ch;
  
-	if (!info->alternate_aux_channel) {

+   if (!info->devdata->child.aux_channel) {


Hi Jani,

The series and the change make sense to me.

From the CI results it seems that cases with LVDS panel connected are 
getting issues here.


Apparently info->devdata is not set in this case. I guess that, 
parse_ddi_port() returns early before info->devdata gets set.


I think without the patch, this situation is not encountered due to the 
fact that 'info->alternate_aux_channel, is initialized to 0.


With this change, perhaps we should check for 'info->devdata' before 
checking for info->devdata->child.aux_channel.


(This will translate to checking for 'devdata' in the final patch as it 
removes ddi_port_info).


Hope it helps.

Regards,

Ankit



aux_ch = (enum aux_ch)port;
  
  		drm_dbg_kms(>drm,

@@ -2879,7 +2877,7 @@ enum aux_ch intel_bios_port_aux_ch(struct 
drm_i915_private *i915,
 * ADL-S VBT uses PHY based mapping. Combo PHYs A,B,C,D,E
 * map to DDI A,TC1,TC2,TC3,TC4 respectively.
 */
-   switch (info->alternate_aux_channel) {
+   switch (info->devdata->child.aux_channel) {
case DP_AUX_A:
aux_ch = AUX_CH_A;
break;
@@ -2940,7 +2938,7 @@ enum aux_ch intel_bios_port_aux_ch(struct 
drm_i915_private *i915,
aux_ch = AUX_CH_I;
break;
default:
-   MISSING_CASE(info->alternate_aux_channel);
+   MISSING_CASE(info->devdata->child.aux_channel);
aux_ch = AUX_CH_A;
break;
}
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a0dead9f9222..91097526cd96 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -640,7 +640,6 @@ struct ddi_vbt_port_info {
/* Non-NULL if port present. */
struct intel_bios_encoder_data *devdata;
  
-	u8 alternate_aux_channel;

u8 alternate_ddc_pin;
  };

[Intel-gfx] ✓ Fi.CI.IGT: success for Enable mipi dsi on XELPD (rev3)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Enable mipi dsi on XELPD (rev3)
URL   : https://patchwork.freedesktop.org/series/93917/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10522_full -> Patchwork_20898_full


Summary
---

  **SUCCESS**

  No regressions found.

  

Known issues


  Here are the changes found in Patchwork_20898_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_isolation@preservation-s3@bcs0:
- shard-apl:  [PASS][1] -> [DMESG-WARN][2] ([i915#180]) +1 similar 
issue
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-apl6/igt@gem_ctx_isolation@preservation...@bcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-apl6/igt@gem_ctx_isolation@preservation...@bcs0.html

  * igt@gem_ctx_persistence@legacy-engines-mixed-process:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-snb6/igt@gem_ctx_persiste...@legacy-engines-mixed-process.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][4] ([i915#280])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-tglb8/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_eio@in-flight-contexts-1us:
- shard-tglb: [PASS][5] -> [TIMEOUT][6] ([i915#3063])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-tglb1/igt@gem_...@in-flight-contexts-1us.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-tglb1/igt@gem_...@in-flight-contexts-1us.html

  * igt@gem_exec_fair@basic-deadline:
- shard-kbl:  [PASS][7] -> [FAIL][8] ([i915#2846])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl3/igt@gem_exec_f...@basic-deadline.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-kbl7/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-flow@rcs0:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#2842]) +1 similar issue
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-tglb6/igt@gem_exec_fair@basic-f...@rcs0.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-tglb1/igt@gem_exec_fair@basic-f...@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][11] ([i915#2842])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-tglb5/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-none@vecs0:
- shard-kbl:  [PASS][12] -> [FAIL][13] ([i915#2842])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl2/igt@gem_exec_fair@basic-n...@vecs0.html
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-kbl1/igt@gem_exec_fair@basic-n...@vecs0.html
- shard-apl:  [PASS][14] -> [FAIL][15] ([i915#2842] / [i915#3468])
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-apl6/igt@gem_exec_fair@basic-n...@vecs0.html
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-apl6/igt@gem_exec_fair@basic-n...@vecs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][16] ([i915#2842]) +1 similar issue
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-iclb2/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][17] -> [SKIP][18] ([fdo#109271]) +1 similar 
issue
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-kbl1/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][19] -> [FAIL][20] ([i915#2842])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-glk7/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][21] -> [FAIL][22] ([i915#2849])
   [21]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-iclb4/igt@gem_exec_fair@basic-throt...@rcs0.html
   [22]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-iclb8/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][23] ([fdo#112283])
   [23]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-tglb8/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_pread@exhaustion:
- shard-snb:  NOTRUN -> [WARN][24] ([i915#2658])
   [24]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/shard-snb7/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
-

Re: [Intel-gfx] [PATCH v8 1/7] drm: fix null ptr dereference in drm_master_release

2021-08-26 Thread Desmond Cheong Zhi Xi


On 26/8/21 5:53 pm, Daniel Vetter wrote:

On Thu, Aug 26, 2021 at 10:01:16AM +0800, Desmond Cheong Zhi Xi wrote:

drm_master_release can be called on a drm_file without a master, which
results in a null ptr dereference of file_priv->master->magic_map. The
three cases are:

1. Error path in drm_open_helper
   drm_open():
 drm_open_helper():
   drm_master_open():
 drm_new_set_master(); <--- returns -ENOMEM,
drm_file.master not set
   drm_file_free():
 drm_master_release(); <--- NULL ptr dereference
(file_priv->master->magic_map)

2. Error path in mock_drm_getfile
   mock_drm_getfile():
 anon_inode_getfile(); <--- returns error, drm_file.master not set
 drm_file_free():
   drm_master_release(); <--- NULL ptr dereference
  (file_priv->master->magic_map)

3. In drm_client_close, as drm_client_open doesn't set up a master

drm_file.master is set up in drm_open_helper through the call to
drm_master_open, so we mirror it with a call to drm_master_release in
drm_close_helper, and remove drm_master_release from drm_file_free to
avoid the null ptr dereference.

Signed-off-by: Desmond Cheong Zhi Xi 


Reviewed-by: Daniel Vetter 

I guess we should also have a cc: stable on this one? I think this bug
existed since pretty much forever, but maybe more prominent with the
drm_client stuff added a while ago.
-Daniel



Thanks for the reviews, Daniel.

Took a closer look. I think if we cc: stable, this fix should accompany 
commit 7eeaeb90a6a5 ("drm/file: Don't set master on in-kernel clients") 
which moves the drm_master_open out from drm_file_alloc into 
drm_open_helper.



---
  drivers/gpu/drm/drm_file.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index ed25168619fc..90b62f360da1 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -282,9 +282,6 @@ void drm_file_free(struct drm_file *file)
  
  	drm_legacy_ctxbitmap_flush(dev, file);
  
-	if (drm_is_primary_client(file))

-   drm_master_release(file);
-
if (dev->driver->postclose)
dev->driver->postclose(dev, file);
  
@@ -305,6 +302,9 @@ static void drm_close_helper(struct file *filp)

list_del(_priv->lhead);
mutex_unlock(>filelist_mutex);
  
+	if (drm_is_primary_client(file_priv))

+   drm_master_release(file_priv);
+
drm_file_free(file_priv);
  }
  
--

2.25.1

Re: [Intel-gfx] [GIT PULL] drm-misc + drm-intel: Add support for out-of-band hotplug notification

2021-08-26 Thread Vivi, Rodrigo

On Thu, 2021-08-26 at 10:23 +0200, Maxime Ripard wrote:
> On Wed, Aug 25, 2021 at 04:03:43PM +, Vivi, Rodrigo wrote:
> > On Tue, 2021-08-24 at 18:48 +0200, Hans de Goede wrote:
> > > Hi,
> > > 
> > > On 8/24/21 10:45 AM, Jani Nikula wrote:
> > > > On Fri, 20 Aug 2021, Hans de Goede  wrote:
> > > > > Hello drm-misc and drm-intel maintainers,
> > > > > 
> > > > > My "Add support for out-of-band hotplug notification"
> > > > > patchset:
> > > > > https://patchwork.freedesktop.org/series/93763/
> > > > > 
> > > > > Is ready for merging now, as discussed on IRC I based this
> > > > > series
> > > > > on top drm-tip and when trying to apply the i915 parts on top
> > > > > of drm-misc this fails due to conflict.
> > > > > 
> > > > > So as Jani suggested here is a pull-req for a topic-branch
> > > > > with
> > > > > the
> > > > > entire set, minus the troublesome i915 bits. Once this has
> > > > > been
> > > > > merged
> > > > > into both drm-misc-next and drm-intel-next I can push the 2
> > > > > i915
> > > > > patch do drm-intel-next on top of the merge.
> > > > > 
> > > > > Note there are also 2 drivers/usb/typec patches in here these
> > > > > have Greg KH's Reviewed-by for merging through the drm tree,
> > > > > Since this USB code does not change all that much. I also
> > > > > checked
> > > > > and the drm-misc-next-2021-08-12 base of this tree contains
> > > > > the
> > > > > same last commit to the modified file as usb-next.
> > > > > 
> > > > > Daniel Vetter mentioned on IRC that it might be better for
> > > > > you to
> > > > > simply
> > > > > pick-up the series directly from patchwork, that is fine too
> > > > > in
> > > > > that
> > > > > case don't forget to add:
> > > > > 
> > > > > Reviewed-by: Lyude Paul 
> > > > > 
> > > > > To the entire series (given in a reply to the cover-letter)
> > > > > 
> > > > > And:
> > > > > 
> > > > > Reviewed-by: Greg Kroah-Hartman 
> > > > > 
> > > > > To the usb/typec patches (patch 7/8), this was given in reply
> > > > > to a previous posting of the series and I forgot to add this
> > > > > in the resend.
> > > > 
> > > > Since this is mostly touching drm core, I think it should be
> > > > merged
> > > > to
> > > > drm-misc-next first, and drm-intel-next after. Please let us
> > > > know.
> > > 
> > > I agree this should go to drm-misc-next first.
> > > 
> > > (I was planning on pushing this to drm-misc-next myself,
> > > but then ended up going with the topic branch because of the
> > > conflict in the i915 bits.)
> > 
> > Just to be clear and avoid confusion: This pull request does apply
> > cleanly on drm-misc-next nd drm-intel-next right now.
> > 
> > I'm just waiting for drm-misc-next maintainers to pull this to drm-
> > misc-next so I can pull it to drm-intel-next.
> > 
> > Maxime, is that your round now?
> > or Thomas?
> 
> That's me, I just pushed it to drm-misc-next

Thank you!
I also pushed to drm-intel-next.

> 
> Thanks!
> Maxime

Re: [Intel-gfx] [PATCH 24/33] drm/i915/guc: Implement banned contexts for GuC submission

2021-08-26 Thread Tvrtko Ursulin




On 26/08/2021 04:49, Matthew Brost wrote:

On Wed, Aug 25, 2021 at 11:39:10AM +0100, Tvrtko Ursulin wrote:


On 27/07/2021 01:23, Matthew Brost wrote:

When using GuC submission, if a context gets banned disable scheduling
and mark all inflight requests as complete.

Cc: John Harrison 
Signed-off-by: Matthew Brost 
Reviewed-by: John Harrison 
---
   drivers/gpu/drm/i915/gem/i915_gem_context.c   |   2 +-
   drivers/gpu/drm/i915/gt/intel_context.h   |  13 ++
   drivers/gpu/drm/i915/gt/intel_context_types.h |   2 +
   drivers/gpu/drm/i915/gt/intel_reset.c |  32 +---
   .../gpu/drm/i915/gt/intel_ring_submission.c   |  20 +++
   drivers/gpu/drm/i915/gt/uc/intel_guc.h|   2 +
   .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 151 --
   drivers/gpu/drm/i915/i915_trace.h |  10 ++
   8 files changed, 195 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index e3df01a201d7..05c3ee191710 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1084,7 +1084,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
-   if (ban && intel_context_set_banned(ce))
+   if (ban && intel_context_ban(ce, NULL))
continue;
/*
diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
b/drivers/gpu/drm/i915/gt/intel_context.h
index 2ed9bf5f91a5..814d9277096a 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.h
+++ b/drivers/gpu/drm/i915/gt/intel_context.h
@@ -16,6 +16,7 @@
   #include "intel_engine_types.h"
   #include "intel_ring_types.h"
   #include "intel_timeline_types.h"
+#include "i915_trace.h"
   #define CE_TRACE(ce, fmt, ...) do {  \
const struct intel_context *ce__ = (ce);\
@@ -243,6 +244,18 @@ static inline bool intel_context_set_banned(struct 
intel_context *ce)
return test_and_set_bit(CONTEXT_BANNED, >flags);
   }
+static inline bool intel_context_ban(struct intel_context *ce,
+struct i915_request *rq)
+{
+   bool ret = intel_context_set_banned(ce);
+
+   trace_intel_context_ban(ce);
+   if (ce->ops->ban)
+   ce->ops->ban(ce, rq);
+
+   return ret;
+}
+
   static inline bool
   intel_context_force_single_submission(const struct intel_context *ce)
   {
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index 035108c10b2c..57c19ee3e313 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -35,6 +35,8 @@ struct intel_context_ops {
int (*alloc)(struct intel_context *ce);
+   void (*ban)(struct intel_context *ce, struct i915_request *rq);
+
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, 
void **vaddr);
int (*pin)(struct intel_context *ce, void *vaddr);
void (*unpin)(struct intel_context *ce);
diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 4d281bc8a38c..91200c43951f 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -22,7 +22,6 @@
   #include "intel_reset.h"
   #include "uc/intel_guc.h"
-#include "uc/intel_guc_submission.h"
   #define RESET_MAX_RETRIES 3
@@ -39,21 +38,6 @@ static void rmw_clear_fw(struct intel_uncore *uncore, 
i915_reg_t reg, u32 clr)
intel_uncore_rmw_fw(uncore, reg, clr, 0);
   }
-static void skip_context(struct i915_request *rq)
-{
-   struct intel_context *hung_ctx = rq->context;
-
-   list_for_each_entry_from_rcu(rq, _ctx->timeline->requests, link) {
-   if (!i915_request_is_active(rq))
-   return;
-
-   if (rq->context == hung_ctx) {
-   i915_request_set_error_once(rq, -EIO);
-   __i915_request_skip(rq);
-   }
-   }
-}
-
   static void client_mark_guilty(struct i915_gem_context *ctx, bool banned)
   {
struct drm_i915_file_private *file_priv = ctx->file_priv;
@@ -88,10 +72,8 @@ static bool mark_guilty(struct i915_request *rq)
bool banned;
int i;
-   if (intel_context_is_closed(rq->context)) {
-   intel_context_set_banned(rq->context);
+   if (intel_context_is_closed(rq->context))
return true;
-   }
rcu_read_lock();
ctx = rcu_dereference(rq->context->gem_context);
@@ -123,11 +105,9 @@ static bool mark_guilty(struct i915_request *rq)
banned = !i915_gem_context_is_recoverable(ctx);
if (time_before(jiffies, prev_hang + CONTEXT_FAST_HANG_JIFFIES))
banned = true;
-   if (banned) {
+   if (banned)

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gem: Fix the mman selftest

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gem: Fix the mman selftest
URL   : https://patchwork.freedesktop.org/series/94062/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10524 -> Patchwork_20900


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/index.html

Known issues


  Here are the changes found in Patchwork_20900 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@gem_huc_copy@huc-copy:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][1] ([i915#2190])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@gem_huc_c...@huc-copy.html

  * igt@i915_pm_backlight@basic-brightness:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][2] ([i915#1155])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@i915_pm_backli...@basic-brightness.html

  * igt@i915_pm_rpm@module-reload:
- fi-tgl-1115g4:  NOTRUN -> [INCOMPLETE][3] ([i915#4006])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@i915_pm_...@module-reload.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: [PASS][4] -> [DMESG-WARN][5] ([i915#3958])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_addfb_basic@too-wide:
- fi-tgl-1115g4:  NOTRUN -> [DMESG-WARN][6] ([i915#4002]) +91 similar 
issues
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@kms_addfb_ba...@too-wide.html

  * igt@kms_chamelium@common-hpd-after-suspend:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][7] ([fdo#111827]) +8 similar issues
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@kms_chamel...@common-hpd-after-suspend.html

  * igt@kms_force_connector_basic@force-load-detect:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][8] ([fdo#109285])
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@kms_force_connector_ba...@force-load-detect.html

  * igt@kms_psr@primary_mmap_gtt:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][9] ([i915#1072]) +2 similar issues
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@kms_psr@primary_mmap_gtt.html

  * igt@kms_psr@primary_page_flip:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][10] ([i915#1072] / [i915#1385])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html

  * igt@prime_vgem@basic-userptr:
- fi-tgl-1115g4:  NOTRUN -> [SKIP][11] ([i915#3301])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@prime_v...@basic-userptr.html

  * igt@runner@aborted:
- fi-tgl-1115g4:  NOTRUN -> [FAIL][12] ([i915#2722] / [i915#3834])
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-tgl-1115g4/igt@run...@aborted.html

  
 Possible fixes 

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-icl-u2:  [DMESG-WARN][13] ([i915#2203] / [i915#2868]) -> 
[PASS][14]
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10524/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20900/fi-icl-u2/igt@kms_chamel...@hdmi-hpd-fast.html

  
  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1155]: https://gitlab.freedesktop.org/drm/intel/issues/1155
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2203]: https://gitlab.freedesktop.org/drm/intel/issues/2203
  [i915#2722]: https://gitlab.freedesktop.org/drm/intel/issues/2722
  [i915#2868]: https://gitlab.freedesktop.org/drm/intel/issues/2868
  [i915#3301]: https://gitlab.freedesktop.org/drm/intel/issues/3301
  [i915#3834]: https://gitlab.freedesktop.org/drm/intel/issues/3834
  [i915#3958]: https://gitlab.freedesktop.org/drm/intel/issues/3958
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002
  [i915#4006]: https://gitlab.freedesktop.org/drm/intel/issues/4006


Participating hosts (40 -> 34)
--

  Additional (1): fi-tgl-1115g4 
  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10524 -> Patchwork_20900

  CI-20190529: 20190529
  CI_DRM_10524: 5833b7a8b6de23b9b4862f74e257bde02e5c08f4 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git

Re: [Intel-gfx] [PATCH 2/4] drm/dp: use more of the extended receiver cap

2021-08-26 Thread Jani Nikula

On Wed, 25 Aug 2021, Jani Nikula  wrote:
> On Thu, 19 Aug 2021, Ville Syrjälä  wrote:
>> On Fri, Aug 13, 2021 at 01:43:20PM +0300, Jani Nikula wrote:
>>> Extend the use of extended receiver cap at 0x2200 to cover
>>> MAIN_LINK_CHANNEL_CODING_CAP in 0x2206, in case an implementation hides
>>> the DP 2.0 128b/132b channel encoding cap.
>>> 
>>> Cc: Manasi Navare 
>>> Signed-off-by: Jani Nikula 
>>> ---
>>>  drivers/gpu/drm/drm_dp_helper.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/gpu/drm/drm_dp_helper.c 
>>> b/drivers/gpu/drm/drm_dp_helper.c
>>> index 9b2a2961fca8..9389f92cb944 100644
>>> --- a/drivers/gpu/drm/drm_dp_helper.c
>>> +++ b/drivers/gpu/drm/drm_dp_helper.c
>>> @@ -608,7 +608,7 @@ static u8 drm_dp_downstream_port_count(const u8 
>>> dpcd[DP_RECEIVER_CAP_SIZE])
>>>  static int drm_dp_read_extended_dpcd_caps(struct drm_dp_aux *aux,
>>>   u8 dpcd[DP_RECEIVER_CAP_SIZE])
>>>  {
>>> -   u8 dpcd_ext[6];
>>> +   u8 dpcd_ext[DP_MAIN_LINK_CHANNEL_CODING + 1];
>>
>> Why are we even reading less of this than the normal receiver caps?
>
> Good question. I forget my reasoning to only extend to what might affect
> this use case. Should we extend to the size of the usual receiver caps?

Ah, there was a previous discussion [1] with Lyude (Cc'd).

BR,
Jani.


[1] 
https://patchwork.freedesktop.org/patch/msgid/20200901123226.4177-1-jani.nik...@intel.com


>
> BR,
> Jani.
>
>
>>
>>> int ret;
>>>  
>>> /*
>>> -- 
>>> 2.20.1

-- 
Jani Nikula, Intel Open Source Graphics Center

[Intel-gfx] [PATCH] drm/i915: Be more gentle when exiting non-persistent contexts

2021-08-26 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

When a non-persistent context exits we currently mark it as banned in
order to trigger fast termination of any outstanding GPU jobs it may have
left running.

In doing so we apply a very strict 1ms limit in which the left over job
has to preempt before we issues an engine resets.

Some workloads are not able to cleanly preempt in that time window and it
can be argued that it would instead be better to give them a bit more
grace since avoiding engine resets is generally preferrable.

To achieve this the patch splits handling of banned contexts from simply
closed non-persistent ones and then applies different timeouts for both
and also extends the criteria which determines if a request should be
scheduled back in after preemption or not.

20ms preempt timeout grace is given to exited non-persistent contexts
which have been empirically tested to satisfy customers requirements
and still provides reasonably quick cleanup post exit.

v2:
 * Streamline fast path checks.

v3:
 * Simplify by using only schedulable status.
 * Increase timeout to 20ms.

v4:
 * Fix live_execlists selftest.

v5:
 * Fix logic in kill_engines.

v6:
 * Rebase.

v7:
 * Add GuC support.

Signed-off-by: Tvrtko Ursulin 
Cc: Chris Wilson 
Cc: Zhen Han 
Cc: Matthew Brost 
---
 drivers/gpu/drm/i915/gem/i915_gem_context.c   | 22 +++-
 drivers/gpu/drm/i915/gt/intel_context.c   | 25 ++
 drivers/gpu/drm/i915/gt/intel_context.h   | 26 ++-
 drivers/gpu/drm/i915/gt/intel_context_types.h |  3 ++-
 .../drm/i915/gt/intel_execlists_submission.c  | 13 +++---
 .../gpu/drm/i915/gt/intel_ring_submission.c   |  7 ++---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 13 ++
 drivers/gpu/drm/i915/i915_request.c   |  2 +-
 8 files changed, 84 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c 
b/drivers/gpu/drm/i915/gem/i915_gem_context.c
index fd169cf2f75a..6ae803cb4de3 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -1072,7 +1072,8 @@ static struct intel_engine_cs *active_engine(struct 
intel_context *ce)
return engine;
 }
 
-static void kill_engines(struct i915_gem_engines *engines, bool ban)
+static void
+kill_engines(struct i915_gem_engines *engines, bool ban, bool persistent)
 {
struct i915_gem_engines_iter it;
struct intel_context *ce;
@@ -1086,8 +1087,15 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 */
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
+   bool skip = false;
 
-   if (ban && intel_context_ban(ce, NULL))
+   if (ban)
+   skip = intel_context_ban(ce, NULL);
+   else if (!persistent)
+   skip = intel_context_exit_nonpersistent(ce, NULL);
+
+   /* Already banned or non-persistent closed. */
+   if (skip)
continue;
 
/*
@@ -1100,7 +1108,7 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
engine = active_engine(ce);
 
/* First attempt to gracefully cancel the context */
-   if (engine && !__cancel_engine(engine) && ban)
+   if (engine && !__cancel_engine(engine) && (ban || !persistent))
/*
 * If we are unable to send a preemptive pulse to bump
 * the context from the GPU, we have to resort to a full
@@ -1112,8 +1120,6 @@ static void kill_engines(struct i915_gem_engines 
*engines, bool ban)
 
 static void kill_context(struct i915_gem_context *ctx)
 {
-   bool ban = (!i915_gem_context_is_persistent(ctx) ||
-   !ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
 
spin_lock_irq(>stale.lock);
@@ -1126,7 +1132,8 @@ static void kill_context(struct i915_gem_context *ctx)
 
spin_unlock_irq(>stale.lock);
 
-   kill_engines(pos, ban);
+   kill_engines(pos, !ctx->i915->params.enable_hangcheck,
+i915_gem_context_is_persistent(ctx));
 
spin_lock_irq(>stale.lock);
GEM_BUG_ON(i915_sw_fence_signaled(>fence));
@@ -1172,7 +1179,8 @@ static void engines_idle_release(struct i915_gem_context 
*ctx,
 
 kill:
if (list_empty(>link)) /* raced, already closed */
-   kill_engines(engines, true);
+   kill_engines(engines, true,
+i915_gem_context_is_persistent(ctx));
 
i915_sw_fence_commit(>fence);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 745e84c72c90..b9880ffe5da7 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -533,6 +533,31 @@ struct

[Intel-gfx] [PATCH v2] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Thomas Hellström

Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.

Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.

Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

v2:
- Don't reset the contexts on each __engine_unpark() but rather at
  resume time (Chris Wilson).

Cc: Tvrtko Ursulin 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
Cc: Brost Matthew 
Cc: Chris Wilson 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  8 +++
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  4 
 drivers/gpu/drm/i915/gt/intel_engine_pm.c | 23 +++
 drivers/gpu/drm/i915/gt/intel_engine_pm.h |  2 ++
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 ++
 drivers/gpu/drm/i915/gt/intel_gt_pm.c |  3 +++
 drivers/gpu/drm/i915/gt/mock_engine.c |  1 +
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 +---
 8 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e54351a170e2..a63631ea0ec4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -152,6 +152,14 @@ struct intel_context {
/** sseu: Control eu/slice partitioning */
struct intel_sseu sseu;
 
+   /**
+* pinned_contexts_link: List link for the engine's pinned contexts.
+* This is only used if this is a perma-pinned kernel context and
+* the list is assumed to only be manipulated during driver load
+* or unload time so no mutex protection currently.
+*/
+   struct list_head pinned_contexts_link;
+
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
 
struct {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 332efea696a5..c606a4714904 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
 
BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
 
+   INIT_LIST_HEAD(>pinned_contexts_list);
engine->id = id;
engine->legacy_idx = INVALID_ENGINE;
engine->mask = BIT(id);
@@ -875,6 +876,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs 
*engine,
return ERR_PTR(err);
}
 
+   list_add_tail(>pinned_contexts_link, >pinned_contexts_list);
+
/*
 * Give our perma-pinned kernel timelines a separate lockdep class,
 * so that we can use them from within the normal user timelines
@@ -897,6 +900,7 @@ void intel_engine_destroy_pinned_context(struct 
intel_context *ce)
list_del(>timeline->engine_link);
mutex_unlock(>vm->mutex);
 
+   list_del(>pinned_contexts_link);
intel_context_unpin(ce);
intel_context_put(ce);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 1f07ac4e0672..dacd62773735 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -298,6 +298,29 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
intel_engine_init_heartbeat(engine);
 }
 
+/**
+ * intel_engine_reset_pinned_contexts - Reset the pinned contexts of
+ * an engine.
+ * @engine: The engine whose pinned contexts we want to reset.
+ *
+ * Typically the pinned context LMEM images lose or get their content
+ * corrupted on suspend. This function resets their images.
+ */
+void intel_engine_reset_pinned_contexts(struct intel_engine_cs *engine)
+{
+   struct intel_context *ce;
+
+   list_for_each_entry(ce, >pinned_contexts_list,
+   pinned_contexts_link) {
+   /* kernel context gets reset at __engine_unpark() */
+   if (ce == engine->kernel_context)
+   continue;
+
+   dbg_poison_ce(ce);
+   ce->ops->reset(ce);
+   }
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftest_engine_pm.c"
 #endif
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h

Re: [Intel-gfx] [PATCH 2/2] drm/i915/debugfs: hook up ttm_resource_manager_debug

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 12:03:29PM +0200, Daniel Vetter wrote:
> On Thu, Aug 26, 2021 at 11:51:44AM +0200, Thomas Hellström wrote:
> > On Thu, 2021-08-26 at 11:16 +0200, Daniel Vetter wrote:
> > > On Thu, Aug 19, 2021 at 09:32:20AM +0200, Thomas Hellström wrote:
> > > > On Wed, 2021-08-18 at 15:58 +0100, Matthew Auld wrote:
> > > > > This should give a more complete view of the various bits of
> > > > > internal
> > > > > resource manager state, for device local-memory.
> > > > > 
> > > > > Signed-off-by: Matthew Auld 
> > > > > Cc: Thomas Hellström 
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_debugfs.c | 12 +---
> > > > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > > index eec0d349ea6a..109e6feed6be 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > > @@ -238,6 +238,7 @@ i915_debugfs_describe_obj(struct seq_file *m,
> > > > > struct drm_i915_gem_object *obj)
> > > > >  static int i915_gem_object_info(struct seq_file *m, void *data)
> > > > >  {
> > > > > struct drm_i915_private *i915 = node_to_i915(m->private);
> > > > > +   struct drm_printer p = drm_seq_file_printer(m);
> > > > > struct intel_memory_region *mr;
> > > > > enum intel_region_id id;
> > > > >  
> > > > > @@ -245,9 +246,14 @@ static int i915_gem_object_info(struct
> > > > > seq_file
> > > > > *m, void *data)
> > > > >    i915->mm.shrink_count,
> > > > >    atomic_read(>mm.free_count),
> > > > >    i915->mm.shrink_memory);
> > > > > -   for_each_memory_region(mr, i915, id)
> > > > > -   seq_printf(m, "%s: total:%pa, available:%pa
> > > > > bytes\n",
> > > > > -  mr->name, >total, >avail);
> > > > > +   for_each_memory_region(mr, i915, id) {
> > > > > +   seq_printf(m, "%s: ", mr->name);
> > > > > +   if (mr->region_private)
> > > > > +   ttm_resource_manager_debug(mr-
> > > > > > region_private, );
> > > > > +   else
> > > > > +   seq_printf(m, "total:%pa, available:%pa
> > > > > bytes\n",
> > > > > +  >total, >avail);
> > > > 
> > > > Hm. Shouldn't we make the above intel_memory_region_debug() or
> > > > perhaps
> > > > intel_memory_region_info() to avoid using memory region internals
> > > > directly here?
> > > 
> > > Imo we should just emebed ttm_resource_mager into our own and not try
> > > to
> > > abstract this all away that much. At least in upstream there is just
> > > not
> > > going to be another memory region implementation, and for backporting
> > > I'm
> > > not sure these abstractions really help that much - we're touching
> > > all the
> > > same code still in the end.
> > 
> > Hmm, yes. Here I was seeing the separation between the debugfs code and
> > the intel_memory_region code, rather between the latter and TTM.
> > 
> > The i915 driver is currently much "everything uses everything" which
> > IMHO is not really good for code understanding and maintainance.
> 
> Ah yes I agree, we don't have clear seperation of concerns really, and
> debugfs is all over. I got confused a bit with the ->region_private
> pointer and thought you'd be talking about that.
> 
> My experience has been that going over the interface functions and trying
> to kerneldoc helps a lot with this, because instead of documenting some
> major confusion you can just clean it up first. We should definitely try
> to componentize stuff more and not leak internal details all over the
> place.

While we discuss this: For debugfs functions I recommend using drm_printer
and not seq_file directly, it's a nice bit of abstraction so that you can
also dump to debugfs. Or anything else really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] ✗ Fi.CI.IGT: failure for Clean up GuC CI failures, simplify locking, and kernel DOC (rev5)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Clean up GuC CI failures, simplify locking, and kernel DOC (rev5)
URL   : https://patchwork.freedesktop.org/series/93704/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10522_full -> Patchwork_20896_full


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20896_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20896_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_20896_full:

### IGT changes ###

 Possible regressions 

  * igt@gem_exec_schedule@reorder-wide@vcs0:
- shard-skl:  [PASS][1] -> [FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-skl3/igt@gem_exec_schedule@reorder-w...@vcs0.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-skl3/igt@gem_exec_schedule@reorder-w...@vcs0.html

  
New tests
-

  New tests have been introduced between CI_DRM_10522_full and 
Patchwork_20896_full:

### New IGT tests (1) ###

  * igt@i915_selftest@live@guc:
- Statuses : 8 pass(s)
- Exec time: [0.47, 4.95] s

  

Known issues


  Here are the changes found in Patchwork_20896_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_ctx_persistence@legacy-engines-mixed-process:
- shard-snb:  NOTRUN -> [SKIP][3] ([fdo#109271] / [i915#1099]) +1 
similar issue
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-snb2/igt@gem_ctx_persiste...@legacy-engines-mixed-process.html

  * igt@gem_ctx_sseu@mmap-args:
- shard-tglb: NOTRUN -> [SKIP][4] ([i915#280])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-tglb7/igt@gem_ctx_s...@mmap-args.html

  * igt@gem_eio@in-flight-10ms:
- shard-skl:  [PASS][5] -> [TIMEOUT][6] ([i915#3063]) +1 similar 
issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-skl9/igt@gem_...@in-flight-10ms.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-skl5/igt@gem_...@in-flight-10ms.html

  * igt@gem_exec_fair@basic-deadline:
- shard-kbl:  [PASS][7] -> [FAIL][8] ([i915#2846])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl3/igt@gem_exec_f...@basic-deadline.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-kbl7/igt@gem_exec_f...@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
- shard-tglb: NOTRUN -> [FAIL][9] ([i915#2842])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-tglb2/igt@gem_exec_fair@basic-none-s...@rcs0.html

  * igt@gem_exec_fair@basic-pace@vcs1:
- shard-iclb: NOTRUN -> [FAIL][10] ([i915#2842])
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-iclb1/igt@gem_exec_fair@basic-p...@vcs1.html

  * igt@gem_exec_fair@basic-pace@vecs0:
- shard-kbl:  [PASS][11] -> [FAIL][12] ([i915#2842]) +1 similar 
issue
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-kbl4/igt@gem_exec_fair@basic-p...@vecs0.html
- shard-tglb: [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-tglb8/igt@gem_exec_fair@basic-p...@vecs0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-tglb7/igt@gem_exec_fair@basic-p...@vecs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
- shard-glk:  [PASS][15] -> [FAIL][16] ([i915#2842]) +1 similar 
issue
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-glk5/igt@gem_exec_fair@basic-throt...@rcs0.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-glk1/igt@gem_exec_fair@basic-throt...@rcs0.html
- shard-iclb: [PASS][17] -> [FAIL][18] ([i915#2849])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/shard-iclb4/igt@gem_exec_fair@basic-throt...@rcs0.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-iclb5/igt@gem_exec_fair@basic-throt...@rcs0.html

  * igt@gem_exec_params@secure-non-master:
- shard-tglb: NOTRUN -> [SKIP][19] ([fdo#112283])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-tglb7/igt@gem_exec_par...@secure-non-master.html

  * igt@gem_pread@exhaustion:
- shard-snb:  NOTRUN -> [WARN][20] ([i915#2658])
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20896/shard-snb7/igt@gem_pr...@exhaustion.html

  * igt@gem_render_copy@yf-tiled-to-vebox-linear:
-

Re: [Intel-gfx] [PATCH 2/2] drm/i915/debugfs: hook up ttm_resource_manager_debug

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 11:51:44AM +0200, Thomas Hellström wrote:
> On Thu, 2021-08-26 at 11:16 +0200, Daniel Vetter wrote:
> > On Thu, Aug 19, 2021 at 09:32:20AM +0200, Thomas Hellström wrote:
> > > On Wed, 2021-08-18 at 15:58 +0100, Matthew Auld wrote:
> > > > This should give a more complete view of the various bits of
> > > > internal
> > > > resource manager state, for device local-memory.
> > > > 
> > > > Signed-off-by: Matthew Auld 
> > > > Cc: Thomas Hellström 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c | 12 +---
> > > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index eec0d349ea6a..109e6feed6be 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -238,6 +238,7 @@ i915_debugfs_describe_obj(struct seq_file *m,
> > > > struct drm_i915_gem_object *obj)
> > > >  static int i915_gem_object_info(struct seq_file *m, void *data)
> > > >  {
> > > > struct drm_i915_private *i915 = node_to_i915(m->private);
> > > > +   struct drm_printer p = drm_seq_file_printer(m);
> > > > struct intel_memory_region *mr;
> > > > enum intel_region_id id;
> > > >  
> > > > @@ -245,9 +246,14 @@ static int i915_gem_object_info(struct
> > > > seq_file
> > > > *m, void *data)
> > > >    i915->mm.shrink_count,
> > > >    atomic_read(>mm.free_count),
> > > >    i915->mm.shrink_memory);
> > > > -   for_each_memory_region(mr, i915, id)
> > > > -   seq_printf(m, "%s: total:%pa, available:%pa
> > > > bytes\n",
> > > > -  mr->name, >total, >avail);
> > > > +   for_each_memory_region(mr, i915, id) {
> > > > +   seq_printf(m, "%s: ", mr->name);
> > > > +   if (mr->region_private)
> > > > +   ttm_resource_manager_debug(mr-
> > > > > region_private, );
> > > > +   else
> > > > +   seq_printf(m, "total:%pa, available:%pa
> > > > bytes\n",
> > > > +  >total, >avail);
> > > 
> > > Hm. Shouldn't we make the above intel_memory_region_debug() or
> > > perhaps
> > > intel_memory_region_info() to avoid using memory region internals
> > > directly here?
> > 
> > Imo we should just emebed ttm_resource_mager into our own and not try
> > to
> > abstract this all away that much. At least in upstream there is just
> > not
> > going to be another memory region implementation, and for backporting
> > I'm
> > not sure these abstractions really help that much - we're touching
> > all the
> > same code still in the end.
> 
> Hmm, yes. Here I was seeing the separation between the debugfs code and
> the intel_memory_region code, rather between the latter and TTM.
> 
> The i915 driver is currently much "everything uses everything" which
> IMHO is not really good for code understanding and maintainance.

Ah yes I agree, we don't have clear seperation of concerns really, and
debugfs is all over. I got confused a bit with the ->region_private
pointer and thought you'd be talking about that.

My experience has been that going over the interface functions and trying
to kerneldoc helps a lot with this, because instead of documenting some
major confusion you can just clean it up first. We should definitely try
to componentize stuff more and not leak internal details all over the
place.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 2/2] drm/i915/debugfs: hook up ttm_resource_manager_debug

2021-08-26 Thread Thomas Hellström

On Thu, 2021-08-26 at 11:51 +0200, Thomas Hellström wrote:
> On Thu, 2021-08-26 at 11:16 +0200, Daniel Vetter wrote:
> > On Thu, Aug 19, 2021 at 09:32:20AM +0200, Thomas Hellström wrote:
> > > On Wed, 2021-08-18 at 15:58 +0100, Matthew Auld wrote:
> > > > This should give a more complete view of the various bits of
> > > > internal
> > > > resource manager state, for device local-memory.
> > > > 
> > > > Signed-off-by: Matthew Auld 
> > > > Cc: Thomas Hellström 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c | 12 +---
> > > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index eec0d349ea6a..109e6feed6be 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -238,6 +238,7 @@ i915_debugfs_describe_obj(struct seq_file
> > > > *m,
> > > > struct drm_i915_gem_object *obj)
> > > >  static int i915_gem_object_info(struct seq_file *m, void
> > > > *data)
> > > >  {
> > > > struct drm_i915_private *i915 = node_to_i915(m-
> > > > >private);
> > > > +   struct drm_printer p = drm_seq_file_printer(m);
> > > > struct intel_memory_region *mr;
> > > > enum intel_region_id id;
> > > >  
> > > > @@ -245,9 +246,14 @@ static int i915_gem_object_info(struct
> > > > seq_file
> > > > *m, void *data)
> > > >    i915->mm.shrink_count,
> > > >    atomic_read(>mm.free_count),
> > > >    i915->mm.shrink_memory);
> > > > -   for_each_memory_region(mr, i915, id)
> > > > -   seq_printf(m, "%s: total:%pa, available:%pa
> > > > bytes\n",
> > > > -  mr->name, >total, >avail);
> > > > +   for_each_memory_region(mr, i915, id) {
> > > > +   seq_printf(m, "%s: ", mr->name);
> > > > +   if (mr->region_private)
> > > > +   ttm_resource_manager_debug(mr-
> > > > > region_private, );
> > > > +   else
> > > > +   seq_printf(m, "total:%pa, available:%pa
> > > > bytes\n",
> > > > +  >total, >avail);
> > > 
> > > Hm. Shouldn't we make the above intel_memory_region_debug() or
> > > perhaps
> > > intel_memory_region_info() to avoid using memory region internals
> > > directly here?
> > 
> > Imo we should just emebed ttm_resource_mager into our own and not
> > try
> > to
> > abstract this all away that much. At least in upstream there is
> > just
> > not
> > going to be another memory region implementation, and for
> > backporting
> > I'm
> > not sure these abstractions really help that much - we're touching
> > all the
> > same code still in the end.
> 
> Hmm, yes. Here I was seeing the separation between the debugfs code
> and
> the intel_memory_region code, rather between the latter and TTM.
> 
> The i915 driver is currently much "everything uses everything" which
> IMHO is not really good for code understanding and maintainance.
> 
> /Thomas
> 
> > -Daniel
> 
But yes, agreed, on the memory region backends it doesn't make much
sense. It was helpful during bringup but yes we probably won't be
adding another backend hopefully.

/Thomas

Re: [Intel-gfx] [PATCH v8 3/7] drm: lock drm_global_mutex earlier in the ioctl handler

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:18AM +0800, Desmond Cheong Zhi Xi wrote:
> In a future patch, a read lock on drm_device.master_rwsem is
> held in the ioctl handler before the check for ioctl
> permissions. However, this inverts the lock hierarchy of
> drm_global_mutex --> master_rwsem.
> 
> To avoid this, we do some prep work to grab the drm_global_mutex
> before checking for ioctl permissions.
> 
> Signed-off-by: Desmond Cheong Zhi Xi 
> ---
>  drivers/gpu/drm/drm_ioctl.c | 18 +-
>  1 file changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
> index d25713b09b80..158629d88319 100644
> --- a/drivers/gpu/drm/drm_ioctl.c
> +++ b/drivers/gpu/drm/drm_ioctl.c
> @@ -772,19 +772,19 @@ long drm_ioctl_kernel(struct file *file, drm_ioctl_t 
> *func, void *kdata,
>   if (drm_dev_is_unplugged(dev))
>   return -ENODEV;
>  
> + /* Enforce sane locking for modern driver ioctls. */
> + if (unlikely(drm_core_check_feature(dev, DRIVER_LEGACY)) && !(flags & 
> DRM_UNLOCKED))

Maybe have a local bool locked_ioctl for this so it's extremely clear it's
the same condition in both?

Either way: Reviewed-by: Daniel Vetter 

> + mutex_lock(_global_mutex);
> +
>   retcode = drm_ioctl_permit(flags, file_priv);
>   if (unlikely(retcode))
> - return retcode;
> + goto out;
>  
> - /* Enforce sane locking for modern driver ioctls. */
> - if (likely(!drm_core_check_feature(dev, DRIVER_LEGACY)) ||
> - (flags & DRM_UNLOCKED))
> - retcode = func(dev, kdata, file_priv);
> - else {
> - mutex_lock(_global_mutex);
> - retcode = func(dev, kdata, file_priv);
> + retcode = func(dev, kdata, file_priv);
> +
> +out:
> + if (unlikely(drm_core_check_feature(dev, DRIVER_LEGACY)) && !(flags & 
> DRM_UNLOCKED))
>   mutex_unlock(_global_mutex);
> - }
>   return retcode;
>  }
>  EXPORT_SYMBOL(drm_ioctl_kernel);
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v8 2/7] drm: convert drm_device.master_mutex into a rwsem

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:17AM +0800, Desmond Cheong Zhi Xi wrote:
> drm_device.master_mutex currently protects the following:
> - drm_device.master
> - drm_file.master
> - drm_file.was_master
> - drm_file.is_master
> - drm_master.unique
> - drm_master.unique_len
> - drm_master.magic_map
> 
> There is a clear separation between functions that read or change
> these attributes. Hence, convert master_mutex into a rwsem to enable
> concurrent readers.
> 
> Signed-off-by: Desmond Cheong Zhi Xi 

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/drm_auth.c| 35 ++-
>  drivers/gpu/drm/drm_debugfs.c |  4 ++--
>  drivers/gpu/drm/drm_drv.c |  3 +--
>  drivers/gpu/drm/drm_ioctl.c   | 10 +-
>  include/drm/drm_auth.h|  6 +++---
>  include/drm/drm_device.h  | 10 ++
>  include/drm/drm_file.h| 12 ++--
>  7 files changed, 41 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_auth.c b/drivers/gpu/drm/drm_auth.c
> index 60a6b21474b1..73ade0513ccb 100644
> --- a/drivers/gpu/drm/drm_auth.c
> +++ b/drivers/gpu/drm/drm_auth.c
> @@ -64,7 +64,7 @@
>  static bool drm_is_current_master_locked(struct drm_file *fpriv)
>  {
>   lockdep_assert_once(lockdep_is_held(>master_lookup_lock) ||
> - lockdep_is_held(>minor->dev->master_mutex));
> + lockdep_is_held(>minor->dev->master_rwsem));
>  
>   return fpriv->is_master && drm_lease_owner(fpriv->master) == 
> fpriv->minor->dev->master;
>  }
> @@ -96,7 +96,7 @@ int drm_getmagic(struct drm_device *dev, void *data, struct 
> drm_file *file_priv)
>   struct drm_auth *auth = data;
>   int ret = 0;
>  
> - mutex_lock(>master_mutex);
> + down_write(>master_rwsem);
>   if (!file_priv->magic) {
>   ret = idr_alloc(_priv->master->magic_map, file_priv,
>   1, 0, GFP_KERNEL);
> @@ -104,7 +104,7 @@ int drm_getmagic(struct drm_device *dev, void *data, 
> struct drm_file *file_priv)
>   file_priv->magic = ret;
>   }
>   auth->magic = file_priv->magic;
> - mutex_unlock(>master_mutex);
> + up_write(>master_rwsem);
>  
>   DRM_DEBUG("%u\n", auth->magic);
>  
> @@ -119,13 +119,13 @@ int drm_authmagic(struct drm_device *dev, void *data,
>  
>   DRM_DEBUG("%u\n", auth->magic);
>  
> - mutex_lock(>master_mutex);
> + down_write(>master_rwsem);
>   file = idr_find(_priv->master->magic_map, auth->magic);
>   if (file) {
>   file->authenticated = 1;
>   idr_replace(_priv->master->magic_map, NULL, auth->magic);
>   }
> - mutex_unlock(>master_mutex);
> + up_write(>master_rwsem);
>  
>   return file ? 0 : -EINVAL;
>  }
> @@ -167,7 +167,7 @@ static int drm_new_set_master(struct drm_device *dev, 
> struct drm_file *fpriv)
>   struct drm_master *old_master;
>   struct drm_master *new_master;
>  
> - lockdep_assert_held_once(>master_mutex);
> + lockdep_assert_held_once(>master_rwsem);
>  
>   WARN_ON(fpriv->is_master);
>   old_master = fpriv->master;
> @@ -249,7 +249,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void 
> *data,
>  {
>   int ret;
>  
> - mutex_lock(>master_mutex);
> + down_write(>master_rwsem);
>  
>   ret = drm_master_check_perm(dev, file_priv);
>   if (ret)
> @@ -281,7 +281,7 @@ int drm_setmaster_ioctl(struct drm_device *dev, void 
> *data,
>  
>   drm_set_master(dev, file_priv, false);
>  out_unlock:
> - mutex_unlock(>master_mutex);
> + up_write(>master_rwsem);
>   return ret;
>  }
>  
> @@ -298,7 +298,7 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void 
> *data,
>  {
>   int ret;
>  
> - mutex_lock(>master_mutex);
> + down_write(>master_rwsem);
>  
>   ret = drm_master_check_perm(dev, file_priv);
>   if (ret)
> @@ -321,8 +321,9 @@ int drm_dropmaster_ioctl(struct drm_device *dev, void 
> *data,
>   }
>  
>   drm_drop_master(dev, file_priv);
> +
>  out_unlock:
> - mutex_unlock(>master_mutex);
> + up_write(>master_rwsem);
>   return ret;
>  }
>  
> @@ -334,7 +335,7 @@ int drm_master_open(struct drm_file *file_priv)
>   /* if there is no current master make this fd it, but do not create
>* any master object for render clients
>*/
> - mutex_lock(>master_mutex);
> + down_write(>master_rwsem);
>   if (!dev->master) {
>   ret = drm_new_set_master(dev, file_priv);
>   } else {
> @@ -342,7 +343,7 @@ int drm_master_open(struct drm_file *file_priv)
>   file_priv->master = drm_master_get(dev->master);
>   spin_unlock(_priv->master_lookup_lock);
>   }
> - mutex_unlock(>master_mutex);
> + up_write(>master_rwsem);
>  
>   return ret;
>  }
> @@ -352,7 +353,7 @@ void drm_master_release(struct drm_file *file_priv)
>   struct drm_device *dev = file_priv->minor->dev;
>   struct drm_master *master;
>

Re: [Intel-gfx] [PATCH v8 1/7] drm: fix null ptr dereference in drm_master_release

2021-08-26 Thread Daniel Vetter

On Thu, Aug 26, 2021 at 10:01:16AM +0800, Desmond Cheong Zhi Xi wrote:
> drm_master_release can be called on a drm_file without a master, which
> results in a null ptr dereference of file_priv->master->magic_map. The
> three cases are:
> 
> 1. Error path in drm_open_helper
>   drm_open():
> drm_open_helper():
>   drm_master_open():
> drm_new_set_master(); <--- returns -ENOMEM,
>drm_file.master not set
>   drm_file_free():
> drm_master_release(); <--- NULL ptr dereference
>(file_priv->master->magic_map)
> 
> 2. Error path in mock_drm_getfile
>   mock_drm_getfile():
> anon_inode_getfile(); <--- returns error, drm_file.master not set
> drm_file_free():
>   drm_master_release(); <--- NULL ptr dereference
>  (file_priv->master->magic_map)
> 
> 3. In drm_client_close, as drm_client_open doesn't set up a master
> 
> drm_file.master is set up in drm_open_helper through the call to
> drm_master_open, so we mirror it with a call to drm_master_release in
> drm_close_helper, and remove drm_master_release from drm_file_free to
> avoid the null ptr dereference.
> 
> Signed-off-by: Desmond Cheong Zhi Xi 

Reviewed-by: Daniel Vetter 

I guess we should also have a cc: stable on this one? I think this bug
existed since pretty much forever, but maybe more prominent with the
drm_client stuff added a while ago.
-Daniel

> ---
>  drivers/gpu/drm/drm_file.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index ed25168619fc..90b62f360da1 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -282,9 +282,6 @@ void drm_file_free(struct drm_file *file)
>  
>   drm_legacy_ctxbitmap_flush(dev, file);
>  
> - if (drm_is_primary_client(file))
> - drm_master_release(file);
> -
>   if (dev->driver->postclose)
>   dev->driver->postclose(dev, file);
>  
> @@ -305,6 +302,9 @@ static void drm_close_helper(struct file *filp)
>   list_del(_priv->lhead);
>   mutex_unlock(>filelist_mutex);
>  
> + if (drm_is_primary_client(file_priv))
> + drm_master_release(file_priv);
> +
>   drm_file_free(file_priv);
>  }
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH 2/2] drm/i915/debugfs: hook up ttm_resource_manager_debug

2021-08-26 Thread Thomas Hellström

On Thu, 2021-08-26 at 11:16 +0200, Daniel Vetter wrote:
> On Thu, Aug 19, 2021 at 09:32:20AM +0200, Thomas Hellström wrote:
> > On Wed, 2021-08-18 at 15:58 +0100, Matthew Auld wrote:
> > > This should give a more complete view of the various bits of
> > > internal
> > > resource manager state, for device local-memory.
> > > 
> > > Signed-off-by: Matthew Auld 
> > > Cc: Thomas Hellström 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c | 12 +---
> > >  1 file changed, 9 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > > b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index eec0d349ea6a..109e6feed6be 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -238,6 +238,7 @@ i915_debugfs_describe_obj(struct seq_file *m,
> > > struct drm_i915_gem_object *obj)
> > >  static int i915_gem_object_info(struct seq_file *m, void *data)
> > >  {
> > > struct drm_i915_private *i915 = node_to_i915(m->private);
> > > +   struct drm_printer p = drm_seq_file_printer(m);
> > > struct intel_memory_region *mr;
> > > enum intel_region_id id;
> > >  
> > > @@ -245,9 +246,14 @@ static int i915_gem_object_info(struct
> > > seq_file
> > > *m, void *data)
> > >    i915->mm.shrink_count,
> > >    atomic_read(>mm.free_count),
> > >    i915->mm.shrink_memory);
> > > -   for_each_memory_region(mr, i915, id)
> > > -   seq_printf(m, "%s: total:%pa, available:%pa
> > > bytes\n",
> > > -  mr->name, >total, >avail);
> > > +   for_each_memory_region(mr, i915, id) {
> > > +   seq_printf(m, "%s: ", mr->name);
> > > +   if (mr->region_private)
> > > +   ttm_resource_manager_debug(mr-
> > > > region_private, );
> > > +   else
> > > +   seq_printf(m, "total:%pa, available:%pa
> > > bytes\n",
> > > +  >total, >avail);
> > 
> > Hm. Shouldn't we make the above intel_memory_region_debug() or
> > perhaps
> > intel_memory_region_info() to avoid using memory region internals
> > directly here?
> 
> Imo we should just emebed ttm_resource_mager into our own and not try
> to
> abstract this all away that much. At least in upstream there is just
> not
> going to be another memory region implementation, and for backporting
> I'm
> not sure these abstractions really help that much - we're touching
> all the
> same code still in the end.

Hmm, yes. Here I was seeing the separation between the debugfs code and
the intel_memory_region code, rather between the latter and TTM.

The i915 driver is currently much "everything uses everything" which
IMHO is not really good for code understanding and maintainance.

/Thomas

> -Daniel

[Intel-gfx] [PATCH] drm/msm: Improve drm/sched point of no return rules

2021-08-26 Thread Daniel Vetter

Originally drm_sched_job_init was the point of no return, after which
drivers really should submit a job. I've split that up, which allows
us to fix this issue pretty easily.

Only thing we have to take care of is to not skip to error paths after
that. Other drivers do this the same for out-fence and similar things.

v2: It's not really a bugfix, just an improvement, since all
drm_sched_job_arm does is reserve the fence number. And gaps should be
fine, as long as the drm_sched_job doesn't escape anywhere at all.

For robustness it's still better to align with other drivers here and
not bail out after job_arm().

v3: I misplaced drm_sched_job_arm by _one_ line! Thanks to Rob for
testing and debug help.

Cc: Rob Clark 
Cc: Rob Clark 
Cc: Sean Paul 
Cc: Sumit Semwal 
Cc: "Christian König" 
Cc: linux-arm-...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: freedr...@lists.freedesktop.org
Cc: linux-me...@vger.kernel.org
Cc: linaro-mm-...@lists.linaro.org
Signed-off-by: Daniel Vetter 
---
 drivers/gpu/drm/msm/msm_gem_submit.c | 13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c 
b/drivers/gpu/drm/msm/msm_gem_submit.c
index 4d1c4d5f6a2a..71b8c8f752a3 100644
--- a/drivers/gpu/drm/msm/msm_gem_submit.c
+++ b/drivers/gpu/drm/msm/msm_gem_submit.c
@@ -52,8 +52,6 @@ static struct msm_gem_submit *submit_create(struct drm_device 
*dev,
return ERR_PTR(ret);
}
 
-   drm_sched_job_arm(>base);
-
xa_init_flags(>deps, XA_FLAGS_ALLOC);
 
kref_init(>ref);
@@ -880,6 +878,8 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void *data,
 
submit->nr_cmds = i;
 
+   drm_sched_job_arm(>base);
+
submit->user_fence = dma_fence_get(>base.s_fence->finished);
 
/*
@@ -891,17 +891,16 @@ int msm_ioctl_gem_submit(struct drm_device *dev, void 
*data,
if (submit->fence_id < 0) {
ret = submit->fence_id = 0;
submit->fence_id = 0;
-   goto out;
}
 
-   if (args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
+   if (ret == 0 && args->flags & MSM_SUBMIT_FENCE_FD_OUT) {
struct sync_file *sync_file = 
sync_file_create(submit->user_fence);
if (!sync_file) {
ret = -ENOMEM;
-   goto out;
+   } else {
+   fd_install(out_fence_fd, sync_file->file);
+   args->fence_fd = out_fence_fd;
}
-   fd_install(out_fence_fd, sync_file->file);
-   args->fence_fd = out_fence_fd;
}
 
submit_attach_object_fences(submit);
-- 
2.32.0

Re: [Intel-gfx] [PATCH 08/27] drm/i915/selftests: Add a cancel request selftest that triggers a reset

2021-08-26 Thread Tvrtko Ursulin




On 26/08/2021 04:23, Matthew Brost wrote:

Add a cancel request selftest that results in an engine reset to cancel
the request as it is non-preemptable. Also insert a NOP request after
the cancelled request and confirm that it completely successfully.


Which patch fixes a problem this exposes in the execlists implementation?


Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/i915/selftests/i915_request.c | 100 ++
  1 file changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c 
b/drivers/gpu/drm/i915/selftests/i915_request.c
index d67710d10615..e2c5db77f087 100644
--- a/drivers/gpu/drm/i915/selftests/i915_request.c
+++ b/drivers/gpu/drm/i915/selftests/i915_request.c
@@ -772,6 +772,98 @@ static int __cancel_completed(struct intel_engine_cs 
*engine)
return err;
  }
  
+static int __cancel_reset(struct intel_engine_cs *engine)

+{
+   struct intel_context *ce;
+   struct igt_spinner spin;
+   struct i915_request *rq, *nop;
+   unsigned long preempt_timeout_ms;
+   int err = 0;
+


You may need to skip the test if preempt timeout is compiled out or if 
GPU reset is altogether disabled.



+   preempt_timeout_ms = engine->props.preempt_timeout_ms;
+   engine->props.preempt_timeout_ms = 100;
+
+   if (igt_spinner_init(, engine->gt))
+   goto out_restore;
+
+   ce = intel_context_create(engine);
+   if (IS_ERR(ce)) {
+   err = PTR_ERR(ce);
+   goto out_spin;
+   }
+
+   rq = igt_spinner_create_request(, ce, MI_NOOP);
+   if (IS_ERR(rq)) {
+   err = PTR_ERR(rq);
+   goto out_ce;
+   }
+
+   pr_debug("%s: Cancelling active request\n", engine->name);


"active non-preemptable" perhaps?


+   i915_request_get(rq);
+   i915_request_add(rq);
+   if (!igt_wait_for_spinner(, rq)) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("Failed to start spinner on %s\n", engine->name);
+   intel_engine_dump(engine, , "%s\n", engine->name);
+   err = -ETIME;
+   goto out_rq;
+   }
+
+   nop = intel_context_create_request(ce);
+   if (IS_ERR(nop))
+   goto out_nop;
+   i915_request_get(nop);
+   i915_request_add(nop);
+
+   i915_request_cancel(rq, -EINTR);
+
+   if (i915_request_wait(rq, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to cancel hung request\n", engine->name);
+   intel_engine_dump(engine, , "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (rq->fence.error != -EINTR) {
+   pr_err("%s: fence not cancelled (%u)\n",
+  engine->name, rq->fence.error);
+   err = -EINVAL;
+   goto out_nop;
+   }
+
+   if (i915_request_wait(nop, 0, HZ) < 0) {
+   struct drm_printer p = drm_info_printer(engine->i915->drm.dev);
+
+   pr_err("%s: Failed to complete nop request\n", engine->name);
+   intel_engine_dump(engine, , "%s\n", engine->name);
+   err = -ETIME;
+   goto out_nop;
+   }
+
+   if (nop->fence.error != 0) {
+   pr_err("%s: Nop request errored (%u)\n",


Maybe s/nop/innocent/ in the respective log messages?


+  engine->name, nop->fence.error);
+   err = -EINVAL;
+   }
+
+out_nop:
+   i915_request_put(nop);
+out_rq:
+   i915_request_put(rq);
+out_ce:
+   intel_context_put(ce);
+out_spin:
+   igt_spinner_fini();
+out_restore:
+   engine->props.preempt_timeout_ms = preempt_timeout_ms;
+   if (err)
+   pr_err("%s: %s error %d\n", __func__, engine->name, err);
+   return err;
+}
+
  static int live_cancel_request(void *arg)
  {
struct drm_i915_private *i915 = arg;
@@ -804,6 +896,14 @@ static int live_cancel_request(void *arg)
return err;
if (err2)
return err2;
+
+   /* Expects reset so call outside of igt_live_test_* */


Hm there are live tests like live_preempt_cancel which seemingly manage 
to do resets under the live test block.


Regards,

Tvrtko


+   err = __cancel_reset(engine);
+   if (err)
+   return err;
+
+   if (igt_flush_test(i915))
+   return -EIO;
}
  
  	return 0;

Re: [Intel-gfx] [PATCH] drm/i915/snps: constify struct intel_mpllb_state arrays harder

2021-08-26 Thread Jani Nikula

On Wed, 25 Aug 2021, Matt Roper  wrote:
> On Wed, Aug 25, 2021 at 05:58:11PM +0300, Jani Nikula wrote:
>> The tables should be const arrays of const pointers, not just arrays of
>> const pointers.
>> 
>> Cc: Matt Roper 
>> Signed-off-by: Jani Nikula 
>
> Reviewed-by: Matt Roper 

Thanks, pushed.

BR,
Jani.

>
>> ---
>>  drivers/gpu/drm/i915/display/intel_snps_phy.c | 14 +++---
>>  1 file changed, 7 insertions(+), 7 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/display/intel_snps_phy.c 
>> b/drivers/gpu/drm/i915/display/intel_snps_phy.c
>> index d81f71296297..58ec2467ad66 100644
>> --- a/drivers/gpu/drm/i915/display/intel_snps_phy.c
>> +++ b/drivers/gpu/drm/i915/display/intel_snps_phy.c
>> @@ -171,7 +171,7 @@ static const struct intel_mpllb_state dg2_dp_hbr3_100 = {
>>  REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_DEN, 1),
>>  };
>>  
>> -static const struct intel_mpllb_state *dg2_dp_100_tables[] = {
>> +static const struct intel_mpllb_state * const dg2_dp_100_tables[] = {
>>  _dp_rbr_100,
>>  _dp_hbr1_100,
>>  _dp_hbr2_100,
>> @@ -284,7 +284,7 @@ static const struct intel_mpllb_state dg2_dp_hbr3_38_4 = 
>> {
>>  REG_FIELD_PREP(SNPS_PHY_MPLLB_FRACN_QUOT, 61440),
>>  };
>>  
>> -static const struct intel_mpllb_state *dg2_dp_38_4_tables[] = {
>> +static const struct intel_mpllb_state * const dg2_dp_38_4_tables[] = {
>>  _dp_rbr_38_4,
>>  _dp_hbr1_38_4,
>>  _dp_hbr2_38_4,
>> @@ -421,7 +421,7 @@ static const struct intel_mpllb_state dg2_edp_r432 = {
>>  REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_STEPSIZE, 65752),
>>  };
>>  
>> -static const struct intel_mpllb_state *dg2_edp_tables[] = {
>> +static const struct intel_mpllb_state * const dg2_edp_tables[] = {
>>  _dp_rbr_100,
>>  _edp_r216,
>>  _edp_r243,
>> @@ -584,7 +584,7 @@ static const struct intel_mpllb_state dg2_hdmi_594 = {
>>  REG_FIELD_PREP(SNPS_PHY_MPLLB_SSC_UP_SPREAD, 1),
>>  };
>>  
>> -static const struct intel_mpllb_state *dg2_hdmi_tables[] = {
>> +static const struct intel_mpllb_state * const dg2_hdmi_tables[] = {
>>  _hdmi_25_175,
>>  _hdmi_27_0,
>>  _hdmi_74_25,
>> @@ -593,7 +593,7 @@ static const struct intel_mpllb_state *dg2_hdmi_tables[] 
>> = {
>>  NULL,
>>  };
>>  
>> -static const struct intel_mpllb_state **
>> +static const struct intel_mpllb_state * const *
>>  intel_mpllb_tables_get(struct intel_crtc_state *crtc_state,
>> struct intel_encoder *encoder)
>>  {
>> @@ -627,7 +627,7 @@ intel_mpllb_tables_get(struct intel_crtc_state 
>> *crtc_state,
>>  int intel_mpllb_calc_state(struct intel_crtc_state *crtc_state,
>> struct intel_encoder *encoder)
>>  {
>> -const struct intel_mpllb_state **tables;
>> +const struct intel_mpllb_state * const *tables;
>>  int i;
>>  
>>  if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_HDMI)) {
>> @@ -823,7 +823,7 @@ void intel_mpllb_readout_hw_state(struct intel_encoder 
>> *encoder,
>>  
>>  int intel_snps_phy_check_hdmi_link_rate(int clock)
>>  {
>> -const struct intel_mpllb_state **tables = dg2_hdmi_tables;
>> +const struct intel_mpllb_state * const *tables = dg2_hdmi_tables;
>>  int i;
>>  
>>  for (i = 0; tables[i]; i++) {
>> -- 
>> 2.20.1
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 2/2] drm/i915/debugfs: hook up ttm_resource_manager_debug

2021-08-26 Thread Daniel Vetter

On Thu, Aug 19, 2021 at 09:32:20AM +0200, Thomas Hellström wrote:
> On Wed, 2021-08-18 at 15:58 +0100, Matthew Auld wrote:
> > This should give a more complete view of the various bits of internal
> > resource manager state, for device local-memory.
> > 
> > Signed-off-by: Matthew Auld 
> > Cc: Thomas Hellström 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c
> > b/drivers/gpu/drm/i915/i915_debugfs.c
> > index eec0d349ea6a..109e6feed6be 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -238,6 +238,7 @@ i915_debugfs_describe_obj(struct seq_file *m,
> > struct drm_i915_gem_object *obj)
> >  static int i915_gem_object_info(struct seq_file *m, void *data)
> >  {
> > struct drm_i915_private *i915 = node_to_i915(m->private);
> > +   struct drm_printer p = drm_seq_file_printer(m);
> > struct intel_memory_region *mr;
> > enum intel_region_id id;
> >  
> > @@ -245,9 +246,14 @@ static int i915_gem_object_info(struct seq_file
> > *m, void *data)
> >    i915->mm.shrink_count,
> >    atomic_read(>mm.free_count),
> >    i915->mm.shrink_memory);
> > -   for_each_memory_region(mr, i915, id)
> > -   seq_printf(m, "%s: total:%pa, available:%pa bytes\n",
> > -  mr->name, >total, >avail);
> > +   for_each_memory_region(mr, i915, id) {
> > +   seq_printf(m, "%s: ", mr->name);
> > +   if (mr->region_private)
> > +   ttm_resource_manager_debug(mr-
> > >region_private, );
> > +   else
> > +   seq_printf(m, "total:%pa, available:%pa
> > bytes\n",
> > +  >total, >avail);
> 
> Hm. Shouldn't we make the above intel_memory_region_debug() or perhaps
> intel_memory_region_info() to avoid using memory region internals
> directly here?

Imo we should just emebed ttm_resource_mager into our own and not try to
abstract this all away that much. At least in upstream there is just not
going to be another memory region implementation, and for backporting I'm
not sure these abstractions really help that much - we're touching all the
same code still in the end.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

[Intel-gfx] [PULL] drm-intel-next-fixes

2021-08-26 Thread Jani Nikula



Hi Dave & Daniel -

Some pretty straightforward fixes for the merge window.

drm-intel-next-fixes-2021-08-26:
drm/i915 fixes for v5.15-rc1:
- Disable underrun recovery with eDP MSO panels on ADL-P
- Use designated initializers for init/exit table
- Fix some error pointer usages

BR,
Jani.

The following changes since commit 397ab98e2d69cede8a28eab77a171983d14e:

  Merge tag 'drm-msm-next-2021-08-12' of https://gitlab.freedesktop.org/drm/msm 
into drm-next (2021-08-17 10:53:52 +1000)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel 
tags/drm-intel-next-fixes-2021-08-26

for you to fetch changes up to fb43ebc83e069625cfeeb2490efc3ffa0013bfa4:

  drm/i915/selftest: Fix use of err in igt_reset_{fail, nop}_engine() 
(2021-08-24 17:23:10 +0300)


drm/i915 fixes for v5.15-rc1:
- Disable underrun recovery with eDP MSO panels on ADL-P
- Use designated initializers for init/exit table
- Fix some error pointer usages


Dan Carpenter (1):
  drm/i915/gt: Potential error pointer dereference in pinned_context()

Kees Cook (1):
  drm/i915: Use designated initializers for init/exit table

Matt Roper (1):
  drm/i915/adl_p: Also disable underrun recovery with MSO

Nathan Chancellor (1):
  drm/i915/selftest: Fix use of err in igt_reset_{fail, nop}_engine()

 drivers/gpu/drm/i915/display/intel_display.c |  3 +++
 drivers/gpu/drm/i915/gt/intel_migrate.c  |  2 +-
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c |  4 +--
 drivers/gpu/drm/i915/i915_module.c   | 37 ++--
 4 files changed, 30 insertions(+), 16 deletions(-)

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH 21/33] drm/i915/guc: Connect reset modparam updates to GuC policy flags

2021-08-26 Thread Jani Nikula

On Mon, 26 Jul 2021, Matthew Brost  wrote:
> From: John Harrison 
>
> Changing the reset module parameter has no effect on a running GuC.
> The corresponding entry in the ADS must be updated and then the GuC
> informed via a Host2GuC message.
>
> The new debugfs interface to module parameters allows this to happen.
> However, connecting the parameter data address back to anything useful
> is messy. One option would be to pass a new private data structure
> address through instead of just the parameter pointer. However, that
> means having a new (and different) data structure for each parameter
> and a new (and different) write function for each parameter. This
> method keeps everything generic by instead using a string lookup on
> the directory entry name.
>
> Signed-off-by: John Harrison 
> Signed-off-by: Matthew Brost 
> Reviewed-by: Matthew Brost 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c |  2 +-
>  drivers/gpu/drm/i915/i915_debugfs_params.c | 32 ++
>  2 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> index 60b73625f686..7797766c56a9 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c
> @@ -99,7 +99,7 @@ static int guc_action_policies_update(struct intel_guc 
> *guc, u32 policy_offset)
>   policy_offset
>   };
>  
> - return intel_guc_send(guc, action, ARRAY_SIZE(action));
> + return intel_guc_send_busy_loop(guc, action, ARRAY_SIZE(action), 0, 
> true);
>  }
>  
>  int intel_guc_global_policies_update(struct intel_guc *guc)
> diff --git a/drivers/gpu/drm/i915/i915_debugfs_params.c 
> b/drivers/gpu/drm/i915/i915_debugfs_params.c
> index 4e2b077692cb..20424275d41e 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs_params.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs_params.c
> @@ -6,9 +6,21 @@
>  #include 
>  
>  #include "i915_debugfs_params.h"
> +#include "gt/intel_gt.h"
> +#include "gt/uc/intel_guc.h"
>  #include "i915_drv.h"
>  #include "i915_params.h"
>  
> +#define MATCH_DEBUGFS_NODE_NAME(_file, _name) \
> + (strcmp((_file)->f_path.dentry->d_name.name, (_name)) == 0)
> +
> +#define GET_I915(i915, name, ptr)\
> + do {\
> + struct i915_params *params; \
> + params = container_of(((void *)(ptr)), typeof(*params), name);  
> \
> + (i915) = container_of(params, typeof(*(i915)), params); \
> + } while (0)
> +
>  /* int param */
>  static int i915_param_int_show(struct seq_file *m, void *data)
>  {
> @@ -24,6 +36,16 @@ static int i915_param_int_open(struct inode *inode, struct 
> file *file)
>   return single_open(file, i915_param_int_show, inode->i_private);
>  }
>  
> +static int notify_guc(struct drm_i915_private *i915)
> +{
> + int ret = 0;
> +
> + if (intel_uc_uses_guc_submission(>gt.uc))
> + ret = intel_guc_global_policies_update(>gt.uc.guc);
> +
> + return ret;
> +}
> +
>  static ssize_t i915_param_int_write(struct file *file,
>   const char __user *ubuf, size_t len,
>   loff_t *offp)
> @@ -81,8 +103,10 @@ static ssize_t i915_param_uint_write(struct file *file,
>const char __user *ubuf, size_t len,
>loff_t *offp)
>  {
> + struct drm_i915_private *i915;
>   struct seq_file *m = file->private_data;
>   unsigned int *value = m->private;
> + unsigned int old = *value;
>   int ret;
>  
>   ret = kstrtouint_from_user(ubuf, len, 0, value);
> @@ -95,6 +119,14 @@ static ssize_t i915_param_uint_write(struct file *file,
>   *value = b;
>   }
>  
> + if (!ret && MATCH_DEBUGFS_NODE_NAME(file, "reset")) {
> + GET_I915(i915, reset, value);
> +
> + ret = notify_guc(i915);
> + if (ret)
> + *value = old;
> + }

Only stumbled on this now. It was never the idea to add this kind of
checks in the middle of the generic functions. What if the type was bool
or ulong, where the generic function is a debugfs helper outside of
i915?

See the comment in i915_debugfs_params() that I added there exactly
because I envisioned someone was going to need this facility:

/*
 * Note: We could create files for params needing special handling
 * here. Set mode in params to 0 to skip the generic create file, or
 * just let the generic create file fail silently with -EEXIST.
 */

The idea was that you create your own handlers for params that need
special handling.


BR,
Jani.


> +
>   return ret ?: len;
>  }

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH] drm/i915/pci: rename functions to have i915_pci prefix

2021-08-26 Thread Jani Nikula

On Wed, 25 Aug 2021, Rodrigo Vivi  wrote:
> On Wed, Aug 25, 2021 at 06:06:23PM +0300, Jani Nikula wrote:
>> Follow the usual naming conventions. While at it, fix i915_pci.h SPDX
>> license comment format and add header include guards.
>> 
>> Cc: Daniel Vetter 
>> Signed-off-by: Jani Nikula 
>
> Reviewed-by: Rodrigo Vivi 

Thanks, pushed to drm-intel-gt-next.

BR,
Jani.


>
>> ---
>>  drivers/gpu/drm/i915/i915_module.c |  4 ++--
>>  drivers/gpu/drm/i915/i915_pci.c|  4 ++--
>>  drivers/gpu/drm/i915/i915_pci.h| 12 
>>  3 files changed, 12 insertions(+), 8 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_module.c 
>> b/drivers/gpu/drm/i915/i915_module.c
>> index d8b4482c69d0..ab2295dd4500 100644
>> --- a/drivers/gpu/drm/i915/i915_module.c
>> +++ b/drivers/gpu/drm/i915/i915_module.c
>> @@ -67,8 +67,8 @@ static const struct {
>>  { .init = i915_mock_selftests },
>>  { .init = i915_pmu_init,
>>.exit = i915_pmu_exit },
>> -{ .init = i915_register_pci_driver,
>> -  .exit = i915_unregister_pci_driver },
>> +{ .init = i915_pci_register_driver,
>> +  .exit = i915_pci_unregister_driver },
>>  { .init = i915_perf_sysctl_register,
>>.exit = i915_perf_sysctl_unregister },
>>  };
>> diff --git a/drivers/gpu/drm/i915/i915_pci.c 
>> b/drivers/gpu/drm/i915/i915_pci.c
>> index 96cfd6427cec..146f7e39182a 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.c
>> +++ b/drivers/gpu/drm/i915/i915_pci.c
>> @@ -1235,12 +1235,12 @@ static struct pci_driver i915_pci_driver = {
>>  .driver.pm = _pm_ops,
>>  };
>>  
>> -int i915_register_pci_driver(void)
>> +int i915_pci_register_driver(void)
>>  {
>>  return pci_register_driver(_pci_driver);
>>  }
>>  
>> -void i915_unregister_pci_driver(void)
>> +void i915_pci_unregister_driver(void)
>>  {
>>  pci_unregister_driver(_pci_driver);
>>  }
>> diff --git a/drivers/gpu/drm/i915/i915_pci.h 
>> b/drivers/gpu/drm/i915/i915_pci.h
>> index b386f319f52e..ee048c238174 100644
>> --- a/drivers/gpu/drm/i915/i915_pci.h
>> +++ b/drivers/gpu/drm/i915/i915_pci.h
>> @@ -1,8 +1,12 @@
>> +/* SPDX-License-Identifier: MIT */
>>  /*
>> - * SPDX-License-Identifier: MIT
>> - *
>>   * Copyright © 2021 Intel Corporation
>>   */
>>  
>> -int i915_register_pci_driver(void);
>> -void i915_unregister_pci_driver(void);
>> +#ifndef __I915_PCI_H__
>> +#define __I915_PCI_H__
>> +
>> +int i915_pci_register_driver(void);
>> +void i915_pci_unregister_driver(void);
>> +
>> +#endif /* __I915_PCI_H__ */
>> -- 
>> 2.20.1
>> 

-- 
Jani Nikula, Intel Open Source Graphics Center

Re: [Intel-gfx] [PATCH] drm/i915: Ensure wa_init_finish() is called for ctx workaround list

2021-08-26 Thread Tvrtko Ursulin




On 26/08/2021 04:35, Matt Roper wrote:

A recent restructuring of our context workaround list initialization
added an early return for non-render engines; this caused us to
potentially miss the wa_init_finish() call at the end of the function.
The mistake is pretty harmless --- the only impact is that non-render
engines on graphics version 12.50+ platforms we don't trim down the
workaround list to reclaim some memory, and we don't print the usual
"Initialized 1 context workaround" message in dmesg.  Let's change the
early return to a jump down to the wa_init_finish() call at the bottom
of the function.

Reported-by: Tvrtko Ursulin 
Fixes: 9e9dfd080201 ("drm/i915/dg2: Maintain backward-compatible nested batch 
behavior")
Signed-off-by: Matt Roper 
---
  drivers/gpu/drm/i915/gt/intel_workarounds.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 688ed04edbf6..94e1937f8d29 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -689,7 +689,7 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
fakewa_disable_nestedbb_mode(engine, wal);
  
  	if (engine->class != RENDER_CLASS)

-   return;
+   goto done;
  
  	if (IS_DG1(i915))

dg1_ctx_workarounds_init(engine, wal);
@@ -720,6 +720,7 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
else
MISSING_CASE(GRAPHICS_VER(i915));
  
+done:

wa_init_finish(wal);
  }
  



Reviewed-by: Tvrtko Ursulin 

Regards,

Tvrtko

Re: [Intel-gfx] [GIT PULL] drm-misc + drm-intel: Add support for out-of-band hotplug notification

2021-08-26 Thread Maxime Ripard

On Wed, Aug 25, 2021 at 04:03:43PM +, Vivi, Rodrigo wrote:
> On Tue, 2021-08-24 at 18:48 +0200, Hans de Goede wrote:
> > Hi,
> > 
> > On 8/24/21 10:45 AM, Jani Nikula wrote:
> > > On Fri, 20 Aug 2021, Hans de Goede  wrote:
> > > > Hello drm-misc and drm-intel maintainers,
> > > > 
> > > > My "Add support for out-of-band hotplug notification" patchset:
> > > > https://patchwork.freedesktop.org/series/93763/
> > > > 
> > > > Is ready for merging now, as discussed on IRC I based this series
> > > > on top drm-tip and when trying to apply the i915 parts on top
> > > > of drm-misc this fails due to conflict.
> > > > 
> > > > So as Jani suggested here is a pull-req for a topic-branch with
> > > > the
> > > > entire set, minus the troublesome i915 bits. Once this has been
> > > > merged
> > > > into both drm-misc-next and drm-intel-next I can push the 2 i915
> > > > patch do drm-intel-next on top of the merge.
> > > > 
> > > > Note there are also 2 drivers/usb/typec patches in here these
> > > > have Greg KH's Reviewed-by for merging through the drm tree,
> > > > Since this USB code does not change all that much. I also checked
> > > > and the drm-misc-next-2021-08-12 base of this tree contains the
> > > > same last commit to the modified file as usb-next.
> > > > 
> > > > Daniel Vetter mentioned on IRC that it might be better for you to
> > > > simply
> > > > pick-up the series directly from patchwork, that is fine too in
> > > > that
> > > > case don't forget to add:
> > > > 
> > > > Reviewed-by: Lyude Paul 
> > > > 
> > > > To the entire series (given in a reply to the cover-letter)
> > > > 
> > > > And:
> > > > 
> > > > Reviewed-by: Greg Kroah-Hartman 
> > > > 
> > > > To the usb/typec patches (patch 7/8), this was given in reply
> > > > to a previous posting of the series and I forgot to add this
> > > > in the resend.
> > > 
> > > Since this is mostly touching drm core, I think it should be merged
> > > to
> > > drm-misc-next first, and drm-intel-next after. Please let us know.
> > 
> > I agree this should go to drm-misc-next first.
> > 
> > (I was planning on pushing this to drm-misc-next myself,
> > but then ended up going with the topic branch because of the
> > conflict in the i915 bits.)
> 
> Just to be clear and avoid confusion: This pull request does apply
> cleanly on drm-misc-next nd drm-intel-next right now.
> 
> I'm just waiting for drm-misc-next maintainers to pull this to drm-
> misc-next so I can pull it to drm-intel-next.
> 
> Maxime, is that your round now?
> or Thomas?

That's me, I just pushed it to drm-misc-next

Thanks!
Maxime


signature.asc
Description: PGP signature

[Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915/gt: Register the migrate contexts with their engines
URL   : https://patchwork.freedesktop.org/series/94058/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10522 -> Patchwork_20899


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/index.html

Known issues


  Here are the changes found in Patchwork_20899 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@amdgpu/amd_basic@memory-alloc:
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271])
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-kbl-soraka/igt@amdgpu/amd_ba...@memory-alloc.html

  * igt@kms_force_connector_basic@force-connector-state:
- fi-rkl-11600:   [PASS][3] -> [FAIL][4] ([i915#3983])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-11600/igt@kms_force_connector_ba...@force-connector-state.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-rkl-11600/igt@kms_force_connector_ba...@force-connector-state.html

  
 Possible fixes 

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [INCOMPLETE][5] ([i915#3920]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  * igt@kms_busy@basic@modeset:
- fi-tgl-1115g4:  [DMESG-WARN][7] ([i915#4002]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@kms_busy@ba...@modeset.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-tgl-1115g4/igt@kms_busy@ba...@modeset.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a:
- {fi-dg1-1}: [INCOMPLETE][9] ([i915#3717]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-dg1-1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-a.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-dg1-1/igt@kms_pipe_crc_ba...@suspend-read-crc-pipe-a.html

  
 Warnings 

  * igt@core_hotunplug@unbind-rebind:
- fi-tgl-1115g4:  [DMESG-WARN][11] ([i915#4002]) -> [DMESG-WARN][12] 
([i915#1982] / [i915#4002])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@core_hotunp...@unbind-rebind.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-tgl-1115g4/igt@core_hotunp...@unbind-rebind.html

  * igt@gem_exec_suspend@basic-s0:
- fi-tgl-1115g4:  [DMESG-WARN][13] ([i915#4002]) -> [FAIL][14] 
([i915#1888])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s0.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s0.html

  * igt@gem_exec_suspend@basic-s3:
- fi-tgl-1115g4:  [DMESG-WARN][15] ([i915#4002]) -> [DMESG-FAIL][16] 
([i915#1888])
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-tgl-1115g4/igt@gem_exec_susp...@basic-s3.html

  * igt@kms_psr@primary_page_flip:
- fi-tgl-1115g4:  [SKIP][17] ([i915#1072]) -> [SKIP][18] ([i915#1072] / 
[i915#1385])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20899/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
  the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#1888]: https://gitlab.freedesktop.org/drm/intel/issues/1888
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#3717]: https://gitlab.freedesktop.org/drm/intel/issues/3717
  [i915#3920]: https://gitlab.freedesktop.org/drm/intel/issues/3920
  [i915#3983]: https://gitlab.freedesktop.org/drm/intel/issues/3983
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-bsw-cyan fi-bdw-samus 
bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10522

Re: [Intel-gfx] [PATCH 0/3] drm/i915: better backlight & panel abstractions

2021-08-26 Thread Jani Nikula

On Wed, 25 Aug 2021, Jani Nikula  wrote:
> On Wed, 25 Aug 2021, Lyude Paul  wrote:
>> Reviewed-by: Lyude Paul  (assuming this still applies)
>>
>> As I mentioned on IRC pretty much all of the DPCD backlight helpers already
>> made it upstream. There are some changes I'm working on right now for VESA
>> backlights that use PWM for controlling the brightness level (so we can
>> hopefully fix https://gitlab.freedesktop.org/drm/intel/-/issues/3680 ,
>> otherwise I've gotta do some more poking with the backlight folks from Intel 
>> I
>> got in touch with), but I have no problem with rebasing this when the time
>> comes.
>
> Thanks!

And pushed.

BR,
Jani.

-- 
Jani Nikula, Intel Open Source Graphics Center

[Intel-gfx] [PATCH] drm/i915/gem: Fix the mman selftest

2021-08-26 Thread Thomas Hellström

Using the I915_MMAP_TYPE_FIXED mmap type requires the TTM backend, so
for that mmap type, use __i915_gem_object_create_user() instead of
i915_gem_object_create_internal(), as we really want to tests objects
mmap-able by user-space.

This also means that the out-of-space error happens at object creation
and returns -ENXIO rather than -ENOSPC, so fix the code up to expect
that on out-of-offset-space errors.

Finally only use I915_MMAP_TYPE_FIXED for LMEM and SMEM for now if
testing on LMEM-capable devices. For stolen LMEM, we still take the
same path as for integrated, as that haven't been moved over to TTM yet,
and user-space should not be able to create out of stolen LMEM anyway.

Fixes: 7961c5b60f23 ("drm/i915: Add TTM offset argument to mmap.")
Cc: Maarten Lankhorst 
Signed-off-by: Thomas Hellström 
---
 .../drm/i915/gem/selftests/i915_gem_mman.c| 26 +++
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
index b20f5621f62b..68da25e66b69 100644
--- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c
@@ -581,6 +581,20 @@ static enum i915_mmap_type default_mapping(struct 
drm_i915_private *i915)
return I915_MMAP_TYPE_GTT;
 }
 
+static struct drm_i915_gem_object *
+create_sys_or_internal(struct drm_i915_private *i915,
+  unsigned long size)
+{
+   if (HAS_LMEM(i915)) {
+   struct intel_memory_region *sys_region =
+   i915->mm.regions[INTEL_REGION_SMEM];
+
+   return __i915_gem_object_create_user(i915, size, _region, 
1);
+   }
+
+   return i915_gem_object_create_internal(i915, size);
+}
+
 static bool assert_mmap_offset(struct drm_i915_private *i915,
   unsigned long size,
   int expected)
@@ -589,7 +603,7 @@ static bool assert_mmap_offset(struct drm_i915_private 
*i915,
u64 offset;
int ret;
 
-   obj = i915_gem_object_create_internal(i915, size);
+   obj = create_sys_or_internal(i915, size);
if (IS_ERR(obj))
return expected && expected == PTR_ERR(obj);
 
@@ -633,6 +647,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
struct drm_mm_node *hole, *next;
int loop, err = 0;
u64 offset;
+   int enospc = HAS_LMEM(i915) ? -ENXIO : -ENOSPC;
 
/* Disable background reaper */
disable_retire_worker(i915);
@@ -683,14 +698,14 @@ static int igt_mmap_offset_exhaustion(void *arg)
}
 
/* Too large */
-   if (!assert_mmap_offset(i915, 2 * PAGE_SIZE, -ENOSPC)) {
+   if (!assert_mmap_offset(i915, 2 * PAGE_SIZE, enospc)) {
pr_err("Unexpectedly succeeded in inserting too large object 
into single page hole\n");
err = -EINVAL;
goto out;
}
 
/* Fill the hole, further allocation attempts should then fail */
-   obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
+   obj = create_sys_or_internal(i915, PAGE_SIZE);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
pr_err("Unable to create object for reclaimed hole\n");
@@ -703,7 +718,7 @@ static int igt_mmap_offset_exhaustion(void *arg)
goto err_obj;
}
 
-   if (!assert_mmap_offset(i915, PAGE_SIZE, -ENOSPC)) {
+   if (!assert_mmap_offset(i915, PAGE_SIZE, enospc)) {
pr_err("Unexpectedly succeeded in inserting object into no 
holes!\n");
err = -EINVAL;
goto err_obj;
@@ -842,7 +857,8 @@ static bool can_mmap(struct drm_i915_gem_object *obj, enum 
i915_mmap_type type)
struct drm_i915_private *i915 = to_i915(obj->base.dev);
bool no_map;
 
-   if (HAS_LMEM(i915))
+   if (HAS_LMEM(i915) && (obj->mm.region->id == INTEL_REGION_SMEM ||
+  obj->mm.region->id == INTEL_REGION_LMEM))
return type == I915_MMAP_TYPE_FIXED;
else if (type == I915_MMAP_TYPE_FIXED)
return false;
-- 
2.31.1

[Intel-gfx] ✓ Fi.CI.BAT: success for Enable mipi dsi on XELPD (rev3)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Enable mipi dsi on XELPD (rev3)
URL   : https://patchwork.freedesktop.org/series/93917/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10522 -> Patchwork_20898


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/index.html

Known issues


  Here are the changes found in Patchwork_20898 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][1] ([fdo#109315]) +17 similar issues
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html
- fi-kbl-soraka:  NOTRUN -> [SKIP][2] ([fdo#109271]) +14 similar issues
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-kbl-soraka/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@kms_chamelium@dp-crc-fast:
- fi-kbl-7500u:   [PASS][3] -> [FAIL][4] ([i915#1372])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-kbl-7500u/igt@kms_chamel...@dp-crc-fast.html

  
 Possible fixes 

  * igt@i915_module_load@reload:
- fi-tgl-1115g4:  [DMESG-WARN][5] ([i915#4002]) -> [PASS][6] +1 similar 
issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@i915_module_l...@reload.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-tgl-1115g4/igt@i915_module_l...@reload.html

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [INCOMPLETE][7] ([i915#3920]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  
 Warnings 

  * igt@kms_psr@primary_page_flip:
- fi-tgl-1115g4:  [SKIP][9] ([i915#1072]) -> [SKIP][10] ([i915#1072] / 
[i915#1385])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/fi-tgl-1115g4/igt@kms_psr@primary_page_flip.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [i915#1072]: https://gitlab.freedesktop.org/drm/intel/issues/1072
  [i915#1372]: https://gitlab.freedesktop.org/drm/intel/issues/1372
  [i915#1385]: https://gitlab.freedesktop.org/drm/intel/issues/1385
  [i915#3920]: https://gitlab.freedesktop.org/drm/intel/issues/3920
  [i915#4002]: https://gitlab.freedesktop.org/drm/intel/issues/4002


Participating hosts (40 -> 34)
--

  Missing(6): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-bsw-cyan fi-bdw-samus 
bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10522 -> Patchwork_20898

  CI-20190529: 20190529
  CI_DRM_10522: b9b50258869989a477e7c04ac6d21a6e3660048e @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20898: b68c166da558fa7fb56f92588e7ffde8d0d851ff @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

b68c166da558 drm/i915/dsi/xelpd: Enable mipi dsi support.
e060d617f06f drm/i915/dsi/xelpd: Add WA to program LP to HS wakeup guardband

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20898/index.html

[Intel-gfx] [PATCH] drm/i915/gt: Register the migrate contexts with their engines

2021-08-26 Thread Thomas Hellström

Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.

Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at __engine_unpark() time to reset the
pinned contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.

Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

Cc: Tvrtko Ursulin 
Cc: Matthew Auld 
Cc: Maarten Lankhorst 
Cc: Brost Matthew 
Signed-off-by: Thomas Hellström 
---
 drivers/gpu/drm/i915/gt/intel_context_types.h |  8 
 drivers/gpu/drm/i915/gt/intel_engine_cs.c |  4 
 drivers/gpu/drm/i915/gt/intel_engine_pm.c |  9 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |  7 +++
 drivers/gpu/drm/i915/gt/mock_engine.c |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 10 +++---
 6 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h 
b/drivers/gpu/drm/i915/gt/intel_context_types.h
index e54351a170e2..a63631ea0ec4 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -152,6 +152,14 @@ struct intel_context {
/** sseu: Control eu/slice partitioning */
struct intel_sseu sseu;
 
+   /**
+* pinned_contexts_link: List link for the engine's pinned contexts.
+* This is only used if this is a perma-pinned kernel context and
+* the list is assumed to only be manipulated during driver load
+* or unload time so no mutex protection currently.
+*/
+   struct list_head pinned_contexts_link;
+
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
 
struct {
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index 332efea696a5..c606a4714904 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum 
intel_engine_id id)
 
BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
 
+   INIT_LIST_HEAD(>pinned_contexts_list);
engine->id = id;
engine->legacy_idx = INVALID_ENGINE;
engine->mask = BIT(id);
@@ -875,6 +876,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs 
*engine,
return ERR_PTR(err);
}
 
+   list_add_tail(>pinned_contexts_link, >pinned_contexts_list);
+
/*
 * Give our perma-pinned kernel timelines a separate lockdep class,
 * so that we can use them from within the normal user timelines
@@ -897,6 +900,7 @@ void intel_engine_destroy_pinned_context(struct 
intel_context *ce)
list_del(>timeline->engine_link);
mutex_unlock(>vm->mutex);
 
+   list_del(>pinned_contexts_link);
intel_context_unpin(ce);
intel_context_put(ce);
 }
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
index 1f07ac4e0672..3a5cbbf3e3fe 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c
@@ -72,6 +72,15 @@ static int __engine_unpark(struct intel_wakeref *wf)
   READ_ONCE(*ce->timeline->hwsp_seqno));
}
 
+   list_for_each_entry(ce, >pinned_contexts_list,
+   pinned_contexts_link) {
+   if (ce == engine->kernel_context)
+   continue;
+
+   dbg_poison_ce(ce);
+   ce->ops->reset(ce);
+   }
+
if (engine->unpark)
engine->unpark(engine);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h 
b/drivers/gpu/drm/i915/gt/intel_engine_types.h
index bfbfe53c23dd..5ae1207c363b 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -307,6 +307,13 @@ struct intel_engine_cs {
 
struct intel_context *kernel_context; /* pinned */
 
+   /**
+* pinned_contexts_list: List of pinned contexts. This list is only
+* assumed to be manipulated during driver load- or unload time and
+* does therefore not have any additional protection.
+*/
+   struct list_head pinned_contexts_list;
+
intel_engine_mask_t saturated; /* submitting semaphores too late? */
 
struct {
diff --git

Re: [Intel-gfx] refactor the i915 GVT support

2021-08-26 Thread Zhenyu Wang

On 2021.08.20 12:56:34 -0700, Luis Chamberlain wrote:
> On Fri, Aug 20, 2021 at 04:17:24PM +0200, Christoph Hellwig wrote:
> > On Thu, Aug 19, 2021 at 04:29:29PM +0800, Zhenyu Wang wrote:
> > > I'm working on below patch to resolve this. But I met a weird issue in
> > > case when building i915 as module and also kvmgt module, it caused
> > > busy wait on request_module("kvmgt") when boot, it doesn't happen if
> > > building i915 into kernel. I'm not sure what could be the reason?
> > 
> > Luis, do you know if there is a problem with a request_module from
> > a driver ->probe routine that is probably called by a module_init
> > function itself?
> 
> Generally no, but you can easily foot yourself in the feet by creating
> cross dependencies and not dealing with them properly. I'd make sure
> to keep module initialization as simple as possible, and run whatever
> takes more time asynchronously, then use a state machine to allow
> you to verify where you are in the initialization phase or query it
> or wait for a completion with a timeout.
> 
> It seems the code in question is getting some spring cleaning, and its
> unclear where the code is I can inspect. If there's a tree somewhere I
> can take a peak I'd be happy to review possible oddities that may stick
> out.

I tried to put current patches under test here: 
https://github.com/intel/gvt-linux/tree/gvt-staging
The issue can be produced with CONFIG_DRM_I915=m and 
CONFIG_DRM_I915_GVT_KVMGT=m.

> 
> My goto model for these sorts of problems is to abstract the issue
> *outside* of the driver in question and implement new selftests to
> try to reproduce. This serves two purposes, 1) helps with testing
> 2) may allow you to see the problem more clearly.
> 

I'll see if can abstract that.

Thanks, Luis.


signature.asc
Description: PGP signature

Re: [Intel-gfx] refactor the i915 GVT support

2021-08-26 Thread Zhenyu Wang

On 2021.08.20 16:17:24 +0200, Christoph Hellwig wrote:
> On Thu, Aug 19, 2021 at 04:29:29PM +0800, Zhenyu Wang wrote:
> > I'm working on below patch to resolve this. But I met a weird issue in
> > case when building i915 as module and also kvmgt module, it caused
> > busy wait on request_module("kvmgt") when boot, it doesn't happen if
> > building i915 into kernel. I'm not sure what could be the reason?
> 
> Luis, do you know if there is a problem with a request_module from
> a driver ->probe routine that is probably called by a module_init
> function itself?
> 
> In the meantime I'll try to reproduce it locally, but I always had a
> hard time getting useful results out of a modular i915, especially
> when combined with module paramters. (no blame on i915, just the problem
> with modules needed early on).
> 
> > 
> > > But the problem I see is that after moving gvt device model (gvt/*.c
> > > except kvmgt.c) into kvmgt module, we'll have issue with initial mmio
> > > state which current gvt relies on, that is in design supposed to get
> > > initial HW state before i915 driver has taken any operation.  Before
> > > we can ensure that, I think we may only remove MPT part first but
> > > still keep gvt device model as part of i915 with config. I'll try to
> > > split that out.
> > 
> > Sorry I misread the code that as we always request kvmgt module when
> > gvt init, so it should still apply original method that this isn't a
> > problem. Our current validation result has shown no regression as well.
> 
> What does initial mmio state mean?  This is something new to me.  But
> as you said in this mail unless I missed something very big it should
> work the same as before.
>

It's gvt internal track for all gfx mmio state, and yes with your current
change it should still work as before.

> > -static inline void intel_context_unpin(struct intel_context *ce)
> > +static inline void _intel_context_unpin(struct intel_context *ce)
> >  {
> > if (!ce->ops->sched_disable) {
> > __intel_context_do_unpin(ce, 1);
> > @@ -150,6 +150,7 @@ static inline void intel_context_unpin(struct 
> > intel_context *ce)
> > }
> > }
> >  }
> > +void intel_context_unpin(struct intel_context *ce);
> 
> Looking at intel_context_unpin/_intel_context_unpin is there really
> a need to have this inline to start with?  It don't see much the compiler
> could optimize by inlining it.

I'll send patch to i915 for this, and get more comments there.

thanks


signature.asc
Description: PGP signature

Re: [Intel-gfx] refactor the i915 GVT support

2021-08-26 Thread Zhenyu Wang

On 2021.08.19 17:43:43 +0300, Joonas Lahtinen wrote:
> Quoting Zhenyu Wang (2021-08-19 11:29:29)
> > On 2021.08.17 13:22:03 +0800, Zhenyu Wang wrote:
> > > > On 2021.08.16 19:34:58 +0200, Christoph Hellwig wrote:
> > > > > Any updates on this?  I'd really hate to miss this merge window.
> > > > 
> > > > I'm still waiting for our validation team's report on this. I'm afraid
> > > > it might be missing for next version as i915 merge window is mostly
> > > > till rc5...and for any change outside of gvt, it still needs to be
> > > > acked by i915 maintainers.
> > > 
> > > Looks our validation team did have problem against recent i915 change.
> > > If you like to try, we have a gvt-staging branch on
> > > https://github.com/intel/gvt-linux which is generated against drm-tip
> > > with gvt changes for testing, currently it's broken.
> > > 
> > > One issue is with i915 export that intel_context_unpin has been
> > > changed into static inline function. Another is that intel_gvt.c
> > > should be part of i915 for gvt interface instead of depending on KVMGT
> > > config.
> > 
> > I'm working on below patch to resolve this. But I met a weird issue in
> > case when building i915 as module and also kvmgt module, it caused
> > busy wait on request_module("kvmgt") when boot, it doesn't happen if
> > building i915 into kernel. I'm not sure what could be the reason?
> > 
> > > But the problem I see is that after moving gvt device model (gvt/*.c
> > > except kvmgt.c) into kvmgt module, we'll have issue with initial mmio
> > > state which current gvt relies on, that is in design supposed to get
> > > initial HW state before i915 driver has taken any operation.
> 
> As mentioned in some past discussions, I think it would be best rely on
> golden MMIO located in /lib/firmware or elsewhere. This way we will better
> isolate the guest system from host system updates/changes.
> 
> This should also hopefully allow enabling kvmgt module after i915 has
> already loaded, as the initialization would not be conditional to
> capture the MMIO.
> 

I think the concern is that even for same GEN hw there could be many
variant platforms e.g APL with gen9, etc. To verify golden states for
them all might take too much effort...

> 
> > > Before
> > > we can ensure that, I think we may only remove MPT part first but
> > > still keep gvt device model as part of i915 with config. I'll try to
> > > split that out.
> > 
> > Sorry I misread the code that as we always request kvmgt module when
> > gvt init, so it should still apply original method that this isn't a
> > problem. Our current validation result has shown no regression as well.
> > 
> > ---8<---
> > From 58ff84572f1a0f9d79ca1d7ec0cff5ecbe78d280 Mon Sep 17 00:00:00 2001
> > From: Zhenyu Wang 
> > Date: Thu, 19 Aug 2021 16:36:33 +0800
> > Subject: [PATCH] TESTONLY:drm/i915/gvt: potential fix for refactor against
> >  current tip
> > 
> > ---
> >  drivers/gpu/drm/i915/Makefile   | 4 +++-
> >  drivers/gpu/drm/i915/gt/intel_context.c | 5 +
> >  drivers/gpu/drm/i915/gt/intel_context.h | 3 ++-
> >  drivers/gpu/drm/i915/i915_trace.h   | 1 +
> >  4 files changed, 11 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> > index c4f953837f72..2248574428a1 100644
> > --- a/drivers/gpu/drm/i915/Makefile
> > +++ b/drivers/gpu/drm/i915/Makefile
> > @@ -296,7 +296,9 @@ i915-$(CONFIG_DRM_I915_SELFTEST) += \
> >  
> >  # virtual gpu code
> >  i915-y += i915_vgpu.o
> > -i915-$(CONFIG_DRM_I915_GVT_KVMGT) += intel_gvt.o
> > +ifneq ($(CONFIG_DRM_I915_GVT_KVMGT),)
> > +i915-y += intel_gvt.o
> > +endif
> >  
> >  kvmgt-y += gvt/kvmgt.o \
> > gvt/gvt.o \
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
> > b/drivers/gpu/drm/i915/gt/intel_context.c
> > index 745e84c72c90..20e7522fed84 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.c
> > @@ -328,6 +328,11 @@ void __intel_context_do_unpin(struct intel_context 
> > *ce, int sub)
> > intel_context_put(ce);
> >  }
> >  
> > +void intel_context_unpin(struct intel_context *ce)
> > +{
> > +   _intel_context_unpin(ce);
> > +}
> > +
> >  static void __intel_context_retire(struct i915_active *active)
> >  {
> > struct intel_context *ce = container_of(active, typeof(*ce), 
> > active);
> > diff --git a/drivers/gpu/drm/i915/gt/intel_context.h 
> > b/drivers/gpu/drm/i915/gt/intel_context.h
> > index c41098950746..f942cbf6300a 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_context.h
> > +++ b/drivers/gpu/drm/i915/gt/intel_context.h
> > @@ -131,7 +131,7 @@ static inline void 
> > intel_context_sched_disable_unpin(struct intel_context *ce)
> > __intel_context_do_unpin(ce, 2);
> >  }
> >  
> > -static inline void intel_context_unpin(struct intel_context *ce)
> > +static inline void _intel_context_unpin(struct intel_context *ce)
> >  {
> > if (!ce->ops->sched_disable) {
> >

[Intel-gfx] ✗ Fi.CI.SPARSE: warning for Enable mipi dsi on XELPD (rev3)

2021-08-26 Thread Patchwork

== Series Details ==

Series: Enable mipi dsi on XELPD (rev3)
URL   : https://patchwork.freedesktop.org/series/93917/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
-
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/i915_gem_context.c:1374:34: warning: incorrect type 
in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:expected struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25:got struct 
i915_address_space *
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:43:25: warning: incorrect 
type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:expected struct 
i915_address_space *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34:got struct 
i915_address_space [noderef] __rcu *vm
+drivers/gpu/drm/i915/gem/selftests/mock_context.c:60:34: warning: incorrect 
type in argument 1 (different address spaces)
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy 
expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1392:5: warning: context imbalance in 
'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/i915_perf.c:1442:15: warning: memset with byte count of 
16777216
+drivers/gpu/drm/i915/i915_perf.c:1496:15: warning: memset with byte count of 
16777216
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative 
(-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative 
(-262080)
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 
'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' 
-

[Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915: Ensure wa_init_finish() is called for ctx workaround list

2021-08-26 Thread Patchwork

== Series Details ==

Series: drm/i915: Ensure wa_init_finish() is called for ctx workaround list
URL   : https://patchwork.freedesktop.org/series/94053/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10522 -> Patchwork_20897


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_20897 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20897, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_20897:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@gt_heartbeat:
- fi-rkl-11600:   [PASS][1] -> [DMESG-FAIL][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-11600/igt@i915_selftest@live@gt_heartbeat.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/fi-rkl-11600/igt@i915_selftest@live@gt_heartbeat.html

  
Known issues


  Here are the changes found in Patchwork_20897 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@amdgpu/amd_basic@cs-gfx:
- fi-rkl-guc: NOTRUN -> [SKIP][3] ([fdo#109315]) +17 similar issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/fi-rkl-guc/igt@amdgpu/amd_ba...@cs-gfx.html

  * igt@i915_selftest@live@gt_lrc:
- fi-rkl-guc: NOTRUN -> [DMESG-WARN][4] ([i915#3958])
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/fi-rkl-guc/igt@i915_selftest@live@gt_lrc.html

  * igt@kms_cursor_legacy@basic-flip-after-cursor-legacy:
- fi-rkl-11600:   [PASS][5] -> [SKIP][6] ([fdo#111825])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-11600/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/fi-rkl-11600/igt@kms_cursor_leg...@basic-flip-after-cursor-legacy.html

  
 Possible fixes 

  * igt@i915_selftest@live@workarounds:
- fi-rkl-guc: [INCOMPLETE][7] ([i915#3920]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10522/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/fi-rkl-guc/igt@i915_selftest@l...@workarounds.html

  
  [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315
  [fdo#111825]: https://bugs.freedesktop.org/show_bug.cgi?id=111825
  [i915#3920]: https://gitlab.freedesktop.org/drm/intel/issues/3920
  [i915#3958]: https://gitlab.freedesktop.org/drm/intel/issues/3958


Participating hosts (40 -> 33)
--

  Missing(7): fi-ilk-m540 bat-adls-5 fi-hsw-4200u fi-tgl-1115g4 fi-bsw-cyan 
fi-bdw-samus bat-jsl-1 


Build changes
-

  * Linux: CI_DRM_10522 -> Patchwork_20897

  CI-20190529: 20190529
  CI_DRM_10522: b9b50258869989a477e7c04ac6d21a6e3660048e @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6186: 250081b306c6fa8f95405fab6a7604f1968dd4ec @ 
https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20897: 62aa62c0a572e2f0d442321e5dc197c51e896e1c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

62aa62c0a572 drm/i915: Ensure wa_init_finish() is called for ctx workaround list

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20897/index.html

96 matches

Mail list logo