date:20200321

Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround

2020-03-21 Thread Dixit, Ashutosh

On Sat, 21 Mar 2020 16:26:42 -0700, Dixit, Ashutosh wrote:
>
> On Thu, 19 Mar 2020 15:52:01 -0700, Umesh Nerlige Ramappa wrote:
> >
> > From: Lionel Landwerlin 
>
> > @@ -477,16 +468,6 @@ static bool oa_buffer_check_unlocked(struct 
> > i915_perf_stream *stream)
> >  */
> > spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
> >
> > hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
> >
> > hw_tail &= ~(report_size - 1);
> >
> > @@ -496,64 +477,64 @@ static bool oa_buffer_check_unlocked(struct 
> > i915_perf_stream *stream)
> >
> > now = ktime_get_mono_fast_ns();
> >
> > +   if (hw_tail == stream->oa_buffer.aging_tail &&
> > +  (now - stream->oa_buffer.aging_timestamp) > OA_TAIL_MARGIN_NSEC) {
> > +   /* If the HW tail hasn't move since the last check and the HW
> > +* tail has been aging for long enough, declare it the new
> > +* tail.
> > +*/
> > +   stream->oa_buffer.tail = stream->oa_buffer.aging_tail;
> > +   } else {
> > +   u32 head, tail;
> >
> > +   /* NB: The head we observe here might effectively be a little
> > +* out of date. If a read() is in progress, the head could be
> > +* anywhere between this head and stream->oa_buffer.tail.
> > +*/
> > +   head = stream->oa_buffer.head - gtt_offset;
> >
> > +   hw_tail -= gtt_offset;
> > +   tail = hw_tail;
> >
> > +   /* Walk the stream backward until we find a report with dword 0
> > +* & 1 not at 0. Since the circular buffer pointers progress by
> > +* increments of 64 bytes and that reports can be up to 256
> > +* bytes long, we can't tell whether a report has fully landed
> > +* in memory before the first 2 dwords of the following report
> > +* have effectively landed.
> > +*
> > +* This is assuming that the writes of the OA unit land in
> > +* memory in the order they were written to.
> > +* If not : (╯°□°）╯︵ ┻━┻
> >  */
> > +   while (OA_TAKEN(tail, head) >= report_size) {
> > +   u32 previous_tail = (tail - report_size) & 
> > (OA_BUFFER_SIZE - 1);
> > +   u32 *report32 = (void *)(stream->oa_buffer.vaddr + 
> > previous_tail);
>
> Sorry, this is wrong. This should just be:
>
>   tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
>   report32 = (void *)(stream->oa_buffer.vaddr + tail);
>
> Otherwise when we break out of the loop below tail is still set one
> report_size ahead. previous_tail is not needed. (In the previous version of
> the patch this used to work out correctly).
>
> > +
> > +   /* Head of the report indicated by the HW tail register 
> > has
> > +* indeed landed into memory.
> > +*/
> > +   if (report32[0] != 0 || report32[1] != 0)
> > +   break;
> > +
> > +   tail = previous_tail;
> > }

Actually a couple of further improvements to the loop above are
possible. First there is no reason to start at previous_tail, we can just
start at the aligned hw_tail itself. Therefore the loop becomes:

while (OA_TAKEN(tail, head) >= report_size) {
u32 *report32 = (void *)(stream->oa_buffer.vaddr + 
tail);

if (report32[0] != 0 || report32[1] != 0)
break;

tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
}

Further, there is no reason to go back to the head but only to the old
tail. Therefore:

head = stream->oa_buffer.head - gtt_offset;
old_tail = stream->oa_buffer.tail - gtt_offset;

hw_tail -= gtt_offset;
tail = hw_tail;

while (OA_TAKEN(tail, old_tail) >= report_size) {
u32 *report32 = (void *)(stream->oa_buffer.vaddr + 
tail);

if (report32[0] != 0 || report32[1] != 0)
break;

tail = (tail - report_size) & (OA_BUFFER_SIZE - 1);
}

Please review and see if these two improvements are possible. Thanks!
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround

2020-03-21 Thread Dixit, Ashutosh

On Thu, 19 Mar 2020 15:52:01 -0700, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin 
>
> We're about to introduce an options to open the perf stream, giving
> the user ability to configure how often it wants the kernel to poll
> the OA registers for available data.
>
> Right now the workaround against the OA tail pointer race condition
> requires at least twice the internal kernel polling timer to make any
> data available.
>
> This changes introduce checks on the OA data written into the circular
> buffer to make as much data as possible available on the first
> iteration of the polling timer.

/snip/

> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
> b/drivers/gpu/drm/i915/i915_perf.c
> index 3222f6cd8255..c1429d3acaf9 100644
> --- a/drivers/gpu/drm/i915/i915_perf.c
> +++ b/drivers/gpu/drm/i915/i915_perf.c
> @@ -223,26 +223,17 @@
>   *
>   * Although this can be observed explicitly while copying reports to 
> userspace
>   * by checking for a zeroed report-id field in tail reports, we want to 
> account
> - * for this earlier, as part of the oa_buffer_check to avoid lots of 
> redundant
> - * read() attempts.
> - *
> - * In effect we define a tail pointer for reading that lags the real tail
> - * pointer by at least %OA_TAIL_MARGIN_NSEC nanoseconds, which gives enough
> - * time for the corresponding reports to become visible to the CPU.
> - *
> - * To manage this we actually track two tail pointers:
> - *  1) An 'aging' tail with an associated timestamp that is tracked until we
> - * can trust the corresponding data is visible to the CPU; at which point
> - * it is considered 'aged'.
> - *  2) An 'aged' tail that can be used for read()ing.
> - *
> - * The two separate pointers let us decouple read()s from tail pointer aging.
> - *
> - * The tail pointers are checked and updated at a limited rate within a 
> hrtimer
> - * callback (the same callback that is used for delivering EPOLLIN events)
> - *
> - * Initially the tails are marked invalid with %INVALID_TAIL_PTR which
> - * indicates that an updated tail pointer is needed.
> + * for this earlier, as part of the oa_buffer_check_unlocked to avoid lots of
> + * redundant read() attempts.
> + *
> + * We workaround this issue in oa_buffer_check_unlocked() by reading the 
> reports
> + * in the OA buffer, starting from the tail reported by the HW until we find 
> 2
> + * consecutive reports with their first 2 dwords of not at 0. Those dwords 
> are

until we find a report with its first 2 dwords not 0 meaning its previous
report is completely in memory and ready to be read.

> + * also set to 0 once read and the whole buffer is cleared upon OA buffer
> + * initialization. The first dword is the reason for this report while the
> + * second is the timestamp, making the chances of having those 2 fields at 0
> + * fairly unlikely. A more detailed explanation is available in
> + * oa_buffer_check_unlocked().

> @@ -477,16 +468,6 @@ static bool oa_buffer_check_unlocked(struct 
> i915_perf_stream *stream)
>*/
>   spin_lock_irqsave(&stream->oa_buffer.ptr_lock, flags);
>
>   hw_tail = stream->perf->ops.oa_hw_tail_read(stream);
>
>   hw_tail &= ~(report_size - 1);
>
> @@ -496,64 +477,64 @@ static bool oa_buffer_check_unlocked(struct 
> i915_perf_stream *stream)
>
>   now = ktime_get_mono_fast_ns();
>
> + if (hw_tail == stream->oa_buffer.aging_tail &&
> +(now - stream->oa_buffer.aging_timestamp) > OA_TAIL_MARGIN_NSEC) {
> + /* If the HW tail hasn't move since the last check and the HW
> +  * tail has been aging for long enough, declare it the new
> +  * tail.
> +  */
> + stream->oa_buffer.tail = stream->oa_buffer.aging_tail;
> + } else {
> + u32 head, tail;
>
> + /* NB: The head we observe here might effectively be a little
> +  * out of date. If a read() is in progress, the head could be
> +  * anywhere between this head and stream->oa_buffer.tail.
> +  */
> + head = stream->oa_buffer.head - gtt_offset;
>
> + hw_tail -= gtt_offset;
> + tail = hw_tail;
>
> + /* Walk the stream backward until we find a report with dword 0
> +  * & 1 not at 0. Since the circular buffer pointers progress by
> +  * increments of 64 bytes and that reports can be up to 256
> +  * bytes long, we can't tell whether a report has fully landed
> +  * in memory before the first 2 dwords of the following report
> +  * have effectively landed.
> +  *
> +  * This is assuming that the writes of the OA unit land in
> +  * memory in the order they were written to.
> +  * If not : (╯°□°）╯︵ ┻━┻
>*/
> - if (hw_tail >= gtt_offset &&
> - hw_tail < (gtt_offset + OA_BUFFER_SIZE)) {
> - stream->oa_buffer.tails[!aged_id

[Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts (rev2)

2020-03-21 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] drm/i915: Extend intel_wakeref to support 
delayed puts (rev2)
URL   : https://patchwork.freedesktop.org/series/74938/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8167_full -> Patchwork_17043_full


Summary
---

  **WARNING**

  Minor unknown changes coming with Patchwork_17043_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17043_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
---

  Here are the unknown changes that may have been introduced in 
Patchwork_17043_full:

### IGT changes ###

 Warnings 

  * igt@gem_ctx_persistence@close-replace-race:
- shard-apl:  [INCOMPLETE][1] ([fdo#103927] / [i915#1402]) -> 
[DMESG-WARN][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-apl6/igt@gem_ctx_persiste...@close-replace-race.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-apl2/igt@gem_ctx_persiste...@close-replace-race.html

  
New tests
-

  New tests have been introduced between CI_DRM_8167_full and 
Patchwork_17043_full:

### New IGT tests (3) ###

  * igt@gem_mmap@pf-nonblock:
- Statuses : 8 pass(s)
- Exec time: [0.00, 0.01] s

  * igt@gem_mmap_gtt@pf-nonblock:
- Statuses : 7 pass(s)
- Exec time: [0.00, 0.02] s

  * igt@gem_mmap_wc@pf-nonblock:
- Statuses : 8 pass(s)
- Exec time: [0.00, 0.01] s

  

Known issues


  Here are the changes found in Patchwork_17043_full that come from known 
issues:

### IGT changes ###

 Issues hit 

  * igt@gem_busy@busy-vcs1:
- shard-iclb: [PASS][3] -> [SKIP][4] ([fdo#112080]) +21 similar 
issues
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-iclb2/igt@gem_b...@busy-vcs1.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-iclb6/igt@gem_b...@busy-vcs1.html

  * igt@gem_ctx_isolation@rcs0-s3:
- shard-kbl:  [PASS][5] -> [DMESG-WARN][6] ([i915#180]) +1 similar 
issue
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-kbl2/igt@gem_ctx_isolat...@rcs0-s3.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-kbl7/igt@gem_ctx_isolat...@rcs0-s3.html

  * igt@gem_ctx_shared@exec-single-timeline-bsd:
- shard-iclb: [PASS][7] -> [SKIP][8] ([fdo#110841])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-iclb5/igt@gem_ctx_sha...@exec-single-timeline-bsd.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-iclb2/igt@gem_ctx_sha...@exec-single-timeline-bsd.html

  * igt@gem_exec_balancer@hang:
- shard-tglb: [PASS][9] -> [FAIL][10] ([i915#1277])
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-tglb6/igt@gem_exec_balan...@hang.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-tglb6/igt@gem_exec_balan...@hang.html

  * igt@gem_exec_schedule@implicit-read-write-bsd2:
- shard-iclb: [PASS][11] -> [SKIP][12] ([fdo#109276] / [i915#677])
   [11]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-iclb2/igt@gem_exec_sched...@implicit-read-write-bsd2.html
   [12]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-iclb6/igt@gem_exec_sched...@implicit-read-write-bsd2.html

  * igt@gem_exec_schedule@implicit-write-read-bsd:
- shard-iclb: [PASS][13] -> [SKIP][14] ([i915#677])
   [13]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-iclb8/igt@gem_exec_sched...@implicit-write-read-bsd.html
   [14]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-iclb4/igt@gem_exec_sched...@implicit-write-read-bsd.html

  * igt@gem_exec_schedule@in-order-bsd:
- shard-iclb: [PASS][15] -> [SKIP][16] ([fdo#112146]) +7 similar 
issues
   [15]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-iclb5/igt@gem_exec_sched...@in-order-bsd.html
   [16]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-iclb2/igt@gem_exec_sched...@in-order-bsd.html

  * igt@i915_pm_dc@dc6-psr:
- shard-skl:  [PASS][17] -> [FAIL][18] ([i915#454])
   [17]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-skl6/igt@i915_pm...@dc6-psr.html
   [18]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-skl10/igt@i915_pm...@dc6-psr.html

  * igt@i915_selftest@live@mman:
- shard-apl:  [PASS][19] -> [INCOMPLETE][20] ([fdo#103927])
   [19]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/shard-apl4/igt@i915_selftest@l...@mman.html
   [20]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/shard-apl7/igt@i915_selftest@l...@mman.html

  * igt@kms_cursor_crc@pipe-a-cursor-suspend:
- shard-skl:  [PASS][21] -> [INCOMPL

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts (rev2)

2020-03-21 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] drm/i915: Extend intel_wakeref to support 
delayed puts (rev2)
URL   : https://patchwork.freedesktop.org/series/74938/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_8167 -> Patchwork_17043


Summary
---

  **SUCCESS**

  No regressions found.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/index.html

Known issues


  Here are the changes found in Patchwork_17043 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@execlists:
- fi-bxt-dsi: [PASS][1] -> [INCOMPLETE][2] ([fdo#103927] / 
[i915#656])
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-bxt-dsi/igt@i915_selftest@l...@execlists.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/fi-bxt-dsi/igt@i915_selftest@l...@execlists.html

  
 Possible fixes 

  * igt@kms_chamelium@hdmi-crc-fast:
- fi-icl-u2:  [FAIL][3] ([fdo#109635] / [i915#217]) -> [PASS][4]
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-icl-u2/igt@kms_chamel...@hdmi-crc-fast.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/fi-icl-u2/igt@kms_chamel...@hdmi-crc-fast.html

  * igt@kms_chamelium@hdmi-hpd-fast:
- fi-kbl-7500u:   [FAIL][5] ([fdo#111407]) -> [PASS][6]
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-kbl-7500u/igt@kms_chamel...@hdmi-hpd-fast.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/fi-kbl-7500u/igt@kms_chamel...@hdmi-hpd-fast.html

  
 Warnings 

  * igt@gem_exec_suspend@basic-s4-devices:
- fi-tgl-y:   [FAIL][7] ([CI#94]) -> [INCOMPLETE][8] ([CI#94] / 
[i915#460])
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-tgl-y/igt@gem_exec_susp...@basic-s4-devices.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/fi-tgl-y/igt@gem_exec_susp...@basic-s4-devices.html

  
  [CI#94]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/94
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635
  [fdo#111407]: https://bugs.freedesktop.org/show_bug.cgi?id=111407
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#460]: https://gitlab.freedesktop.org/drm/intel/issues/460
  [i915#656]: https://gitlab.freedesktop.org/drm/intel/issues/656


Participating hosts (49 -> 42)
--

  Additional (1): fi-kbl-7560u 
  Missing(8): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan 
fi-ctg-p8600 fi-pnv-d510 fi-byt-clapper fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8167 -> Patchwork_17043

  CI-20190529: 20190529
  CI_DRM_8167: b51a7e7f4f72cf780661a1e4b479e2b27ddbafc8 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5526: f49ebeee9f54d6f23c60a842f75f65561d452ab0 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17043: bed27875a49f92e4f6f2c7d301067b7d3cb3e86c @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

bed27875a49f drm/i915/execlists: Pull tasklet interrupt-bh local to direct 
submission
1aa7fb03cf0c drm/i915: Rely on direct submission to the queue
a3349fcb8e49 drm/i915/gt: Delay release of engine-pm after last retirement
381c9adb228f drm/i915: Extend intel_wakeref to support delayed puts

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17043/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts

2020-03-21 Thread Patchwork

== Series Details ==

Series: series starting with [1/4] drm/i915: Extend intel_wakeref to support 
delayed puts
URL   : https://patchwork.freedesktop.org/series/74938/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_8167 -> Patchwork_17042


Summary
---

  **FAILURE**

  Serious unknown changes coming with Patchwork_17042 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_17042, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/index.html

Possible new issues
---

  Here are the unknown changes that may have been introduced in Patchwork_17042:

### IGT changes ###

 Possible regressions 

  * igt@i915_selftest@live@mman:
- fi-icl-dsi: [PASS][1] -> [INCOMPLETE][2]
   [1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-icl-dsi/igt@i915_selftest@l...@mman.html
   [2]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/fi-icl-dsi/igt@i915_selftest@l...@mman.html

  
Known issues


  Here are the changes found in Patchwork_17042 that come from known issues:

### IGT changes ###

 Issues hit 

  * igt@i915_selftest@live@gem_contexts:
- fi-cml-s:   [PASS][3] -> [DMESG-FAIL][4] ([i915#877])
   [3]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-cml-s/igt@i915_selftest@live@gem_contexts.html
   [4]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/fi-cml-s/igt@i915_selftest@live@gem_contexts.html
- fi-cfl-guc: [PASS][5] -> [INCOMPLETE][6] ([fdo#106070] / 
[i915#424])
   [5]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-cfl-guc/igt@i915_selftest@live@gem_contexts.html
   [6]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/fi-cfl-guc/igt@i915_selftest@live@gem_contexts.html

  
 Possible fixes 

  * igt@gem_exec_suspend@basic-s4-devices:
- fi-tgl-y:   [FAIL][7] ([CI#94]) -> [PASS][8]
   [7]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-tgl-y/igt@gem_exec_susp...@basic-s4-devices.html
   [8]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/fi-tgl-y/igt@gem_exec_susp...@basic-s4-devices.html

  * igt@kms_chamelium@hdmi-crc-fast:
- fi-icl-u2:  [FAIL][9] ([fdo#109635] / [i915#217]) -> [PASS][10]
   [9]: 
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_8167/fi-icl-u2/igt@kms_chamel...@hdmi-crc-fast.html
   [10]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/fi-icl-u2/igt@kms_chamel...@hdmi-crc-fast.html

  
  [CI#94]: https://gitlab.freedesktop.org/gfx-ci/i915-infra/issues/94
  [fdo#106070]: https://bugs.freedesktop.org/show_bug.cgi?id=106070
  [fdo#109635]: https://bugs.freedesktop.org/show_bug.cgi?id=109635
  [i915#217]: https://gitlab.freedesktop.org/drm/intel/issues/217
  [i915#424]: https://gitlab.freedesktop.org/drm/intel/issues/424
  [i915#877]: https://gitlab.freedesktop.org/drm/intel/issues/877


Participating hosts (49 -> 36)
--

  Missing(13): fi-ilk-m540 fi-hsw-4200u fi-byt-j1900 fi-skl-guc 
fi-byt-squawks fi-bsw-cyan fi-kbl-7500u fi-ctg-p8600 fi-gdg-551 fi-cfl-8109u 
fi-byt-clapper fi-bsw-nick fi-bdw-samus 


Build changes
-

  * CI: CI-20190529 -> None
  * Linux: CI_DRM_8167 -> Patchwork_17042

  CI-20190529: 20190529
  CI_DRM_8167: b51a7e7f4f72cf780661a1e4b479e2b27ddbafc8 @ 
git://anongit.freedesktop.org/gfx-ci/linux
  IGT_5526: f49ebeee9f54d6f23c60a842f75f65561d452ab0 @ 
git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_17042: ad93345a88ebe797bf424c9cd6a32375228d30a3 @ 
git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ad93345a88eb drm/i915/execlists: Pull tasklet interrupt-bh local to direct 
submission
87403a98caa1 drm/i915: Rely on direct submission to the queue
a4eb01fe1100 drm/i915/gt: Delay release of engine-pm after last retirement
ba10302d1256 drm/i915: Extend intel_wakeref to support delayed puts

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_17042/index.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 4/4] drm/i915/execlists: Pull tasklet interrupt-bh local to direct submission

2020-03-21 Thread Chris Wilson

We dropped calling process_csb prior to handling direct submission in
order to avoid the nesting of spinlocks and lift process_csb() and the
majority of the tasklet out of irq-off. However, we do want to avoid
ksoftirqd latency in the fast path, so try and pull the interrupt-bh
local to direct submission if we can acquire the tasklet's lock.

Signed-off-by: Chris Wilson 
Cc: Tvrtko Ursulin 
---
 drivers/gpu/drm/i915/gt/intel_lrc.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 210f60e14ef4..82dee2141b46 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2891,6 +2891,12 @@ static void __submit_queue_imm(struct intel_engine_cs 
*engine)
if (reset_in_progress(execlists))
return; /* defer until we restart the engine following reset */
 
+   /* Hopefully we clear execlists->pending[] to let us through */
+   if (execlists->pending[0] && tasklet_trylock(&execlists->tasklet)) {
+   process_csb(engine);
+   tasklet_unlock(&execlists->tasklet);
+   }
+
__execlists_submission_tasklet(engine);
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 1/4] drm/i915: Extend intel_wakeref to support delayed puts

2020-03-21 Thread Chris Wilson

In some cases we want to hold onto the wakeref for a little after the
last user so that we can avoid having to drop and then immediately
reacquire it. Allow the last user to specify if they would like to keep
the wakeref alive for a short hysteresis.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_engine_pm.h   |  6 ++
 drivers/gpu/drm/i915/gt/intel_gt_requests.c |  2 +-
 drivers/gpu/drm/i915/intel_wakeref.c| 11 ++-
 drivers/gpu/drm/i915/intel_wakeref.h| 10 --
 4 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h 
b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
index e52c2b0cb245..418df0a13145 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h
@@ -37,6 +37,12 @@ static inline void intel_engine_pm_put_async(struct 
intel_engine_cs *engine)
intel_wakeref_put_async(&engine->wakeref);
 }
 
+static inline void intel_engine_pm_put_delay(struct intel_engine_cs *engine,
+unsigned long delay)
+{
+   intel_wakeref_put_delay(&engine->wakeref, delay);
+}
+
 static inline void intel_engine_pm_flush(struct intel_engine_cs *engine)
 {
intel_wakeref_unlock_wait(&engine->wakeref);
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c 
b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
index 24c99d0838af..835ec184763e 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -38,7 +38,7 @@ static bool flush_submission(struct intel_gt *gt)
for_each_engine(engine, gt, id) {
intel_engine_flush_submission(engine);
active |= flush_work(&engine->retire_work);
-   active |= flush_work(&engine->wakeref.work);
+   active |= flush_delayed_work(&engine->wakeref.work);
}
 
return active;
diff --git a/drivers/gpu/drm/i915/intel_wakeref.c 
b/drivers/gpu/drm/i915/intel_wakeref.c
index 8fbf6f4d3f26..2977bc0428e2 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.c
+++ b/drivers/gpu/drm/i915/intel_wakeref.c
@@ -70,11 +70,11 @@ static void intel_wakeref_put_last(struct intel_wakeref 
*wf)
 
 void __intel_wakeref_put_last(struct intel_wakeref *wf, unsigned long flags)
 {
-   INTEL_WAKEREF_BUG_ON(work_pending(&wf->work));
+   INTEL_WAKEREF_BUG_ON(delayed_work_pending(&wf->work));
 
/* Assume we are not in process context and so cannot sleep. */
if (flags & INTEL_WAKEREF_PUT_ASYNC || !mutex_trylock(&wf->mutex)) {
-   schedule_work(&wf->work);
+   mod_delayed_work(system_wq, &wf->work, flags >> 1);
return;
}
 
@@ -83,7 +83,7 @@ void __intel_wakeref_put_last(struct intel_wakeref *wf, 
unsigned long flags)
 
 static void __intel_wakeref_put_work(struct work_struct *wrk)
 {
-   struct intel_wakeref *wf = container_of(wrk, typeof(*wf), work);
+   struct intel_wakeref *wf = container_of(wrk, typeof(*wf), work.work);
 
if (atomic_add_unless(&wf->count, -1, 1))
return;
@@ -104,8 +104,9 @@ void __intel_wakeref_init(struct intel_wakeref *wf,
atomic_set(&wf->count, 0);
wf->wakeref = 0;
 
-   INIT_WORK(&wf->work, __intel_wakeref_put_work);
-   lockdep_init_map(&wf->work.lockdep_map, "wakeref.work", &key->work, 0);
+   INIT_DELAYED_WORK(&wf->work, __intel_wakeref_put_work);
+   lockdep_init_map(&wf->work.work.lockdep_map,
+"wakeref.work", &key->work, 0);
 }
 
 int intel_wakeref_wait_for_idle(struct intel_wakeref *wf)
diff --git a/drivers/gpu/drm/i915/intel_wakeref.h 
b/drivers/gpu/drm/i915/intel_wakeref.h
index 7d1e676b71ef..e87532e282d2 100644
--- a/drivers/gpu/drm/i915/intel_wakeref.h
+++ b/drivers/gpu/drm/i915/intel_wakeref.h
@@ -41,7 +41,7 @@ struct intel_wakeref {
struct intel_runtime_pm *rpm;
const struct intel_wakeref_ops *ops;
 
-   struct work_struct work;
+   struct delayed_work work;
 };
 
 struct intel_wakeref_lockclass {
@@ -154,6 +154,12 @@ intel_wakeref_put_async(struct intel_wakeref *wf)
__intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC);
 }
 
+static inline void
+intel_wakeref_put_delay(struct intel_wakeref *wf, unsigned long delay)
+{
+   __intel_wakeref_put(wf, INTEL_WAKEREF_PUT_ASYNC | delay << 1);
+}
+
 /**
  * intel_wakeref_lock: Lock the wakeref (mutex)
  * @wf: the wakeref
@@ -194,7 +200,7 @@ intel_wakeref_unlock_wait(struct intel_wakeref *wf)
 {
mutex_lock(&wf->mutex);
mutex_unlock(&wf->mutex);
-   flush_work(&wf->work);
+   flush_delayed_work(&wf->work);
 }
 
 /**
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 2/4] drm/i915/gt: Delay release of engine-pm after last retirement

2020-03-21 Thread Chris Wilson

Keep the engine-pm awake until the next jiffie, to avoid immediate
ping-pong under moderate load. (Forcing the idle barrier excerbates the
moderate load, dramatically increasing the driver overhead.) On the
other hand, delaying the idle-barrier slightly incurs longer rc6-off
and so more power consumption.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/848
Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gt/intel_context.c | 2 +-
 drivers/gpu/drm/i915/i915_active.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_context.c 
b/drivers/gpu/drm/i915/gt/intel_context.c
index 7132bf616cc4..be4bb62425ca 100644
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -350,7 +350,7 @@ void intel_context_enter_engine(struct intel_context *ce)
 void intel_context_exit_engine(struct intel_context *ce)
 {
intel_timeline_exit(ce->timeline);
-   intel_engine_pm_put(ce->engine);
+   intel_engine_pm_put_delay(ce->engine, 1);
 }
 
 int intel_context_prepare_remote_request(struct intel_context *ce,
diff --git a/drivers/gpu/drm/i915/i915_active.c 
b/drivers/gpu/drm/i915/i915_active.c
index c4048628188a..a0d31f7bfb42 100644
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -818,7 +818,7 @@ void i915_active_acquire_barrier(struct i915_active *ref)
 
GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
llist_add(barrier_to_ll(node), &engine->barrier_tasks);
-   intel_engine_pm_put(engine);
+   intel_engine_pm_put_delay(engine, 1);
}
 }
 
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

[Intel-gfx] [PATCH 3/4] drm/i915: Rely on direct submission to the queue

2020-03-21 Thread Chris Wilson

Drop the pretense of kicking the tasklet (used only for the defunct guc
submission backend, it should just take ownership of the submit!) and so
remove the bh-kicking from around submission.

Signed-off-by: Chris Wilson 
---
 drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 2 --
 drivers/gpu/drm/i915/gt/intel_lrc.c| 5 +
 drivers/gpu/drm/i915/i915_request.c| 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c 
b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 36d069504836..c2bd5accde0c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2348,9 +2348,7 @@ static void eb_request_add(struct i915_execbuffer *eb)
__i915_request_skip(rq);
}
 
-   local_bh_disable();
__i915_request_queue(rq, &attr);
-   local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
 
/* Try to clean up the client's timeline after submitting the request */
if (prev)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index f09dd87324b9..210f60e14ef4 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2891,10 +2891,7 @@ static void __submit_queue_imm(struct intel_engine_cs 
*engine)
if (reset_in_progress(execlists))
return; /* defer until we restart the engine following reset */
 
-   if (execlists->tasklet.func == execlists_submission_tasklet)
-   __execlists_submission_tasklet(engine);
-   else
-   tasklet_hi_schedule(&execlists->tasklet);
+   __execlists_submission_tasklet(engine);
 }
 
 static void submit_queue(struct intel_engine_cs *engine,
diff --git a/drivers/gpu/drm/i915/i915_request.c 
b/drivers/gpu/drm/i915/i915_request.c
index c0df71d7d0ff..3388c5b610c5 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -1444,9 +1444,7 @@ void i915_request_add(struct i915_request *rq)
if (list_empty(&rq->sched.signalers_list))
attr.priority |= I915_PRIORITY_WAIT;
 
-   local_bh_disable();
__i915_request_queue(rq, &attr);
-   local_bh_enable(); /* Kick the execlists tasklet if just scheduled */
 
mutex_unlock(&tl->mutex);
 }
-- 
2.20.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Pull tasklet interrupt-bh local to direct submission

2020-03-21 Thread Chris Wilson

Quoting Chris Wilson (2020-03-20 17:47:45)
> We dropped calling process_csb prior to handling direct submission in
> order to avoid the nesting of spinlocks and lift process_csb() and the
> majority of the tasklet out of irq-off. However, we do want to avoid
> ksoftirqd latency in the fast path, so try and pull the interrupt-bh
> local to direct submission if we can acquire the tasklet's lock.
> 
> v2: Tweak the balance to avoid over submitting lite-restores

Sigh. It's a driver problem with lock contention, mostly from aggressive
idling.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH v10 00/12] Convert PWM period and duty cycle to u64

2020-03-21 Thread Dan Carpenter

This is a giant CC list.

There was one version where you CC'd me on patch 6/12 but after that you
just CC'd me on the cover page.  Something is messed up in your scripts
because Cc'ing me on just the cover is pointless.

regards,
dan carpenter

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 3/3] drm/i915/perf: add new open param to configure polling of OA buffer

2020-03-21 Thread Dixit, Ashutosh

On Thu, 19 Mar 2020 15:52:03 -0700, Umesh Nerlige Ramappa wrote:
>
> From: Lionel Landwerlin 
>
> This new parameter let's the application choose how often the OA
> buffer should be checked on the CPU side for data availability. Longer
> polling period tend to reduce CPU overhead if the application does not
> care about somewhat real time data collection.

/snip/

> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index db649d03ab52..c781b9f31b3c 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -1980,6 +1980,19 @@ enum drm_i915_perf_property_id {
>*/
>   DRM_I915_PERF_PROP_GLOBAL_SSEU,
>
> + /**
> +  * This optional parameter specifies the timer interval in nanoseconds
> +  * at which the i915 driver will check the OA buffer for available data.
> +  * Minimum allowed value is 100 microseconds. A default value is used by
> +  * the driver if this parameter is not specified. Note that a large
> +  * value may reduce cpu consumption during OA perf captures, but it
> +  * would also potentially result in OA buffer overwrite as the captures
> +  * reach end of the OA buffer.

I would like to change the wording here a bit because the wording above
seems to say that the /same/ timer period which reduces cpu consumption
will also potentially result in OA buffer overwrites. This is misleading
and not true. Instead we should say something like: "Note that larger timer
values will reduce cpu consumption during OA perf captures. However,
excessively large values would potentially result in OA buffer overwrites
as captures reach end of the OA buffer".

Maybe some guidance to selecting the value, say based on sampling rate and
OA buffer size, could also be added here but at the minimum we should fix
up the wording as indicated above before pushing. Otherwise:

Reviewed-by: Ashutosh Dixit 

> +  *
> +  * This property is available in perf revision 5.
> +  */
> + DRM_I915_PERF_PROP_POLL_OA_PERIOD,
> +
>   DRM_I915_PERF_PROP_MAX /* non-ABI */
>  };
>
> --
> 2.20.1
>
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround

Re: [Intel-gfx] [PATCH 1/3] drm/i915/perf: rework aging tail workaround

[Intel-gfx] ✓ Fi.CI.IGT: success for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts (rev2)

[Intel-gfx] ✓ Fi.CI.BAT: success for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts (rev2)

[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/4] drm/i915: Extend intel_wakeref to support delayed puts

[Intel-gfx] [PATCH 4/4] drm/i915/execlists: Pull tasklet interrupt-bh local to direct submission

[Intel-gfx] [PATCH 1/4] drm/i915: Extend intel_wakeref to support delayed puts

[Intel-gfx] [PATCH 2/4] drm/i915/gt: Delay release of engine-pm after last retirement

[Intel-gfx] [PATCH 3/4] drm/i915: Rely on direct submission to the queue

Re: [Intel-gfx] [PATCH] drm/i915/execlists: Pull tasklet interrupt-bh local to direct submission

Re: [Intel-gfx] [PATCH v10 00/12] Convert PWM period and duty cycle to u64

Re: [Intel-gfx] [PATCH 3/3] drm/i915/perf: add new open param to configure polling of OA buffer

12 matches

Site Navigation

Mail list logo

Footer information