Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Per-crtc/connector DRRS debugfs (rev2)

2022-10-03 Thread Sarvela, Tomi P

> On Sat, Oct 01, 2022 at 11:23:17PM -, Patchwork wrote:
> >   * igt@gem_exec_balancer@parallel-balancer:
> > - shard-iclb: [PASS][58] -> [SKIP][59] ([i915#4525])
> >[58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12204/shard-
> iclb2/igt@gem_exec_balan...@parallel-balancer.html
> >[59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_109175v2/shard-
> iclb5/igt@gem_exec_balan...@parallel-balancer.html
> 
> shard-iclb skips most DRRS tests, but does execute a few which is
> weird.
> 
> I spot checked a few of he logs and saw at least three different panels
> being used:
> 1. using preferred EDID fixed mode: "2560x1440": 60 241750 2560 2608 2640
> 2720 1440 1443 1448 1481 0x48 0xa
> 2. using preferred EDID fixed mode: "1920x1080": 60 141000 1920 1936 1952
> 2104 1080 1083 1097 1116 0x48 0xa
> 3. using preferred EDID fixed mode: "1920x1080": 60 138780 1920 1966 1996
> 2080 1080 1082 1086 1112 0x48 0xa
>using alternate EDID fixed mode: "1920x1080": 40 92520 1920 1966 1996 2080
> 1080 1082 1086 1112 0x40 0xa
> 
> So the DRRS tests were only executed when they ended up on machine with
> panel 3.
> 
> Having different panels between the machines in the shard pool is not
> great. We can get all kinds of pingpongs depending on how tests get
> scheduled to individual machines.

ICL-shard is, sadly, heterogenous bunch. In addition to different panels,
the CPUs themselves are different even if fusing is done to make them
look almost the same.

Considering that there are not many ICLs on market and we don't still
have good choices for production CI hardware (as opposed to eg. TGL
where we have both pre-prod and prod hardware readily available),
we should consider taking shard-iclb out and leaving couple of them
for BAT runs.

Tomi


Re: [Intel-gfx] [CI] Maintenance: Intel-GFX-CI down on weekend 27.5.-29.5.

2022-05-27 Thread Sarvela, Tomi P
GFX-CI will be quieting down today for hardware migration.
New builds will be stopped soon to drain the shard queue.
Existing results on https://intel-gfx-ci.01.org/ will be available
during migration.

CI data migration is estimated to take two days. After syncing,
functional testing will proceed. If tests pass, partial functionality
(DRM-Tip) is restored on Sunday and full functionality on Monday.

Tomi


From: Sarvela, Tomi P
Sent: Friday, May 20, 2022 11:38 AM

This is early warning about GFX-CI going down for maintenance
on next week weekend Fri 27.5. to Sun 29.5.

GFX-CI will be migrating to new hardware starting Friday 27.5. and
estimated downtime is at least 48h (CIBuglog database dump/import).
During downtime, no new builds are done and no testing is performed.

If all goes well, CI will be running on new server on Monday 30.5.
If not, the old server is rolled back by changing IP and the services
will be restarted on old hardware.

Visible improvement will be doubling of disk space, which
allows us to have more history stored.

Weekends usually are quiet on mailing lists, so I hope this
won't be too much of an inconvenience.

Best Regards,

Tomi Sarvela
--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo



Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/edid: expand on struct drm_edid usage

2022-05-24 Thread Sarvela, Tomi P
> From: Nikula, Jani 
> 
> On Tue, 24 May 2022, Patchwork 
> wrote:
> > == Series Details ==
> >
> > Series: drm/edid: expand on struct drm_edid usage
> > URL   : https://patchwork.freedesktop.org/series/104309/
> > State : failure
> >
> > == Summary ==
> >
> >
> >  Possible regressions 
> >
> >   * igt@i915_pm_rpm@module-reload:
> > - fi-skl-guc: [PASS][1] -> [FAIL][2]
> >[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11693/fi-skl-
> guc/igt@i915_pm_...@module-reload.html
> >[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-skl-
> guc/igt@i915_pm_...@module-reload.html
> 
> Tomi, this link is giving me access denied.

Both links work. Patchwork posting is done before results are synced to
upstream service. This is noticeable with 404 if the transfer is slow.
The issue is known and will be fixed when priorities allow.

Tomi

> >
> >
> > Known issues
> > 
> >
> >   Here are the changes found in Patchwork_104309v1 that come from
> known issues:
> >
> > ### IGT changes ###
> >
> >  Issues hit 
> >
> >   * igt@fbdev@write:
> > - bat-dg1-5:  NOTRUN -> [SKIP][3] ([i915#2582]) +4 similar 
> > issues
> >[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@fb...@write.html
> >
> >   * igt@gem_huc_copy@huc-copy:
> > - fi-icl-u2:  NOTRUN -> [SKIP][4] ([i915#2190])
> >[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-icl-
> u2/igt@gem_huc_c...@huc-copy.html
> >
> >   * igt@gem_lmem_swapping@parallel-random-engines:
> > - fi-icl-u2:  NOTRUN -> [SKIP][5] ([i915#4613]) +3 similar 
> > issues
> >[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-icl-
> u2/igt@gem_lmem_swapp...@parallel-random-engines.html
> >
> >   * igt@gem_mmap@basic:
> > - bat-dg1-5:  NOTRUN -> [SKIP][6] ([i915#4083])
> >[6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@gem_m...@basic.html
> >
> >   * igt@gem_tiled_fence_blits@basic:
> > - bat-dg1-5:  NOTRUN -> [SKIP][7] ([i915#4077]) +2 similar 
> > issues
> >[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@gem_tiled_fence_bl...@basic.html
> >
> >   * igt@gem_tiled_pread_basic:
> > - bat-dg1-5:  NOTRUN -> [SKIP][8] ([i915#4079]) +1 similar issue
> >[8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@gem_tiled_pread_basic.html
> >
> >   * igt@i915_pm_backlight@basic-brightness:
> > - bat-dg1-5:  NOTRUN -> [SKIP][9] ([i915#1155])
> >[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@i915_pm_backli...@basic-brightness.html
> >
> >   * igt@i915_pm_rpm@module-reload:
> > - fi-cfl-8109u:   [PASS][10] -> [DMESG-WARN][11] ([i915#62]) +16 
> > similar
> issues
> >[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11693/fi-cfl-
> 8109u/igt@i915_pm_...@module-reload.html
> >[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-cfl-
> 8109u/igt@i915_pm_...@module-reload.html
> >
> >   * igt@i915_selftest@live@hangcheck:
> > - bat-dg1-5:  NOTRUN -> [DMESG-FAIL][12] ([i915#4494] /
> [i915#4957])
> >[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@i915_selftest@l...@hangcheck.html
> > - bat-dg1-6:  [PASS][13] -> [DMESG-FAIL][14] ([i915#4494] /
> [i915#4957])
> >[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11693/bat-dg1-
> 6/igt@i915_selftest@l...@hangcheck.html
> >[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-6/igt@i915_selftest@l...@hangcheck.html
> > - fi-snb-2600:[PASS][15] -> [INCOMPLETE][16] ([i915#3921])
> >[15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11693/fi-snb-
> 2600/igt@i915_selftest@l...@hangcheck.html
> >[16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-
> snb-2600/igt@i915_selftest@l...@hangcheck.html
> >
> >   * igt@i915_selftest@live@late_gt_pm:
> > - fi-bsw-nick:[PASS][17] -> [DMESG-FAIL][18] ([i915#3428])
> >[17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11693/fi-bsw-
> nick/igt@i915_selftest@live@late_gt_pm.html
> >[18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-
> bsw-nick/igt@i915_selftest@live@late_gt_pm.html
> >
> >   * igt@i915_suspend@basic-s2idle-without-i915:
> > - bat-dg1-5:  NOTRUN -> [INCOMPLETE][19] ([i915#6011])
> >[19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/bat-
> dg1-5/igt@i915_susp...@basic-s2idle-without-i915.html
> >
> >   * igt@i915_suspend@basic-s3-without-i915:
> > - fi-icl-u2:  NOTRUN -> [SKIP][20] ([i915#5903])
> >[20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_104309v1/fi-icl-
> u2/igt@i915_susp...@basic-s3-without-i915.html
> >
> >   * igt@kms_addfb_basic@basic-y-tiled-legacy:
> > - bat-dg1-5:  

[Intel-gfx] [CI] Maintenance: Intel-GFX-CI down on weekend 27.5.-29.5.

2022-05-20 Thread Sarvela, Tomi P
This is early warning about GFX-CI going down for maintenance
on next week weekend Fri 27.5. to Sun 29.5.

GFX-CI will be migrating to new hardware starting Friday 27.5. and
estimated downtime is at least 48h (CIBuglog database dump/import).
During downtime, no new builds are done and no testing is performed.

If all goes well, CI will be running on new server on Monday 30.5.
If not, the old server is rolled back by changing IP and the services
will be restarted on old hardware.

Visible improvement will be doubling of disk space, which
allows us to have more history stored.

Weekends usually are quiet on mailing lists, so I hope this
won't be too much of an inconvenience.

Best Regards,

Tomi Sarvela
--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo



Re: [Intel-gfx] ✓ Fi.CI.BAT: success for drm/i915/dmc: Load DMC on DG2 (rev4)

2022-04-16 Thread Sarvela, Tomi P
This week I've been working to get the Public CI work more reliably
on bat-* machines which are hosted on a different site.

The BAT for IGT, IGTPW, CI_DRM and Patchwork seems to be stabilized
pretty well, and probably Trybot/TrybotIGT will be added soon enough.
This isn't a promise that even one DG2 is always present on testing: the
pre-production platforms are as fickle as ever. If DG2 results are needed,
and none of the DG2 machines were successfully rebooted with the
dGPU present, best I can do is 're-test the series'.

For FULL, we can't enable all the builds on all shards because lack of
machine hours. DG2 has hundreds of more tests than the next worst shard,
the tests take more time, and we have less DG2s than the next worst shard.
In the end, we probably end up here with a w/a: DG2-specific blacklist for
the most commonly hanging, and longest lasting tests.

Regards,

Tomi


> From: De Marchi, Lucas 
> 
> On Thu, Apr 14, 2022 at 08:54:49PM +, Anusha Srivatsa wrote:
> >HI,
> >
> >The result here says SUCCESS but closer look  shows that it never ran on any
> DG2. Wanted to know if something went wrong at the system side and if it
> needs to be revived. The patch is simply loading DMC and shouldn’t cause
> the system to not boot up at all.
> >
> >Any info in this regard will be very useful.
> 
> I think the issue is less about the feedback saying "SUCCESS" and more
> about "why was it not tested on DG2 and we have no clue what happened?".
> 
> Not gating a "SUCCESS" message is needed for platforms that are unstable
> due to not been completed yet: otherwise almost all patches series would
> return "FAIL" and it would be even more useless.
> 
> So, as DG2 is one of those platforms, I think it's ok to have this
> behavior. But it's not very good to simply have no results and no
> feedback on what really happened.
> 
> Lucas De Marchi
> 
> >
> >Thanks,
> >Anusha
> >
> >From: Patchwork 
> >Sent: Thursday, April 14, 2022 11:04 AM
> >To: Srivatsa, Anusha 
> >Cc: intel-gfx@lists.freedesktop.org
> >Subject: ✓ Fi.CI.BAT: success for drm/i915/dmc: Load DMC on DG2 (rev4)
> >
> >Patch Details
> >Series:
> >
> >drm/i915/dmc: Load DMC on DG2 (rev4)
> >
> >URL:
> >
> >https://patchwork.freedesktop.org/series/102630/
> >
> >State:
> >
> >success
> >
> >Details:
> >
> >https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_102630v4/index.html
> >
> >CI Bug Log - changes from CI_DRM_11500 -> Patchwork_102630v4
> >Summary
> >
> >SUCCESS
> >
> >No regressions found.
> >
> >External URL: https://intel-gfx-ci.01.org/tree/drm-
> tip/Patchwork_102630v4/index.html
> >
> >Participating hosts (48 -> 45)
> >
> >Additional (2): bat-adlm-1 fi-pnv-d510
> >Missing (5): bat-dg2-8 fi-bsw-cyan fi-icl-u2 bat-dg2-9 fi-bdw-samus
> >
> >Known issues
> >
> >Here are the changes found in Patchwork_102630v4 that come from known
> issues:
> >
> >IGT changes
> >Issues hit
> >
> >  *   igt@i915_selftest@live@execlists:
> >
> > *   fi-bsw-nick: PASS tip/CI_DRM_11500/fi-bsw-nick/igt@i915_selftest@l...@execlists.html> ->
> INCOMPLETE tip/Patchwork_102630v4/fi-bsw-
> nick/igt@i915_selftest@l...@execlists.html>
> (i915#2940)
> >
> >  *   igt@i915_selftest@live@hangcheck:
> >
> > *   fi-snb-2600: PASS tip/CI_DRM_11500/fi-snb-2600/igt@i915_selftest@l...@hangcheck.html> ->
> INCOMPLETE tip/Patchwork_102630v4/fi-snb-
> 2600/igt@i915_selftest@l...@hangcheck.html>
> (i915#3921)
> >
> >  *   igt@i915_selftest@live@requests:
> >
> > *   fi-blb-e6850: PASS tip/CI_DRM_11500/fi-blb-e6850/igt@i915_selftest@l...@requests.html> ->
> DMESG-FAIL tip/Patchwork_102630v4/fi-blb-
> e6850/igt@i915_selftest@l...@requests.html>
> (i915#4528)
> >
> >  *   igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-c:
> >
> > *   fi-pnv-d510: NOTRUN -> SKIP tip/Patchwork_102630v4/fi-pnv-d510/igt@kms_pipe_crc_basic@compare-
> crc-sanitycheck-pipe-c.html>
> (fdo#109271 /
> i915#5341)
> >
> >  *   igt@prime_vgem@basic-userptr:
> >
> > *   fi-pnv-d510: NOTRUN -> SKIP tip/Patchwork_102630v4/fi-pnv-d510/igt@prime_vgem@basic-
> userptr.html>
> (fdo#109271) +39
> similar issues
> >
> >  *   igt@runner@aborted:
> >
> > *   fi-bsw-nick: NOTRUN -> FAIL tip/Patchwork_102630v4/fi-bsw-nick/igt@run...@aborted.html>
> (fdo#109271 /
> 

Re: [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [RESEND,1/3] drm/i915/dmc: abstract GPU error state dump

2022-03-31 Thread Sarvela, Tomi P
After the latest CI_DRM has been build, re-test should be enough.

Tomi

> From: De Marchi, Lucas 
> On Thu, Mar 31, 2022 at 08:28:09AM +, Tomi Sarvela wrote:
> >The latest CI_DRM built is 11416; after that, there is build error:
> >drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c: In function
> 'amdgpu_gtt_mgr_recover':
> >drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c:200:31: error: 'struct
> ttm_range_mgr_node' has no member named 'tbo'
> >   amdgpu_ttm_recover_gart(node->tbo);
> >   ^~
> >  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
> >scripts/Makefile.build:288: recipe for target
> 'drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.o' failed
> 
> just fixed that
> 
> >
> >The patch is applied against latest working build commit. Can you try your
> patch against
> > CI_DRM_11416 1dc2c6953e2689a0e5b7cca8450da14059d35f03
> >and see if you get the same error?
> 
> so maybe just a re-trigger should work?
> 
> Lucas De Marchi
> 
> >
> >Tomi
> >
> >> From: Nikula, Jani 
> >>
> >> On Wed, 30 Mar 2022, Patchwork 
> >> wrote:
> >> > == Series Details ==
> >> >
> >> > Series: series starting with [RESEND,1/3] drm/i915/dmc: abstract GPU
> error
> >> state dump
> >> > URL   : https://patchwork.freedesktop.org/series/101957/
> >> > State : failure
> >>
> >> I don't get why this doesn't apply.
> >>
> >> It applies for me.
> >>
> >>
> >> BR,
> >> Jani.
> >>
> >>
> >> >
> >> > == Summary ==
> >> >
> >> > Applying: drm/i915/dmc: abstract GPU error state dump
> >> > Using index info to reconstruct a base tree...
> >> > Mdrivers/gpu/drm/i915/display/intel_dmc.c
> >> > Mdrivers/gpu/drm/i915/display/intel_dmc.h
> >> > Mdrivers/gpu/drm/i915/i915_gpu_error.c
> >> > Falling back to patching base and 3-way merge...
> >> > Auto-merging drivers/gpu/drm/i915/i915_gpu_error.c
> >> > Auto-merging drivers/gpu/drm/i915/display/intel_dmc.h
> >> > CONFLICT (content): Merge conflict in
> >> drivers/gpu/drm/i915/display/intel_dmc.h
> >> > Auto-merging drivers/gpu/drm/i915/display/intel_dmc.c
> >> > CONFLICT (content): Merge conflict in
> >> drivers/gpu/drm/i915/display/intel_dmc.c
> >> > error: Failed to merge in the changes.
> >> > hint: Use 'git am --show-current-patch=diff' to see the failed patch
> >> > Patch failed at 0001 drm/i915/dmc: abstract GPU error state dump
> >> > When you have resolved this problem, run "git am --continue".
> >> > If you prefer to skip this patch, run "git am --skip" instead.
> >> > To restore the original branch and stop patching, run "git am --abort".
> >> >
> >> >
> >>
> >> --
> >> Jani Nikula, Intel Open Source Graphics Center


Re: [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [RESEND,1/3] drm/i915/dmc: abstract GPU error state dump

2022-03-31 Thread Sarvela, Tomi P
The latest CI_DRM built is 11416; after that, there is build error:
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c: In function 
'amdgpu_gtt_mgr_recover':
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c:200:31: error: 'struct 
ttm_range_mgr_node' has no member named 'tbo'
   amdgpu_ttm_recover_gart(node->tbo);
   ^~
  CC [M]  drivers/net/ethernet/intel/igb/e1000_mbx.o
scripts/Makefile.build:288: recipe for target 
'drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.o' failed

The patch is applied against latest working build commit. Can you try your 
patch against
 CI_DRM_11416 1dc2c6953e2689a0e5b7cca8450da14059d35f03
and see if you get the same error?

Tomi

> From: Nikula, Jani 
> 
> On Wed, 30 Mar 2022, Patchwork 
> wrote:
> > == Series Details ==
> >
> > Series: series starting with [RESEND,1/3] drm/i915/dmc: abstract GPU error
> state dump
> > URL   : https://patchwork.freedesktop.org/series/101957/
> > State : failure
> 
> I don't get why this doesn't apply.
> 
> It applies for me.
> 
> 
> BR,
> Jani.
> 
> 
> >
> > == Summary ==
> >
> > Applying: drm/i915/dmc: abstract GPU error state dump
> > Using index info to reconstruct a base tree...
> > M   drivers/gpu/drm/i915/display/intel_dmc.c
> > M   drivers/gpu/drm/i915/display/intel_dmc.h
> > M   drivers/gpu/drm/i915/i915_gpu_error.c
> > Falling back to patching base and 3-way merge...
> > Auto-merging drivers/gpu/drm/i915/i915_gpu_error.c
> > Auto-merging drivers/gpu/drm/i915/display/intel_dmc.h
> > CONFLICT (content): Merge conflict in
> drivers/gpu/drm/i915/display/intel_dmc.h
> > Auto-merging drivers/gpu/drm/i915/display/intel_dmc.c
> > CONFLICT (content): Merge conflict in
> drivers/gpu/drm/i915/display/intel_dmc.c
> > error: Failed to merge in the changes.
> > hint: Use 'git am --show-current-patch=diff' to see the failed patch
> > Patch failed at 0001 drm/i915/dmc: abstract GPU error state dump
> > When you have resolved this problem, run "git am --continue".
> > If you prefer to skip this patch, run "git am --skip" instead.
> > To restore the original branch and stop patching, run "git am --abort".
> >
> >
> 
> --
> Jani Nikula, Intel Open Source Graphics Center


Re: [Intel-gfx] Intel-GFX-CI is halted due to power issue

2022-03-08 Thread Sarvela, Tomi P
The incident is over: power has been restored, but CI is still quiet.

I'll check up the affected hosts and start re-testing the missed
Patchwork series.

Best Regards,

Tomi Sarvela

From: Sarvela, Tomi P
Sent: Tuesday, March 8, 2022 11:26 AM
To: intel-gfx@lists.freedesktop.org

Intel-GFX-CI has ongoing issue, assumed AC power delivery.

At around 21:00 EET yesterday, the power was lost to several parts
of our CI lab, most notably to PDUs that control testhosts. As the PDUs
cannot be reached or controlled, testhosts by and large are without power.

There has been builds but no results after the incident.

I'll inform more with ETA when I know more about the situation

Best Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo



[Intel-gfx] Intel-GFX-CI is halted due to power issue

2022-03-08 Thread Sarvela, Tomi P

Intel-GFX-CI has ongoing issue, assumed AC power delivery.

At around 21:00 EET yesterday, the power was lost to several parts
of our CI lab, most notably to PDUs that control testhosts. As the PDUs
cannot be reached or controlled, testhosts by and large are without power.

There has been builds but no results after the incident.

I'll inform more with ETA when I know more about the situation

Best Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo



Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/display/vrr: Reset VRR capable property on a long hpd (rev4)

2022-02-25 Thread Sarvela, Tomi P
> From: Saarinen, Jani 
> > From: Navare, Manasi D 
> > Subject: RE: ✗ Fi.CI.BAT: failure for drm/i915/display/vrr: Reset VRR
> capable
> > property on a long hpd (rev4)
> >
> > Hi,
> >
> >
> >
> > I fixed the regression in this patch and resent it, it still has BAT 
> > failures, I
> wanted
> > to understand if it failed to boot some of the machines again or the errors
> flagged
> > here are the known errors.
> >
> >
> >
> > Regards
> >
> > Manasi
> >
> >
> >
> > From: Patchwork 
> > Sent: Thursday, February 24, 2022 10:45 AM
> > To: Navare, Manasi D 
> > Cc: intel-gfx@lists.freedesktop.org
> > Subject: ✗ Fi.CI.BAT: failure for drm/i915/display/vrr: Reset VRR capable
> property
> > on a long hpd (rev4)
> >
> >
> >
> > Patch Details
> >
> > Series:
> >
> > drm/i915/display/vrr: Reset VRR capable property on a long hpd (rev4)
> >
> > URL:
> >
> > https://patchwork.freedesktop.org/series/98801/
> >
> > State:
> >
> > failure
> >
> > Details:
> >
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22394/index.html
> >
> >
> > CI Bug Log - changes from CI_DRM_11279 -> Patchwork_22394
> >
> >
> > Summary
> >
> >
> > FAILURE
> >
> > Serious unknown changes coming with Patchwork_22394 absolutely need
> to be
> > verified manually.
> >
> > If you think the reported changes have nothing to do with the changes
> > introduced in Patchwork_22394, please notify your bug team to allow them
> to
> > document this new failure mode, which will reduce false positives in CI.
> >
> > External URL: https://intel-gfx-ci.01.org/tree/drm-
> > tip/Patchwork_22394/index.html
> >
> >
> > Participating hosts (43 -> 32)
> >
> >
> > Missing (11): fi-kbl-soraka fi-cml-u2 fi-bsw-cyan fi-ilk-650 fi-apl-guc 
> > fi-kbl-
> 7500u fi-
> > kbl-x1275 fi-cfl-8109u fi-bsw-kefka fi-bdw-samus fi-skl-6600u
> Would be good to understand why there is this many systems down still. Also
> are these same than on previous series...
> Previous was missing:
> --
> Missing (29): fi-kbl-soraka fi-bdw-gvtdvm fi-apl-guc fi-snb-2520m fi-skl-6600u
> fi-snb-2600 fi-cml-u2 fi-bxt-dsi fi-bdw-5557u shard-tglu fi-bsw-n3050 
> fi-glk-dsi
> fi-ilk-650 fi-kbl-7500u fi-hsw-4770 fi-ivb-3770 fi-elk-e7500 fi-bsw-nick 
> fi-skl-
> 6700k2 fi-kbl-7567u fi-skl-guc fi-cfl-8700k fi-bsw-cyan fi-cfl-guc fi-kbl-guc 
> fi-
> kbl-x1275 fi-cfl-8109u fi-kbl-8809g fi-bsw-kefka
> --
> So there are same systems. Tomi, what is threshold how many systems need
> to have boot issues and having
> Just looking some same systems on both...:
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22394/filelist.html
> eg. https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22394/fi-kbl-
> soraka/run0.txt
> and https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22394/fi-kbl-
> 7500u/run0.txt , there is also oops https://intel-gfx-ci.01.org/tree/drm-
> tip/Patchwork_22394/fi-kbl-7500u/pstore0-1645726566_Oops_1.txt
> I would say not clean really yet

The threshold is 50% of hosts. Couple of those now missing are problem
children, but there's also some that should not be down.

Please re-test the series, and if the result looks the same, then there
is probably real issue with the patch.

Tomi


> 
> 
> >
> >
> > Possible new issues
> >
> >
> > Here are the unknown changes that may have been introduced in
> > Patchwork_22394:
> >
> >
> > IGT changes
> >
> >
> > Possible regressions
> >
> >
> > *   igt@gem_exec_suspend@basic-s0@smem:
> >
> > *   fi-skl-6700k2: PASS  ci.01.org/tree/drm-
> > tip/CI_DRM_11279/fi-skl-6700k2/igt@gem_exec_suspend@basic-
> > s...@smem.html>  -> DMESG-WARN  > tip/Patchwork_22394/fi-skl-6700k2/igt@gem_exec_suspend@basic-
> > s...@smem.html>
> >
> >
> > Known issues
> >
> >
> > Here are the changes found in Patchwork_22394 that come from known
> issues:
> >
> >
> > IGT changes
> >
> >
> > Issues hit
> >
> >
> > *   igt@amdgpu/amd_basic@cs-multi-fence:
> >
> > *   fi-blb-e6850: NOTRUN -> SKIP  > ci.01.org/tree/drm-tip/Patchwork_22394/fi-blb-
> > e6850/igt@amdgpu/amd_ba...@cs-multi-fence.html>  (fdo#109271
> >  ) +17 similar
> issues
> >
> > *   igt@runner@aborted:
> >
> > *   fi-skl-6700k2: NOTRUN -> FAIL  gfx-
> > ci.01.org/tree/drm-tip/Patchwork_22394/fi-skl-
> > 6700k2/igt@run...@aborted.html>  (i915#4312
> >  )
> >
> >
> > Possible fixes
> >
> >
> > *   igt@i915_selftest@live@hangcheck:
> >
> > *   bat-dg1-6: DMESG-FAIL  ci.01.org/tree/drm-
> > tip/CI_DRM_11279/bat-dg1-6/igt@i915_selftest@l...@hangcheck.html>
> > (i915#4494   /
> i915#4957
> >  ) -> PASS
>  > gfx-ci.01.org/tree/drm-tip/Patchwork_22394/bat-dg1-
> > 6/igt@i915_selftest@l...@hangcheck.html>
> >
> > *   

Re: [Intel-gfx] [PATCH v2 1/4] drm/i915/fbc: Parametrize FBC register offsets

2021-12-15 Thread Sarvela, Tomi P
> From: Sarvela, Tomi P
> > From: Ville Syrjälä 
> > On Wed, Dec 15, 2021 at 09:05:03AM +, Sarvela, Tomi P wrote:
> > > > From: Ville Syrjälä 
> > > >
> > > > On Tue, Dec 14, 2021 at 06:25:43PM +0200, Ville Syrjälä wrote:
> > > > > On Mon, Dec 13, 2021 at 09:54:04PM +0200, Jani Nikula wrote:
> > > > > > On Mon, 13 Dec 2021, Ville Syrjala 
> > wrote:
> > > > > >
> > > > > > This one is only used in gvt, anyway. And that actually makes me
> > wonder
> > > > > > if this should be breaking the build. Does CI not have gvt enabled?
> > > > >
> > > > > Hmm. I thought it was enabled in CI, but maybe not. I've often broken
> > > > > gvt with register define changes but I've always caught it before
> > > > > pushing. I think I have gvt enabled in my "make sure all commits build
> > > > > before I push" test config, so maybe that's where I caught most of
> > them.
> > > > >
> > > > > Tomi, can we enable gvt in ci builds to make sure it at least still
> > > > > builds?
> > > >
> > > > Actually cc Tomi..
> > >
> > > GVT-d is enabled and tested by fi-bdw-gvtdvm.
> >
> > We're talking about the other gvt (whatever it was called), ie.
> > CONFIG_DRM_I915_GVT.
> 
> This kconfig entry doesn't exist in default CI kconfig, even as 'is not set'
> placeholder:
> https://gitlab.freedesktop.org/gfx-ci/i915-infra/-
> /blob/master/kconfig/debug
> 
> If the config entry is exact, I'll probably need to upgrade the default config
> from 5.13 and add it with requirements. Not today, but maybe soon.

kconfigs debug, debug-kasan and debug-gcov have been updated to v5.15
with 'make olddefconfig', and CONFIG_DRM_I915_GVT=y has been set.

First CI_DRM to use this kconfig will be CI_DRM_11005.

Tomi


Re: [Intel-gfx] [PATCH v2 1/4] drm/i915/fbc: Parametrize FBC register offsets

2021-12-15 Thread Sarvela, Tomi P
> From: Ville Syrjälä 
> On Wed, Dec 15, 2021 at 09:05:03AM +, Sarvela, Tomi P wrote:
> > > From: Ville Syrjälä 
> > >
> > > On Tue, Dec 14, 2021 at 06:25:43PM +0200, Ville Syrjälä wrote:
> > > > On Mon, Dec 13, 2021 at 09:54:04PM +0200, Jani Nikula wrote:
> > > > > On Mon, 13 Dec 2021, Ville Syrjala 
> wrote:
> > > > >
> > > > > This one is only used in gvt, anyway. And that actually makes me
> wonder
> > > > > if this should be breaking the build. Does CI not have gvt enabled?
> > > >
> > > > Hmm. I thought it was enabled in CI, but maybe not. I've often broken
> > > > gvt with register define changes but I've always caught it before
> > > > pushing. I think I have gvt enabled in my "make sure all commits build
> > > > before I push" test config, so maybe that's where I caught most of
> them.
> > > >
> > > > Tomi, can we enable gvt in ci builds to make sure it at least still
> > > > builds?
> > >
> > > Actually cc Tomi..
> >
> > GVT-d is enabled and tested by fi-bdw-gvtdvm.
> 
> We're talking about the other gvt (whatever it was called), ie.
> CONFIG_DRM_I915_GVT.

This kconfig entry doesn't exist in default CI kconfig, even as 'is not set'
placeholder:
https://gitlab.freedesktop.org/gfx-ci/i915-infra/-/blob/master/kconfig/debug

If the config entry is exact, I'll probably need to upgrade the default config
from 5.13 and add it with requirements. Not today, but maybe soon.

Tomi


Re: [Intel-gfx] [PATCH v2 1/4] drm/i915/fbc: Parametrize FBC register offsets

2021-12-15 Thread Sarvela, Tomi P
> From: Ville Syrjälä 
> 
> On Tue, Dec 14, 2021 at 06:25:43PM +0200, Ville Syrjälä wrote:
> > On Mon, Dec 13, 2021 at 09:54:04PM +0200, Jani Nikula wrote:
> > > On Mon, 13 Dec 2021, Ville Syrjala  wrote:
> > >
> > > This one is only used in gvt, anyway. And that actually makes me wonder
> > > if this should be breaking the build. Does CI not have gvt enabled?
> >
> > Hmm. I thought it was enabled in CI, but maybe not. I've often broken
> > gvt with register define changes but I've always caught it before
> > pushing. I think I have gvt enabled in my "make sure all commits build
> > before I push" test config, so maybe that's where I caught most of them.
> >
> > Tomi, can we enable gvt in ci builds to make sure it at least still
> > builds?
> 
> Actually cc Tomi..

GVT-d is enabled and tested by fi-bdw-gvtdvm.

Tomi


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915: Fix DPT suspend/resume on !HAS_DISPLAY platforms

2021-11-26 Thread Sarvela, Tomi P
The trace is full of ext4, so I'm siding on -rc2 issue.

You can try re-testing the series to see if same failure happens again.

Tomi

> From: Deak, Imre 
> 
> Hi,
> 
> On Thu, Nov 25, 2021 at 10:38:35PM +, Patchwork wrote:
> > == Series Details ==
> >
> > Series: drm/i915: Fix DPT suspend/resume on !HAS_DISPLAY platforms
> > URL   : https://patchwork.freedesktop.org/series/97291/
> > State : failure
> >
> > == Summary ==
> >
> > CI Bug Log - changes from CI_DRM_10928_full -> Patchwork_21682_full
> > 
> >
> > Summary
> > ---
> >
> >   **FAILURE**
> >
> >   Serious unknown changes coming with Patchwork_21682_full absolutely
> need to be
> >   verified manually.
> >
> >   If you think the reported changes have nothing to do with the changes
> >   introduced in Patchwork_21682_full, please notify your bug team to allow
> them
> >   to document this new failure mode, which will reduce false positives in 
> > CI.
> >
> >
> >
> > Participating hosts (11 -> 11)
> > --
> >
> >   No changes in participating hosts
> >
> > Possible new issues
> > ---
> >
> >   Here are the unknown changes that may have been introduced in
> Patchwork_21682_full:
> >
> > ### IGT changes ###
> >
> >  Possible regressions 
> >
> >   * igt@kms_flip@flip-vs-suspend@b-dp1:
> > - shard-kbl:  [PASS][1] -> [INCOMPLETE][2]
> >[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> kbl4/igt@kms_flip@flip-vs-susp...@b-dp1.html
> >[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21682/shard-
> kbl6/igt@kms_flip@flip-vs-susp...@b-dp1.html
> 
> This is (happy that we have pstore logs!):
> 
> <3>[  121.347224] INFO: task kworker/u8:17:1044 blocked for more than 30
> seconds.
> <3>[  121.347231]   Tainted: GW 
> 5.16.0-rc2-CI-Patchwork_21682+ #1
> <3>[  121.347236] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> <6>[  121.347241] task:kworker/u8:17   state:D stack:13456 pid: 1044 ppid:
>  2
> flags:0x4000
> <6>[  121.347250] Workqueue: writeback wb_workfn (flush-259:0)
> <6>[  121.346993]  schedule+0x3f/0xc0
> <6>[  121.346998]  __bio_queue_enter+0x3a4/0x450
> <6>[  121.347006]  ? finish_wait+0x80/0x80
> <6>[  121.347018]  blk_mq_submit_bio+0x171/0xa30
> <6>[  121.347025]  ? mpage_release_unused_pages+0x27b/0x290
> <6>[  121.347036]  ? do_writepages+0xd3/0x1a0
> <6>[  121.347043]  submit_bio_noacct+0x254/0x2a0
> <6>[  121.347055]  ext4_io_submit+0x44/0x50
> <6>[  121.347060]  ext4_writepages+0x32c/0x1070
> <6>[  121.347070]  ? __lock_acquire+0x5c0/0xb70
> <6>[  121.347099]  do_writepages+0xd3/0x1a0
> <6>[  121.347103]  ? filemap_fdatawrite_wbc+0x4b/0x80
> <6>[  121.347117]  filemap_fdatawrite_wbc+0x56/0x80
> <6>[  121.347124]  file_write_and_wait_range+0x97/0xd0
> <6>[  121.347144]  ext4_sync_file+0x166/0x460
> 
> Any idea if this could be an -rc2 related problem, fs corruption or
> related to the storage device on shard-kbl4 (if you've seen already
> similar reports)?
> 
> In any case the issue is not related, since on KBL the change doesn't
> have any effect.
> 
> >
> > Known issues
> > 
> >
> >   Here are the changes found in Patchwork_21682_full that come from
> known issues:
> >
> > ### CI changes ###
> >
> >  Possible fixes 
> >
> >   * boot:
> > - shard-apl:  ([PASS][3], [PASS][4], [PASS][5], [PASS][6], 
> > [PASS][7],
> [PASS][8], [PASS][9], [PASS][10], [PASS][11], [PASS][12], [PASS][13],
> [PASS][14], [PASS][15], [PASS][16], [PASS][17], [PASS][18], [PASS][19],
> [PASS][20], [PASS][21], [FAIL][22], [PASS][23], [PASS][24], [PASS][25],
> [PASS][26], [PASS][27]) ([i915#4386]) -> ([PASS][28], [PASS][29], [PASS][30],
> [PASS][31], [PASS][32], [PASS][33], [PASS][34], [PASS][35], [PASS][36],
> [PASS][37], [PASS][38], [PASS][39], [PASS][40], [PASS][41], [PASS][42],
> [PASS][43], [PASS][44], [PASS][45], [PASS][46], [PASS][47], [PASS][48],
> [PASS][49], [PASS][50], [PASS][51], [PASS][52])
> >[3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl8/boot.html
> >[4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl8/boot.html
> >[5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl8/boot.html
> >[6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl8/boot.html
> >[7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl8/boot.html
> >[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl7/boot.html
> >[9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl7/boot.html
> >[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl7/boot.html
> >[11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl6/boot.html
> >[12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10928/shard-
> apl6/boot.html
> >[13]: 

Re: [Intel-gfx] [PATCH 2/3] drm/i915/dg2: Add initial gt/ctx/engine workarounds

2021-11-12 Thread Sarvela, Tomi P
This issue was not catched by CI, because of series of unfortunate events.

Before, CI has rebooted without module blocklist, and CI catched boot-time
dmesg correctly and marked it as 'ci@boot' test with failure if there was a 
taint.

I've been doing changes to make blocklisting i915 possible and load it as
the first test of IGT: that'd make possible to remove some workarounds
and integrate the result better on our framework.

The test to decide if i915 should be modprobed was slightly off, and
on these runs where i915 failed to load in boot, it was modprobed again,
and modprobe hanged because of existing i915. Results were not collected.

I've added the condition to the conditional modprobe, and the results
from failed boot-time modprobe should be soon available as before,
eg. CI_DRM_10873 later shards with SNB.

Regards,

Tomi

> From: Latvala, Petri 
> On Tue, Nov 02, 2021 at 03:25:10PM -0700, Matt Roper wrote:
> > Bspec: 54077,68173,54833
> > Cc: Anusha Srivatsa 
> > Signed-off-by: Matt Roper 
> > ---
> >  drivers/gpu/drm/i915/gt/intel_workarounds.c | 278
> +++-
> >  drivers/gpu/drm/i915/i915_reg.h |  94 +--
> >  drivers/gpu/drm/i915/intel_pm.c |  21 +-
> >  3 files changed, 372 insertions(+), 21 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index 4aaa210fc003..37fd541a9719 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -644,6 +644,42 @@ static void dg1_ctx_workarounds_init(struct
> intel_engine_cs *engine,
> >
> DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
> >  }
> >
> > +static void dg2_ctx_workarounds_init(struct intel_engine_cs *engine,
> > +struct
> i915_wa_list *wal)
> > +{
> > +   gen12_ctx_gt_tuning_init(engine, wal);
> > +
> > +   /* Wa_16011186671:dg2_g11 */
> > +   if (IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0,
> STEP_B0)) {
> > +   wa_masked_dis(wal, VFLSKPD,
> DIS_MULT_MISS_RD_SQUASH);
> > +   wa_masked_en(wal, VFLSKPD,
> DIS_OVER_FETCH_CACHE);
> > +   }
> > +
> > +   if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0,
> STEP_B0)) {
> > +   /* Wa_14010469329:dg2_g10 */
> > +   wa_masked_en(wal,
> GEN11_COMMON_SLICE_CHICKEN3,
> > +
> XEHP_DUAL_SIMD8_SEQ_MERGE_DISABLE);
> > +
> > +   /*
> > +* Wa_22010465075:dg2_g10
> > +* Wa_22010613112:dg2_g10
> > +* Wa_14010698770:dg2_g10
> > +*/
> > +   wa_masked_en(wal,
> GEN11_COMMON_SLICE_CHICKEN3,
> > +
> GEN12_DISABLE_CPS_AWARE_COLOR_PIPE);
> > +   }
> > +
> > +   /* Wa_16013271637:dg2 */
> > +   wa_masked_en(wal, SLICE_COMMON_ECO_CHICKEN1,
> > +
> MSC_MSAA_REODER_BUF_BYPASS_DISABLE);
> > +
> > +   /* Wa_22012532006:dg2 */
> > +   if (IS_DG2_GRAPHICS_STEP(engine->i915, G10, STEP_A0,
> STEP_C0) ||
> > +   IS_DG2_GRAPHICS_STEP(engine->i915, G11, STEP_A0,
> STEP_B0))
> > +   wa_masked_en(wal,
> GEN9_HALF_SLICE_CHICKEN7,
> > +
> DG2_DISABLE_ROUND_ENABLE_ALLOW_FOR_SSLA);
> > +}
> > +
> >  static void fakewa_disable_nestedbb_mode(struct intel_engine_cs
> *engine,
> >
> struct i915_wa_list *wal)
> >  {
> > @@ -730,7 +766,9 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs
> *engine,
> > if (engine->class != RENDER_CLASS)
> > goto done;
> >
> > -   if (IS_XEHPSDV(i915))
> > +   if (IS_DG2(i915))
> > +   dg2_ctx_workarounds_init(engine, wal);
> > +   else if (IS_XEHPSDV(i915))
> > ; /* noop; none at this time */
> > else if (IS_DG1(i915))
> > dg1_ctx_workarounds_init(engine, wal);
> > @@ -1343,12 +1381,117 @@ xehpsdv_gt_workarounds_init(struct intel_gt
> *gt, struct i915_wa_list *wal)
> > GLOBAL_INVALIDATION_MODE);
> >  }
> >
> > +static void
> > +dg2_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
> > +{
> > +   struct intel_engine_cs *engine;
> > +   int id;
> > +
> > +   xehp_init_mcr(gt, wal);
> > +
> > +   /* Wa_14011060649:dg2 */
> > +   wa_14011060649(gt, wal);
> > +
> > +   /*
> > +* Although there are per-engine instances of these registers,
> > +* they technically exist outside the engine itself and are not
> > +* impacted by engine resets.  Furthermore, they're part of the
> > +* GuC blacklist so trying to treat them as engine workarounds
> > +* will result in GuC initialization failure and a wedged GPU.
> > +*/
> > +   for_each_engine(engine, gt, id) {
> > +   if (engine->class != VIDEO_DECODE_CLASS)
> > +   continue;
> > +
> > +   /* Wa_16010515920:dg2_g10 */
> > +   if (IS_DG2_GRAPHICS_STEP(gt->i915, G10,
> STEP_A0, STEP_B0))
> > +   wa_write_or(wal,
> VDBOX_CGCTL3F18(engine->mmio_base),
> > +
> ALNUNIT_CLKGATE_DIS);
> > +   }
> > +
> > +   if (IS_DG2_G10(gt->i915)) {
> > +   /* Wa_22010523718:dg2 */
> > +   

Re: [Intel-gfx] [PATCH 4/4] drm/i915: Fix oops on platforms w/o hpd support

2021-10-14 Thread Sarvela, Tomi P
> From: Ville Syrjälä 
> On Thu, Oct 14, 2021 at 09:31:40AM +, Sarvela, Tomi P wrote:
> > > From: Ville Syrjälä 
> > > On Thu, Oct 14, 2021 at 12:18:23PM +0300, Jani Nikula wrote:
> > > > On Thu, 14 Oct 2021, Ville Syrjala  
> > > > wrote:
> > > > > From: Ville Syrjälä 
> > > > >
> > > > > We don't have hpd support on i8xx/i915 which means
> > > hotplug_funcs==NULL.
> > > > > Let's not oops when loading the driver on one those machines.
> > > >
> > > > D'oh!
> > > >
> > > > Lemme guess, CI just casually dropped the machines from the results
> > > > because they didn't boot?
> > >
> > > Dunno where the gdg has gone actually. Tomi?
> >
> > Both GDGs are dead to old age (PSU / power delivery).
> 
> We don't have spare PSUs to throw at them? Or are the boards also
> semi-dead due to rotted caps etc.?

It could be MB caps, PSU caps, or PSU anything else. Nothing comes on
when power is turned on, no fans, no leds, nothing. Same issue on both
hosts. No surprises there, they're identical models. It could be CPU,
but IIRC I already tried changing that.

The PSU part is vendor-specific. Standard PSU maybe could be retrofitted,
but that'd need some dedicated time.

Tomi


Re: [Intel-gfx] [PATCH 4/4] drm/i915: Fix oops on platforms w/o hpd support

2021-10-14 Thread Sarvela, Tomi P
> From: Ville Syrjälä 
> On Thu, Oct 14, 2021 at 12:18:23PM +0300, Jani Nikula wrote:
> > On Thu, 14 Oct 2021, Ville Syrjala  wrote:
> > > From: Ville Syrjälä 
> > >
> > > We don't have hpd support on i8xx/i915 which means
> hotplug_funcs==NULL.
> > > Let's not oops when loading the driver on one those machines.
> >
> > D'oh!
> >
> > Lemme guess, CI just casually dropped the machines from the results
> > because they didn't boot?
> 
> Dunno where the gdg has gone actually. Tomi?

Both GDGs are dead to old age (PSU / power delivery).

Tomi


Re: [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [1/2] drm/dp: add drm_dp_phy_name() for getting DP PHY name

2021-10-05 Thread Sarvela, Tomi P
There was an issue with fd.o expired root cert, and that caused some issues
during the weekend and yesterday, mostly with git fetches. I wonder if this
is related. Can you re-test the patchset and see if the issue persists?

Other patchsets nearby timewise seem to be unaffected by spurious sparses.

Tomi

> From: Nikula, Jani 
> 
> 
> I wonder what's going on here?!
> 
> BR,
> Jani.
> 
> 
> On Tue, 05 Oct 2021, Patchwork 
> wrote:
> > == Series Details ==
> >
> > Series: series starting with [1/2] drm/dp: add drm_dp_phy_name() for
> getting DP PHY name
> > URL   : https://patchwork.freedesktop.org/series/95447/
> > State : warning
> >
> > == Summary ==
> >
> > $ dim sparse --fast origin/drm-tip
> > Sparse version: v0.6.2
> > Fast mode used, each commit won't be checked separately.
> > -
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +
> > +

Re: [Intel-gfx] [PATCH 01/24] drm/i915/uncore: split the fw get function into separate vfunc

2021-10-04 Thread Sarvela, Tomi P
> From: Ville Syrjälä 
> On Wed, Sep 29, 2021 at 01:57:45AM +0300, Jani Nikula wrote:
> > From: Dave Airlie 
> >
> > constify it while here. drop the put function since it was never
> > overloaded and always has done the same thing, no point in
> > indirecting it for show.
> >
> > Reviewed-by: Jani Nikula 
> > Signed-off-by: Dave Airlie 
> > Signed-off-by: Jani Nikula 
> 
> This has totally broken snb and ivb machines. Total death
> ensues somewhere in uncore init after some backtraces fly by.
> Didn't get any logs out to disk unfortunately. Please revert.
> 
> Sadly CI is still afraid to report when machines disappear.
> For the bat report you at least get a list of machines that
> were awol, but the shard run seems to not even mention that
> all snbs suddenly vanished.
> 
> I've said it before and I'll say it again. We really should
> *not* be loading i915 when the machine boots. That way we'd
> at least get the machine up and running and can report that
> loading i915 is the thing that killed it...

Added Petri Latvala

The best way to handle i915 loading in BAT would be to blacklist
i915 in boot and have igt@i915_module_load@load as the first
thing in fast-feedback.testlist. This would catch any i915 issue
to a test and we wouldn't need to do tricks with ci@boot
pseudotest.

Most of the CI parts are already in place. The IGT commit to
change fast-feedback needs to be coordinated.

Tomi


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/adlp: Add support for remapping CCS FBs (rev3)

2021-09-07 Thread Sarvela, Tomi P
This looks very much like SATA drive issue.

I'll replace the drive for shard-iclb3 and we'll see if this happens again.

Tomi

> -Original Message-
> From: Deak, Imre 
> Subject: Re: ✗ Fi.CI.IGT: failure for drm/i915/adlp: Add support for
> remapping CCS FBs (rev3)
> 
> Hi Lakshmi, Tomi,
> 
> could you check the failure below, looks like a storage device issue.
> 
> On Tue, Sep 07, 2021 at 05:25:30AM +, Patchwork wrote:
> > == Series Details ==
> >
> > Series: drm/i915/adlp: Add support for remapping CCS FBs (rev3)
> > URL   : https://patchwork.freedesktop.org/series/94108/
> > State : failure
> >
> > == Summary ==
> >
> > CI Bug Log - changes from CI_DRM_10553_full -> Patchwork_20971_full
> > 
> >
> > Summary
> > ---
> >
> >   **FAILURE**
> >
> >   Serious unknown changes coming with Patchwork_20971_full absolutely
> need to be
> >   verified manually.
> >
> >   If you think the reported changes have nothing to do with the changes
> >   introduced in Patchwork_20971_full, please notify your bug team to allow
> them
> >   to document this new failure mode, which will reduce false positives in 
> > CI.
> >
> >
> >
> > Possible new issues
> > ---
> >
> >   Here are the unknown changes that may have been introduced in
> Patchwork_20971_full:
> >
> > ### IGT changes ###
> >
> >  Possible regressions 
> >
> >   * igt@i915_suspend@sysfs-reader:
> > - shard-iclb: [PASS][1] -> [INCOMPLETE][2]
> >[1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10553/shard-
> iclb2/igt@i915_susp...@sysfs-reader.html
> >[2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> iclb3/igt@i915_susp...@sysfs-reader.html
> 
> In
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> iclb3/pstore6-1630987738_Panic_1.txt
> 
> <4>[  186.690964] ata3.00: qc timeout (cmd 0x27)
> 
> followed by ata3/sda2 device errors.
> 
> This happened on the same machine already at:
> https://lore.kernel.org/intel-gfx/20201003134854.GA1278041@ideak-
> desk.fi.intel.com/
> 
> and recently on the same machine also in:
> 
> Trybot_7967/shard-iclb3
> Patchwork_20943/shard-iclb3
> IGTPW_6178/shard-iclb3
> Patchwork_20963/shard-iclb3
> Patchwork_20962/shard-iclb3
> 
> as well as on:
> shard-iclb4
> shard-iclb7
> shard-kbl6
> shard-tglb1
> shard-tglb7
> 
> I can't see this related to the changes, as no CCS tests were run.
> 
> > Known issues
> > 
> >
> >   Here are the changes found in Patchwork_20971_full that come from
> known issues:
> >
> > ### IGT changes ###
> >
> >  Issues hit 
> >
> >   * igt@feature_discovery@chamelium:
> > - shard-tglb: NOTRUN -> [SKIP][3] ([fdo#111827])
> >[3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> tglb8/igt@feature_discov...@chamelium.html
> >
> >   * igt@feature_discovery@display-2x:
> > - shard-tglb: NOTRUN -> [SKIP][4] ([i915#1839])
> >[4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> tglb3/igt@feature_discov...@display-2x.html
> >
> >   * igt@gem_ctx_persistence@smoketest:
> > - shard-snb:  NOTRUN -> [SKIP][5] ([fdo#109271] / [i915#1099]) 
> > +6
> similar issues
> >[5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> snb5/igt@gem_ctx_persiste...@smoketest.html
> >
> >   * igt@gem_eio@in-flight-contexts-1us:
> > - shard-iclb: [PASS][6] -> [TIMEOUT][7] ([i915#3070])
> >[6]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10553/shard-
> iclb3/igt@gem_...@in-flight-contexts-1us.html
> >[7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> iclb3/igt@gem_...@in-flight-contexts-1us.html
> >
> >   * igt@gem_exec_fair@basic-deadline:
> > - shard-kbl:  [PASS][8] -> [FAIL][9] ([i915#2846])
> >[8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10553/shard-
> kbl1/igt@gem_exec_f...@basic-deadline.html
> >[9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> kbl1/igt@gem_exec_f...@basic-deadline.html
> > - shard-glk:  [PASS][10] -> [FAIL][11] ([i915#2846])
> >[10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10553/shard-
> glk5/igt@gem_exec_f...@basic-deadline.html
> >[11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> glk2/igt@gem_exec_f...@basic-deadline.html
> >
> >   * igt@gem_exec_fair@basic-none-rrul@rcs0:
> > - shard-glk:  NOTRUN -> [FAIL][12] ([i915#2842]) +1 similar 
> > issue
> >[12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-
> glk6/igt@gem_exec_fair@basic-none-r...@rcs0.html
> >
> >   * igt@gem_exec_fair@basic-none@vcs0:
> > - shard-kbl:  [PASS][13] -> [FAIL][14] ([i915#2842]) +1 similar 
> > issue
> >[13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10553/shard-
> kbl3/igt@gem_exec_fair@basic-n...@vcs0.html
> >[14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20971/shard-

Re: [Intel-gfx] Public i915 CI shardruns are disabled

2021-03-09 Thread Sarvela, Tomi P
It seems that the chops to be built has been re-defined several times in 
pipelines. Fixed.

https://github.com/intel-innersource/drivers.gpu.i915.ci.pipelines/commit/89d2f8174a15585c082b2f714551225ba6cafe08

Tomi

From: Sarvela, Tomi P
Sent: Tuesday, March 2, 2021 7:27 PM
To: 'intel-gfx@lists.freedesktop.org' 
Cc: Szwichtenberg, Radoslaw 
Subject: RE: Public i915 CI shardruns are disabled

The regression has been identified; Chris Wilson found commits touching
swapfile.c, and reverting them the issue couldn't be reproduced any more.

https://patchwork.freedesktop.org/series/87549/

This revert will be applied to core-for-CI branch. When new CI_DRM has
been built, shard-testing will be enabled again.

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, 
where
testable kernel booted from 2. partition, and copy of the 2. partition was 
stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has 
more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Public i915 CI shardruns are disabled

2021-03-02 Thread Sarvela, Tomi P
The regression has been identified; Chris Wilson found commits touching
swapfile.c, and reverting them the issue couldn't be reproduced any more.

https://patchwork.freedesktop.org/series/87549/

This revert will be applied to core-for-CI branch. When new CI_DRM has
been built, shard-testing will be enabled again.

Regards,

Tomi Sarvela

From: Sarvela, Tomi P

More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P

Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, 
where
testable kernel booted from 2. partition, and copy of the 2. partition was 
stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has 
more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] Public i915 CI shardruns are disabled

2021-03-02 Thread Sarvela, Tomi P
More information (excuse my top-posting):

- Issue happens in igt@gem_tiled_swapping@non-threaded Mlocking
phase, before "starting subtest" appears.

- Filesystem trashed is the one containing swapfile

- If swap is partition, it seems that the swap signature is correct even
after running the test, so for now I'm assuming that the issue has to do
with swapfile

- Bisection between 20210129 and 20210215 proved to be challenging,
because the kernels have pre-init hang, don't leave dmesg and I don't
have console on testing host. Petri's suggestion to bisect between
CI_DRM_9817 and 9818 might work better

Regards,

Tomi Sarvela

From: Sarvela, Tomi P
Sent: Tuesday, March 2, 2021 1:38 PM
To: intel-gfx@lists.freedesktop.org
Cc: Szwichtenberg, Radoslaw 
Subject: Public i915 CI shardruns are disabled

Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, 
where
testable kernel booted from 2. partition, and copy of the 2. partition was 
stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has 
more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] Public i915 CI shardruns are disabled

2021-03-02 Thread Sarvela, Tomi P
Hello,

The linux i915 CI shardruns have been disabled. This is due to the unfortunate
filesystem-corrupting bug first seen in linux-next 20210215, which now has
been merged to linus 5.12-rc1 and further on to DRM-Tip, first instance seen
in CI_DRM_9818. Last changes coming in were:

fb3b93df7979 drm-tip: 2021y-03m-01d-09h-36m-57s UTC integration manifest
3b3c4086295b drm-tip: 2021y-03m-01d-08h-49m-06s UTC integration manifest
fe07bfda2fb9 Linux 5.12-rc1

More information can be seen at:
https://phoronix.com/scan.php?page=news_item=Linux-5.12-Early-Buggy-Issue

I've seen this bug happen regularly with (but not limited to) IGT test:
igt@gem_tiled_swapping@non-threaded

The range for bisection is linux-next 20210215 to 20210129 because the kernels
in-between taint the kernel and our i915 testing was not done. Hitting the bug
corrupts the underlying filesystem very thoroughly, wiping out large amount of
data from the beginning of the partition which leaves fsck sad with thousands of
items lost. Bisection of the IGT testlist was done with two root filesystems, 
where
testable kernel booted from 2. partition, and copy of the 2. partition was 
stored
on 1. partition and could be restored at will.

I'll continue bisecting this bug on the linux-next tree again. If someone has 
more
information where this issue originates from, help would be appreciated.

Regards,

Tomi Sarvela

--
Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915/mst: fix pipe and vblank enable

2020-02-14 Thread Sarvela, Tomi P
> From: Jani Nikula 
> 
> On Mon, 10 Feb 2020, Arkadiusz Hiler  wrote:
> > As of the 3 days worth of queued shards:
> >
> > I agree that this is unacceptable, but we can do only so much from the
> > CI/infra side. The time has been creeping up steadily over the last year
> > or so and the machines are not getting any faster.
> 
> I am *not* trying to say that it's all your fault and you need to
> provide all results faster for the ever-increasing firehose of incoming
> patches.
> 
> I'd like to pose the question, what would all this look like if we made
> it a hard requirement that we need a go/no-go decision on every patch
> series within 24 hours? I emphasize that I don't mean full results in 24
> hours. Given all the other constraints, how could we provide as much
> useful information as possible within 24 hours to make a decision?
> 
> In another thread I said, we've shifted a bit from review being the
> bottle neck to shard runs being the bottle neck. It's still much more
> likely that a patch will change due to review feedback instead of shard
> run results. Half a dozen rounds of review ping pong directly leads to
> half a dozen rounds of mostly unnecessary testing. I would not outright
> dismiss only running full igt on reviewed/acked patches.

This is actually a good idea. In practice, the shards are swamped by the
amount of builds today, and the throughput has been close to 1/h a long
time, even with work ongoing to prune or tighten stupidest IGT tests.

We could make the shard run requirements stricter: in addition to passing
BAT it would need some amount of Acks. Patchwork already collects them.

Another idea has been moving the serialized shard run queue to something
that can handle reordering: trybots can be moved after everything else. This
doesn't affect to the shard queue length though, if we still want to test
everything.

> Additionally, there are smaller optimizations to be made (obviously all
> depending on developer bandwidth to implement this stuff), such as
> identifying patches that don't change the resulting binary
> (comment/documentation/whitespace changes), and only running build
> testing on them.

This idea has been floating around, and would help in 5% changes or so
(which is still noticeable: 1-2 more builds / day tested instead of queued).

Just need a good diff checker that says "text changes only, skip it".

Tomi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 0/5] drm/i915/skl: drop pre-production stepping workarounds

2016-09-22 Thread Sarvela, Tomi P
> From: Zanoni, Paulo R
> Em Sex, 2016-09-16 às 16:59 +0300, Jani Nikula escreveu:
> > Only production steppings are supported, drop workarounds for
> > anything
> > else. The series is split by revision so we can bikeshed if there are
> > steppings some people still need to use for some reason.
> 
> Bikeshed: in patches 2 and 3 you could have added platform tags to the
> workaround tags, while also adding the missing space to a
> /* comment*/.
> 
> Jani S., Yann: perhaps we could try to check if our CI/QA systems still
> have these machines? Just "lspci -nn | grep VGA" on the SKL systems and
> check whether rev <= 5.

The CI system doesn't have any pre-production SKL machines. They were dropped 
as soon as we got production machines.

Tomi

> If we conclude our CI system doesn't include these machines:
> Reviewed-by: Paulo Zanoni 

> > Jani Nikula (5):
> >   drm/i915/skl: drop workarounds for A0 and B0 revisions
> >   drm/i915/skl: drop workarounds for C0 revision
> >   drm/i915/skl: drop workarounds for D0 revision
> >   drm/i915/skl: drop workarounds for E0 revision
> >   drm/i915/skl: drop workarounds for F0 revision
> >
> >  drivers/gpu/drm/i915/intel_dp.c   |  4 --
> >  drivers/gpu/drm/i915/intel_dp_link_training.c |  3 --
> >  drivers/gpu/drm/i915/intel_guc_loader.c   |  8 ++--
> >  drivers/gpu/drm/i915/intel_lrc.c  | 23 +--
> >  drivers/gpu/drm/i915/intel_pm.c   |  3 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c   | 58 +--
> > 
> >  6 files changed, 23 insertions(+), 76 deletions(-)
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/skl: Add missing SKL GT3 id (rev2)

2016-02-01 Thread Sarvela, Tomi P
> From: Mika Kuoppala [mailto:mika.kuopp...@linux.intel.com]
> 
> "Sarvela, Tomi P" <tomi.p.sarv...@intel.com> writes:
> >
> > In the result box there is no SKL-I5K column at all. You're looking at HSW-
> GT2?
> >
> 
> Yes I was looking at wrong column. My bad.

I can see that this might be confusing.  Reason why it looks like this
is that there is workaround for IGT results.json on CI_IGT side which
has not yet been applied to CI_Patchwork side. This makes the
visualization and comparison logic slightly different.

Tomi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/skl: Add missing SKL GT3 id (rev2)

2016-02-01 Thread Sarvela, Tomi P
> From: Mika Kuoppala [mailto:mika.kuopp...@linux.intel.com]
>
> Patchwork  writes:
> 
> > Series 2919v2 drm/i915/skl: Add missing SKL GT3 id
> > http://patchwork.freedesktop.org/api/1.0/series/2919/revisions/2/mbox/
> >
> > Test kms_pipe_crc_basic:
> > Subgroup suspend-read-crc-pipe-a:
> > incomplete -> PASS   (hsw-gt2)
> > Subgroup suspend-read-crc-pipe-c:
> > dmesg-warn -> PASS   (bsw-nuc-2)
> >
> > bdw-nuci7total:156  pass:147  dwarn:0   dfail:0   fail:0   skip:9
> > bdw-ultratotal:159  pass:147  dwarn:0   dfail:0   fail:0   skip:12
> > bsw-nuc-2total:159  pass:129  dwarn:0   dfail:0   fail:0   skip:30
> > byt-nuc  total:159  pass:136  dwarn:0   dfail:0   fail:0   skip:23
> > hsw-brixbox  total:159  pass:146  dwarn:0   dfail:0   fail:0   skip:13
> > hsw-gt2  total:159  pass:149  dwarn:0   dfail:0   fail:0   skip:10
> > ilk-hp8440p  total:159  pass:111  dwarn:0   dfail:0   fail:0   skip:48
> > ivb-t430stotal:159  pass:145  dwarn:0   dfail:0   fail:0   skip:14
> > snb-dellxps  total:159  pass:137  dwarn:0   dfail:0   fail:0   skip:22
> >
> > HANGED skl-i5k-2 in igt@gem_sync@basic-blt
> >
> This seems to be the Nightly base which has hanged, not with the patch
> applied. Still the run was failure. Tomi whats up with this?

No, the hang is with patch applied. Might be unrelated to patch, but hang
nonetheless. From igt/piglit log

Linux skl-i5k-2 4.5.0-rc1-gfxbench+ #1 SMP PREEMPT Mon Feb 1 10:06:18 EET 2016 
x86_64 x86_64 x86_64 GNU/Linux
Nightly_445 Patchwork_1328

[...]

[053/159] skip: 7, pass: 45, dmesg-warn: 1 /
skip: igt/gem_storedw_loop/basic-bsd2   

[054/159] skip: 8, pass: 45, dmesg-warn: 1 /
running: igt/gem_sync/basic-blt 

[054/159] skip: 8, pass: 45, dmesg-warn: 1 -
Build timed out (after 10 minutes). Marking the build as aborted.
ERROR: remote file operation failed: /opt/jenkins/workspace/CI_IGT_test at 
hudson.remoting.Channel@5906ab1e:igt-skl-i5k-2: 
hudson.remoting.ChannelClosedException: channel is already closed
Build was aborted

> > Results at /archive/results/CI_IGT_test/Patchwork_1328/
> >
> > 6b1049b84dcd979f631d15b2ada325d8e5b2c4e1 drm-intel-nightly: 2016y-
> 01m-29d-22h-50m-57s UTC integration manifest
> > 4e52a4b9960b2761c6daef4374b8a2695212413a drm/i915/skl: Add missing
> SKL ids

Here you can see the relevant commit ids.

Tomi
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.BAT: failure for drm/i915/skl: Add missing SKL GT3 id (rev2)

2016-02-01 Thread Sarvela, Tomi P
> From: Mika Kuoppala [mailto:mika.kuopp...@linux.intel.com]
> 
> Ok. Then the rendering of the Patchwork_1328 report is somehow wrong as
> in that picture the HANG box is in the nightly base column.

In the result box there is no SKL-I5K column at all. You're looking at HSW-GT2?

Tomi


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx