Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,1/4] drm/i915/guc: Tidy guc_log_control

2018-03-10 Thread Chris Wilson
Quoting Michał Winiarski (2018-03-10 11:07:03)
> [   59.708020] [drm:error_state_write [i915]] Resetting error state
> [   59.708508] [IGT] gem_exec_capture: starting subtest capture-vebox
> [   59.718849] [drm] GPU HANG: ecode 9:0:0xfff7fffe, reason: Manually set
> wedged engine mask = , action: reset
> [   59.719421] i915 :00:02.0: Resetting vecs0 after gpu hang
> [   59.720276] [drm:i915_gem_reset_engine [i915]] resetting vecs0 to restart
> from tail of request 0x1
> [   59.721008] [drm:i915_reset_device [i915]] resetting chip
> [   59.721226] i915 :00:02.0: Resetting chip after gpu hang
> [   59.721575] i915 :00:02.0: GPU recovery failed

Full device reset doesn't handle being called from a failed per-engine
reset. Whoops. It doesn't look there's any reason for it to have failed
per-engine reset either,

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 828f3104488c..44eef355e12c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2985,6 +2985,7 @@ void i915_handle_error(struct drm_i915_private *dev_priv,
 */
intel_runtime_pm_get(dev_priv);
 
+   engine_mask &= INTEL_INFO(dev_priv)->ring_mask;
i915_capture_error_state(dev_priv, engine_mask, error_msg);
i915_clear_error_registers(dev_priv);
 
should fix the immediate problem; but there's no reason afaict for this
to vary between test runs. As to how to properly ignore left-over state
from per-engine reset when doing the full-reset fallback... ugh.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,1/4] drm/i915/guc: Tidy guc_log_control

2018-03-10 Thread Michał Winiarski
On Fri, Mar 09, 2018 at 10:48:29PM +, Patchwork wrote:
> == Series Details ==
> 
> Series: series starting with [CI,1/4] drm/i915/guc: Tidy guc_log_control
> URL   : https://patchwork.freedesktop.org/series/39710/
> State : failure
> 
> == Summary ==
> 
>  Possible new issues:
> 
> Test drv_missed_irq:
> pass   -> SKIP   (shard-apl)
> Test drv_selftest:
> Subgroup live_guc:
> pass   -> DMESG-WARN (shard-apl)
> Test gem_exec_capture:
> Subgroup capture-vebox:
> pass   -> FAIL   (shard-apl)
> Test perf:
> Subgroup gen8-unprivileged-single-ctx-counters:
> pass   -> FAIL   (shard-apl)

Note that drv_missed_irq, drv_selftest and perf are also failing without this
series (probably a good time to take a closer look at that).

I'm not sure what happened on gem_exec_capture (works for me):

[   59.708020] [drm:error_state_write [i915]] Resetting error state
[   59.708508] [IGT] gem_exec_capture: starting subtest capture-vebox
[   59.718849] [drm] GPU HANG: ecode 9:0:0xfff7fffe, reason: Manually set
wedged engine mask = , action: reset
[   59.719421] i915 :00:02.0: Resetting vecs0 after gpu hang
[   59.720276] [drm:i915_gem_reset_engine [i915]] resetting vecs0 to restart
from tail of request 0x1
[   59.721008] [drm:i915_reset_device [i915]] resetting chip
[   59.721226] i915 :00:02.0: Resetting chip after gpu hang
[   59.721575] i915 :00:02.0: GPU recovery failed

-Michał

> 
>  Known issues:
> 
> Test gem_eio:
> Subgroup in-flight-contexts:
> incomplete -> PASS   (shard-apl) fdo#105341 +1
> Test kms_flip:
> Subgroup 2x-flip-vs-expired-vblank:
> pass   -> FAIL   (shard-hsw) fdo#102887
> Subgroup plain-flip-ts-check-interruptible:
> fail   -> PASS   (shard-hsw) fdo#100368 +1
> Test kms_plane_multiple:
> Subgroup atomic-pipe-c-tiling-y:
> pass   -> FAIL   (shard-apl) fdo#103166
> 
> fdo#105341 https://bugs.freedesktop.org/show_bug.cgi?id=105341
> fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
> fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
> fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166
> 
> shard-apltotal:3262 pass:1716 dwarn:2   dfail:0   fail:17  skip:1524 
> time:11509s
> shard-hswtotal:3467 pass:1770 dwarn:1   dfail:0   fail:3   skip:1692 
> time:11547s
> shard-snbtotal:3467 pass:1365 dwarn:1   dfail:0   fail:1   skip:2100 
> time:6868s
> Blacklisted hosts:
> shard-kbltotal:3168 pass:1779 dwarn:2   dfail:1   fail:18  skip:1364 
> time:8065s
> 
> == Logs ==
> 
> For more details see: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8295/shards.html
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] ✗ Fi.CI.IGT: failure for series starting with [CI,1/4] drm/i915/guc: Tidy guc_log_control

2018-03-09 Thread Patchwork
== Series Details ==

Series: series starting with [CI,1/4] drm/i915/guc: Tidy guc_log_control
URL   : https://patchwork.freedesktop.org/series/39710/
State : failure

== Summary ==

 Possible new issues:

Test drv_missed_irq:
pass   -> SKIP   (shard-apl)
Test drv_selftest:
Subgroup live_guc:
pass   -> DMESG-WARN (shard-apl)
Test gem_exec_capture:
Subgroup capture-vebox:
pass   -> FAIL   (shard-apl)
Test perf:
Subgroup gen8-unprivileged-single-ctx-counters:
pass   -> FAIL   (shard-apl)

 Known issues:

Test gem_eio:
Subgroup in-flight-contexts:
incomplete -> PASS   (shard-apl) fdo#105341 +1
Test kms_flip:
Subgroup 2x-flip-vs-expired-vblank:
pass   -> FAIL   (shard-hsw) fdo#102887
Subgroup plain-flip-ts-check-interruptible:
fail   -> PASS   (shard-hsw) fdo#100368 +1
Test kms_plane_multiple:
Subgroup atomic-pipe-c-tiling-y:
pass   -> FAIL   (shard-apl) fdo#103166

fdo#105341 https://bugs.freedesktop.org/show_bug.cgi?id=105341
fdo#102887 https://bugs.freedesktop.org/show_bug.cgi?id=102887
fdo#100368 https://bugs.freedesktop.org/show_bug.cgi?id=100368
fdo#103166 https://bugs.freedesktop.org/show_bug.cgi?id=103166

shard-apltotal:3262 pass:1716 dwarn:2   dfail:0   fail:17  skip:1524 
time:11509s
shard-hswtotal:3467 pass:1770 dwarn:1   dfail:0   fail:3   skip:1692 
time:11547s
shard-snbtotal:3467 pass:1365 dwarn:1   dfail:0   fail:1   skip:2100 
time:6868s
Blacklisted hosts:
shard-kbltotal:3168 pass:1779 dwarn:2   dfail:1   fail:18  skip:1364 
time:8065s

== Logs ==

For more details see: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_8295/shards.html
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx