I think I'm triggering this problem with focal 5.4.0-14-generic
#17-Ubuntu.

Kernel log says:

----8<----
Feb 24 15:27:53 vesho kernel: Asynchronous wait on fence i915:Xorg[2401]:1d16c 
timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
Feb 24 15:27:53 vesho kernel: Asynchronous wait on fence i915:Xorg[2401]:1d170 
timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
Feb 24 15:27:53 vesho kernel: Asynchronous wait on fence i915:Xorg[2401]:1d16e 
timed out (hint:intel_atomic_commit_ready+0x0/0x54 [i915])
Feb 24 15:27:57 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:05 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:07 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:09 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:11 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:13 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:15 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:17 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:19 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:21 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:23 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:25 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:27 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:29 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:31 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:33 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:35 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:37 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:39 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:41 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:42 vesho kernel: GpuWatchdog[9598]: segfault at 0 ip 
0000560157086e32 sp 00007f17aaa944c0 error 6 in chrome[560153140000+7287000]
Feb 24 15:28:42 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 
0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 
a6 fe <c7> 04 25 00>
Feb 24 15:28:43 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:45 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:47 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:49 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:51 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:53 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:55 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:57 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:28:59 vesho kernel: i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Feb 24 15:29:01 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, 
cancelling all in-flight rendering.
Feb 24 15:29:01 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 24 15:29:02 vesho kernel: fbcon: Taking over console
Feb 24 15:29:02 vesho kernel: Console: switching to colour frame buffer device 
240x67
Feb 24 15:29:03 vesho kernel: i915 0000:00:02.0: GPU recovery timed out, 
cancelling all in-flight rendering.
Feb 24 15:29:03 vesho kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0
Feb 24 15:29:13 vesho kernel: GpuWatchdog[17771]: segfault at 0 ip 
000055f174f88e32 sp 00007f5f101404c0 error 6 in chrome[55f171042000+7287000]
Feb 24 15:29:13 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 
0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 
a6 fe <c7> 04 25 00>
Feb 24 15:29:23 vesho kernel: GpuWatchdog[18031]: segfault at 0 ip 
000055e12a17ee32 sp 00007f2bfbe2a4c0 error 6 in chrome[55e126238000+7287000]
Feb 24 15:29:23 vesho kernel: Code: 83 c3 e8 75 e9 41 8b 85 00 01 00 00 85 c0 
0f 84 99 00 00 00 48 8d 3d 63 61 4b fb be 01 00 00 00 ba 03 00 00 00 e8 fe 17 
a6 fe <c7> 04 25 00>
----8<-----

("fbcon" because I switch to a text vt to recover)

`/sys/class/drm/card0/error` has:

----8<----
GPU HANG: ecode 9:1:0x00000000, hang on rcs0
Kernel: 5.4.0-14-generic x86_64
Driver: 20190822
Time: 1582550869 s 450292 us
Boottime: 917 s 861996 us
Uptime: 916 s 761786 us
Epoch: 4295121736 jiffies (250 HZ)
Capture: 4295121736 jiffies; 2568 ms ago, 0 ms after epoch
Reset count: 0
Suspend count: 0
Platform: KABYLAKE
Subplatform: 0x0
PCI ID: 0x591b
PCI Revision: 0x04
PCI Subsystem: 1028:07bf
IOMMU enabled?: 1
DMC loaded: yes
DMC fw version: 1.4
GT awake: yes
RPM wakelock: yes
PM suspended: no
EIR: 0x00000000
IER: 0x08080000
GTIER[0]: 0x01010101
GTIER[1]: 0x01010101
GTIER[2]: 0x00000070
GTIER[3]: 0x00000101
PGTBL_ER: 0x00000000
FORCEWAKE: 0x00010001
DERRMR: 0x2077efef
CCID: 0x00000000
  fence[0] = 8e702f00858001
  fence[1] = e4302f009c4001
  fence[2] = ae7f09707f00001
  fence[3] = 4f7f09702000001
  fence[4] = ecbf0970bd40001
  fence[5] = fbbf02f0ecc0001
  fence[6] = 00000000
  fence[7] = 00000000
  fence[8] = 00000000
  fence[9] = 00000000
  fence[10] = 00000000
  fence[11] = 00000000
  fence[12] = 00000000
  fence[13] = 00000000
  fence[14] = 00000000
  fence[15] = 00000000
  fence[16] = 00000000
  fence[17] = 00000000
  fence[18] = 00000000
  fence[19] = 00000000
  fence[20] = 00000000
  fence[21] = 00000000
  fence[22] = 00000000
  fence[23] = 00000000
  fence[24] = 00000000
  fence[25] = 00000000
  fence[26] = 00000000
  fence[27] = 00000000
  fence[28] = 00000000
  fence[29] = 00000000
  fence[30] = 00000000
  fence[31] = 00000000
ERROR: 0x00000000
DONE_REG: 0xffffffff
FAULT_TLB_DATA: 0x00000009 0xf4bff213
Num Pipes: 3
Pipe [0]:
  Power: on
  SRC: 077f0437
  STAT: 00000000
Plane [0]:
  CNTR: c4042400
  STRIDE: 00000026
  SURF: 08f40000
  TILEOFF: 000d0240
Cursor [0]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
Pipe [1]:
  Power: on
  SRC: 09ff059f
  STAT: 00000000
Plane [1]:
  CNTR: c4043000
  STRIDE: 00000050
  SURF: 05000000
  TILEOFF: 00000000
Cursor [1]:
  CNTR: 04000027
  POS: 005a007b
  BASE: 00840000
Pipe [2]:
  Power: on
  SRC: 09ff059f
  STAT: 00000000
Plane [2]:
  CNTR: c4043000
  STRIDE: 00000050
  SURF: 06e00000
  TILEOFF: 00000000
Cursor [2]:
  CNTR: 00000000
  POS: 00000000
  BASE: 00000000
CPU transcoder: A
  Power: on
  CONF: 00000000
  HTOTAL: 00000000
  HBLANK: 00000000
  HSYNC: 00000000
  VTOTAL: 00000000
  VBLANK: 00000000
  VSYNC: 00000000
CPU transcoder: B
  Power: on
  CONF: c0000000
  HTOTAL: 0a9f09ff
  HBLANK: 0a9f09ff
  HSYNC: 0a4f0a2f
  VTOTAL: 05c8059f
  VBLANK: 05c8059f
  VSYNC: 05a705a2
CPU transcoder: C
  Power: on
  CONF: c0000000
  HTOTAL: 0a9f09ff
  HBLANK: 0a9f09ff
  HSYNC: 0a4f0a2f
  VTOTAL: 05c8059f
  VBLANK: 05c8059f
  VSYNC: 05a705a2
CPU transcoder: EDP
  Power: on
  CONF: c0000000
  HTOTAL: 081f077f
  HBLANK: 081f077f
  HSYNC: 07cf07af
  VTOTAL: 04560437
  VBLANK: 04560437
  VSYNC: 043f043a
is_mobile: no
is_lp: no
require_force_probe: no
has_64bit_reloc: yes
gpu_reset_clobbers_display: no
has_reset_engine: yes
has_fpga_dbg: yes
has_global_mocs: no
has_gt_uc: yes
has_l3_dpf: no
has_llc: yes
has_logical_ring_contexts: yes
has_logical_ring_elsq: no
has_logical_ring_preemption: yes
has_pooled_eu: no
has_rc6: yes
has_rc6p: no
has_rps: yes
has_runtime_pm: yes
has_snoop: no
has_coherent_ggtt: yes
unfenced_needs_alignment: no
hws_needs_physical: no
cursor_needs_physical: no
has_csr: yes
has_ddi: yes
has_dp_mst: yes
has_fbc: yes
has_gmch: no
has_hotplug: yes
has_ipc: yes
has_modular_fia: no
has_overlay: no
has_psr: yes
overlay_needs_physical: no
supports_tv: no
Has logical contexts? yes
scheduler: 1f
slice0: 3 subslice(s) (0x7):
        subslice0: 8 EUs (0xff)
        subslice1: 8 EUs (0xff)
        subslice2: 8 EUs (0xff)
        subslice3: 0 EUs (0x0)
slice1: 0 subslice(s) (0x0):
        subslice0: 0 EUs (0x0)
        subslice1: 0 EUs (0x0)
        subslice2: 0 EUs (0x0)
        subslice3: 0 EUs (0x0)
slice2: 0 subslice(s) (0x0):
        subslice0: 0 EUs (0x0)
        subslice1: 0 EUs (0x0)
        subslice2: 0 EUs (0x0)
        subslice3: 0 EUs (0x0)
i915.vbt_firmware=(null)
i915.modeset=-1
i915.lvds_channel_mode=0
i915.panel_use_ssc=-1
i915.vbt_sdvo_panel_type=-1
i915.enable_dc=-1
i915.enable_fbc=1
i915.enable_psr=-1
i915.disable_power_well=1
i915.enable_ips=1
i915.invert_brightness=0
i915.enable_guc=0
i915.guc_log_level=-1
i915.guc_firmware_path=(null)
i915.huc_firmware_path=(null)
i915.dmc_firmware_path=(null)
i915.mmio_debug=0
i915.edp_vswing=0
i915.reset=2
i915.inject_load_failure=0
i915.fastboot=-1
i915.enable_dpcd_backlight=-1
i915.force_probe=
i915.alpha_support=no
i915.enable_hangcheck=yes
i915.prefault_disable=no
i915.load_detect_test=no
i915.force_reset_modeset_test=no
i915.error_capture=yes
i915.disable_display=no
i915.verbose_state_checks=yes
i915.nuclear_pageflip=no
i915.enable_dp_mst=yes
i915.enable_gvt=no
GuC firmware:
        status: DISABLED
        version: wanted 33.0, found 0.0
        uCode: 0 bytes
        RSA: 0 bytes
HuC firmware: (null)
        status: N/A
        version: wanted 0.0, found 0.0
        uCode: 0 bytes
        RSA: 0 bytes
----8<----

I will next try vanilla build 5.4.22 that I see available in mainline
builds PPA (see that the proposed hangfix kernel above is older than
what I'm using).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1861395

Title:
  system hang: i915 Resetting rcs0 for hang on rcs0

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1861395/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to