https://bugs.freedesktop.org/show_bug.cgi?id=97369
Bug ID: 97369 Summary: AMDGPU/Iceland hangs kernel 4.8-rc2 Product: DRI Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: major Priority: medium Component: DRM/AMDgpu Assignee: dri-devel at lists.freedesktop.org Reporter: krejzi at email.com Since update to 4.8 series, kernel hangs when Xorg tries to start up. I've tracked the issue to amdgpu driver. Blacklisting the driver makes the hang go away. systemd-journald has recorded the following: Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power state, currently in D3 Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power state, currently in D3 Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power state, currently in D3 Aug 16 19:50:21 krejzi kernel: amdgpu 0000:01:00.0: Wait for MC idle timedout ! Aug 16 19:50:22 krejzi kernel: amdgpu 0000:01:00.0: Wait for MC idle timedout ! Aug 16 19:50:22 krejzi kernel: [drm] PCIE GART of 2048M enabled (table at 0x0000000000040000). Aug 16 19:50:24 krejzi kernel: BUG: unable to handle kernel paging request at ffffc91c00763fec Aug 16 19:50:24 krejzi kernel: IP: [<ffffffffa0084aad>] iceland_smu_populate_single_firmware_entry+0x4d/0x100 [amdgpu] Aug 16 19:50:24 krejzi kernel: PGD 23700e067 PUD 0 Aug 16 19:50:24 krejzi kernel: Oops: 0002 [#1] PREEMPT SMP Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm iwlwifi intel_vbtn Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Not tainted 4.8.0-rc2-krejzi #1 Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.11 05/09/2016 Aug 16 19:50:24 krejzi kernel: Workqueue: pm pm_runtime_work Aug 16 19:50:24 krejzi kernel: task: ffff8802365b9b00 task.stack: ffff88023627c000 Aug 16 19:50:24 krejzi kernel: RIP: 0010:[<ffffffffa0084aad>] [<ffffffffa0084aad>] iceland_smu_populate_single_firmware_entry+0x4d/0x100 [amdgpu] Aug 16 19:50:24 krejzi kernel: RSP: 0018:ffff88023627fc40 EFLAGS: 00010202 Aug 16 19:50:24 krejzi kernel: RAX: 000000008002e000 RBX: ffff880231500000 RCX: 000000000000007e Aug 16 19:50:24 krejzi kernel: RDX: ffffc91c00763fec RSI: 0000000000000003 RDI: 0000000000002180 Aug 16 19:50:24 krejzi kernel: RBP: ffffc90000764000 R08: 0000000000000002 R09: ffff88023627fc34 Aug 16 19:50:24 krejzi kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880231717500 Aug 16 19:50:24 krejzi kernel: R13: 0000000000008000 R14: 0000000000000000 R15: 0000000000000246 Aug 16 19:50:24 krejzi kernel: FS: 0000000000000000(0000) GS:ffff8802404c0000(0000) knlGS:0000000000000000 Aug 16 19:50:24 krejzi kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 16 19:50:24 krejzi kernel: CR2: ffffc91c00763fec CR3: 0000000001e07000 CR4: 00000000003406e0 Aug 16 19:50:24 krejzi kernel: Stack: Aug 16 19:50:24 krejzi kernel: 0000000000000000 ffff8802315005bc ffff880231500000 ffffffffa0085156 Aug 16 19:50:24 krejzi kernel: ffff0040ffffffff ffff880231500000 ffff8802315037b8 0000000000000048 Aug 16 19:50:24 krejzi kernel: 0000000000000002 ffff88023627fdd0 ffff880231500000 ffffffffa008578c Aug 16 19:50:24 krejzi kernel: Call Trace: Aug 16 19:50:24 krejzi kernel: [<ffffffffa0085156>] ? iceland_smu_start+0x336/0x5d0 [amdgpu] Aug 16 19:50:24 krejzi kernel: [<ffffffffa008578c>] ? iceland_dpm_resume+0x1c/0x40 [amdgpu] Aug 16 19:50:24 krejzi kernel: [<ffffffffa004e5a8>] ? amdgpu_resume+0x58/0xa0 [amdgpu] Aug 16 19:50:24 krejzi kernel: [<ffffffffa00513e3>] ? amdgpu_resume_kms+0xa3/0x370 [amdgpu] Aug 16 19:50:24 krejzi kernel: [<ffffffffa004e12c>] ? amdgpu_pmops_runtime_resume+0x6c/0xa0 [amdgpu] Aug 16 19:50:24 krejzi kernel: [<ffffffff81362023>] ? pci_pm_runtime_resume+0x73/0xa0 Aug 16 19:50:24 krejzi kernel: [<ffffffff81509760>] ? vga_switcheroo_set_dynamic_switch+0x80/0x80 Aug 16 19:50:24 krejzi kernel: [<ffffffff81518688>] ? __rpm_callback+0x28/0x60 Aug 16 19:50:24 krejzi kernel: [<ffffffff815186da>] ? rpm_callback+0x1a/0x70 Aug 16 19:50:24 krejzi kernel: [<ffffffff81509760>] ? vga_switcheroo_set_dynamic_switch+0x80/0x80 Aug 16 19:50:24 krejzi kernel: [<ffffffff81519733>] ? rpm_resume+0x3e3/0x5f0 Aug 16 19:50:24 krejzi kernel: [<ffffffff81061427>] ? __switch_to+0x37/0x580 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c4cbe>] ? _raw_spin_unlock_irq+0xe/0x20 Aug 16 19:50:24 krejzi kernel: [<ffffffff8151a06e>] ? pm_runtime_work+0x4e/0xa0 Aug 16 19:50:24 krejzi kernel: [<ffffffff810d51d2>] ? process_one_work+0x1c2/0x400 Aug 16 19:50:24 krejzi kernel: [<ffffffff810d5452>] ? worker_thread+0x42/0x4c0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c071b>] ? __schedule+0x2db/0x6a0 Aug 16 19:50:24 krejzi kernel: [<ffffffff810d5410>] ? process_one_work+0x400/0x400 Aug 16 19:50:24 krejzi kernel: [<ffffffff810da758>] ? kthread+0xc8/0xe0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c533f>] ? ret_from_fork+0x1f/0x40 Aug 16 19:50:24 krejzi kernel: [<ffffffff810da690>] ? kthread_worker_fn+0x170/0x170 Aug 16 19:50:24 krejzi kernel: Code: 05 48 8b 8c 0b e0 65 00 00 48 85 c9 0f 84 b7 00 00 00 48 8b 49 08 48 05 2f 03 00 00 48 c1 e0 05 48 8b 44 03 08 8b 79 14 8b 49 10 <66> 89 32 c7 42 0c 00 00 00 00 66 89 4a 02 48 89 c1 48 c1 e9 20 Aug 16 19:50:24 krejzi kernel: RIP [<ffffffffa0084aad>] iceland_smu_populate_single_firmware_entry+0x4d/0x100 [amdgpu] Aug 16 19:50:24 krejzi kernel: RSP <ffff88023627fc40> Aug 16 19:50:24 krejzi kernel: CR2: ffffc91c00763fec Aug 16 19:50:24 krejzi kernel: ---[ end trace 176c593915795723 ]--- Aug 16 19:50:24 krejzi kernel: general protection fault: 0000 [#2] PREEMPT SMP Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm iwlwifi intel_vbtn Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Tainted: G D 4.8.0-rc2-krejzi #1 Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.11 05/09/2016 Aug 16 19:50:24 krejzi kernel: task: ffff8802365b9b00 task.stack: ffff88023627c000 Aug 16 19:50:24 krejzi kernel: RIP: 0010:[<ffffffff810fe430>] [<ffffffff810fe430>] queued_spin_lock_slowpath+0x150/0x190 Aug 16 19:50:24 krejzi kernel: RSP: 0018:ffff88023627fe70 EFLAGS: 00010082 Aug 16 19:50:24 krejzi kernel: RAX: ccccccccccce3fbc RBX: ffff88023627ff18 RCX: ffff8802404d72c0 Aug 16 19:50:24 krejzi kernel: RDX: 0000000000002d47 RSI: ffffffff81c4c914 RDI: 0000000000100000 Aug 16 19:50:24 krejzi kernel: RBP: 0000000000000000 R08: 00000000000000ff R09: 0000000000000000 Aug 16 19:50:24 krejzi kernel: R10: 00000000b98c6018 R11: ffff880237003200 R12: ffff8802404d72c0 Aug 16 19:50:24 krejzi kernel: R13: 0000000000000046 R14: ffffc91c00763fec R15: 0000000000000009 Aug 16 19:50:24 krejzi kernel: FS: 0000000000000000(0000) GS:ffff8802404c0000(0000) knlGS:0000000000000000 Aug 16 19:50:24 krejzi kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 16 19:50:24 krejzi kernel: CR2: 0000000000000028 CR3: 00000002240c9000 CR4: 00000000003406e0 Aug 16 19:50:24 krejzi kernel: Stack: Aug 16 19:50:24 krejzi kernel: ffff88023627ff18 0000000000000282 0000000000000000 ffffffff818c4e61 Aug 16 19:50:24 krejzi kernel: ffff88023627ff18 ffff88023627ff10 ffffffff810f7103 ffff8802365ba210 Aug 16 19:50:24 krejzi kernel: ffff8802365b9b00 0000000000000000 ffffffff810bcf9c ffff8802365b9b00 Aug 16 19:50:24 krejzi kernel: Call Trace: Aug 16 19:50:24 krejzi kernel: [<ffffffff818c4e61>] ? _raw_spin_lock_irqsave+0x31/0x40 Aug 16 19:50:24 krejzi kernel: [<ffffffff810f7103>] ? complete+0x13/0x40 Aug 16 19:50:24 krejzi kernel: [<ffffffff810bcf9c>] ? mm_release+0x9c/0x120 Aug 16 19:50:24 krejzi kernel: [<ffffffff810c28ea>] ? do_exit+0x70a/0xad0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c6a47>] ? rewind_stack_do_exit+0x17/0x20 Aug 16 19:50:24 krejzi kernel: [<ffffffff810da690>] ? kthread_worker_fn+0x170/0x170 Aug 16 19:50:24 krejzi kernel: Code: b8 01 00 00 00 66 89 03 5b 5d 41 5c c3 c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 04 48 63 d2 48 05 c0 72 01 00 48 03 04 d5 60 03 ea 81 <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 4c 8b Aug 16 19:50:24 krejzi kernel: RIP [<ffffffff810fe430>] queued_spin_lock_slowpath+0x150/0x190 Aug 16 19:50:24 krejzi kernel: RSP <ffff88023627fe70> Aug 16 19:50:24 krejzi kernel: ---[ end trace 176c593915795724 ]--- Aug 16 19:50:24 krejzi kernel: Fixing recursive fault but reboot is needed! Aug 16 19:50:24 krejzi kernel: BUG: scheduling while atomic: kworker/3:2/129/0x00000003 Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm iwlwifi intel_vbtn Aug 16 19:50:24 krejzi kernel: Preemption disabled at:[< (null)>] (null) Aug 16 19:50:24 krejzi kernel: Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Tainted: G D 4.8.0-rc2-krejzi #1 Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.11 05/09/2016 Aug 16 19:50:24 krejzi kernel: 0000000000000086 00000000b523baa8 ffffffff8132745c ffff8802404d67c0 Aug 16 19:50:24 krejzi kernel: 00000000000167c0 ffffffff810de8a7 ffffffff818c09bf ffff8802365b9b00 Aug 16 19:50:24 krejzi kernel: 00000000b523baa8 ffff880236280000 000000000000000b 0000000000000000 Aug 16 19:50:24 krejzi kernel: Call Trace: Aug 16 19:50:24 krejzi kernel: [<ffffffff8132745c>] ? dump_stack+0x46/0x5a Aug 16 19:50:24 krejzi kernel: [<ffffffff810de8a7>] ? __schedule_bug+0x57/0xb0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c09bf>] ? __schedule+0x57f/0x6a0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c0b16>] ? schedule+0x36/0x90 Aug 16 19:50:24 krejzi kernel: [<ffffffff810c2a96>] ? do_exit+0x8b6/0xad0 Aug 16 19:50:24 krejzi kernel: [<ffffffff818c6a47>] ? rewind_stack_do_exit+0x17/0x20 Aug 16 19:50:24 krejzi kernel: [<ffffffff810da690>] ? kthread_worker_fn+0x170/0x170 Aug 16 19:50:24 krejzi kernel: BUG: unable to handle kernel paging request at ffffffffffffffd8 Aug 16 19:50:24 krejzi kernel: IP: [<ffffffff810dab57>] kthread_data+0x7/0x10 Relevant dmesg output from amdgpu load time (also obtained from journal) Aug 16 19:49:49 krejzi kernel: [drm] amdgpu kernel modesetting enabled. Aug 16 19:49:49 krejzi kernel: vga_switcheroo: detected switching method \_SB_.PCI0.GFX0.ATPX handle Aug 16 19:49:49 krejzi kernel: ATPX version 1, functions 0x00000003 Aug 16 19:49:49 krejzi kernel: ATPX Hybrid Graphics Aug 16 19:49:49 krejzi kernel: iwlwifi 0000:03:00.0: loaded firmware version 22.361476.0 op_mode iwlmvm Aug 16 19:49:49 krejzi kernel: CRAT table not found Aug 16 19:49:49 krejzi kernel: Finished initializing topology ret=0 Aug 16 19:49:49 krejzi kernel: kfd kfd: Initialized module Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: enabling device (0006 -> 0007) Aug 16 19:49:49 krejzi kernel: [drm] initializing kernel modesetting (TOPAZ 0x1002:0x6900 0x103C:0x811C 0x83). Aug 16 19:49:49 krejzi kernel: [drm] register mmio base: 0xE2000000 Aug 16 19:49:49 krejzi kernel: [drm] register mmio size: 262144 Aug 16 19:49:49 krejzi kernel: [drm] doorbell mmio base: 0xE0000000 Aug 16 19:49:49 krejzi kernel: [drm] doorbell mmio size: 2097152 Aug 16 19:49:49 krejzi kernel: [drm] probing gen 2 caps for device 8086:9d10 = 1724843/e Aug 16 19:49:49 krejzi kernel: [drm] probing mlw for device 8086:9d10 = 1724843 Aug 16 19:49:49 krejzi kernel: vga_switcheroo: enabled Aug 16 19:49:49 krejzi kernel: ATOM BIOS: HP/Quanta Aug 16 19:49:49 krejzi kernel: [drm] GPU not posted. posting now... Aug 16 19:49:49 krejzi kernel: [drm] Changing default dispclk from 0Mhz to 600Mhz Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used) Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: GTT: 2048M 0x0000000080000000 - 0x00000000FFFFFFFF Aug 16 19:49:49 krejzi kernel: [drm] Detected VRAM RAM=2048M, BAR=256M Aug 16 19:49:49 krejzi kernel: [drm] RAM width 64bits DDR3 Aug 16 19:49:49 krejzi kernel: [TTM] Zone kernel: Available graphics memory: 4027936 kiB Aug 16 19:49:49 krejzi kernel: [TTM] Zone dma32: Available graphics memory: 2097152 kiB Aug 16 19:49:49 krejzi kernel: [TTM] Initializing pool allocator Aug 16 19:49:49 krejzi kernel: [TTM] Initializing DMA pool allocator Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: 2048M of VRAM memory ready Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: 2048M of GTT memory ready. Aug 16 19:49:49 krejzi kernel: [drm] GART: num cpu pages 524288, num gpu pages 524288 Aug 16 19:49:49 krejzi kernel: [drm] PCIE GART of 2048M enabled (table at 0x0000000000040000). Aug 16 19:49:49 krejzi kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013). Aug 16 19:49:49 krejzi kernel: [drm] Driver supports precise vblank timestamp query. Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: amdgpu: using MSI. Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: irq initialized. Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000010, cpu addr 0xffff880231523010 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000020, cpu addr 0xffff880231523020 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000030, cpu addr 0xffff880231523030 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000040, cpu addr 0xffff880231523040 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000050, cpu addr 0xffff880231523050 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000080000060, cpu addr 0xffff880231523060 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x0000000080000070, cpu addr 0xffff880231523070 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x0000000080000080, cpu addr 0xffff880231523080 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x0000000080000090, cpu addr 0xffff880231523090 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x00000000800000a0, cpu addr 0xffff8802315230a0 Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x00000000800000b0, cpu addr 0xffff8802315230b0 Aug 16 19:49:49 krejzi kernel: [drm] ring test on 0 succeeded in 15 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 1 succeeded in 19 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 2 succeeded in 15 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 3 succeeded in 5 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 4 succeeded in 2 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 5 succeeded in 2 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 6 succeeded in 2 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 7 succeeded in 3 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 8 succeeded in 2 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 9 succeeded in 6 usecs Aug 16 19:49:49 krejzi kernel: [drm] ring test on 10 succeeded in 4 usecs Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 0 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 1 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 2 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 3 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 4 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 5 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 6 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 7 succeeded Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 8 succeeded Aug 16 19:49:49 krejzi kernel: [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR* amdgpu: fence wait failed (1000). Aug 16 19:49:49 krejzi kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 9 (1000). Aug 16 19:49:49 krejzi kernel: [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR* amdgpu: fence wait failed (1000). Aug 16 19:49:49 krejzi kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* amdgpu: failed testing IB on ring 10 (1000). Aug 16 19:49:49 krejzi kernel: [drm] Initialized amdgpu 3.3.0 20150101 for 0000:01:00.0 on minor 1 Radeon R7 M340 Hybrid Graphics, Topaz/Iceland. -- You are receiving this mail because: You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20160816/0dc45442/attachment-0001.html>