https://bugs.freedesktop.org/show_bug.cgi?id=97369

            Bug ID: 97369
           Summary: AMDGPU/Iceland hangs kernel 4.8-rc2
           Product: DRI
           Version: unspecified
          Hardware: x86-64 (AMD64)
                OS: Linux (All)
            Status: NEW
          Severity: major
          Priority: medium
         Component: DRM/AMDgpu
          Assignee: dri-devel at lists.freedesktop.org
          Reporter: krejzi at email.com

Since update to 4.8 series, kernel hangs when Xorg tries to start up. I've
tracked the issue to amdgpu driver. Blacklisting the driver makes the hang go
away.

systemd-journald has recorded the following:

Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power
state, currently in D3
Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power
state, currently in D3
Aug 16 19:50:19 krejzi kernel: amdgpu 0000:01:00.0: Refused to change power
state, currently in D3
Aug 16 19:50:21 krejzi kernel: amdgpu 0000:01:00.0: Wait for MC idle timedout !
Aug 16 19:50:22 krejzi kernel: amdgpu 0000:01:00.0: Wait for MC idle timedout !
Aug 16 19:50:22 krejzi kernel: [drm] PCIE GART of 2048M enabled (table at
0x0000000000040000).
Aug 16 19:50:24 krejzi kernel: BUG: unable to handle kernel paging request at
ffffc91c00763fec
Aug 16 19:50:24 krejzi kernel: IP: [<ffffffffa0084aad>]
iceland_smu_populate_single_firmware_entry+0x4d/0x100 [amdgpu]
Aug 16 19:50:24 krejzi kernel: PGD 23700e067 PUD 0 
Aug 16 19:50:24 krejzi kernel: Oops: 0002 [#1] PREEMPT SMP
Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm
iwlwifi intel_vbtn
Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Not tainted
4.8.0-rc2-krejzi #1
Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS
N78 Ver. 01.11 05/09/2016
Aug 16 19:50:24 krejzi kernel: Workqueue: pm pm_runtime_work
Aug 16 19:50:24 krejzi kernel: task: ffff8802365b9b00 task.stack:
ffff88023627c000
Aug 16 19:50:24 krejzi kernel: RIP: 0010:[<ffffffffa0084aad>] 
[<ffffffffa0084aad>] iceland_smu_populate_single_firmware_entry+0x4d/0x100
[amdgpu]
Aug 16 19:50:24 krejzi kernel: RSP: 0018:ffff88023627fc40  EFLAGS: 00010202
Aug 16 19:50:24 krejzi kernel: RAX: 000000008002e000 RBX: ffff880231500000 RCX:
000000000000007e
Aug 16 19:50:24 krejzi kernel: RDX: ffffc91c00763fec RSI: 0000000000000003 RDI:
0000000000002180
Aug 16 19:50:24 krejzi kernel: RBP: ffffc90000764000 R08: 0000000000000002 R09:
ffff88023627fc34
Aug 16 19:50:24 krejzi kernel: R10: 0000000000000000 R11: 0000000000000001 R12:
ffff880231717500
Aug 16 19:50:24 krejzi kernel: R13: 0000000000008000 R14: 0000000000000000 R15:
0000000000000246
Aug 16 19:50:24 krejzi kernel: FS:  0000000000000000(0000)
GS:ffff8802404c0000(0000) knlGS:0000000000000000
Aug 16 19:50:24 krejzi kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 16 19:50:24 krejzi kernel: CR2: ffffc91c00763fec CR3: 0000000001e07000 CR4:
00000000003406e0
Aug 16 19:50:24 krejzi kernel: Stack:
Aug 16 19:50:24 krejzi kernel:  0000000000000000 ffff8802315005bc
ffff880231500000 ffffffffa0085156
Aug 16 19:50:24 krejzi kernel:  ffff0040ffffffff ffff880231500000
ffff8802315037b8 0000000000000048
Aug 16 19:50:24 krejzi kernel:  0000000000000002 ffff88023627fdd0
ffff880231500000 ffffffffa008578c
Aug 16 19:50:24 krejzi kernel: Call Trace:
Aug 16 19:50:24 krejzi kernel:  [<ffffffffa0085156>] ?
iceland_smu_start+0x336/0x5d0 [amdgpu]
Aug 16 19:50:24 krejzi kernel:  [<ffffffffa008578c>] ?
iceland_dpm_resume+0x1c/0x40 [amdgpu]
Aug 16 19:50:24 krejzi kernel:  [<ffffffffa004e5a8>] ? amdgpu_resume+0x58/0xa0
[amdgpu]
Aug 16 19:50:24 krejzi kernel:  [<ffffffffa00513e3>] ?
amdgpu_resume_kms+0xa3/0x370 [amdgpu]
Aug 16 19:50:24 krejzi kernel:  [<ffffffffa004e12c>] ?
amdgpu_pmops_runtime_resume+0x6c/0xa0 [amdgpu]
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81362023>] ?
pci_pm_runtime_resume+0x73/0xa0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81509760>] ?
vga_switcheroo_set_dynamic_switch+0x80/0x80
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81518688>] ? __rpm_callback+0x28/0x60
Aug 16 19:50:24 krejzi kernel:  [<ffffffff815186da>] ? rpm_callback+0x1a/0x70
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81509760>] ?
vga_switcheroo_set_dynamic_switch+0x80/0x80
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81519733>] ? rpm_resume+0x3e3/0x5f0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff81061427>] ? __switch_to+0x37/0x580
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c4cbe>] ?
_raw_spin_unlock_irq+0xe/0x20
Aug 16 19:50:24 krejzi kernel:  [<ffffffff8151a06e>] ?
pm_runtime_work+0x4e/0xa0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810d51d2>] ?
process_one_work+0x1c2/0x400
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810d5452>] ? worker_thread+0x42/0x4c0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c071b>] ? __schedule+0x2db/0x6a0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810d5410>] ?
process_one_work+0x400/0x400
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810da758>] ? kthread+0xc8/0xe0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c533f>] ? ret_from_fork+0x1f/0x40
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810da690>] ?
kthread_worker_fn+0x170/0x170
Aug 16 19:50:24 krejzi kernel: Code: 05 48 8b 8c 0b e0 65 00 00 48 85 c9 0f 84
b7 00 00 00 48 8b 49 08 48 05 2f 03 00 00 48 c1 e0 05 48 8b 44 03 08 8b 79 14
8b 49 10 <66> 89 32 c7 42 0c 00 00 00 00 66 89 4a 02 48 89 c1 48 c1 e9 20 
Aug 16 19:50:24 krejzi kernel: RIP  [<ffffffffa0084aad>]
iceland_smu_populate_single_firmware_entry+0x4d/0x100 [amdgpu]
Aug 16 19:50:24 krejzi kernel:  RSP <ffff88023627fc40>
Aug 16 19:50:24 krejzi kernel: CR2: ffffc91c00763fec
Aug 16 19:50:24 krejzi kernel: ---[ end trace 176c593915795723 ]---
Aug 16 19:50:24 krejzi kernel: general protection fault: 0000 [#2] PREEMPT SMP
Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm
iwlwifi intel_vbtn
Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Tainted: G    
 D         4.8.0-rc2-krejzi #1
Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS
N78 Ver. 01.11 05/09/2016
Aug 16 19:50:24 krejzi kernel: task: ffff8802365b9b00 task.stack:
ffff88023627c000
Aug 16 19:50:24 krejzi kernel: RIP: 0010:[<ffffffff810fe430>] 
[<ffffffff810fe430>] queued_spin_lock_slowpath+0x150/0x190
Aug 16 19:50:24 krejzi kernel: RSP: 0018:ffff88023627fe70  EFLAGS: 00010082
Aug 16 19:50:24 krejzi kernel: RAX: ccccccccccce3fbc RBX: ffff88023627ff18 RCX:
ffff8802404d72c0
Aug 16 19:50:24 krejzi kernel: RDX: 0000000000002d47 RSI: ffffffff81c4c914 RDI:
0000000000100000
Aug 16 19:50:24 krejzi kernel: RBP: 0000000000000000 R08: 00000000000000ff R09:
0000000000000000
Aug 16 19:50:24 krejzi kernel: R10: 00000000b98c6018 R11: ffff880237003200 R12:
ffff8802404d72c0
Aug 16 19:50:24 krejzi kernel: R13: 0000000000000046 R14: ffffc91c00763fec R15:
0000000000000009
Aug 16 19:50:24 krejzi kernel: FS:  0000000000000000(0000)
GS:ffff8802404c0000(0000) knlGS:0000000000000000
Aug 16 19:50:24 krejzi kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 16 19:50:24 krejzi kernel: CR2: 0000000000000028 CR3: 00000002240c9000 CR4:
00000000003406e0
Aug 16 19:50:24 krejzi kernel: Stack:
Aug 16 19:50:24 krejzi kernel:  ffff88023627ff18 0000000000000282
0000000000000000 ffffffff818c4e61
Aug 16 19:50:24 krejzi kernel:  ffff88023627ff18 ffff88023627ff10
ffffffff810f7103 ffff8802365ba210
Aug 16 19:50:24 krejzi kernel:  ffff8802365b9b00 0000000000000000
ffffffff810bcf9c ffff8802365b9b00
Aug 16 19:50:24 krejzi kernel: Call Trace:
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c4e61>] ?
_raw_spin_lock_irqsave+0x31/0x40
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810f7103>] ? complete+0x13/0x40
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810bcf9c>] ? mm_release+0x9c/0x120
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810c28ea>] ? do_exit+0x70a/0xad0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c6a47>] ?
rewind_stack_do_exit+0x17/0x20
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810da690>] ?
kthread_worker_fn+0x170/0x170
Aug 16 19:50:24 krejzi kernel: Code: b8 01 00 00 00 66 89 03 5b 5d 41 5c c3 c1
ea 12 83 e0 03 83 ea 01 48 c1 e0 04 48 63 d2 48 05 c0 72 01 00 48 03 04 d5 60
03 ea 81 <48> 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 85 c0 74 f7 4c 8b 
Aug 16 19:50:24 krejzi kernel: RIP  [<ffffffff810fe430>]
queued_spin_lock_slowpath+0x150/0x190
Aug 16 19:50:24 krejzi kernel:  RSP <ffff88023627fe70>
Aug 16 19:50:24 krejzi kernel: ---[ end trace 176c593915795724 ]---
Aug 16 19:50:24 krejzi kernel: Fixing recursive fault but reboot is needed!
Aug 16 19:50:24 krejzi kernel: BUG: scheduling while atomic:
kworker/3:2/129/0x00000003
Aug 16 19:50:24 krejzi kernel: Modules linked in: iwlmvm amdkfd amdgpu ttm
iwlwifi intel_vbtn
Aug 16 19:50:24 krejzi kernel: Preemption disabled at:[<          (null)>]     
     (null)
Aug 16 19:50:24 krejzi kernel: 
Aug 16 19:50:24 krejzi kernel: CPU: 3 PID: 129 Comm: kworker/3:2 Tainted: G    
 D         4.8.0-rc2-krejzi #1
Aug 16 19:50:24 krejzi kernel: Hardware name: HP HP ProBook 470 G3/8102, BIOS
N78 Ver. 01.11 05/09/2016
Aug 16 19:50:24 krejzi kernel:  0000000000000086 00000000b523baa8
ffffffff8132745c ffff8802404d67c0
Aug 16 19:50:24 krejzi kernel:  00000000000167c0 ffffffff810de8a7
ffffffff818c09bf ffff8802365b9b00
Aug 16 19:50:24 krejzi kernel:  00000000b523baa8 ffff880236280000
000000000000000b 0000000000000000
Aug 16 19:50:24 krejzi kernel: Call Trace:
Aug 16 19:50:24 krejzi kernel:  [<ffffffff8132745c>] ? dump_stack+0x46/0x5a
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810de8a7>] ? __schedule_bug+0x57/0xb0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c09bf>] ? __schedule+0x57f/0x6a0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c0b16>] ? schedule+0x36/0x90
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810c2a96>] ? do_exit+0x8b6/0xad0
Aug 16 19:50:24 krejzi kernel:  [<ffffffff818c6a47>] ?
rewind_stack_do_exit+0x17/0x20
Aug 16 19:50:24 krejzi kernel:  [<ffffffff810da690>] ?
kthread_worker_fn+0x170/0x170
Aug 16 19:50:24 krejzi kernel: BUG: unable to handle kernel paging request at
ffffffffffffffd8
Aug 16 19:50:24 krejzi kernel: IP: [<ffffffff810dab57>] kthread_data+0x7/0x10



Relevant dmesg output from amdgpu load time (also obtained from journal)

Aug 16 19:49:49 krejzi kernel: [drm] amdgpu kernel modesetting enabled.
Aug 16 19:49:49 krejzi kernel: vga_switcheroo: detected switching method
\_SB_.PCI0.GFX0.ATPX handle
Aug 16 19:49:49 krejzi kernel: ATPX version 1, functions 0x00000003
Aug 16 19:49:49 krejzi kernel: ATPX Hybrid Graphics
Aug 16 19:49:49 krejzi kernel: iwlwifi 0000:03:00.0: loaded firmware version
22.361476.0 op_mode iwlmvm
Aug 16 19:49:49 krejzi kernel: CRAT table not found
Aug 16 19:49:49 krejzi kernel: Finished initializing topology ret=0
Aug 16 19:49:49 krejzi kernel: kfd kfd: Initialized module
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: enabling device (0006 ->
0007)
Aug 16 19:49:49 krejzi kernel: [drm] initializing kernel modesetting (TOPAZ
0x1002:0x6900 0x103C:0x811C 0x83).
Aug 16 19:49:49 krejzi kernel: [drm] register mmio base: 0xE2000000
Aug 16 19:49:49 krejzi kernel: [drm] register mmio size: 262144
Aug 16 19:49:49 krejzi kernel: [drm] doorbell mmio base: 0xE0000000
Aug 16 19:49:49 krejzi kernel: [drm] doorbell mmio size: 2097152
Aug 16 19:49:49 krejzi kernel: [drm] probing gen 2 caps for device 8086:9d10 =
1724843/e
Aug 16 19:49:49 krejzi kernel: [drm] probing mlw for device 8086:9d10 = 1724843
Aug 16 19:49:49 krejzi kernel: vga_switcheroo: enabled
Aug 16 19:49:49 krejzi kernel: ATOM BIOS: HP/Quanta
Aug 16 19:49:49 krejzi kernel: [drm] GPU not posted. posting now...
Aug 16 19:49:49 krejzi kernel: [drm] Changing default dispclk from 0Mhz to
600Mhz
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: VRAM: 2048M
0x0000000000000000 - 0x000000007FFFFFFF (2048M used)
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: GTT: 2048M
0x0000000080000000 - 0x00000000FFFFFFFF
Aug 16 19:49:49 krejzi kernel: [drm] Detected VRAM RAM=2048M, BAR=256M
Aug 16 19:49:49 krejzi kernel: [drm] RAM width 64bits DDR3
Aug 16 19:49:49 krejzi kernel: [TTM] Zone  kernel: Available graphics memory:
4027936 kiB
Aug 16 19:49:49 krejzi kernel: [TTM] Zone   dma32: Available graphics memory:
2097152 kiB
Aug 16 19:49:49 krejzi kernel: [TTM] Initializing pool allocator
Aug 16 19:49:49 krejzi kernel: [TTM] Initializing DMA pool allocator
Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: 2048M of VRAM memory ready
Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: 2048M of GTT memory ready.
Aug 16 19:49:49 krejzi kernel: [drm] GART: num cpu pages 524288, num gpu pages
524288
Aug 16 19:49:49 krejzi kernel: [drm] PCIE GART of 2048M enabled (table at
0x0000000000040000).
Aug 16 19:49:49 krejzi kernel: [drm] Supports vblank timestamp caching Rev 2
(21.10.2013).
Aug 16 19:49:49 krejzi kernel: [drm] Driver supports precise vblank timestamp
query.
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: amdgpu: using MSI.
Aug 16 19:49:49 krejzi kernel: [drm] amdgpu: irq initialized.
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 0 use
gpu addr 0x0000000080000010, cpu addr 0xffff880231523010
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 1 use
gpu addr 0x0000000080000020, cpu addr 0xffff880231523020
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 2 use
gpu addr 0x0000000080000030, cpu addr 0xffff880231523030
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 3 use
gpu addr 0x0000000080000040, cpu addr 0xffff880231523040
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 4 use
gpu addr 0x0000000080000050, cpu addr 0xffff880231523050
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 5 use
gpu addr 0x0000000080000060, cpu addr 0xffff880231523060
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 6 use
gpu addr 0x0000000080000070, cpu addr 0xffff880231523070
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 7 use
gpu addr 0x0000000080000080, cpu addr 0xffff880231523080
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 8 use
gpu addr 0x0000000080000090, cpu addr 0xffff880231523090
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 9 use
gpu addr 0x00000000800000a0, cpu addr 0xffff8802315230a0
Aug 16 19:49:49 krejzi kernel: amdgpu 0000:01:00.0: fence driver on ring 10 use
gpu addr 0x00000000800000b0, cpu addr 0xffff8802315230b0
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 0 succeeded in 15 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 1 succeeded in 19 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 2 succeeded in 15 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 3 succeeded in 5 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 4 succeeded in 2 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 5 succeeded in 2 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 6 succeeded in 2 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 7 succeeded in 3 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 8 succeeded in 2 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 9 succeeded in 6 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ring test on 10 succeeded in 4 usecs
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 0 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 1 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 2 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 3 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 4 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 5 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 6 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 7 succeeded
Aug 16 19:49:49 krejzi kernel: [drm] ib test on ring 8 succeeded
Aug 16 19:49:49 krejzi kernel: [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR*
amdgpu: fence wait failed (1000).
Aug 16 19:49:49 krejzi kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR*
amdgpu: failed testing IB on ring 9 (1000).
Aug 16 19:49:49 krejzi kernel: [drm:sdma_v2_4_ring_test_ib [amdgpu]] *ERROR*
amdgpu: fence wait failed (1000).
Aug 16 19:49:49 krejzi kernel: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR*
amdgpu: failed testing IB on ring 10 (1000).
Aug 16 19:49:49 krejzi kernel: [drm] Initialized amdgpu 3.3.0 20150101 for
0000:01:00.0 on minor 1


Radeon R7 M340 Hybrid Graphics, Topaz/Iceland.

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<https://lists.freedesktop.org/archives/dri-devel/attachments/20160816/0dc45442/attachment-0001.html>

Reply via email to