[Bug 109403] amdgpu randomly hangs while streaming or when CPU is busy on X399 with TR 1950X

2019-11-19 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109403

Martin Peres  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |MOVED

--- Comment #4 from Martin Peres  ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/682.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[Bug 109403] amdgpu randomly hangs while streaming or when CPU is busy on X399 with TR 1950X

2019-01-31 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109403

--- Comment #3 from Andrey Grodzovsky  ---
Hey, can you check if adding amdgpu.vm_debug=1 makes the VMC page faults 
disappear ?

Regarding  the hang you see while doing GPU reset - please provide dmesg for
this but with command line parameter of drm.debug=0xff  - you also probably
should open another ticket for this specific issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 109403] amdgpu randomly hangs while streaming or when CPU is busy on X399 with TR 1950X

2019-01-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109403

--- Comment #2 from Chris  ---
I wonder if this is related to your motherboard. I have an ASUS Zenith with a
1950X, 128GB RAM and a Vega 64 LC that have been on Kernel 4.20 through
5.0-rc4. The latter of which I'm currently on. I have no kernel parameters on
my grub file, only the default 'splash quiet'. I have never ran into hangs
while gaming, using youtube, OBS nor compiling the linux kernel. Just thought I
would share my similar configuration. I can only suggest try updating and or
downgrading your BIOS?

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 109403] amdgpu randomly hangs while streaming or when CPU is busy on X399 with TR 1950X

2019-01-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109403

Ivan Avdeev <1...@provod.gl> changed:

   What|Removed |Added

 CC||1...@provod.gl

--- Comment #1 from Ivan Avdeev <1...@provod.gl> ---
Created attachment 143176
  --> https://bugs.freedesktop.org/attachment.cgi?id=143176=edit
dmesg-5.0-rc2

-- 
You are receiving this mail because:
You are the assignee for the bug.___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[Bug 109403] amdgpu randomly hangs while streaming or when CPU is busy on X399 with TR 1950X

2019-01-21 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=109403

Bug ID: 109403
   Summary: amdgpu randomly hangs while streaming or when CPU is
busy on X399 with TR 1950X
   Product: DRI
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: DRM/AMDgpu
  Assignee: dri-devel@lists.freedesktop.org
  Reporter: 1...@provod.gl

I've been experiencing random GPU hangs since I upgraded to Threadripper about
a year ago.

Specs:
- Motherboard: ASUS Prime X399-A, all bios versions from stock until current
0808
- CPU: Threadripper 1950X, 32 threads
- GPU: MSI Radeon RX Vega 64 Air Boost 8G OC (was also happening on ASUS R9
Fury X on the same machine; this GPU was generally stable on previous box)
- Displays:
   - 2x DELL U2412M 1920x1200x60 (DP)
   - 1x ASUS MG279Q 2560x1440x144 (DP)
- Kernel versions: 4.20, 5.0-rc2 (has been happening since from at least 4.14;
earlier versions weren't tried).
- linux-firmware: 20181218
- Mesa: 18.3.1
- X: 1.20.3
- libdrm: 2.4.96
- Possibly relevant kernel options: amd_iommu=on
vfio-pci.ids=10de:1005,10de:0e1a,1912:0014,1106:3483 iommu=pt
vfio-pci.disable_vga=1 hpet=disable nohpet amdgpu.ppfeaturemask=0xfffd7fff
amdgpu.gpu_recovery=1 pcie_aspm=off

The problem manifests itself usually like this:
1. Screen suddenly freezes (sometimes it is possible to move mouse cursor for a
few seconds, but it will freeze eventually too)
2. GPU fan speeds up and remain high
3. Every process that talks to GPU freezes and becomes impossible to kill.
4. Can SSH into the machine and everything else besides the GPU works ok.
5. dmesg contains a message like this:
[Jan21 00:03] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring
gfx timeout, signaled seq=17188686, emitted seq=17188689
[  +0.32] [drm:amdgpu_job_timedout [amdgpu]] *ERROR*
Process information: process X pid 9315 thread X:cs0 pid 9335
or with a bit more stuff happening before:
[Jan18 19:43] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.03] amdgpu :44:00.0:   in page starting at
address 0x800010607000 from 27
[  +0.02] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x0060153D
[  +0.05] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.02] amdgpu :44:00.0:   in page starting at
address 0x800010609000 from 27
[  +0.01] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.01] amdgpu :44:00.0:   in page starting at
address 0x800010607000 from 27
[  +0.02] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.01] amdgpu :44:00.0:   in page starting at
address 0x800010609000 from 27
[  +0.01] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.02] amdgpu :44:00.0:   in page starting at
address 0x800010607000 from 27
[  +0.01] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.01] amdgpu :44:00.0:   in page starting at
address 0x800010609000 from 27
[  +0.01] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[  +0.01] amdgpu :44:00.0:   in page starting at
address 0x800010607000 from 27
[  +0.01] amdgpu :44:00.0:
VM_L2_PROTECTION_FAULT_STATUS:0x
[  +0.04] amdgpu :44:00.0: [gfxhub] VMC page fault
(src_id:0 ring:158 vmid:6 pasid:32771, for process superposition pid 11225
thread superposit:cs0 pid 11308)
[