I have similar messages in journalctl:

Package: linux-firmware
Version: 1.197.3

Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting 
at address 0x0000800101140000 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting 
at address 0x0000800101188000 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault 
(src_id:0 ring:0 vmid:1 pasid:32778, for process vivaldi-bin pid 1673 thread 
vivaldi-bi:cs0 pid 1699)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:   in page starting 
at address 0x0000800101189000 from client 0x12 (VMC)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu: 
MMVM_L2_PROTECTION_FAULT_STATUS:0x00105631
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          Faulty 
UTCL2 client ID: VCN0 (0x2b)
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MORE_FAULTS: 0x1
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
WALKER_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
PERMISSION_FAULTS: 0x3
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          
MAPPING_ERROR: 0x0
Aug 29 16:58:44 dagon kernel: amdgpu 0000:03:00.0: amdgpu:          RW: 0x0

-- 
You received this bug notification because you are a member of Desktop
Packages, which is subscribed to mesa in Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

Status in amd:
  New
Status in linux-firmware package in Ubuntu:
  Incomplete
Status in mesa package in Ubuntu:
  Confirmed

Bug description:
  After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
  upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent
  and severe GPU instability. When this happens, I see this error in
  dmesg:

  [20061.061069] amdgpu 0000:03:00.0: amdgpu: [gfxhub0] retry page fault 
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 
pid 1236)
  [20061.061103] amdgpu 0000:03:00.0: amdgpu:   in page starting at address 
0x800000401000 from client 27
  [20061.061135] amdgpu 0000:03:00.0: amdgpu: 
VM_L2_PROTECTION_FAULT_STATUS:0x00101031
  [20061.061147] amdgpu 0000:03:00.0: amdgpu:      Faulty UTCL2 client ID: TCP 
(0x8)
  [20061.061157] amdgpu 0000:03:00.0: amdgpu:      MORE_FAULTS: 0x1
  [20061.061167] amdgpu 0000:03:00.0: amdgpu:      WALKER_ERROR: 0x0
  [20061.061174] amdgpu 0000:03:00.0: amdgpu:      PERMISSION_FAULTS: 0x3
  [20061.061183] amdgpu 0000:03:00.0: amdgpu:      MAPPING_ERROR: 0x0
  [20061.061189] amdgpu 0000:03:00.0: amdgpu:      RW: 0x0

  I'll attach a couple of full dmesgs that I collected.

  Many of the times when this happens, the screen and keyboard freeze
  irreversibly (I tried waiting for more than 30 minutes, but it doesn't
  help). I can still log in via ssh though. When there's no freeze, I
  can continue using the computer normally, but the laptop fans keep
  running are always running and the battery depletes fast. There's
  probably something on a permanent loop either in the kernel or in the
  GPU.

  This bug happens several times a day, rendering the machine so
  unstable as to be almost unusable. It is a severe regression and I'm
  aghast that it passed AMD's Quality Assurance.

  After downgrading back to linux-firmware 1.190.5, the machine is back
  to the previous, mostly-reliable state. Which is to say, this bug is
  gone, I'm just left with the other amdgpu suspend bug I've learned to
  live with since I bought this computer.

  Please revert the amdgpu firmware in this package as soon as possible.
  This is unbearable.

  Relevant information:
  Ubuntu version: 21.04
  Linux kernel: 5.11.0-17-generic x86_64
  CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
  GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Picasso (rev c1)
  Laptop model: Lenovo Ideapad S145

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
Mailing list: https://launchpad.net/~desktop-packages
Post to     : desktop-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~desktop-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to