[Kernel-packages] [Bug 2039926] Re: Error UBSAN: array-index-out-of-bounds amdgpu

2023-10-27 Thread Alex Deucher
This updates the rest of them:
https://patchwork.freedesktop.org/patch/564786/

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2039926

Title:
  Error UBSAN: array-index-out-of-bounds amdgpu

Status in Linux:
  New
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Error in boot:

  [8.597520] UBSAN: array-index-out-of-bounds in 
/build/linux-D15vQj/linux-6.5.0/drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_hwmgr.c:3676:4
  [8.597527] index 7 is out of range for type 
'ATOM_Polaris_SCLK_Dependency_Record [1]'

  ProblemType: Bug
  DistroRelease: Ubuntu 23.10
  Package: linux-image-generic 6.5.0.9.11
  ProcVersionSignature: Ubuntu 6.5.0-9.9-generic 6.5.3
  Uname: Linux 6.5.0-9-generic x86_64
  ApportVersion: 2.27.0-0ubuntu5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Oct 20 09:28:16 2023
  InstallationDate: Installed on 2022-10-12 (373 days ago)
  InstallationMedia: Ubuntu 22.04.1 LTS "Jammy Jellyfish" - Release amd64 
(20220809.1)
  MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
  ProcFB: 0 amdgpudrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.5.0-9-generic 
root=UUID=9edc5478-c6c2-4cf3-9de8-01ccb697fb9e ro quiet splash audit=0 
mitigations=off amdgpu.ppfeaturemask=0x vt.global_cursor_default=0 
loglevel=2 rd.systemd.show_status=false rd.udev.log-prority=3 
sysrq_always_enabled=1 audit=0 vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-6.5.0-9-generic N/A
   linux-backports-modules-6.5.0-9-generic  N/A
   linux-firmware   20230919.git3672ccab-0ubuntu2.1
  SourcePackage: linux
  UpgradeStatus: Upgraded to mantic on 2023-10-13 (6 days ago)
  dmi.bios.date: 07/11/2014
  dmi.bios.release: 4.6
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: D3EMW08.110
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: D3F3-EM
  dmi.board.vendor: MEDION
  dmi.board.version: 1.0
  dmi.chassis.type: 3
  dmi.chassis.vendor: MEDION
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrD3EMW08.110:bd07/11/2014:br4.6:svnMEDION:pnD3F3-EM:pvr1.0:rvnMEDION:rnD3F3-EM:rvr1.0:cvnMEDION:ct3:cvr:skuTobefilledbyO.E.M.:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: D3F3-EM
  dmi.product.sku: To be filled by O.E.M.
  dmi.product.version: 1.0
  dmi.sys.vendor: MEDION

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/2039926/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2039926] Re: Error UBSAN: array-index-out-of-bounds amdgpu

2023-10-27 Thread Alex Deucher
FWIW, none of these are actually out of bound accesses.  These just
happen to use the old nomenclature for variable sized arrays.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2039926

Title:
  Error UBSAN: array-index-out-of-bounds amdgpu

Status in Linux:
  New
Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Error in boot:

  [8.597520] UBSAN: array-index-out-of-bounds in 
/build/linux-D15vQj/linux-6.5.0/drivers/gpu/drm/amd/amdgpu/../pm/powerplay/hwmgr/smu7_hwmgr.c:3676:4
  [8.597527] index 7 is out of range for type 
'ATOM_Polaris_SCLK_Dependency_Record [1]'

  ProblemType: Bug
  DistroRelease: Ubuntu 23.10
  Package: linux-image-generic 6.5.0.9.11
  ProcVersionSignature: Ubuntu 6.5.0-9.9-generic 6.5.3
  Uname: Linux 6.5.0-9-generic x86_64
  ApportVersion: 2.27.0-0ubuntu5
  Architecture: amd64
  CasperMD5CheckResult: pass
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Oct 20 09:28:16 2023
  InstallationDate: Installed on 2022-10-12 (373 days ago)
  InstallationMedia: Ubuntu 22.04.1 LTS "Jammy Jellyfish" - Release amd64 
(20220809.1)
  MachineType: {report['dmi.sys.vendor']} {report['dmi.product.name']}
  ProcFB: 0 amdgpudrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.5.0-9-generic 
root=UUID=9edc5478-c6c2-4cf3-9de8-01ccb697fb9e ro quiet splash audit=0 
mitigations=off amdgpu.ppfeaturemask=0x vt.global_cursor_default=0 
loglevel=2 rd.systemd.show_status=false rd.udev.log-prority=3 
sysrq_always_enabled=1 audit=0 vt.handoff=7
  RelatedPackageVersions:
   linux-restricted-modules-6.5.0-9-generic N/A
   linux-backports-modules-6.5.0-9-generic  N/A
   linux-firmware   20230919.git3672ccab-0ubuntu2.1
  SourcePackage: linux
  UpgradeStatus: Upgraded to mantic on 2023-10-13 (6 days ago)
  dmi.bios.date: 07/11/2014
  dmi.bios.release: 4.6
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: D3EMW08.110
  dmi.board.asset.tag: To be filled by O.E.M.
  dmi.board.name: D3F3-EM
  dmi.board.vendor: MEDION
  dmi.board.version: 1.0
  dmi.chassis.type: 3
  dmi.chassis.vendor: MEDION
  dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvrD3EMW08.110:bd07/11/2014:br4.6:svnMEDION:pnD3F3-EM:pvr1.0:rvnMEDION:rnD3F3-EM:rvr1.0:cvnMEDION:ct3:cvr:skuTobefilledbyO.E.M.:
  dmi.product.family: To be filled by O.E.M.
  dmi.product.name: D3F3-EM
  dmi.product.sku: To be filled by O.E.M.
  dmi.product.version: 1.0
  dmi.sys.vendor: MEDION

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/2039926/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-10-27 Thread Alex Deucher
The reverts are in the latest firmware tree:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=d7b50e61669dc137924337d03d09b8986eb752a3
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=d843e520a4b0d92b986645548d11ade3b9b239a4
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/amdgpu?id=99d72504bff7ab40c261b8509c0b9d8abf98b296

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

Status in amd:
  New
Status in linux-firmware package in Ubuntu:
  Confirmed
Status in mesa package in Ubuntu:
  Confirmed

Bug description:
  After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
  upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent
  and severe GPU instability. When this happens, I see this error in
  dmesg:

  [20061.061069] amdgpu :03:00.0: amdgpu: [gfxhub0] retry page fault 
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 
pid 1236)
  [20061.061103] amdgpu :03:00.0: amdgpu:   in page starting at address 
0x80401000 from client 27
  [20061.061135] amdgpu :03:00.0: amdgpu: 
VM_L2_PROTECTION_FAULT_STATUS:0x00101031
  [20061.061147] amdgpu :03:00.0: amdgpu:  Faulty UTCL2 client ID: TCP 
(0x8)
  [20061.061157] amdgpu :03:00.0: amdgpu:  MORE_FAULTS: 0x1
  [20061.061167] amdgpu :03:00.0: amdgpu:  WALKER_ERROR: 0x0
  [20061.061174] amdgpu :03:00.0: amdgpu:  PERMISSION_FAULTS: 0x3
  [20061.061183] amdgpu :03:00.0: amdgpu:  MAPPING_ERROR: 0x0
  [20061.061189] amdgpu :03:00.0: amdgpu:  RW: 0x0

  I'll attach a couple of full dmesgs that I collected.

  Many of the times when this happens, the screen and keyboard freeze
  irreversibly (I tried waiting for more than 30 minutes, but it doesn't
  help). I can still log in via ssh though. When there's no freeze, I
  can continue using the computer normally, but the laptop fans keep
  running are always running and the battery depletes fast. There's
  probably something on a permanent loop either in the kernel or in the
  GPU.

  This bug happens several times a day, rendering the machine so
  unstable as to be almost unusable. It is a severe regression and I'm
  aghast that it passed AMD's Quality Assurance.

  After downgrading back to linux-firmware 1.190.5, the machine is back
  to the previous, mostly-reliable state. Which is to say, this bug is
  gone, I'm just left with the other amdgpu suspend bug I've learned to
  live with since I bought this computer.

  Please revert the amdgpu firmware in this package as soon as possible.
  This is unbearable.

  Relevant information:
  Ubuntu version: 21.04
  Linux kernel: 5.11.0-17-generic x86_64
  CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
  GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Picasso (rev c1)
  Laptop model: Lenovo Ideapad S145

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-07-12 Thread Alex Deucher
Does the latest firmware in the firmware git tree help?
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/log/amdgpu

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

Status in amd:
  New
Status in linux-firmware package in Ubuntu:
  Incomplete
Status in mesa package in Ubuntu:
  Confirmed

Bug description:
  After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
  upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent
  and severe GPU instability. When this happens, I see this error in
  dmesg:

  [20061.061069] amdgpu :03:00.0: amdgpu: [gfxhub0] retry page fault 
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 
pid 1236)
  [20061.061103] amdgpu :03:00.0: amdgpu:   in page starting at address 
0x80401000 from client 27
  [20061.061135] amdgpu :03:00.0: amdgpu: 
VM_L2_PROTECTION_FAULT_STATUS:0x00101031
  [20061.061147] amdgpu :03:00.0: amdgpu:  Faulty UTCL2 client ID: TCP 
(0x8)
  [20061.061157] amdgpu :03:00.0: amdgpu:  MORE_FAULTS: 0x1
  [20061.061167] amdgpu :03:00.0: amdgpu:  WALKER_ERROR: 0x0
  [20061.061174] amdgpu :03:00.0: amdgpu:  PERMISSION_FAULTS: 0x3
  [20061.061183] amdgpu :03:00.0: amdgpu:  MAPPING_ERROR: 0x0
  [20061.061189] amdgpu :03:00.0: amdgpu:  RW: 0x0

  I'll attach a couple of full dmesgs that I collected.

  Many of the times when this happens, the screen and keyboard freeze
  irreversibly (I tried waiting for more than 30 minutes, but it doesn't
  help). I can still log in via ssh though. When there's no freeze, I
  can continue using the computer normally, but the laptop fans keep
  running are always running and the battery depletes fast. There's
  probably something on a permanent loop either in the kernel or in the
  GPU.

  This bug happens several times a day, rendering the machine so
  unstable as to be almost unusable. It is a severe regression and I'm
  aghast that it passed AMD's Quality Assurance.

  After downgrading back to linux-firmware 1.190.5, the machine is back
  to the previous, mostly-reliable state. Which is to say, this bug is
  gone, I'm just left with the other amdgpu suspend bug I've learned to
  live with since I bought this computer.

  Please revert the amdgpu firmware in this package as soon as possible.
  This is unbearable.

  Relevant information:
  Ubuntu version: 21.04
  Linux kernel: 5.11.0-17-generic x86_64
  CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
  GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Picasso (rev c1)
  Laptop model: Lenovo Ideapad S145

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1928393] Re: linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0] retry page fault"

2021-06-08 Thread Alex Deucher
Can you narrow down which specific firmware file causes the problem?  We
haven't been able to repro this.

I think it may be related to a change in mesa.  Specifically mesa commit
820dec3f7c7.  For more info see
https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866


** Bug watch added: gitlab.freedesktop.org/mesa/mesa/-/issues #4866
   https://gitlab.freedesktop.org/mesa/mesa/-/issues/4866

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1928393

Title:
  linux-firmware 1.197 causes kernel to report error "amdgpu: [gfxhub0]
  retry page fault"

Status in amd:
  New
Status in linux-firmware package in Ubuntu:
  Incomplete

Bug description:
  After upgrading linux-firmware from 1.190.5 to 1.197 (as part of the
  upgrade from Ubuntu 20.10 to 21.04), I started experiencing frequent
  and severe GPU instability. When this happens, I see this error in
  dmesg:

  [20061.061069] amdgpu :03:00.0: amdgpu: [gfxhub0] retry page fault 
(src_id:0 ring:0 vmid:1 pasid:32769, for process Xorg pid 1141 thread Xorg:cs0 
pid 1236)
  [20061.061103] amdgpu :03:00.0: amdgpu:   in page starting at address 
0x80401000 from client 27
  [20061.061135] amdgpu :03:00.0: amdgpu: 
VM_L2_PROTECTION_FAULT_STATUS:0x00101031
  [20061.061147] amdgpu :03:00.0: amdgpu:  Faulty UTCL2 client ID: TCP 
(0x8)
  [20061.061157] amdgpu :03:00.0: amdgpu:  MORE_FAULTS: 0x1
  [20061.061167] amdgpu :03:00.0: amdgpu:  WALKER_ERROR: 0x0
  [20061.061174] amdgpu :03:00.0: amdgpu:  PERMISSION_FAULTS: 0x3
  [20061.061183] amdgpu :03:00.0: amdgpu:  MAPPING_ERROR: 0x0
  [20061.061189] amdgpu :03:00.0: amdgpu:  RW: 0x0

  I'll attach a couple of full dmesgs that I collected.

  Many of the times when this happens, the screen and keyboard freeze
  irreversibly (I tried waiting for more than 30 minutes, but it doesn't
  help). I can still log in via ssh though. When there's no freeze, I
  can continue using the computer normally, but the laptop fans keep
  running are always running and the battery depletes fast. There's
  probably something on a permanent loop either in the kernel or in the
  GPU.

  This bug happens several times a day, rendering the machine so
  unstable as to be almost unusable. It is a severe regression and I'm
  aghast that it passed AMD's Quality Assurance.

  After downgrading back to linux-firmware 1.190.5, the machine is back
  to the previous, mostly-reliable state. Which is to say, this bug is
  gone, I'm just left with the other amdgpu suspend bug I've learned to
  live with since I bought this computer.

  Please revert the amdgpu firmware in this package as soon as possible.
  This is unbearable.

  Relevant information:
  Ubuntu version: 21.04
  Linux kernel: 5.11.0-17-generic x86_64
  CPU model: AMD Ryzen 7 3700U with Radeon Vega Mobile Gfx
  GPU: 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. 
[AMD/ATI] Picasso (rev c1)
  Laptop model: Lenovo Ideapad S145

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1928393/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1203127] Re: AMD 12.04.4 LTS Sea Islands GPU Support

2013-08-13 Thread Alex Deucher
Missing a bonaire pci id:  0x665d

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/1203127

Title:
  AMD 12.04.4 LTS Sea Islands GPU Support

Status in amd:
  New
Status in “linux” package in Ubuntu:
  Incomplete
Status in “linux-firmware” package in Ubuntu:
  Confirmed

Bug description:
  AMD feature request to backport Sea Island GPU generation driver from
  3.11 kernel to 12.04.x for Kyoto  Berlin APU support.

  Relevant PCI'ID's in for Sea Islands:

  Bonaire/Saturn:
  0x6640
  0x6641
  0x6649
  0x6650
  0x6651
  0x6658
  0x665c

  Kabini/Temash:
  0x9830
  0x9831
  0x9832
  0x9833
  0x9834
  0x9835
  0x9836
  0x9837
  0x9838
  0x9839
  0x983a
  0x983b
  0x983c
  0x983d
  0x983e
  0x983f

  Oland/Mars:
  0x6600
  0x6601
  0x6602
  0x6603
  0x6606
  0x6607
  0x6610
  0x6611
  0x6613
  0x6620
  0x6621
  0x6623
  0x6632

  Hainan/Sun:
  0x6660
  0x6663
  0x6664
  0x6665
  0x6667
  0x666f

To manage notifications about this bug go to:
https://bugs.launchpad.net/amd/+bug/1203127/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp