[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-03-11 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-firmware -
20230919.git3672ccab-0ubuntu2.9

---
linux-firmware (20230919.git3672ccab-0ubuntu2.9) mantic; urgency=medium

  * Update firmware for MT7921 in order to fix Framework 13 AMD 7040 (LP: 
#2049220)
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)

linux-firmware (20230919.git3672ccab-0ubuntu2.8) mantic; urgency=medium

  * DP connection swap to break eDP behavior on AMD 7735U (LP: #2049758)
- SAUCE: Update DCN312 DMCUB firmware

linux-firmware (20230919.git3672ccab-0ubuntu2.7) mantic; urgency=medium

  * Miscellaneous Ubuntu changes
- [Packaging] scripts: Fix shellcheck warnings
- [Workflow] Add initial gitea workflow file
- SAUCE: [Workflow] Disable markdownlint pre-commit hook
- SAUCE: [Workflow] check_whence.py: Update list of known files
- [Packaging] scripts/generate-changelog: Fix array initialization
- [Packaging] control: Add XSBC-Original-Maintainer field
- [Packaging] scripts/install-firmware: Fix installation of license files
  * AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress 
loading (LP: #2051636)
- amdgpu: update PSP 13.0.4 firmware from 5.7 branch
- amdgpu: update GC 11.0.1 firmware from 5.7 branch
- amdgpu: update GC 11.0.4 firmware from 5.7 branch
- amdgpu: update PSP 13.0.11 firmware from 5.7 branch
- amdgpu: update GC 11.0.1 firmware
- amdgpu: update PSP 13.0.4 firmware
- amdgpu: update VCN 4.0.2 firmware
- amdgpu: update GC 11.0.4 firmware
- amdgpu: update PSP 13.0.11 firmware
  * Update firmware for MT7921 in order to fix Framework 13 AMD 7040 (LP: 
#2049220)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
  * WCN6856 Wi-FI Unavailable and no function during suspend stress (LP: 
#2048977)
- ath11k: WCN6855 hw2.0: update to 
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37

linux-firmware (20230919.git3672ccab-0ubuntu2.6) mantic; urgency=medium

  * occasional wifi firmware loading failures: wiwlwifi: BE200: Failed to start 
RT ucode: -110 (LP: #2048853)
- iwlwifi: add new FWs from core83-55 release
- iwlwifi: fix for the new FWs from core83-55 release
- iwlwifi: update gl FW for core80-165 release
  * WCN6856 Wi-FI Unavailable and no function during suspend stress (LP: 
#2048977)
- ath11k: WCN6855 hw2.0: update board-2.bin
- ath11k: WCN6855 hw2.0: update to 
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.36

 -- Juerg Haefliger   Wed, 21 Feb 2024
10:41:18 +0100

** Changed in: linux-firmware (Ubuntu Mantic)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Released
Status in linux-firmware source package in Mantic:
  Fix Released
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-03-07 Thread Launchpad Bug Tracker
This bug was fixed in the package linux-firmware -
20220329.git681281e4-0ubuntu3.29

---
linux-firmware (20220329.git681281e4-0ubuntu3.29) jammy; urgency=medium

  * Update firmware for MT7921 in order to fix Framework 13 AMD 7040 (LP: 
#2049220)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)
- linux-firmware: update firmware for MT7922 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7922)

linux-firmware (20220329.git681281e4-0ubuntu3.28) jammy; urgency=medium

  * Missing firmware for AMD GPU GC 11.0.3 (LP: #2034103)
- amdgpu: update VCN 4.0.0 firmware for amd.5.5 release
- amdgpu: update VCN 4.0.0 firmware
  * DP connection swap to break eDP behavior on AMD 7735U (LP: #2049758)
- SAUCE: Update DCN312 DMCUB firmware

linux-firmware (20220329.git681281e4-0ubuntu3.27) jammy; urgency=medium

  * AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress 
loading (LP: #2051636)
- amdgpu: update PSP 13.0.4 firmware for amd.5.5 release
- amdgpu: update PSP 13.0.11 firmware for amd.5.5 release
- amdgpu: update PSP 13.0.4 firmware from 5.7 branch
- amdgpu: update GC 11.0.1 firmware from 5.7 branch
- amdgpu: update GC 11.0.4 firmware from 5.7 branch
- amdgpu: update PSP 13.0.11 firmware from 5.7 branch
- amdgpu: update GC 11.0.1 firmware
- amdgpu: update PSP 13.0.4 firmware
- amdgpu: update VCN 4.0.2 firmware
- amdgpu: update GC 11.0.4 firmware
- amdgpu: update PSP 13.0.11 firmware
  * Update firmware for MT7921 in order to fix Framework 13 AMD 7040 (LP: 
#2049220)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for MT7921 WiFi device
- linux-firmware: update firmware for mediatek bluetooth chip (MT7921)
  * WCN6856 Wi-FI Unavailable and no function during suspend stress (LP: 
#2048977)
- ath11k: WCN6855 hw2.0: update board-2.bin
- ath11k: WCN6855 hw2.0: update board-2.bin
- ath11k: WCN6855 hw2.0: update to 
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.36
- ath11k: WCN6855 hw2.0: update to 
WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37

 -- Juerg Haefliger   Wed, 21 Feb 2024
10:32:57 +0100

** Changed in: linux-firmware (Ubuntu Jammy)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Released
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-03-04 Thread You-Sheng Yang
verified:
* linux-firmware/jammy-proposed version 20220329.git681281e4-0ubuntu3.29
* linux-firmware/noble-proposed version 20230919.git3672ccab-0ubuntu2.9

** Tags removed: verification-needed-mantic
** Tags added: verification-done-mantic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Committed
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-02-28 Thread Timo Aaltonen
** Tags added: verification-needed-mantic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Committed
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-02-15 Thread Mario Limonciello
Internal team at AMD has tested across a number of different OEM PHX
systems using OEM 6.5-1014 kernel.  This is testing well.

** Tags added: verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Committed
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-02-09 Thread Timo Aaltonen
Hello You-Sheng, or anyone else affected,

Accepted linux-firmware into mantic-proposed. The package will build now
and be available at https://launchpad.net/ubuntu/+source/linux-
firmware/20230919.git3672ccab-0ubuntu2.8 in a few hours, and then in the
-proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
to enable and use -proposed.  Your feedback will aid us getting this
update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, what testing has been
performed on the package and change the tag from verification-needed-
mantic to verification-done-mantic. If it does not fix the bug for you,
please add a comment stating that, and change the tag to verification-
failed-mantic. In either case, without details of your testing we will
not be able to proceed.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance for helping!

N.B. The updated package will be released to -updates after the bug(s)
fixed by this package have been verified and the package has been in
-proposed for a minimum of 7 days.

** Changed in: linux-firmware (Ubuntu Mantic)
   Status: In Progress => Fix Committed

** Changed in: linux-firmware (Ubuntu Jammy)
   Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Committed
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-02-08 Thread Timo Aaltonen
** Changed in: linux-firmware (Ubuntu Noble)
   Status: Triaged => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  In Progress
Status in linux-firmware source package in Mantic:
  In Progress
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-30 Thread Juerg Haefliger
** Tags added: kern-9038

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Triaged
Status in linux-firmware source package in Jammy:
  In Progress
Status in linux-firmware source package in Mantic:
  In Progress
Status in linux-firmware source package in Noble:
  Triaged

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  == original bug report ==

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-30 Thread You-Sheng Yang
** Description changed:

+ [SRU Justification]
+ 
+ [Impact]
+ 
+ With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
+ within a few minutes or sometimes even quicker
+ 
+ [Fix]
+ 
+ Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
+ * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
+ * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
+ * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
+ * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
+ * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")
+ 
+ [Test Case]
+ 
+ Run stress tool like 3DMark or GravityMark.
+ 
+ [Where problems could occur]
+ 
+ Binary firmware update recommended by chip vendor. No known issue so
+ far.
+ 
+ [Other Info]
+ 
+ Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy is
+ also nominated for fix.
+ 
+ == original bug report ==
+ 
  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic + v6.7
  hit the hang, so need to update new FWs to fix this issue.
  
  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85
  
  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x00801FD0).
  [ 417.027149] amdgpu :0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu :0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
  [ 417.193037] amdgpu :0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG 
decode initialized successfully.
  [ 417.193447] amdgpu :0d:00.0: amdgpu: ring 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-30 Thread You-Sheng Yang
SRU:
* https://kernel.ubuntu.com/gitea/kernel/linux-firmware/pulls/6 (jammy)
* https://kernel.ubuntu.com/gitea/kernel/linux-firmware/pulls/8 (mantic)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in linux-firmware package in Ubuntu:
  Triaged
Status in linux-firmware source package in Jammy:
  In Progress
Status in linux-firmware source package in Mantic:
  In Progress
Status in linux-firmware source package in Noble:
  Triaged

Bug description:
  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x00801FD0).
  [ 417.027149] amdgpu :0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu :0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
  [ 417.193037] amdgpu :0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG 
decode initialized successfully.
  [ 417.193447] amdgpu :0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 
on hub 0
  [ 417.193449] amdgpu :0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
  [ 417.193451] amdgpu :0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
  [ 417.193452] amdgpu :0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 
on hub 0
  [ 417.193453] amdgpu :0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 
on hub 0
  [ 417.193454] 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-30 Thread You-Sheng Yang
All the fixes are in upstream repository, so there should be no work to
do for Noble once it migrate to upstream HEAD.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in linux-firmware package in Ubuntu:
  Triaged
Status in linux-firmware source package in Jammy:
  In Progress
Status in linux-firmware source package in Mantic:
  In Progress
Status in linux-firmware source package in Noble:
  Triaged

Bug description:
  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x00801FD0).
  [ 417.027149] amdgpu :0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu :0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
  [ 417.193037] amdgpu :0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG 
decode initialized successfully.
  [ 417.193447] amdgpu :0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 
on hub 0
  [ 417.193449] amdgpu :0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
  [ 417.193451] amdgpu :0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
  [ 417.193452] amdgpu :0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 
on hub 0
  [ 417.193453] amdgpu :0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 
on hub 0
  [ 417.193454] amdgpu :0d:00.0: amdgpu: 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-30 Thread You-Sheng Yang
** Changed in: linux-firmware (Ubuntu Jammy)
   Status: New => In Progress

** Changed in: linux-firmware (Ubuntu Mantic)
   Status: New => In Progress

** Changed in: linux-firmware (Ubuntu Noble)
   Status: New => Incomplete

** Changed in: linux-firmware (Ubuntu Noble)
   Status: Incomplete => Triaged

** Changed in: linux-firmware (Ubuntu Jammy)
   Importance: Undecided => High

** Changed in: linux-firmware (Ubuntu Mantic)
   Importance: Undecided => High

** Changed in: linux-firmware (Ubuntu Jammy)
 Assignee: (unassigned) => You-Sheng Yang (vicamo)

** Changed in: linux-firmware (Ubuntu Mantic)
 Assignee: (unassigned) => You-Sheng Yang (vicamo)

** Changed in: linux-firmware (Ubuntu Noble)
 Assignee: (unassigned) => You-Sheng Yang (vicamo)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in linux-firmware package in Ubuntu:
  Triaged
Status in linux-firmware source package in Jammy:
  In Progress
Status in linux-firmware source package in Mantic:
  In Progress
Status in linux-firmware source package in Noble:
  Triaged

Bug description:
  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x00801FD0).
  [ 417.027149] amdgpu :0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu :0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] 

[Kernel-packages] [Bug 2051636] Re: AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

2024-01-29 Thread You-Sheng Yang
** Summary changed:

- AMD phenix/phenix2 platforms facing amdgpu(PHX) hangs during stress loading
+ AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress loading

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in linux-firmware package in Ubuntu:
  New
Status in linux-firmware source package in Jammy:
  New
Status in linux-firmware source package in Mantic:
  New
Status in linux-firmware source package in Noble:
  New

Bug description:
  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu :0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu :0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu :0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x00801FD0).
  [ 417.027149] amdgpu :0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu :0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
  [ 417.193037] amdgpu :0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG 
decode initialized successfully.
  [ 417.193447] amdgpu :0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 
on hub 0
  [ 417.193449] amdgpu :0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
  [ 417.193451] amdgpu :0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
  [ 417.193452] amdgpu :0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 
on hub 0
  [ 417.193453] amdgpu :0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 
on hub 0
  [