The verification of the Stable Release Update for linux-firmware has
completed successfully and the package is now being released to
-updates.  Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-firmware in Ubuntu.
https://bugs.launchpad.net/bugs/2051636

Title:
  AMD phoenix/phoenix2 platforms facing amdgpu(PHX) hangs during stress
  loading

Status in HWE Next:
  New
Status in linux-firmware package in Ubuntu:
  Fix Released
Status in linux-firmware source package in Jammy:
  Fix Committed
Status in linux-firmware source package in Mantic:
  Fix Committed
Status in linux-firmware source package in Noble:
  Fix Released

Bug description:
  [SRU Justification]

  [Impact]

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker

  [Fix]

  Upstream firmware fixes for Phoenix (GC 11.0.1)/Phoenix 2 (GC 11.0.4), and 
other prerequisites:
  * amdgpu/gc_11_0_1_* up to commit 56c0e7e ("amdgpu: update GC 11.0.1 
firmware")
  * amdgpu/psp_13_0_4_ta.bin up to commit ed7ddfb ("amdgpu: update PSP 13.0.4 
firmware")
  * amdgpu/vcn_4_0_2.bin up to commit 34ccb75 ("amdgpu: update VCN 4.0.2 
firmware")
  * amdgpu/gc_11_0_4_* up to commit 680d98c ("amdgpu: update GC 11.0.4 
firmware")
  * amdgpu/psp_13_0_11_ta.bin up to commit 72227fe ("amdgpu: update PSP 13.0.11 
firmware")

  [Test Case]

  Run stress tool like 3DMark or GravityMark.

  [Where problems could occur]

  Binary firmware update recommended by chip vendor. No known issue so
  far.

  [Other Info]

  Phoenix is supported in linux-oem-6.5/jammy, so linux-firmware/jammy
  is also nominated for fix.

  ========== original bug report ==========

  With stress tool like 3DMark or GravityMark, facing amdgpu(PHX) hangs
  within a few minutes or sometimes even quicker. Also using mantic +
  v6.7 hit the hang, so need to update new FWs to fix this issue.

  PHX series
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=680d98c62b13bd441949280c77ca31efb021b68a
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=72227fe463af85648523300543287a68e6c6de5f
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=56c0e7e688427270729fce6e85ecd98f1fe2a6e1
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=ed7ddfb5d136c3b9b1eeb48f7568550c0e5d99da
  
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=34ccb7502e075607682f0f0984a83022bfa0da85

  [ 415.782623] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 
timeout, signaled seq=27035, emitted seq=27037
  [ 415.782833] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: 
process gnome-shell pid 1361 thread gnome-shel:cs0 pid 1421
  [ 415.783004] amdgpu 0000:0d:00.0: amdgpu: GPU reset begin!
  [ 415.944129] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 415.944317] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.074161] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.074327] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.204184] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.204356] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.334204] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.334377] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.464226] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.464398] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.594247] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.594418] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.724265] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.724432] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.854275] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.854437] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.984284] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 
[amdgpu]] *ERROR* MES failed to response msg=3
  [ 416.984456] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to 
unmap legacy queue
  [ 416.996743] amdgpu 0000:0d:00.0: amdgpu: MODE2 reset
  [ 417.026498] amdgpu 0000:0d:00.0: amdgpu: GPU reset succeeded, trying to 
resume
  [ 417.026909] [drm] PCIE GART of 512M enabled (table at 0x000000801FD00000).
  [ 417.027149] amdgpu 0000:0d:00.0: amdgpu: SMU is resuming...
  [ 417.029520] amdgpu 0000:0d:00.0: amdgpu: SMU is resumed successfully!
  [ 417.032154] [drm] DMUB hardware initialized: version=0x08003000
  [ 417.190837] [drm] kiq ring mec 3 pipe 1 q 0
  [ 417.192870] [drm] VCN decode and encode initialized successfully(under DPG 
Mode).
  [ 417.193037] amdgpu 0000:0d:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG 
decode initialized successfully.
  [ 417.193447] amdgpu 0000:0d:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 
on hub 0
  [ 417.193449] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 
on hub 0
  [ 417.193451] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 
on hub 0
  [ 417.193452] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 
on hub 0
  [ 417.193453] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 
on hub 0
  [ 417.193454] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 
on hub 0
  [ 417.193455] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 
on hub 0
  [ 417.193456] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 
on hub 0
  [ 417.193458] amdgpu 0000:0d:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 
on hub 0
  [ 417.193459] amdgpu 0000:0d:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on 
hub 0
  [ 417.193460] amdgpu 0000:0d:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 
0 on hub 8
  [ 417.193461] amdgpu 0000:0d:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on 
hub 8
  [ 417.193462] amdgpu 0000:0d:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 
13 on hub 0
  [ 417.195893] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow start
  [ 417.195894] amdgpu 0000:0d:00.0: amdgpu: recover vram bo from shadow done
  [ 417.195904] amdgpu 0000:0d:00.0: amdgpu: GPU reset(2) succeeded!
  [ 417.197048] [drm] Skip scheduling IBs!
  [ 417.197057] [drm] Skip scheduling IBs!
  [ 417.197063] [drm] Skip scheduling IBs!
  [ 443.578688] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize 
parser -125!

To manage notifications about this bug go to:
https://bugs.launchpad.net/hwe-next/+bug/2051636/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to