** Description changed:

+ impact
+ being noticed a lot, only affects 5.4, fix in subsequent failures
+ 
+ The offending patch was removed in 20.10 and later kernels (it was
+ reverted upstream not long after being merged into mainline but we never
+ reverted it)
+ 
+ 
  following error messages are observed
  
  [  146.429212] shutdown[1]: Rebooting.
  [  146.435151] kvm: exiting hardware virtualization
  [  146.575319] megaraid_sas 0000:67:00.0: megasas_disable_intr_fusion is 
called outbound_intr_mask:0x40000009
  [  148.088133] [qede_unload:2236(eno12409)]Link is down
  [  148.183618] qede 0000:31:00.1: Ending qede_remove successfully
  [  148.518541] [qede_unload:2236(eno12399)]Link is down
  [  148.625066] qede 0000:31:00.0: Ending qede_remove successfully
  [  148.762067] ACPI: Preparing to enter system sleep state S5
  [  148.794638] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 5
  [  148.803731] {1}[Hardware Error]: event severity: recoverable
  [  148.810191] {1}[Hardware Error]:  Error 0, type: fatal
  [  148.816088] {1}[Hardware Error]:   section_type: PCIe error
  [  148.822391] {1}[Hardware Error]:   port_type: 0, PCIe end point
  [  148.829026] {1}[Hardware Error]:   version: 3.0
  [  148.834266] {1}[Hardware Error]:   command: 0x0006, status: 0x0010
  [  148.841140] {1}[Hardware Error]:   device_id: 0000:04:00.0
  [  148.847309] {1}[Hardware Error]:   slot: 0
  [  148.852077] {1}[Hardware Error]:   secondary_bus: 0x00
  [  148.857876] {1}[Hardware Error]:   vendor_id: 0x14e4, device_id: 0x165f
  [  148.865145] {1}[Hardware Error]:   class_code: 020000
  [  148.870845] {1}[Hardware Error]:   aer_uncor_status: 0x00100000, 
aer_uncor_mask: 0x00010000
  [  148.879842] {1}[Hardware Error]:   aer_uncor_severity: 0x000ef030
  [  148.886575] {1}[Hardware Error]:   TLP Header: 40000001 0000030f 90028090 
00000000
  [  148.894823] tg3 0000:04:00.0: AER: aer_status: 0x00100000, aer_mask: 
0x00010000
  [  148.902795] tg3 0000:04:00.0: AER:    [20] UnsupReq               (First)
  [  148.910234] tg3 0000:04:00.0: AER: aer_layer=Transaction Layer, 
aer_agent=Requester ID
  [  148.918806] tg3 0000:04:00.0: AER: aer_uncor_severity: 0x000ef030
  [  148.925558] tg3 0000:04:00.0: AER:   TLP Header: 40000001 0000030f 
90028090 00000000
  [  148.933984] reboot: Restarting system
  [  148.938319] reboot: machine restart
  
- 
- I  have observed the following. when I test older kernel 
- 
+ I  have observed the following. when I test older kernel
  
  Kernel  version       Fatal Error
  5.4.0-42.46   No
  5.4.0-45.49   No
  5.4.0-47.51   No
  5.4.0-48.52   No
  5.4.0-51.56   No
  5.4.0-52.57   No
  5.4.0-53.59   No
  5.4.0-54.60   No
  5.4.0-58.64   No
  5.4.0-59.65   yes
  5.4.0-60.67   yes
  
- 
  later I have bisect kernel between 5.4.0-58.64 and 5.4.0-59.65.
  
  looks like due to the following patch we are observing this issue. The
  driver is not handling D3 state properly
  
  PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI
  
  https://kernel.ubuntu.com/git/ubuntu/ubuntu-
  focal.git/commit/?id=b9319dd02269593911403dd5d684368bcef3261d

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1917471

Title:
  [Regression] Bus Fatal Error observed when reboot on BCM5720

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  In Progress
Status in linux source package in Impish:
  Fix Released
Status in linux source package in Jammy:
  Fix Released

Bug description:
  impact
  being noticed a lot, only affects 5.4, fix in subsequent failures

  The offending patch was removed in 20.10 and later kernels (it was
  reverted upstream not long after being merged into mainline but we
  never reverted it)


  following error messages are observed

  [  146.429212] shutdown[1]: Rebooting.
  [  146.435151] kvm: exiting hardware virtualization
  [  146.575319] megaraid_sas 0000:67:00.0: megasas_disable_intr_fusion is 
called outbound_intr_mask:0x40000009
  [  148.088133] [qede_unload:2236(eno12409)]Link is down
  [  148.183618] qede 0000:31:00.1: Ending qede_remove successfully
  [  148.518541] [qede_unload:2236(eno12399)]Link is down
  [  148.625066] qede 0000:31:00.0: Ending qede_remove successfully
  [  148.762067] ACPI: Preparing to enter system sleep state S5
  [  148.794638] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 5
  [  148.803731] {1}[Hardware Error]: event severity: recoverable
  [  148.810191] {1}[Hardware Error]:  Error 0, type: fatal
  [  148.816088] {1}[Hardware Error]:   section_type: PCIe error
  [  148.822391] {1}[Hardware Error]:   port_type: 0, PCIe end point
  [  148.829026] {1}[Hardware Error]:   version: 3.0
  [  148.834266] {1}[Hardware Error]:   command: 0x0006, status: 0x0010
  [  148.841140] {1}[Hardware Error]:   device_id: 0000:04:00.0
  [  148.847309] {1}[Hardware Error]:   slot: 0
  [  148.852077] {1}[Hardware Error]:   secondary_bus: 0x00
  [  148.857876] {1}[Hardware Error]:   vendor_id: 0x14e4, device_id: 0x165f
  [  148.865145] {1}[Hardware Error]:   class_code: 020000
  [  148.870845] {1}[Hardware Error]:   aer_uncor_status: 0x00100000, 
aer_uncor_mask: 0x00010000
  [  148.879842] {1}[Hardware Error]:   aer_uncor_severity: 0x000ef030
  [  148.886575] {1}[Hardware Error]:   TLP Header: 40000001 0000030f 90028090 
00000000
  [  148.894823] tg3 0000:04:00.0: AER: aer_status: 0x00100000, aer_mask: 
0x00010000
  [  148.902795] tg3 0000:04:00.0: AER:    [20] UnsupReq               (First)
  [  148.910234] tg3 0000:04:00.0: AER: aer_layer=Transaction Layer, 
aer_agent=Requester ID
  [  148.918806] tg3 0000:04:00.0: AER: aer_uncor_severity: 0x000ef030
  [  148.925558] tg3 0000:04:00.0: AER:   TLP Header: 40000001 0000030f 
90028090 00000000
  [  148.933984] reboot: Restarting system
  [  148.938319] reboot: machine restart

  I  have observed the following. when I test older kernel

  Kernel  version       Fatal Error
  5.4.0-42.46   No
  5.4.0-45.49   No
  5.4.0-47.51   No
  5.4.0-48.52   No
  5.4.0-51.56   No
  5.4.0-52.57   No
  5.4.0-53.59   No
  5.4.0-54.60   No
  5.4.0-58.64   No
  5.4.0-59.65   yes
  5.4.0-60.67   yes

  later I have bisect kernel between 5.4.0-58.64 and 5.4.0-59.65.

  looks like due to the following patch we are observing this issue. The
  driver is not handling D3 state properly

  PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI

  https://kernel.ubuntu.com/git/ubuntu/ubuntu-
  focal.git/commit/?id=b9319dd02269593911403dd5d684368bcef3261d

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1917471/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to