Public bug reported:

Verified on multiple DL360 Gen9 servers with up to date firmware.  Just
before reboot or shutdown, there is the following panic:

[  289.093083] {1}[Hardware Error]: Hardware error from APEI Generic Hardware 
Error Source: 1
[  289.093085] {1}[Hardware Error]: event severity: fatal
[  289.093087] {1}[Hardware Error]:  Error 0, type: fatal
[  289.093088] {1}[Hardware Error]:   section_type: PCIe error
[  289.093090] {1}[Hardware Error]:   port_type: 4, root port
[  289.093091] {1}[Hardware Error]:   version: 1.16
[  289.093093] {1}[Hardware Error]:   command: 0x6010, status: 0x0143
[  289.093094] {1}[Hardware Error]:   device_id: 0000:00:01.0
[  289.093095] {1}[Hardware Error]:   slot: 0
[  289.093096] {1}[Hardware Error]:   secondary_bus: 0x03
[  289.093097] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f02
[  289.093098] {1}[Hardware Error]:   class_code: 040600
[  289.093378] {1}[Hardware Error]:   bridge: secondary_status: 0x2000, 
control: 0x0003
[  289.093380] {1}[Hardware Error]:  Error 1, type: fatal
[  289.093381] {1}[Hardware Error]:   section_type: PCIe error
[  289.093382] {1}[Hardware Error]:   port_type: 4, root port
[  289.093383] {1}[Hardware Error]:   version: 1.16
[  289.093384] {1}[Hardware Error]:   command: 0x6010, status: 0x0143
[  289.093386] {1}[Hardware Error]:   device_id: 0000:00:01.0
[  289.093386] {1}[Hardware Error]:   slot: 0
[  289.093387] {1}[Hardware Error]:   secondary_bus: 0x03
[  289.093388] {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2f02
[  289.093674] {1}[Hardware Error]:   class_code: 040600
[  289.093676] {1}[Hardware Error]:   bridge: secondary_status: 0x2000, 
control: 0x0003
[  289.093678] Kernel panic - not syncing: Fatal hardware error!
[  289.093745] Kernel Offset: 0x1cc00000 from 0xffffffff81000000 (relocation 
range: 0xffffffff80000000-0xffffffffbfffffff)
[  289.105835] ERST: [Firmware Warn]: Firmware does not respond in time.

It does eventually restart after this.  Then during the subsequent POST,
the following warning appears:

Embedded RAID 1 : Smart Array P440ar Controller - (2048 MB, V6.30) 7 Logical
Drive(s) - Operation Failed
 - 1719-Slot 0 Drive Array - A controller failure event occurred prior
   to this power-up.  (Previous lock up code = 0x13) Action: Install the
   latest controller firmware. If the problem persists, replace the
   controller.

The latter's symptoms are described in
https://support.hpe.com/hpsc/doc/public/display?docId=emr_na-c04805565
but the running storage controller firmware is much newer than the doc's
resolution.

Neither of these problems occur during shutdown/reboot on the xenial
kernel.

FWIW, when running on old P89 (1.50 (07/20/2015) vs 2.56 (01/22/2018)),
the shutdown failure mode was a loop like so:

[529151.035267] NMI: IOCK error (debug interrupt?) for reason 75 on CPU 0.
[529153.222883] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.222884] Do you have a strange power saving mode enabled?
[529153.222884] Dazed and confused, but trying to continue
[529153.554447] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.554448] Do you have a strange power saving mode enabled?
[529153.554449] Dazed and confused, but trying to continue
[529153.554450] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.554451] Do you have a strange power saving mode enabled?
[529153.554452] Dazed and confused, but trying to continue
[529153.554452] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.554453] Do you have a strange power saving mode enabled?
[529153.554454] Dazed and confused, but trying to continue
[529153.554454] Uhhuh. NMI received for unknown reason 35 on CPU 0.
[529153.554455] Do you have a strange power saving mode enabled?
[529153.554456] Dazed and confused, but trying to continue
[529153.554457] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.554458] Do you have a strange power saving mode enabled?
[529153.554458] Dazed and confused, but trying to continue
[529153.554459] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529153.554460] Do you have a strange power saving mode enabled?
[529153.554460] Dazed and confused, but trying to continue
[529154.953916] Uhhuh. NMI received for unknown reason 25 on CPU 0.
[529154.953917] Do you have a strange power saving mode enabled?
[529154.953918] Dazed and confused, but trying to continue

But upgrading to 2.56 changes that to a kernel panic.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-signed-image-generic 4.15.0.21.22
ProcVersionSignature: Ubuntu 4.15.0-21.22-generic 4.15.17
Uname: Linux 4.15.0-21-generic x86_64
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 May 15 23:11 seq
 crw-rw---- 1 root audio 116, 33 May 15 23:11 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
Date: Wed May 16 00:17:53 2018
HibernationDevice: RESUME=UUID=696e8063-c668-4c89-a478-bfc23a450369
InstallationDate: Installed on 2016-06-01 (713 days ago)
InstallationMedia: Ubuntu-Server 14.04.5 LTS "Trusty Tahr" - Beta amd64 
(20160527)
MachineType: HP ProLiant DL360 Gen9
PciMultimedia:
 
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 mgadrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-21-generic 
root=UUID=6e6d422d-8ffb-4db3-b8c7-6c81e320b1b2 ro console=tty0 
console=ttyS1,38400 nosplash console=ttyS1,38400 console=tty0 nosplash
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-21-generic N/A
 linux-backports-modules-4.15.0-21-generic  N/A
 linux-firmware                             1.173
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-05-09 (6 days ago)
dmi.bios.date: 01/22/2018
dmi.bios.vendor: HP
dmi.bios.version: P89
dmi.board.name: ProLiant DL360 Gen9
dmi.board.vendor: HP
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: 
dmi:bvnHP:bvrP89:bd01/22/2018:svnHP:pnProLiantDL360Gen9:pvr:rvnHP:rnProLiantDL360Gen9:rvr:cvnHP:ct23:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant DL360 Gen9
dmi.sys.vendor: HP

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Confirmed


** Tags: amd64 apport-bug bionic package-from-proposed third-party-packages

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771467

Title:
  Reboot/shutdown kernel panic on HP DL360 Gen9 w/ bionic 4.15.0

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1771467/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to