** Description changed:

  [Impact]
  When an ARM-specific platform error (CPER) comes occurs, the kernel will emit 
an error with an opaque hex error type. The user would then need to consult the 
UEFI specification to decode it. It is far easier for the kernel to do the 
decoding itself, and just tell the user what the problem is.
  
  [Test Case]
  On a server that supports EINJ, generate a fake CE (thanks to Tyler Baicar 
for this example):
  
  modprobe einj
  echo 0x12345000 > /sys/kernel/debug/apei/einj/param1
  echo $((-1 << 12)) > /sys/kernel/debug/apei/einj/param2
  echo 5 > /sys/kernel/debug/apei/einj/param3
  echo 0x3 > /sys/kernel/debug/apei/einj/flags
  echo 0x1 > /sys/kernel/debug/apei/einj/error_type
  echo 1 > /sys/kernel/debug/apei/einj/error_inject
  
  In the output, look for a decoded message. Without this fix, you'll see:
  [  388.094304] {2}[Hardware Error]:    error_info: 0x0000000004c6007f
  [  388.094341] {2}[Hardware Error]:    physical fault address: 
0x0000000012345000
  
  But with the fix, you'll see:
  
  [  388.094304] {2}[Hardware Error]:    error_info: 0x0000000004c6007f
  [  388.094317] {2}[Hardware Error]:     transaction type: Generic
  [  388.094322] {2}[Hardware Error]:     operation type: Generic read (type of 
instruction or data request cannot be determined)
  [  388.094326] {2}[Hardware Error]:     cache level: 3
  [  388.094330] {2}[Hardware Error]:     processor context not corrupted
  [  388.094333] {2}[Hardware Error]:     the error has been corrected
  [  388.094337] {2}[Hardware Error]:     PC is imprecise
  [  388.094341] {2}[Hardware Error]:    physical fault address: 
0x0000000012345000
  
  [Fix]
  These upstream fixes add ARM decoding support:
  
  c6d8c8ef1d0d94fdae9f5d72982963db89f9cdad      efi: Move ARM CPER code to new 
file
  301f55b1a9177132d2b9ce8a90bf0ae4b37bb850      efi: Parse ARM error 
information value
  
  [Regression Risk]
+ The code changed is specific to ARM, and has been tested there. There's a 
small change to arch-independent code, but it just involves renaming an array 
and adding the obviously correct const attribute.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1770244

Title:
  Decode ARM CPER records in kernel

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Bionic:
  In Progress

Bug description:
  [Impact]
  When an ARM-specific platform error (CPER) comes occurs, the kernel will emit 
an error with an opaque hex error type. The user would then need to consult the 
UEFI specification to decode it. It is far easier for the kernel to do the 
decoding itself, and just tell the user what the problem is.

  [Test Case]
  On a server that supports EINJ, generate a fake CE (thanks to Tyler Baicar 
for this example):

  modprobe einj
  echo 0x12345000 > /sys/kernel/debug/apei/einj/param1
  echo $((-1 << 12)) > /sys/kernel/debug/apei/einj/param2
  echo 5 > /sys/kernel/debug/apei/einj/param3
  echo 0x3 > /sys/kernel/debug/apei/einj/flags
  echo 0x1 > /sys/kernel/debug/apei/einj/error_type
  echo 1 > /sys/kernel/debug/apei/einj/error_inject

  In the output, look for a decoded message. Without this fix, you'll see:
  [  388.094304] {2}[Hardware Error]:    error_info: 0x0000000004c6007f
  [  388.094341] {2}[Hardware Error]:    physical fault address: 
0x0000000012345000

  But with the fix, you'll see:

  [  388.094304] {2}[Hardware Error]:    error_info: 0x0000000004c6007f
  [  388.094317] {2}[Hardware Error]:     transaction type: Generic
  [  388.094322] {2}[Hardware Error]:     operation type: Generic read (type of 
instruction or data request cannot be determined)
  [  388.094326] {2}[Hardware Error]:     cache level: 3
  [  388.094330] {2}[Hardware Error]:     processor context not corrupted
  [  388.094333] {2}[Hardware Error]:     the error has been corrected
  [  388.094337] {2}[Hardware Error]:     PC is imprecise
  [  388.094341] {2}[Hardware Error]:    physical fault address: 
0x0000000012345000

  [Fix]
  These upstream fixes add ARM decoding support:

  c6d8c8ef1d0d94fdae9f5d72982963db89f9cdad      efi: Move ARM CPER code to new 
file
  301f55b1a9177132d2b9ce8a90bf0ae4b37bb850      efi: Parse ARM error 
information value

  [Regression Risk]
  The code changed is specific to ARM, and has been tested there. There's a 
small change to arch-independent code, but it just involves renaming an array 
and adding the obviously correct const attribute.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1770244/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to