** Tags added: kernel-daily-bug

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2121858

Title:
  [Ubuntu 22.04.5 BUG] Ubuntu 22.04.5 OS kernel logging not showing the
  TLP header properly.

Status in linux package in Ubuntu:
  New

Bug description:
  On Dell PowerEdge system when Ubuntu 22.04.5 OS is installed and
  configured with Mellanox Network card, when we Inject MalfTLP error
  like modify the MPS value on EP (smaller than MPS on RC/Switch) using
  setpci tool observed that the first 3 bytes of TLP headers DW0, DW1,
  and DW2 has been zeroed out in OS kernel logs.

  However, we noticed that the TLP headers in the SEL and lspci logs are
  identical.

  
  Steps to Reproduce: -

  1. Install Ubuntu 22.04.5 OS.
  2. Inject MalfTLP error like modify the MPS value on EP (smaller than MPS on 
RC/Switch) using the following commands.

  Check the Original MPS value using:

  sh# setpci -s 1b:00.0 68.b

  If the MPS value is 3f change it to 2f using the following commands.

  sh# setpci -s 1b:. 68.w=591f or setpci -s 1b:. 68.b=1f to change MPS
  from 256 to 128

  3. Check SEL log, lspci -vvv and OS kernel log.

  4. The dmesg logs shows the following error.

  
------------------------------------------------------------------------------------------

  sh# dmesg | grep -i "Hardware Error"

  [526427.538510] {1}[Hardware Error]: Hardware error from APEI Generic 
Hardware Error Source: 5
  [526427.546890] {1}[Hardware Error]: event severity: recoverable
  [526427.552654] {1}[Hardware Error]:  Error 0, type: fatal
  [526427.557898] {1}[Hardware Error]:   section_type: PCIe error
  [526427.563572] {1}[Hardware Error]:   port_type: 0, PCIe end point
  [526427.569597] {1}[Hardware Error]:   version: 3.0
  [526427.574232] {1}[Hardware Error]:   command: 0x0406, status: 0x0010
  [526427.580517] {1}[Hardware Error]:   device_id: 0000:1b:00.0
  [526427.586106] {1}[Hardware Error]:   slot: 40
  [526427.590396] {1}[Hardware Error]:   secondary_bus: 0x00
  [526427.595641] {1}[Hardware Error]:   vendor_id: 0x15b3, device_id: 0x101d
  [526427.602358] {1}[Hardware Error]:   class_code: 020000
  [526427.607516] {1}[Hardware Error]:   aer_uncor_status: 0x00040000, 
aer_uncor_mask: 0x00010000
  [526427.615968] {1}[Hardware Error]:   aer_uncor_severity: 0x004ef010
  [526427.622172] {1}[Hardware Error]:   TLP Header: 00000040 00000000 00000000 
00000000

  
-----------------------------------------------------------------------------------------------------

   
  Expected Results: - The TLP headers should be identical as listed in the SEL 
and using lspci -s 1b:00.0 -vvv

  Actual Results: - The first three bytes DW0, DW1 and DW2 are zeroed
  out in OS kernel logs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2121858/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to