This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1915403

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1915403

Title:
  devlink: don't do reporter recovery if the state is healthy

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Hi,

  [Impact]
  Currently in focal, devices reporter recovery is enabled even if state is 
healthy.

  [test case]

  1)
  display devlink health status
  # devlink health show  pci/0000:05:00.0 reporter fw_fatal
  pci/0000:05:00.0:
    reporter fw_fatal
      state healthy error 0 recover 0 grace_period 1200000 auto_recover true
  2)
  perform reporter recovery using devlink,
  # devlink health recover pci/0000:05:00.0 reporter fw_fatal

  3)see that recovery was performed.
  # dmesg
  [776733.438708] mlx5_core 0000:05:00.0: mlx5_health_try_recover:316:(pid 
563178): handling bad device here
  [776733.438717] mlx5_core 0000:05:00.0: mlx5_handle_bad_state:278:(pid 
563178): Expected to see disabled
   NIC but it is full driver
  [776735.591522] mlx5_core 0000:05:00.0: mlx5_health_try_recover:328:(pid 
563178): starting health recovery flow
  ...
  # devlink health show  pci/0000:05:00.0 reporter fw_fatal
  pci/0000:05:00.0:
    reporter fw_fatal
      state healthy error 0 recover 1 grace_period 1200000 auto_recover true

  [fix]
  402818205c9e devlink: don't do reporter recovery if the state is healthy
  this upstream commit from kernel v5.5-rc1 which is cleanly applied on focal 
tree.
  the commit prevents reporter recovery when device in healthy state.
  when applied, issuing
  # devlink health recover pci/0000:05:00.0 reporter fw_fatal
  on healthy state reporter return successfully, but dmesg is clean and recover 
counter do not change.

  [Regression Potential]
  very small as it is a very minor change.

  Thanks,
  Amir

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1915403/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to