This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 1915403 and then change the status of the bug to 'Confirmed'. If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'. This change has been made by an automated script, maintained by the Ubuntu Kernel Team. ** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1915403 Title: devlink: don't do reporter recovery if the state is healthy Status in linux package in Ubuntu: Incomplete Bug description: Hi, [Impact] Currently in focal, devices reporter recovery is enabled even if state is healthy. [test case] 1) display devlink health status # devlink health show pci/0000:05:00.0 reporter fw_fatal pci/0000:05:00.0: reporter fw_fatal state healthy error 0 recover 0 grace_period 1200000 auto_recover true 2) perform reporter recovery using devlink, # devlink health recover pci/0000:05:00.0 reporter fw_fatal 3)see that recovery was performed. # dmesg [776733.438708] mlx5_core 0000:05:00.0: mlx5_health_try_recover:316:(pid 563178): handling bad device here [776733.438717] mlx5_core 0000:05:00.0: mlx5_handle_bad_state:278:(pid 563178): Expected to see disabled NIC but it is full driver [776735.591522] mlx5_core 0000:05:00.0: mlx5_health_try_recover:328:(pid 563178): starting health recovery flow ... # devlink health show pci/0000:05:00.0 reporter fw_fatal pci/0000:05:00.0: reporter fw_fatal state healthy error 0 recover 1 grace_period 1200000 auto_recover true [fix] 402818205c9e devlink: don't do reporter recovery if the state is healthy this upstream commit from kernel v5.5-rc1 which is cleanly applied on focal tree. the commit prevents reporter recovery when device in healthy state. when applied, issuing # devlink health recover pci/0000:05:00.0 reporter fw_fatal on healthy state reporter return successfully, but dmesg is clean and recover counter do not change. [Regression Potential] very small as it is a very minor change. Thanks, Amir To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1915403/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp