Hi Jeff,

upstream commit
50b2412b7e78 net/mlx5: Avoid possible free of command entry while timeout comp 
handler 
was picked to Ubuntu-5.4.0-56.62 kernel 
(hash bcd6e98bef76cc8a49a1b736b0fefffbffb75c30)
(v5.4.71 upstream stable release, 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1902110 )

now a new issue arise 
reloading mlx5 modules causes an error message in kernel buffer 
"cmd_work_handler:887:(pid 292): failed to allocate command entry"

reproduction:
# modprobe -r mlx5_ib mlx5_core
# modprobe mlx5_core mlx5_ib
# dmesg
[  142.638490] mlx5_core 0000:08:00.1: E-Switch: cleanup
[  143.734339] mlx5_core 0000:08:00.0: E-Switch: cleanup
[  164.171511] mlx5_core: unknown parameter 'mlx5_ib' ignored
[  164.173501] mlx5_core 0000:08:00.0: firmware version: 16.28.1002
[  164.173576] mlx5_core 0000:08:00.0: 126.016 Gb/s available PCIe bandwidth (8 
GT/s x16 link)
[  164.457342] mlx5_core 0000:08:00.0: Rate limit: 127 rates are supported, 
range: 0Mbps to 97656Mbps
[  164.457365] mlx5_core 0000:08:00.0: E-Switch: Total vports 2, per vport: max 
uc(1024) max mc(16384)
[  164.484659] port_module: 5 callbacks suppressed
[  164.484665] mlx5_core 0000:08:00.0: Port module event: module 0, Cable 
plugged
[  164.485112] mlx5_core 0000:08:00.0: mlx5_pcie_event:294:(pid 8): PCIe slot 
advertised sufficient power (75W).
[  164.494771] mlx5_core 0000:08:00.1: firmware version: 16.28.1002
[  164.494844] mlx5_core 0000:08:00.1: 126.016 Gb/s available PCIe bandwidth (8 
GT/s x16 link)
[  164.779534] mlx5_core 0000:08:00.1: Rate limit: 127 rates are supported, 
range: 0Mbps to 97656Mbps
[  164.779552] mlx5_core 0000:08:00.1: E-Switch: Total vports 2, per vport: max 
uc(1024) max mc(16384)
[  164.808886] mlx5_core 0000:08:00.1: Port module event: module 1, Cable 
plugged
[  164.809228] mlx5_core 0000:08:00.1: mlx5_pcie_event:294:(pid 292): PCIe slot 
advertised sufficient power (75W).
[  164.840667] mlx5_core 0000:08:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) 
RxCqeCmprss(0)
[  165.081342] mlx5_core 0000:08:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) 
RxCqeCmprss(0)
[  165.282793] mlx5_ib: Mellanox Connect-IB Infiniband driver v5.0-0
[  165.438226] mlx5_core 0000:08:00.0: cmd_work_handler:887:(pid 292): failed 
to allocate command entry
[  165.442506] infiniband rocep8s0f0: reg_mr_callback:104:(pid 292): async reg 
mr failed. status -11
#  
 
the following fixes this issue
410bd754cd73 net/mlx5: Add retry mechanism to the command entry index 
allocation       (upstream 5.9)
1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout                   
       (upstream 5.9)
d43b7007dbd1 net/mlx5: Fix a race when moving command interface to events mode  
       (upstream 5.7-rc7)
3ed879965cc4 net/mlx5: net/mlx5: Use async EQ setup cleanup helpers for 
multiple EQs   (upstream 5.6-rc1)

those are on master-next branch off focal tree also synced from linux stable. 
(v5.4.79 upstream stable release 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1907151 )

# git log --oneline  Ubuntu-5.4.0-59.65..master-next
....
400ec5bb2816 net/mlx5: Add retry mechanism to the command entry index allocation
2bd608898edd net/mlx5: Fix a race when moving command interface to events mode
bec07c488db0 net/mlx5: poll cmd EQ in case of command timeout
0c9bfdf598e1 net/mlx5: Use async EQ setup cleanup helpers for multiple EQs
.....

I compiled master-next, booted the system with it and the issue is
resolved.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1905574

Title:
  Ubuntu 20.10 four needed fixes to 'Add driver for Mellanox Connect-IB
  adapters'

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Focal:
  Won't Fix

Bug description:
  [Impact]

  commit
  d43b7007dbd1 net/mlx5: Fix a race when moving command interface to events mode
  from upstream v5.7-rc1 (and in groovy) fixes 
  e126ba97dba9 mlx5: Add driver for Mellanox Connect-IB adapters
  this fix should come with four more patches from v5.9.

  410bd754cd73 net/mlx5: Add retry mechanism to the command entry index 
allocation
  1d5558b1f0de net/mlx5: poll cmd EQ in case of command timeout
  50b2412b7e78 net/mlx5: Avoid possible free of command entry while timeout 
comp handler
  432161ea26d6 net/mlx5: Fix a race when moving command interface to polling 
mode

  all four patches are applied cleanly on groovy tree and we ask to pull
  them into groovy.

  please also see this discussion
  https://www.spinics.net/lists/stable/msg428620.html
     
    
  Thank's

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1905574/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to