>-----Original Message-----
>From: Kubiak, Michal <michal.kub...@intel.com>
>Sent: Monday, June 24, 2024 8:42 PM
>To: intel-wired-...@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkow...@intel.com>;
>net...@vger.kernel.org; Drewek, Wojciech <wojciech.dre...@intel.com>;
>Kuruvinakunnel, George <george.kuruvinakun...@intel.com>; Keller, Jacob E
><jacob.e.kel...@intel.com>; k...@kernel.org; Kubiak, Michal
><michal.kub...@intel.com>
>Subject: [PATCH iwl-net v2] i40e: Fix XDP program unloading while removing
>the driver
>
>The commit 6533e558c650 ("i40e: Fix reset path while removing the driver")
>introduced a new PF state "__I40E_IN_REMOVE" to block modifying the XDP
>program while the driver is being removed.
>Unfortunately, such a change is useful only if the ".ndo_bpf()"
>callback was called out of the rmmod context because unloading the existing
>XDP program is also a part of driver removing procedure.
>In other words, from the rmmod context the driver is expected to unload the
>XDP program without reporting any errors. Otherwise, the kernel warning with
>callstack is printed out to dmesg.
>
>Example failing scenario:
> 1. Load the i40e driver.
> 2. Load the XDP program.
> 3. Unload the i40e driver (using "rmmod" command).
>
>The example kernel warning log:
>
>[  +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290
>unregister_netdevice_many_notify+0x7a9/0x870
>[...]
>[  +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870
>[...]
>[  +0.002726] Call Trace:
>[  +0.002457]  <TASK>
>[  +0.002119]  ? __warn+0x80/0x120
>[  +0.003245]  ? unregister_netdevice_many_notify+0x7a9/0x870
>[  +0.005586]  ? report_bug+0x164/0x190
>[  +0.003678]  ? handle_bug+0x3c/0x80
>[  +0.003503]  ? exc_invalid_op+0x17/0x70 [  +0.003846]  ?
>asm_exc_invalid_op+0x1a/0x20 [  +0.004200]  ?
>unregister_netdevice_many_notify+0x7a9/0x870
>[  +0.005579]  ? unregister_netdevice_many_notify+0x3cc/0x870
>[  +0.005586]  unregister_netdevice_queue+0xf7/0x140
>[  +0.004806]  unregister_netdev+0x1c/0x30 [  +0.003933]
>i40e_vsi_release+0x87/0x2f0 [i40e] [  +0.004604]  i40e_remove+0x1a1/0x420
>[i40e] [  +0.004220]  pci_device_remove+0x3f/0xb0 [  +0.003943]
>device_release_driver_internal+0x19f/0x200
>[  +0.005243]  driver_detach+0x48/0x90
>[  +0.003586]  bus_remove_driver+0x6d/0xf0 [  +0.003939]
>pci_unregister_driver+0x2e/0xb0 [  +0.004278]  i40e_exit_module+0x10/0x5f0
>[i40e] [  +0.004570]  __do_sys_delete_module.isra.0+0x197/0x310
>[  +0.005153]  do_syscall_64+0x85/0x170
>[  +0.003684]  ? syscall_exit_to_user_mode+0x69/0x220
>[  +0.004886]  ? do_syscall_64+0x95/0x170 [  +0.003851]  ?
>exc_page_fault+0x7e/0x180 [  +0.003932]
>entry_SYSCALL_64_after_hwframe+0x71/0x79
>[  +0.005064] RIP: 0033:0x7f59dc9347cb
>[  +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83
>c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f
>05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48 [
>+0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX:
>00000000000000b0 [  +0.007577] RAX: ffffffffffffffda RBX:
>0000559b9bb2f6e0 RCX: 00007f59dc9347cb [  +0.007140] RDX:
>0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748 [
>+0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09:
>0000000000000000 [  +0.007133] R10: 00007f59dc9a5ac0 R11:
>0000000000000206 R12: 0000000000000000 [  +0.007141] R13:
>00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000 [
>+0.007151]  </TASK> [  +0.002204] ---[ end trace 0000000000000000 ]---
>
>Fix this by checking if the XDP program is being loaded or unloaded.
>Then, block only loading a new program while "__I40E_IN_REMOVE" is set.
>Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP
>callback to avoid unnecessary operations and checks.
>
>Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver")
>Signed-off-by: Michal Kubiak <michal.kub...@intel.com>
>
>---
>
>v1 -> v2 changes:
> - simplify the fix according to Kuba's suggestions, i.e. remove
>   checking 'NETREG_UNREGISTERING' flag directly from ndo_bpf
>   and remove a separate handling for 'unregister' context.
> - update the commit message accordingly
> - add an example of the kernel warning for the issue being fixed.
>
>See:
>v1: <https://lore.kernel.org/netdev/20240528-net-2024-05-28-intel-net-
>fixes-v1-4-dc8593d2b...@intel.com/>
>---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.r...@intel.com> (A Contingent Worker at 
Intel)

Reply via email to