> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf
> Of Petr Oros
> Sent: Monday, February 2, 2026 9:48 AM
> To: [email protected]
> Cc: Vecera, Ivan <[email protected]>; Kitszel, Przemyslaw
> <[email protected]>; Eric Dumazet <[email protected]>;
> [email protected]; Andrew Lunn <[email protected]>;
> Stanislav Fomichev <[email protected]>; Nguyen, Anthony L
> <[email protected]>; [email protected];
> Jakub Kicinski <[email protected]>; Paolo Abeni <[email protected]>;
> David S. Miller <[email protected]>
> Subject: [Intel-wired-lan] [PATCH net] iavf: fix deadlock in reset
> handling
> 
> Three driver callbacks schedule a reset and wait for its completion:
> ndo_change_mtu(), ethtool set_ringparam(), and ethtool
> set_channels().
> 
> Waiting for reset in ndo_change_mtu() and set_ringparam() was added
> by commit c2ed2403f12c74 ("iavf: Wait for reset in callbacks which
> trigger
> it") to fix a race condition where adding an interface to bonding
> immediately after MTU or ring parameter change failed because the
> interface was still in __RESETTING state. The same commit also added
> waiting in iavf_set_priv_flags(), which was later removed by commit
> 53844673d55529 ("iavf: kill "legacy-rx" for good").
> 
> Waiting in set_channels() was introduced earlier by commit
> 4e5e6b5d9d13
> ("iavf: Fix return of set the new channel count") to ensure the PF
> has enough time to complete the VF reset when changing channel
> count, and to return correct error codes to userspace.
> 
> Commit ef490bbb226702 ("iavf: Add net_shaper_ops support") added
> net_shaper_ops to iavf, which required reset_task to use _locked
> NAPI variants (napi_enable_locked, napi_disable_locked) that need
> the netdev instance lock.
> 
> Later, commit 7e4d784f5810 ("net: hold netdev instance lock during
> rtnetlink operations") and commit 2bcf4772e45adb ("net: ethtool: try
> to protect all callback with netdev instance lock") started holding
> the netdev instance lock during ndo and ethtool callbacks for
> drivers with net_shaper_ops.
> 
> The combination of waiting for reset and the new locking
> requirements creates a deadlock: the callback holds the lock and
> waits for reset_task, but reset_task is blocked waiting for the same
> lock:
> 
>   Thread 1 (callback)               Thread 2 (reset_task)
>   -------------------               ---------------------
>   netdev_lock()
>   ndo_change_mtu() or ethtool op
>     iavf_schedule_reset()
>     iavf_wait_for_reset()           iavf_reset_task()
>       waiting...                      netdev_lock() <- DEADLOCK
> 
> Reproducer:
> 
>   # echo 16 > /sys/class/net/$PF/device/sriov_numvfs
>   # ip link set $VF up
>   # ip link set $VF mtu 5000
>   RTNETLINK answers: Device or resource busy
> 
>   # dmesg | tail -1
>   iavf: MTU change interrupted waiting for reset
> 
> Fix this by temporarily releasing the lock while waiting for reset
> to complete. This follows the pattern used elsewhere in the kernel
> (e.g.,
> do_set_master() releases rtnl_lock before calling ndo_add_slave()).
> 
> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during
> rtnetlink operations")
> Signed-off-by: Petr Oros <[email protected]>
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 8aa6e92c16431f..d7738fb8fa60bc 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -189,13 +189,22 @@ static bool iavf_is_reset_in_progress(struct
> iavf_adapter *adapter)
>   * iavf_wait_for_reset - Wait for reset to finish.
>   * @adapter: board private structure
>   *
> + * The iavf driver selects NET_SHAPER, so callbacks that trigger
> reset
> + are
> + * always called with netdev instance lock held, while reset_task
> also
> + needs
> + * this lock. Release the lock while waiting to avoid deadlock.
> + *
>   * Returns 0 if reset finished successfully, negative on timeout or
> interrupt.
>   */
>  int iavf_wait_for_reset(struct iavf_adapter *adapter)  {
> -     int ret = wait_event_interruptible_timeout(adapter-
> >reset_waitqueue,
> -
>       !iavf_is_reset_in_progress(adapter),
> -                                     msecs_to_jiffies(5000));
> +     struct net_device *netdev = adapter->netdev;
> +     int ret;
> +
> +     netdev_unlock(netdev);
> +     ret = wait_event_interruptible_timeout(adapter-
> >reset_waitqueue,
> +
> !iavf_is_reset_in_progress(adapter),
> +                                            msecs_to_jiffies(5000));
> +     netdev_lock(netdev);
> 
>       /* If ret < 0 then it means wait was interrupted.
>        * If ret == 0 then it means we got a timeout while waiting
> --
> 2.52.0

Reviewed-by: Aleksandr Loktionov <[email protected]>

Reply via email to