Re: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening vport if it is NULL during HW reset

2026-01-08 Thread Loktionov, Aleksandr


> -Original Message-
> From: Intel-wired-lan  On Behalf
> Of Loktionov, Aleksandr
> Sent: Friday, January 9, 2026 7:07 AM
> To: Li Li ; Nguyen, Anthony L
> ; Kitszel, Przemyslaw
> ; David S. Miller ;
> Jakub Kicinski ; Eric Dumazet ;
> [email protected]
> Cc: [email protected]; [email protected]; David
> Decotigny ; Singhai, Anjali
> ; Samudrala, Sridhar
> ; Brian Vazquez ;
> Tantilov, Emil S 
> Subject: Re: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening
> vport if it is NULL during HW reset
> 
> 
> 
> > -Original Message-
> > From: Intel-wired-lan  On Behalf
> > Of Li Li via Intel-wired-lan
> > Sent: Wednesday, January 7, 2026 2:05 AM
> > To: Nguyen, Anthony L ; Kitszel,
> > Przemyslaw ; David S. Miller
> > ; Jakub Kicinski ; Eric
> Dumazet
> > ; [email protected]
> > Cc: [email protected]; [email protected]; David
> > Decotigny ; Singhai, Anjali
> > ; Samudrala, Sridhar
> > ; Brian Vazquez ;
> Li
> > Li ; Tantilov, Emil S 
> > Subject: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening
> > vport if it is NULL during HW reset
> >
> > When an idpf HW reset is triggered, it clears the vport but does not
> > clear the netdev held by vport:
> >
> > // In idpf_vport_dealloc() called by idpf_init_hard_reset(),
> > // idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
> > // idpf_decfg_netdev() doesn't get called.
> > if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
> > idpf_decfg_netdev(vport);
> > // idpf_decfg_netdev() would clear netdev but it isn't called:
> > unregister_netdev(vport->netdev);
> > free_netdev(vport->netdev);
> > vport->netdev = NULL;
> > // Later in idpf_init_hard_reset(), the vport is cleared:
> > kfree(adapter->vports);
> > adapter->vports = NULL;
> >
> > During an idpf HW reset, when userspace restarts the network
> service,
> > the vport associated with the netdev is NULL, and so a kernel panic
> > would
> > happen:
> >
> > [ 1791.669339] BUG: kernel NULL pointer dereference, address:
> > 0070 ...
> > [ 1791.717130] RIP: 0010:idpf_vport_stop+0x16/0x1c0
> >
> > This can be reproduced reliably by injecting a TX timeout to cause
> an
> > idpf HW reset, and injecting a virtchnl error to cause the HW reset
> to
> > fail and retry, while running "service network restart" in
> userspace.
> >
> > With this patch applied, we see the following error but no kernel
> > panics anymore:
> >
> > [  181.409483] idpf :05:00.0 eth1: mtu not changed due to no
> vport
> > innetdev RTNETLINK answers: Bad address ...
"innetdev" -> "in netdev"

> > [  181.913644] idpf :05:00.0 eth1: not stopping vport because it
> > is NULL [  181.938675] idpf :05:00.0 eth1: mtu not changed due
> to
> > no vport in netdev ...
> > [  242.849499] idpf :05:00.0 eth1: not opening vport because it
> is
> > NULL ...
> > [  304.289364] idpf :05:00.0 eth0: not opening vport because it
> is
> > NULL
> >
> > Signed-off-by: Li Li 
> > ---
> >  drivers/net/ethernet/intel/idpf/idpf_lib.c | 12 
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> > b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> > index 53b31989722a7..a9a556499262b 100644
> > --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> > +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> > @@ -1021,6 +1021,8 @@ static void idpf_vport_stop(struct idpf_vport
> > *vport, bool rtnl)
> >   */
> >  static int idpf_stop(struct net_device *netdev)  {
> > +   if (!netdev)
> > +   return 0;
> > struct idpf_netdev_priv *np = netdev_priv(netdev);
> > struct idpf_vport *vport;
> >
> > @@ -1029,9 +1031,14 @@ static int idpf_stop(struct net_device
> > *netdev)
> >
> > idpf_vport_ctrl_lock(netdev);
> > vport = idpf_netdev_to_vport(netdev);
> > +   if (!vport) {
> > +   netdev_err(netdev, "not stopping vport because it is
> > NULL");
> Please don't forget to add trailing '\n'.
> 
> > +   goto unlock;
> > +   }
> >
> > idpf_vport_stop(vport, false);
> >
> > +unlock:
> > idpf_vport_ctrl_unlock(netdev);
> >
> > return 0;
> > @@ -2301,6 +2308,11 @@ static int idpf_open(struct net_device
> > *netdev)
> >
> > idpf_vport_ctrl_lock(netdev);
> > vport = idpf_netdev_to_vport(netdev);
> > +   if (!vport) {
> > +   netdev_err(netdev, "not opening vport because it is
> > NULL");
> Please don't forget to add trailing '\n', here too.
> 
> > +   err = -EFAULT;
> > +   goto unlock;
> > +   }
> >
> > err = idpf_set_real_num_queues(vport);
> > if (err)
> > --
> > 2.52.0.351.gbe84eed79e-goog
> 
> Reviewed-by: Aleksandr Loktionov 



Re: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening vport if it is NULL during HW reset

2026-01-08 Thread Loktionov, Aleksandr


> -Original Message-
> From: Intel-wired-lan  On Behalf
> Of Li Li via Intel-wired-lan
> Sent: Wednesday, January 7, 2026 2:05 AM
> To: Nguyen, Anthony L ; Kitszel,
> Przemyslaw ; David S. Miller
> ; Jakub Kicinski ; Eric
> Dumazet ; [email protected]
> Cc: [email protected]; [email protected]; David
> Decotigny ; Singhai, Anjali
> ; Samudrala, Sridhar
> ; Brian Vazquez ;
> Li Li ; Tantilov, Emil S
> 
> Subject: [Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening
> vport if it is NULL during HW reset
> 
> When an idpf HW reset is triggered, it clears the vport but does not
> clear the netdev held by vport:
> 
> // In idpf_vport_dealloc() called by idpf_init_hard_reset(),
> // idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
> // idpf_decfg_netdev() doesn't get called.
> if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
> idpf_decfg_netdev(vport);
> // idpf_decfg_netdev() would clear netdev but it isn't called:
> unregister_netdev(vport->netdev);
> free_netdev(vport->netdev);
> vport->netdev = NULL;
> // Later in idpf_init_hard_reset(), the vport is cleared:
> kfree(adapter->vports);
> adapter->vports = NULL;
> 
> During an idpf HW reset, when userspace restarts the network
> service, the vport associated with the netdev is NULL, and so a
> kernel panic would
> happen:
> 
> [ 1791.669339] BUG: kernel NULL pointer dereference, address:
> 0070 ...
> [ 1791.717130] RIP: 0010:idpf_vport_stop+0x16/0x1c0
> 
> This can be reproduced reliably by injecting a TX timeout to cause
> an idpf HW reset, and injecting a virtchnl error to cause the HW
> reset to fail and retry, while running "service network restart" in
> userspace.
> 
> With this patch applied, we see the following error but no kernel
> panics anymore:
> 
> [  181.409483] idpf :05:00.0 eth1: mtu not changed due to no
> vport innetdev RTNETLINK answers: Bad address ...
> [  181.913644] idpf :05:00.0 eth1: not stopping vport because it
> is NULL [  181.938675] idpf :05:00.0 eth1: mtu not changed due
> to no vport in netdev ...
> [  242.849499] idpf :05:00.0 eth1: not opening vport because it
> is NULL ...
> [  304.289364] idpf :05:00.0 eth0: not opening vport because it
> is NULL
> 
> Signed-off-by: Li Li 
> ---
>  drivers/net/ethernet/intel/idpf/idpf_lib.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> index 53b31989722a7..a9a556499262b 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
> @@ -1021,6 +1021,8 @@ static void idpf_vport_stop(struct idpf_vport
> *vport, bool rtnl)
>   */
>  static int idpf_stop(struct net_device *netdev)  {
> + if (!netdev)
> + return 0;
>   struct idpf_netdev_priv *np = netdev_priv(netdev);
>   struct idpf_vport *vport;
> 
> @@ -1029,9 +1031,14 @@ static int idpf_stop(struct net_device
> *netdev)
> 
>   idpf_vport_ctrl_lock(netdev);
>   vport = idpf_netdev_to_vport(netdev);
> + if (!vport) {
> + netdev_err(netdev, "not stopping vport because it is
> NULL");
Please don't forget to add trailing '\n'.

> + goto unlock;
> + }
> 
>   idpf_vport_stop(vport, false);
> 
> +unlock:
>   idpf_vport_ctrl_unlock(netdev);
> 
>   return 0;
> @@ -2301,6 +2308,11 @@ static int idpf_open(struct net_device
> *netdev)
> 
>   idpf_vport_ctrl_lock(netdev);
>   vport = idpf_netdev_to_vport(netdev);
> + if (!vport) {
> + netdev_err(netdev, "not opening vport because it is
> NULL");
Please don't forget to add trailing '\n', here too.

> + err = -EFAULT;
> + goto unlock;
> + }
> 
>   err = idpf_set_real_num_queues(vport);
>   if (err)
> --
> 2.52.0.351.gbe84eed79e-goog

Reviewed-by: Aleksandr Loktionov 



[Intel-wired-lan] [PATCH 5/5] idpf: skip stopping/opening vport if it is NULL during HW reset

2026-01-06 Thread Li Li via Intel-wired-lan
When an idpf HW reset is triggered, it clears the vport but does
not clear the netdev held by vport:

// In idpf_vport_dealloc() called by idpf_init_hard_reset(),
// idpf_init_hard_reset() sets IDPF_HR_RESET_IN_PROG, so
// idpf_decfg_netdev() doesn't get called.
if (!test_bit(IDPF_HR_RESET_IN_PROG, adapter->flags))
idpf_decfg_netdev(vport);
// idpf_decfg_netdev() would clear netdev but it isn't called:
unregister_netdev(vport->netdev);
free_netdev(vport->netdev);
vport->netdev = NULL;
// Later in idpf_init_hard_reset(), the vport is cleared:
kfree(adapter->vports);
adapter->vports = NULL;

During an idpf HW reset, when userspace restarts the network service,
the vport associated with the netdev is NULL, and so a kernel panic would
happen:

[ 1791.669339] BUG: kernel NULL pointer dereference, address: 0070
...
[ 1791.717130] RIP: 0010:idpf_vport_stop+0x16/0x1c0

This can be reproduced reliably by injecting a TX timeout to cause
an idpf HW reset, and injecting a virtchnl error to cause the HW
reset to fail and retry, while running "service network restart" in
userspace.

With this patch applied, we see the following error but no kernel
panics anymore:

[  181.409483] idpf :05:00.0 eth1: mtu not changed due to no vport innetdev
RTNETLINK answers: Bad address
...
[  181.913644] idpf :05:00.0 eth1: not stopping vport because it is NULL
[  181.938675] idpf :05:00.0 eth1: mtu not changed due to no vport in netdev
...
[  242.849499] idpf :05:00.0 eth1: not opening vport because it is NULL
...
[  304.289364] idpf :05:00.0 eth0: not opening vport because it is NULL

Signed-off-by: Li Li 
---
 drivers/net/ethernet/intel/idpf/idpf_lib.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c 
b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 53b31989722a7..a9a556499262b 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -1021,6 +1021,8 @@ static void idpf_vport_stop(struct idpf_vport *vport, 
bool rtnl)
  */
 static int idpf_stop(struct net_device *netdev)
 {
+   if (!netdev)
+   return 0;
struct idpf_netdev_priv *np = netdev_priv(netdev);
struct idpf_vport *vport;
 
@@ -1029,9 +1031,14 @@ static int idpf_stop(struct net_device *netdev)
 
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+   if (!vport) {
+   netdev_err(netdev, "not stopping vport because it is NULL");
+   goto unlock;
+   }
 
idpf_vport_stop(vport, false);
 
+unlock:
idpf_vport_ctrl_unlock(netdev);
 
return 0;
@@ -2301,6 +2308,11 @@ static int idpf_open(struct net_device *netdev)
 
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
+   if (!vport) {
+   netdev_err(netdev, "not opening vport because it is NULL");
+   err = -EFAULT;
+   goto unlock;
+   }
 
err = idpf_set_real_num_queues(vport);
if (err)
-- 
2.52.0.351.gbe84eed79e-goog