On Thu, May 22, 2025 at 03:13:46PM -0700, Jakub Kicinski wrote:
> On Wed, 21 May 2025 03:25:03 -0700 Saurabh Sengar wrote:
> > The MANA driver's probe registers netdevice via the following call chain:
> > 
> > mana_probe()
> >   register_netdev()
> >     register_netdevice()
> > 
> > register_netdevice() calls notifier callback for netvsc driver,
> > holding the netdev mutex via netdev_lock_ops().
> > 
> > Further this netvsc notifier callback end up attempting to acquire the
> > same lock again in dev_xdp_propagate() leading to deadlock.
> > 
> > netvsc_netdev_event()
> >   netvsc_vf_setxdp()
> >     dev_xdp_propagate()
> > 
> > This deadlock was not observed so far because net_shaper_ops was never set,
> 
> The lock is on the VF, I think you meant to say that no device you use
> in Azure is ops locked?
> 
> There's also the call to netvsc_register_vf() on probe path, please
> fix or explain why it doesn't need locking in the commit message.

This patch specifically addresses the netvsc_register_vf() path only.
I omitted the mention of netvsc_register_vf() in the commit message
to keep the function path shorter. The full stack trace is provided below:

[   92.542180]  dev_xdp_propagate+0x2c/0x1b0
[   92.542185]  netvsc_vf_setxdp+0x10d/0x180 [hv_netvsc]
[   92.542192]  netvsc_register_vf.part.0+0x179/0x200 [hv_netvsc]
[   92.542196]  netvsc_netdev_event+0x267/0x340 [hv_netvsc]
[   92.542200]  notifier_call_chain+0x5f/0xc0
[   92.542203]  raw_notifier_call_chain+0x16/0x20
[   92.542205]  call_netdevice_notifiers_info+0x52/0xa0
[   92.542209]  register_netdevice+0x7c8/0xaa0
[   92.542211]  register_netdev+0x1f/0x40
[   92.542214]  mana_probe+0x6e2/0x8e0 [mana]
[   92.542220]  mana_gd_probe+0x187/0x220 [mana]

If you prefer I can update the stack trace in commit meesage
From:

netvsc_netdev_event()
  netvsc_vf_setxdp()
    dev_xdp_propagate()

To:

netvsc_netdev_event()
  netvsc_register_vf()
    netvsc_vf_setxdp()
      dev_xdp_propagate()

- Saurabh

> -- 
> pw-bot: cr

Reply via email to