On Thu, May 22, 2025 at 03:13:46PM -0700, Jakub Kicinski wrote:
> On Wed, 21 May 2025 03:25:03 -0700 Saurabh Sengar wrote:
> > The MANA driver's probe registers netdevice via the following call chain:
> >
> > mana_probe()
> > register_netdev()
> > register_netdevice()
> >
> > register_netdevice() calls notifier callback for netvsc driver,
> > holding the netdev mutex via netdev_lock_ops().
> >
> > Further this netvsc notifier callback end up attempting to acquire the
> > same lock again in dev_xdp_propagate() leading to deadlock.
> >
> > netvsc_netdev_event()
> > netvsc_vf_setxdp()
> > dev_xdp_propagate()
> >
> > This deadlock was not observed so far because net_shaper_ops was never set,
>
> The lock is on the VF, I think you meant to say that no device you use
> in Azure is ops locked?
>
> There's also the call to netvsc_register_vf() on probe path, please
> fix or explain why it doesn't need locking in the commit message.
This patch specifically addresses the netvsc_register_vf() path only.
I omitted the mention of netvsc_register_vf() in the commit message
to keep the function path shorter. The full stack trace is provided below:
[ 92.542180] dev_xdp_propagate+0x2c/0x1b0
[ 92.542185] netvsc_vf_setxdp+0x10d/0x180 [hv_netvsc]
[ 92.542192] netvsc_register_vf.part.0+0x179/0x200 [hv_netvsc]
[ 92.542196] netvsc_netdev_event+0x267/0x340 [hv_netvsc]
[ 92.542200] notifier_call_chain+0x5f/0xc0
[ 92.542203] raw_notifier_call_chain+0x16/0x20
[ 92.542205] call_netdevice_notifiers_info+0x52/0xa0
[ 92.542209] register_netdevice+0x7c8/0xaa0
[ 92.542211] register_netdev+0x1f/0x40
[ 92.542214] mana_probe+0x6e2/0x8e0 [mana]
[ 92.542220] mana_gd_probe+0x187/0x220 [mana]
If you prefer I can update the stack trace in commit meesage
From:
netvsc_netdev_event()
netvsc_vf_setxdp()
dev_xdp_propagate()
To:
netvsc_netdev_event()
netvsc_register_vf()
netvsc_vf_setxdp()
dev_xdp_propagate()
- Saurabh
> --
> pw-bot: cr