On Tue, Nov 07, 2017 at 07:11:56AM +0100, Florian Westphal wrote:
> Peter Zijlstra <pet...@infradead.org> wrote:
> > On Mon, Nov 06, 2017 at 11:51:07AM +0100, Florian Westphal wrote:
> > > @@ -180,6 +164,12 @@ int __rtnl_register(int protocol, int msgtype,
> > >           rcu_assign_pointer(rtnl_msg_handlers[protocol], tab);
> > >   }
> > >  
> > > + WARN_ON(tab[msgindex].owner && tab[msgindex].owner != owner);
> > > +
> > > + tab[msgindex].owner = owner;
> > > + /* make sure owner is always visible first */
> > > + smp_wmb();
> > > +
> > >   if (doit)
> > >           tab[msgindex].doit = doit;
> > >   if (dumpit)
> > 
> > > @@ -235,6 +279,9 @@ int rtnl_unregister(int protocol, int msgtype)
> > >   handlers[msgindex].doit = NULL;
> > >   handlers[msgindex].dumpit = NULL;
> > >   handlers[msgindex].flags = 0;
> > > + /* make sure we clear owner last */
> > > + smp_wmb();
> > > + handlers[msgindex].owner = NULL;
> > >   rtnl_unlock();
> > >  
> > >   return 0;
> > 
> > These wmb()'s don't make sense; and the comments are incomplete. What do
> > they pair with? Who cares about this ordering?
> 
> rtnetlink_rcv_msg:
> 
> 4406                         dumpit = READ_ONCE(handlers[type].dumpit);
> 4407                         if (!dumpit)
> 4408                                 goto err_unlock;
> 4409                         owner = READ_ONCE(handlers[type].owner);

So what stops the CPU from hoisting this load before the dumpit load?

> 4410                 }
> ..
> 4417                 if (!try_module_get(owner))
> 4418                         err = -EPROTONOSUPPORT;
> 4419 
> 
> I don't want dumpit function address to be visible before owner.
> Does that make sense?

And no. That's insane, how can it ever observe an incomplete tab in the
first place.

The problem is that __rtnl_register() and rtnl_unregister are broken.

__rtnl_register() publishes the tab before it initializes it; allowing
people to observe the thing incomplete.

Also, are we required to hold rtnl_lock() across __rtnl_register()? I'd
hope so, otherwise what stops concurrent allocations and leaking of tab?

Also, rtnl_register() doesn't seen to employ rtnl_lock() and panic()
WTF?!

rtnl_unregister() should then RCU free the tab.

None of that is happening, so what is that RCU stuff supposed to do?

Reply via email to