On Sat, Apr 11, 2020 at 11:54:04PM +1000, David Gwynne wrote:
> On Sat, Apr 11, 2020 at 01:43:20PM +0000, Visa Hankala wrote:
> > On Sat, Apr 11, 2020 at 11:09:54PM +1000, David Gwynne wrote:
> > > On Sat, Apr 11, 2020 at 03:21:49AM +0000, Visa Hankala wrote:
> > > > On Fri, Apr 10, 2020 at 01:30:47PM -0600, Theo de Raadt wrote:
> > > > > Why did it take almost a year to find this?
> > > > > 
> > > > > Or is this bug due to ioctl(2) becoming UNLOCKED on 2020/02/22?
> > > > 
> > > > This is not related to ioctl(2) becoming UNLOCKED. Lower-layer ioctl
> > > > code, soo_ioctl() included, lock the kernel when needed. However, most
> > > > .if_ioctl backends need NET_LOCK() in addition to KERNEL_LOCK(). In
> > > > most cases, that is satisfied by ifioctl() which acquires the lock
> > > > before invoking .if_ioctl(). bridge_ioctl() nullifies this by
> > > > releasing NET_LOCK().
> > > 
> > > yes.
> > > 
> > > i came up with the following diff before i read the thread here. it's
> > > largely identical to what you (visa) already came up with, but it adds
> > > some extra checks to ifpromisc based on the doco in around struct ifnet
> > > members in src/sys/net/if_var.h. i audited the rest of the ifpromisc
> > > calls and found another one in if_aggr that i was able to trigger.
> > > 
> > > i think the only other call to ifpromisc outside src/sys/net is in carp,
> > > and i managed to convinced myself that all those calls hold NET_LOCK
> > > already.
> > > 
> > > Index: if.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/net/if.c,v
> > > retrieving revision 1.601
> > > diff -u -p -r1.601 if.c
> > > --- if.c  10 Mar 2020 09:11:55 -0000      1.601
> > > +++ if.c  11 Apr 2020 13:08:46 -0000
> > > @@ -3031,7 +3031,9 @@ ifpromisc(struct ifnet *ifp, int pswitch
> > >   unsigned short oif_flags;
> > >   int oif_pcount, error;
> > >  
> > > + NET_ASSERT_LOCKED(); /* modifying if_flags */
> > >   oif_flags = ifp->if_flags;
> > > + KERNEL_ASSERT_LOCKED(); /* modifying if_pcount */
> > >   oif_pcount = ifp->if_pcount;
> > >   if (pswitch) {
> > >           if (ifp->if_pcount++ != 0)
> > > Index: if_aggr.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/net/if_aggr.c,v
> > > retrieving revision 1.28
> > > diff -u -p -r1.28 if_aggr.c
> > > --- if_aggr.c     11 Mar 2020 07:01:42 -0000      1.28
> > > +++ if_aggr.c     11 Apr 2020 13:08:46 -0000
> > > @@ -589,8 +589,10 @@ aggr_clone_destroy(struct ifnet *ifp)
> > >   if_detach(ifp);
> > >  
> > >   /* last ref, no need to lock. aggr_p_dtor locks anyway */
> > > + NET_LOCK();
> > >   while ((p = TAILQ_FIRST(&sc->sc_ports)) != NULL)
> > >           aggr_p_dtor(sc, p, "destroy");
> > > + NET_UNLOCK();
> > >  
> > >   free(sc, M_DEVBUF, sizeof(*sc));
> > >  
> > > Index: if_bridge.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/net/if_bridge.c,v
> > > retrieving revision 1.338
> > > diff -u -p -r1.338 if_bridge.c
> > > --- if_bridge.c   6 Nov 2019 03:51:26 -0000       1.338
> > > +++ if_bridge.c   11 Apr 2020 13:08:46 -0000
> > > @@ -313,7 +313,9 @@ bridge_ioctl(struct ifnet *ifp, u_long c
> > >                   break;
> > >           }
> > >  
> > > +         NET_LOCK();
> > >           error = ifpromisc(ifs, 1);
> > > +         NET_UNLOCK();
> > >           if (error != 0) {
> > >                   free(bif, M_DEVBUF, sizeof(*bif));
> > >                   break;
> > > @@ -558,7 +560,9 @@ bridge_ifremove(struct bridge_iflist *bi
> > >   }
> > >  
> > >   bif->ifp->if_bridgeidx = 0;
> > > + NET_LOCK();
> > >   error = ifpromisc(bif->ifp, 0);
> > > + NET_UNLOCK();
> > >  
> > >   bridge_rtdelete(sc, bif->ifp, 0);
> > >   bridge_flushrule(bif);
> > > Index: if_tpmr.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/net/if_tpmr.c,v
> > > retrieving revision 1.9
> > > diff -u -p -r1.9 if_tpmr.c
> > > --- if_tpmr.c     11 Apr 2020 11:01:03 -0000      1.9
> > > +++ if_tpmr.c     11 Apr 2020 13:08:46 -0000
> > > @@ -201,12 +201,14 @@ tpmr_clone_destroy(struct ifnet *ifp)
> > >  
> > >   if_detach(ifp);
> > >  
> > > + NET_LOCK();
> > >   for (i = 0; i < nitems(sc->sc_ports); i++) {
> > >           struct tpmr_port *p = SMR_PTR_GET_LOCKED(&sc->sc_ports[i]);
> > >           if (p == NULL)
> > >                   continue;
> > >           tpmr_p_dtor(sc, p, "destroy");
> > >   }
> > > + NET_UNLOCK();
> > 
> > This makes tpmr_p_dtor() call smr_barrier() with NET_LOCK() held.
> 
> tpmr_p_dtor is already called with NET_LOCK held, this just does it
> consistently now.
> 
> > There is a risk of a deadlock if some SMR callback wants to take that
> > lock. Currently all the callbacks look NET_LOCK()-free, though.
> > 
> > A similar point applies to the smr_barrier() in aggr(4).
> 
> Similarly, the smr_barrier in aggr(4) was usually called with NET_LOCK
> held, and is now consistently called with it.
> 
> > Could tpmr_p_dtor() and aggr_map() release the lock momentarily when
> > calling the barrier?
> 
> That looks straightforward to do in tpmr_p_dtor(). aggr_map() is going
> to take a lot more thought before I can say. aggr_map() is called
> all over the place, and I'll have to fit a lot more code back into my
> head to figure out what's going on.
> 
> However, you say that currently there are no SMR callbacks that take the
> NET_LOCK. Can we take advantage of that and put the ifpromisc diffs in,
> and work on the smr_barrier bits separately?

I think so. However, now that I looked into aggr_p_dtor(),
timeout_del_barrier() is another source of trouble if NET_LOCK() is
held (like is anything barrier-like that can get blocked if another
thread is prevented from making progress because of the lock). That
needs to be fixed sooner or later as well. Luckily there already is
an XXX next to the call. :)

Reply via email to