On 13/09/16(Tue) 12:23, Martin Pieuchot wrote: > Here's the big scary diff I've been using for some months now to stop > grabbing the KERNEL_LOCK() in bpf_mtap(9). This has been originally > written to prevent lock ordering inside pf_test(). Now that we're > heading toward using a rwlock, we won't have this problem, but fewer > usages of KERNEL_LOCK() is still interesting. > > I'm going to split this diff in small chunks to ease the review. But > I'd appreciate if people could try to break it, test & report back. > > Some notes: > > - Now that selwakeup() is called in a thread context (task) we only > rely on the KERNEL_LOCK() to serialize access to kqueue(9) data. > > - The reference counting is here to make sure a descriptor is not > freed during a sleep. That's why the KERNEL_LOCK() is still > necessary in the slow path. On the other hand bpf_catchpacket() > relies on the reference guaranteed by the SRP list. > > - A mutex now protects the rotating buffers and their associated > fields. It is dropped before calling ifpromisc() because USB > devices sleep. > > - The dance around uiomove(9) is here to check that buffers aren't > rotated while data is copied to userland. Setting ``b->bd_fbuf'' > to NULL should be enough to let bpf_catchpacket() to drop the > patcket. But I added ``__in_uiomove'' to be able to have usable > panic if something weird happen.
Next extract diff that tweaks bpf_detachd(). The goal here is to remove ``d'' from the list and NULLify ``d->bd_bif'' before calling ifpromisc(). The reason is that ifpromisc() can sleep. Think USB ;) So we'll have to release the mutex before calling it. So we want to make sure ``d'' is no longer in the interface descriptor list at this point. ok? Index: net/bpf.c =================================================================== RCS file: /cvs/src/sys/net/bpf.c,v retrieving revision 1.154 diff -u -p -r1.154 bpf.c --- net/bpf.c 21 Nov 2016 09:15:40 -0000 1.154 +++ net/bpf.c 22 Nov 2016 10:47:05 -0000 @@ -288,6 +288,23 @@ bpf_detachd(struct bpf_d *d) struct bpf_if *bp; bp = d->bd_bif; + /* Not attached. */ + if (bp == NULL) + return; + + /* Remove ``d'' from the interface's descriptor list. */ + KERNEL_ASSERT_LOCKED(); + SRPL_REMOVE_LOCKED(&bpf_d_rc, &bp->bif_dlist, d, bpf_d, bd_next); + + if (SRPL_EMPTY_LOCKED(&bp->bif_dlist)) { + /* + * Let the driver know that there are no more listeners. + */ + *bp->bif_driverp = NULL; + } + + d->bd_bif = NULL; + /* * Check if this descriptor had requested promiscuous mode. * If so, turn it off. @@ -305,19 +322,6 @@ bpf_detachd(struct bpf_d *d) */ panic("bpf: ifpromisc failed"); } - - /* Remove d from the interface's descriptor list. */ - KERNEL_ASSERT_LOCKED(); - SRPL_REMOVE_LOCKED(&bpf_d_rc, &bp->bif_dlist, d, bpf_d, bd_next); - - if (SRPL_EMPTY_LOCKED(&bp->bif_dlist)) { - /* - * Let the driver know that there are no more listeners. - */ - *d->bd_bif->bif_driverp = 0; - } - - d->bd_bif = NULL; } void @@ -372,8 +376,7 @@ bpfclose(dev_t dev, int flag, int mode, d = bpfilter_lookup(minor(dev)); s = splnet(); - if (d->bd_bif) - bpf_detachd(d); + bpf_detachd(d); bpf_wakeup(d); LIST_REMOVE(d, bd_list); bpf_put(d); @@ -1033,12 +1036,10 @@ bpf_setif(struct bpf_d *d, struct ifreq goto out; } if (candidate != d->bd_bif) { - if (d->bd_bif) - /* - * Detach if attached to something else. - */ - bpf_detachd(d); - + /* + * Detach if attached to something else. + */ + bpf_detachd(d); bpf_attachd(d, candidate); } bpf_resetd(d);