On Thu, Apr 22, 2021 at 01:13:50PM +1000, David Gwynne wrote: > On Wed, Apr 21, 2021 at 01:15:53PM +0000, Visa Hankala wrote: > > On Wed, Apr 21, 2021 at 11:04:20AM +1000, David Gwynne wrote: > > > On Wed, Apr 21, 2021 at 10:21:32AM +1000, David Gwynne wrote: > > > > if you have a program that uses kq (or libevent) to wait for bytes to > > > > read off an idle network interface via /dev/bpf and that interface > > > > goes away, the program doesnt get woken up. this is because the kq > > > > read filter in bpf only checks if there ares bytes available. because a > > > > detached interface never gets packets (how very zen), this condition > > > > never changes and the program will never know something happened. > > > > > > > > this has the bpf filter check if the interface is detached too. with > > > > this change my test program wakes up, tries to read, and gets EIO. which > > > > is great. > > > > > > > > note that in the middle of this is the vdevgone machinery. when an > > > > interface is detached, bpfdetach gets called, which ends up calling > > > > vdevgone. vdevgone sort of swaps out bpf on the currently open vdev with > > > > some dead operations, part of which involves calling bpfclose() to try > > > > and clean up the existing state associated with the vdev. bpfclose tries > > > > to wake up any waiting listeners, which includes kq handlers. that's how > > > > the kernel goes from an interface being detached to the bpf kq filter > > > > being run. the bpf kq filter just has to check that the interface is > > > > still attached. > > > > > > I thought tun(4) had this same problem, but I wrote a test and couldn't > > > reproduce it. tun works because it addresses the problem in a different > > > way. Instead of having its own kq filter check if the device is dead or > > > not, it calls klist_invalidate, which switches things around like the > > > vdevgone/vop_revoke stuff does with the vdev. > > > > > > So an alternative way to solve this problem in bpf(4) would be the > > > following: > > > > > > Index: bpf.c > > > =================================================================== > > > RCS file: /cvs/src/sys/net/bpf.c,v > > > retrieving revision 1.203 > > > diff -u -p -r1.203 bpf.c > > > --- bpf.c 21 Jan 2021 12:33:14 -0000 1.203 > > > +++ bpf.c 21 Apr 2021 00:54:30 -0000 > > > @@ -401,6 +401,7 @@ bpfclose(dev_t dev, int flag, int mode, > > > bpf_wakeup(d); > > > LIST_REMOVE(d, bd_list); > > > mtx_leave(&d->bd_mtx); > > > + klist_invalidate(&d->bd_sel.si_note); > > > bpf_put(d); > > > > > > return (0); > > > > I think bpf should call klist_invalidate() from the detach path. > > bpfsdetach() might be the right place. This would make the code pattern > > similar to the existing uses of klist_invalidate(). > > > > Calling klist_invalidate() from the close function twists the logic, > > at least in my opinion. When a file descriptor is closed, the file > > descriptor layer will remove the knotes automatically. This is why close > > functions usually do not have to manage with knotes. However, the > > automatic removal does not happen when a device is revoked, which is > > mended by klist_invalidate(). > > yep, makes sense to me. how's this look? it works as well as my previous > diff did in my tests.
Looks good. OK visa@ > Index: bpf.c > =================================================================== > RCS file: /cvs/src/sys/net/bpf.c,v > retrieving revision 1.203 > diff -u -p -r1.203 bpf.c > --- bpf.c 21 Jan 2021 12:33:14 -0000 1.203 > +++ bpf.c 22 Apr 2021 03:09:27 -0000 > @@ -1690,8 +1690,10 @@ bpfsdetach(void *p) > if (cdevsw[maj].d_open == bpfopen) > break; > > - while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist))) > + while ((bd = SMR_SLIST_FIRST_LOCKED(&bp->bif_dlist))) { > vdevgone(maj, bd->bd_unit, bd->bd_unit, VCHR); > + klist_invalidate(&bd->bd_sel.si_note); > + } > > for (tbp = bpf_iflist; tbp; tbp = tbp->bif_next) { > if (tbp->bif_next == bp) { >