On Thu, Feb 24, 2022 at 07:39:54PM +1000, David Gwynne wrote: > claudio and i came up with the following, which is to have tun_dev_open > check the state of the vnode associated with the current open call > after all the sleeping and potential tun_clone_destroy and > tun_clone_create calls. if the vnode has been made bad/dead after > all the sleeping, it returns with ENXIO. > > this appears to fix the issue. i had a test program that would trigger > the KASSERT in a matter of seconds and it's been running for an hour > now. > > here's the diff.
The diff looks reasonable. OK visa@ I guess a similar issue can arise with some other detachable devices as well. However, the problem is most acute with things like tun(4) that can be created and destroyed programmatically on a whim at high speed. spec_open() could check for vnode revocation after completing d_open. Of course, that would not fix the problem with tun. The check would allow error reporting to happen earlier, but there would be no fundamental shift in behaviour, I think. > Index: if_tun.c > =================================================================== > RCS file: /cvs/src/sys/net/if_tun.c,v > retrieving revision 1.234 > diff -u -p -r1.234 if_tun.c > --- if_tun.c 16 Feb 2022 02:22:39 -0000 1.234 > +++ if_tun.c 24 Feb 2022 08:08:38 -0000 > @@ -374,10 +374,19 @@ tun_dev_open(dev_t dev, const struct if_ > struct ifnet *ifp; > int error; > u_short stayup = 0; > + struct vnode *vp; > > char name[IFNAMSIZ]; > unsigned int rdomain; > > + /* > + * Find the vnode associated with this open before we sleep > + * and let something else revoke it. Our caller has a reference > + * to it so we don't need to account for it. > + */ > + if (!vfinddev(dev, VCHR, &vp)) > + panic("%s vfinddev failed", __func__); > + > snprintf(name, sizeof(name), "%s%u", ifc->ifc_name, minor(dev)); > rdomain = rtable_l2(p->p_p->ps_rtableid); > > @@ -412,6 +421,12 @@ tun_dev_open(dev_t dev, const struct if_ > /* XXX if_clone_destroy if stayup? */ > goto done; > } > + } > + > + /* Has tun_clone_destroy torn the rug out under us? */ > + if (vp->v_type == VBAD) { > + error = ENXIO; > + goto done; > } > > if (sc->sc_dev != 0) { >