On 19/06/18(Tue) 15:08, Mark Kettenis wrote: > > Date: Tue, 19 Jun 2018 12:51:35 +0200 > > From: Martin Pieuchot <m...@openbsd.org> > > > > There's one place in our kernel where `f_data' is overwritten while the > > descriptor sits in multiple shared data structures. It is in diskmap. > > > > We want to avoid this situation to be able to treat f_data as immutable. > > > > So the diff below remove `fp' from the shared data structures before it > > gets modified. It is not a complete solution to the reference grabbing > > problem of vnode. There's still a window where a race is possible > > between the moment where we release the mutex and increase the vnode's > > reference. The KERNEL_LOCK() is what protects us for now. However this > > problem is the next one to solve to be able to unlock syscalls messing > > with vnodes. > > > > Still this diff is a step towards that. > > > > Note that this change only apply to vnodes, so it doesn't matter for > > socket-related syscalls. > > > > Ok? > > I think this is the wrong approach. Instead of modifying the struct > file, this code should create a new struct file. Then replace the > pointer in the file descriptor table with that new file and then drop > the reference to the old struct file. That way f_data *is* immutable.
I tried that first... well it didn't work for me. The problem is that once you swapped `fp' you still need to free the old one. But you don't want to close the associated file, so you end up NULLing `f_data' before calling closef()... On top of that this approach introduces a new error condition if we reach `maxfiles', yes this is a degenerated case but I'd prefer not have to deal with it. The approach below returns `fp' into the "larval" or "reserved" state, change it, then insert it again. > > > Index: dev/diskmap.c > > =================================================================== > > RCS file: /cvs/src/sys/dev/diskmap.c,v > > retrieving revision 1.20 > > diff -u -p -r1.20 diskmap.c > > --- dev/diskmap.c 9 May 2018 08:42:02 -0000 1.20 > > +++ dev/diskmap.c 19 Jun 2018 10:38:37 -0000 > > @@ -59,7 +59,7 @@ diskmapioctl(dev_t dev, u_long cmd, cadd > > struct filedesc *fdp = p->p_fd; > > struct file *fp = NULL; > > struct vnode *vp = NULL, *ovp; > > - char *devname; > > + char *devname, flags; > > int fd, error = EINVAL; > > > > if (cmd != DIOCMAP) > > @@ -106,6 +106,15 @@ diskmapioctl(dev_t dev, u_long cmd, cadd > > vput(ovp); > > } > > > > + /* Remove the file while we're modifying it to prevent races. */ > > + flags = fdp->fd_ofileflags[fd]; > > + mtx_enter(&fhdlk); > > + fdp->fd_ofiles[fd] = NULL; > > + fdp->fd_ofileflags[fd] = 0; > > + if (fp->f_iflags & FIF_INSERTED) > > + LIST_REMOVE(fp, f_list); > > + mtx_leave(&fhdlk); > > + > > fp->f_data = (caddr_t)vp; > > fp->f_offset = 0; > > mtx_enter(&fp->f_mtx); > > @@ -115,6 +124,9 @@ diskmapioctl(dev_t dev, u_long cmd, cadd > > fp->f_rbytes = 0; > > fp->f_wbytes = 0; > > mtx_leave(&fp->f_mtx); > > + > > + /* Insert it back where it was. */ > > + fdinsert(fdp, fd, flags, fp); > > > > VOP_UNLOCK(vp); > > > > > >