On Feb 28, 2014, at 4:41 PM, Taylor R Campbell <campbell+netbsd-tech-k...@mumble.net> wrote: <snip> > First, using an explicit vnode state, v_interlock to inspect or modify, > looks good. > > Actually, I think I may want to change that, so that vget can be done > under a pserialized read -- i.e., atomics only, no locks or waits. I > took a look at dholland's message from September and started to write > a generic inode->vnode cache with pserialize, and realized vget needs > to happen under it. But at least the explicit states will help to > clarify the model for now, and will be a good fallback if I give up > before making a pserialized inode->vnode cache.
I don't see a many reader / rare writer case here. Pserialize is very expensive for the writer case (at the moment you even get a minimum of 2 HZ for a pserialize_perform -- but this is a bug). <snip> > I wanted to ensure it returns failure in case of overflow instead of > panicking (or silently proceeding, as it always does currently). > >> case VS_REVOKING: >> case VS_DESTROYING: >> return ENOENT; > > Return without waiting looks bad. > > I think this is actually OK. If you're a file system, and you're > trying to grab one of your own vnodes, it's no good to get one whose > identity will soon be changed to deadfs -- you need to make a new > vnode for the inode you were looking for. Nor is it any good to try > to work with one that vfs has chosen to destroy -- it's too late to > recover that one. So in both cases immediately returning ENOENT is > the right thing. It is better not to have two vnodes (on dying, one creating) representing the same inode in parallel. <snip> >> int >> vn_lock(struct vnode *vp, int flags) >> { >> int error = 0; >> >> VOP_LOCK(vp, flags); >> >> mutex_enter(vp->v_interlock); >> if (__predict_false(vp->v_state == VS_REVOKING) || >> (vp->v_op == dead_vnodeop_p)) >> error = ENOENT; >> else >> KASSERT(vp->v_state == VS_ACTIVE); >> mutex_exit(vp->v_interlock); >> >> if (error) >> VOP_UNLOCK(vp); > > This doesn't work as VOP_UNLOCK() may or will use a different > operations vector than VOP_LOCK() above. There was a reason for > my change to VOP_LOCK() some days ago. > > Hmm. This vn_lock code may be broken. But revocation is currently > also broken because it doesn't exclude other vnode operations, and if > we fixed it in the obvious way so that VOP_REVOKE or vrevoke took the > vnode lock, then xyzfs and deadfs would have to use genfs_lock anyway > in order for it to all work out. So even if we use the LK_RETRY hack > I believe there's more that we'll need to fix to enable revocation to > mix with non-genfs_lock VOP_LOCKs. Remember, VOP_LOCK is a file system operation. Every file system is free to use its own locking -- all file systems using genfs_lock() is just the status quo. <snip> > Vrecycle() has to change -- I'm currently working on it. > > What is it supposed to do? I don't understand from the man page > description or the uses in-tree. While the name is bad it gets rid of inactive vnodes while cleaning up a file system during unmount or reload after fsck. <snip> > > Hold counts are still broken. They were introduced to prevent the > the vnode from recycling. I suspect most users of vhold are more > or less wrong. > > If so, I totally misunderstood them -- but I believe that is how the > current vfs_vnode.c interprets the hold count. How would it be > different from the usecount as you describe? This is a difficult story. If I remember right, 4.4BSD had them but didn't take care of the hold count. Later we got the freelist holding held vnodes -- this list was never recycled from. Later yamt changed it to recycle held vnodes after trying to recycle free nodes. So I suppose we are free to (re-)define hold counts or simply remove them if they don't get needed. -- J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)