Re: rototilling the vnode life cycle

J. Hannken-Illjes Fri, 28 Feb 2014 08:23:43 -0800

On Feb 28, 2014, at 4:41 PM, Taylor R Campbell 
<campbell+netbsd-tech-k...@mumble.net> wrote:
<snip>
>   First, using an explicit vnode state, v_interlock to inspect or modify,
>   looks good.
> 
> Actually, I think I may want to change that, so that vget can be done
> under a pserialized read -- i.e., atomics only, no locks or waits.  I
> took a look at dholland's message from September and started to write
> a generic inode->vnode cache with pserialize, and realized vget needs
> to happen under it.  But at least the explicit states will help to
> clarify the model for now, and will be a good fallback if I give up
> before making a pserialized inode->vnode cache.


I don't see a many reader / rare writer case here.  Pserialize is
very expensive for the writer case (at the moment you even get a minimum
of 2 HZ for a pserialize_perform -- but this is a bug).

<snip>
> I wanted to ensure it returns failure in case of overflow instead of
> panicking (or silently proceeding, as it always does currently).
> 
>>      case VS_REVOKING:
>>      case VS_DESTROYING:
>>              return ENOENT;
> 
>   Return without waiting looks bad.
> 
> I think this is actually OK.  If you're a file system, and you're
> trying to grab one of your own vnodes, it's no good to get one whose
> identity will soon be changed to deadfs -- you need to make a new
> vnode for the inode you were looking for.  Nor is it any good to try
> to work with one that vfs has chosen to destroy -- it's too late to
> recover that one.  So in both cases immediately returning ENOENT is
> the right thing.

It is better not to have two vnodes (on dying, one creating)
representing the same inode in parallel.

<snip>
>> int
>> vn_lock(struct vnode *vp, int flags)
>> {
>>      int error = 0;
>> 
>>      VOP_LOCK(vp, flags);
>> 
>>      mutex_enter(vp->v_interlock);
>>      if (__predict_false(vp->v_state == VS_REVOKING) ||
>>          (vp->v_op == dead_vnodeop_p))
>>              error = ENOENT;
>>      else
>>              KASSERT(vp->v_state == VS_ACTIVE);
>>      mutex_exit(vp->v_interlock);
>> 
>>      if (error)
>>              VOP_UNLOCK(vp);
> 
>   This doesn't work as VOP_UNLOCK() may or will use a different
>   operations vector than VOP_LOCK() above.  There was a reason for
>   my change to VOP_LOCK() some days ago.
> 
> Hmm.  This vn_lock code may be broken.  But revocation is currently
> also broken because it doesn't exclude other vnode operations, and if
> we fixed it in the obvious way so that VOP_REVOKE or vrevoke took the
> vnode lock, then xyzfs and deadfs would have to use genfs_lock anyway
> in order for it to all work out.  So even if we use the LK_RETRY hack
> I believe there's more that we'll need to fix to enable revocation to
> mix with non-genfs_lock VOP_LOCKs.

Remember, VOP_LOCK is a file system operation.  Every file system is
free to use its own locking -- all file systems using genfs_lock()
is just the status quo.

<snip>
>   Vrecycle() has to change -- I'm currently working on it.
> 
> What is it supposed to do?  I don't understand from the man page
> description or the uses in-tree.

While the name is bad it gets rid of inactive vnodes while cleaning
up a file system during unmount or reload after fsck.

<snip>
> 
>   Hold counts are still broken.  They were introduced to prevent the
>   the vnode from recycling.  I suspect most users of vhold are more
>   or less wrong.
> 
> If so, I totally misunderstood them -- but I believe that is how the
> current vfs_vnode.c interprets the hold count.  How would it be
> different from the usecount as you describe?

This is a difficult story.  If I remember right, 4.4BSD had them but
didn't take care of the hold count.  Later we got the freelist holding
held vnodes -- this list was never recycled from.  Later yamt changed
it to recycle held vnodes after trying to recycle free nodes.

So I suppose we are free to (re-)define hold counts or simply remove
them if they don't get needed.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)

Re: rototilling the vnode life cycle

Reply via email to