subject:"nfs_inactive\(\) bug\? \- panic\: lockmgr\: locking against myself"

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Don Lewis

On 13 Sep, Ian Dowse wrote:

> For example, if you hold the reference count at 1 while calling the
> cleanup function, it allows that function to safely add and drop
> references, but if that cleanup function has a bug that drops one
> too many references then you end up recursing instead of detecting
> it as a negative reference count. I've found in some other code
> that it works reasonably well to leave the reference count at zero,
> but set a flag to stop further 1->0 transitions from retriggering
> the cleanup. Obviously other approaches will work too.

The cleanup function shouldn't be mucking with the reference count,
which means that the present implementation of nfs_inactive() is broken,
but I think there is already general agreement on that point.  The only
possible exception would be to increase the reference count to pass a
reference to another thread, but that would be a silly thing for a
cleanup function to do, since it would no longer be cleaning up.

We could add a flag that would cause an immediate panic if the cleanup
function fiddled with the reference count as an aid to tracking down
broken code ;-)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Ian Dowse

In message <[EMAIL PROTECTED]>, Terry Lambert writes:
>Ian Dowse wrote:
>> And I've just remembered a fifth :-) I think the old BSD code had
>> both an `open' count and a reference count. The open count is a
>> count of the real users of the vnode (it is what ufs_inactive really
>> wants to compare against 0), and the reference count is just for
>> places that you don't want the vnode to be recycled or destroyed.
>> This was probably lost when the encumbered BSD sources were rewritten.
>
>No, this went away with the vnode locking changes; it was in the
>4.4 code, for sure.  I think references are the correct thing here,
>and SunOS seems to agree, since that's how they implement, too.  8-).

I seem to have mis-remembered the details anyway :-) It doesn't
look as if there ever was ever the `open' count that I mentioned
above. Maybe I was just thinking that it would be a good way to
solve the issues of matching VOP_CLOSEs with VOP_OPENs, since there
are many cases in the kernel that do not guarantee to do a VOP_CLOSE
for each VOP_OPEN that was performed.

Handling the dropping of a last reference is always tricky to get
right when complex operations can be performed from the reference
dropping function (especially where that includes adding and then
removing references multiple times). It's even harder to do it in
a way that continues to catch missing or extra references caused
by bugs in the functions called when the reference count hits zero.

For example, if you hold the reference count at 1 while calling the
cleanup function, it allows that function to safely add and drop
references, but if that cleanup function has a bug that drops one
too many references then you end up recursing instead of detecting
it as a negative reference count. I've found in some other code
that it works reasonably well to leave the reference count at zero,
but set a flag to stop further 1->0 transitions from retriggering
the cleanup. Obviously other approaches will work too.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Terry Lambert


Ian Dowse wrote:
> And I've just remembered a fifth :-) I think the old BSD code had
> both an `open' count and a reference count. The open count is a
> count of the real users of the vnode (it is what ufs_inactive really
> wants to compare against 0), and the reference count is just for
> places that you don't want the vnode to be recycled or destroyed.
> This was probably lost when the encumbered BSD sources were rewritten.

No, this went away with the vnode locking changes; it was in the
4.4 code, for sure.  I think references are the correct thing here,
and SunOS seems to agree, since that's how they implement, too.  8-).


> At the time I was looking at it last, I remember thinking that the
> open count would allow vrele/vput to keep the reference count at 1
> during the VOP_INACTIVE() call, which is what you were proposing.
> It would also allow us to fix the problem of many places not matching
> each VOP_OPEN() with a VOP_CLOSE(). I suspect we could clean up a
> lot of related problems if the open count was brought back.

Yes.  It was murdered for good reason.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Don Lewis writes:
>After looking at ufs_inactive(), I'd like to add a fourth proposal

And I've just remembered a fifth :-) I think the old BSD code had
both an `open' count and a reference count. The open count is a
count of the real users of the vnode (it is what ufs_inactive really
wants to compare against 0), and the reference count is just for
places that you don't want the vnode to be recycled or destroyed.
This was probably lost when the encumbered BSD sources were rewritten.

At the time I was looking at it last, I remember thinking that the
open count would allow vrele/vput to keep the reference count at 1
during the VOP_INACTIVE() call, which is what you were proposing.
It would also allow us to fix the problem of many places not matching
each VOP_OPEN() with a VOP_CLOSE(). I suspect we could clean up a
lot of related problems if the open count was brought back.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-12 Thread Don Lewis

On 11 Sep, Ian Dowse wrote:
> In message <[EMAIL PROTECTED]>, Don Lewis writes:
>>
>>A potentially better solution just occurred to me.  It looks like it
>>would be better if vrele() waited to decrement v_usecount until *after*
>>the call to VOP_INACTIVE() (and after the call to VI_LOCK()).  If that
>>were done, nfs_inactive() wouldn't need to muck with v_usecount at all.
> 
> I've looked at this before; I think some filesystems (ufs anyway)
> depend on v_usecount being 0 when VOP_INACTIVE() is called.

Yup, though it would easy to tweak ufs_inactive() to expect the
v_usecount to be one greater.  A bigger problem is the call to
vrecycle() in ufs_inactive().

> The
> patch I have had lying around for quite a while is below. It adds
> a vnode flag to avoid recursion into the last-reference handling
> code in vrele/vput, which is the real problem.
> 
> It also guarantees that a vnode will not be recycled during
> VOP_INACTIVE(), so the nfs code no longer needs to hold an extra
> reference in the first place. The flag manipulation code got a bit
> messy after Jeff's vnode flag splitting work, so the patch could
> probably be improved.

If the need for nfs_inactive() to manipulate the reference count is
eliminated, that should be sufficient to eliminate the vrele/vput
recursion problem as well.

> Whatever way this is done, we should try to avoid adding more hacks
> to the nfs_inactive() code anyway.

Tweaking nfs_inactive() has the advantage of being the least intrusive
fix for this problem, but it isn't architecturally clean.

I'd also argue that decrementing the vnode reference count to zero, but
holding on to the reference for an extended period of time while doing
I/O on the vnode is also a kludge.  It also causes problems that require
kludgey fixes, like the reference count manipulation in nfs_inactive().

For the most part the VI_INACTIVE flag is just another way of indicating
that we are holding a reference to the vnode, so both the flag and the
reference count need to be checked to see if the vnode is in use.  The
advantage of adding this flag versus just bumping the reference count is
that the flag allows us to avoid the bugs in the current implementation
while not requiring changes to code that expects the reference count to
be zero in code called from VOP_INACTIVE().

Here are the three proposed fixes along with their advantages and
disadvantages:

  Inline the v_usecount decrement code in nfs_inactive()

+ Least intrusive fix for the recursion bug

- Adds a kludge to a kludge

  Call VOP_INACTIVE() before decrementing the reference count and tweak
  the foo_inactive() methods to expect the reference count to be one
  instead of zero

+ Eliminates the kludge from nfs_inactive()

+ We aren't lying about holding a reference, so hopefully less
  potential for bugs

- It is not obvious to how to adapt ufs_inactive()

  Add VI_INACTIVE flag to prevent the vnode from being recycled

+ Eliminates the kludge from nfs_inactive()

+ Doesn't require changes to the other filesystems

- Code complexity

- Programmer needs to know when to look at VI_INACTIVE and when
  looking at just the reference count is ok

After looking at ufs_inactive(), I'd like to add a fourth proposal that
would involve splitting VOP_INACTIVE() into two separate methods.  The
first method would be called before the reference count is decremented
and would do any I/O.  The second method would be called after the
reference count is decremented and could only be used for freeing
resources, manipulating flags, putting it on the free list with
vrecycle(), etc.

+ Eliminates the kludge from nfs_inactive()

+ We don't lie about holding a reference while doing I/O

- More extensive code changes

- Bikeshed arguments about what to call the two methods

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-11 Thread Ian Dowse


In message <[EMAIL PROTECTED]>, Don Lewis writes:
>
>A potentially better solution just occurred to me.  It looks like it
>would be better if vrele() waited to decrement v_usecount until *after*
>the call to VOP_INACTIVE() (and after the call to VI_LOCK()).  If that
>were done, nfs_inactive() wouldn't need to muck with v_usecount at all.

I've looked at this before; I think some filesystems (ufs anyway)
depend on v_usecount being 0 when VOP_INACTIVE() is called. The
patch I have had lying around for quite a while is below. It adds
a vnode flag to avoid recursion into the last-reference handling
code in vrele/vput, which is the real problem.

It also guarantees that a vnode will not be recycled during
VOP_INACTIVE(), so the nfs code no longer needs to hold an extra
reference in the first place. The flag manipulation code got a bit
messy after Jeff's vnode flag splitting work, so the patch could
probably be improved.

Whatever way this is done, we should try to avoid adding more hacks
to the nfs_inactive() code anyway.

Ian

Index: sys/vnode.h
===
RCS file: /home/iedowse/CVS/src/sys/sys/vnode.h,v
retrieving revision 1.206
diff -u -r1.206 vnode.h
--- sys/vnode.h 1 Sep 2002 20:37:21 -   1.206
+++ sys/vnode.h 11 Sep 2002 11:06:46 -
@@ -220,6 +220,7 @@
 #defineVI_DOOMED   0x0080  /* This vnode is being recycled */
 #defineVI_FREE 0x0100  /* This vnode is on the freelist */
 #defineVI_OBJDIRTY 0x0400  /* object might be dirty */
+#defineVI_INACTIVE 0x0800  /* VOP_INACTIVE is in progress */
 /*
  * XXX VI_ONWORKLST could be replaced with a check for NULL list elements
  * in v_synclist.
@@ -377,14 +378,14 @@
 
 /* Requires interlock */
 #defineVSHOULDFREE(vp) \
-   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED)) && \
+   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_INACTIVE)) && \
 !(vp)->v_holdcnt && !(vp)->v_usecount && \
 (!(vp)->v_object || \
  !((vp)->v_object->ref_count || (vp)->v_object->resident_page_count)))
 
 /* Requires interlock */
 #define VMIGHTFREE(vp) \
-   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK)) && \
+   (!((vp)->v_iflag & (VI_FREE|VI_DOOMED|VI_XLOCK|VI_INACTIVE)) && \
 LIST_EMPTY(&(vp)->v_cache_src) && !(vp)->v_usecount)
 
 /* Requires interlock */
Index: nfsclient/nfs_node.c
===
RCS file: /home/iedowse/CVS/src/sys/nfsclient/nfs_node.c,v
retrieving revision 1.55
diff -u -r1.55 nfs_node.c
--- nfsclient/nfs_node.c11 Jul 2002 17:54:58 -  1.55
+++ nfsclient/nfs_node.c11 Sep 2002 11:06:46 -
@@ -289,21 +289,7 @@
} else
sp = NULL;
if (sp) {
-   /*
-* We need a reference to keep the vnode from being
-* recycled by getnewvnode while we do the I/O
-* associated with discarding the buffers unless we
-* are being forcibly unmounted in which case we already
-* have our own reference.
-*/
-   if (ap->a_vp->v_usecount > 0)
-   (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
-   else if (vget(ap->a_vp, 0, td))
-   panic("nfs_inactive: lost vnode");
-   else {
-   (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
-   vrele(ap->a_vp);
-   }
+   (void)nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
/*
 * Remove the silly file that was rename'd earlier
 */
Index: kern/vfs_subr.c
===
RCS file: /home/iedowse/CVS/src/sys/kern/vfs_subr.c,v
retrieving revision 1.401
diff -u -r1.401 vfs_subr.c
--- kern/vfs_subr.c 5 Sep 2002 20:46:19 -   1.401
+++ kern/vfs_subr.c 11 Sep 2002 11:06:46 -
@@ -829,7 +829,8 @@
for (count = 0; count < freevnodes; count++) {
vp = TAILQ_FIRST(&vnode_free_list);
 
-   KASSERT(vp->v_usecount == 0, 
+   KASSERT(vp->v_usecount == 0 &&
+   (vp->v_iflag & VI_INACTIVE) == 0,
("getnewvnode: free vnode isn't"));
 
TAILQ_REMOVE(&vnode_free_list, vp, v_freelist);
@@ -1980,8 +1981,8 @@
KASSERT(vp->v_writecount < vp->v_usecount || vp->v_usecount < 1,
("vrele: missed vn_close"));
 
-   if (vp->v_usecount > 1) {
-
+   if (vp->v_usecount > 1 ||
+   ((vp->v_iflag & VI_INACTIVE) && vp->v_usecount == 1)) {
vp->v_usecount--;
VI_UNLOCK(vp);
 
@@ -1991,13 +1992,20 @@
if (vp->v_usecount == 1) {
vp->v_usecount--;
/*
-* We must call VOP_INACTIVE with the node l

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-10 Thread Deepankar Das

I remember seeing this panic many months back. I had fixed this in our
Freebsd source base.

Here is the scenario:

vrele() {
if (vp->v_usecount == 1) {
   vp->usecount--;
   if (VSHOULDFREE(vp))
   vfree(vp);
   vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, p); 
   (A)
   VOP_INACTIVE() { /* translates to nfs_inactive() */
   sp = np->n_sillyrename;
   if (sp) {
   if (ap->a_vp->v_usecount > 0) {
   ...
   } else {
   vget(ap->a_vp, 0, p);
   nfs_vinvalbuf(ap->a_vp, 0, ...);
   vrele(ap->a_vp); 
 (B)
   }
   ...
  }
   }
}
}

vrele() calls nfs_inactive() since the vp->v_usecount == 0. Before 
calling nfs_inactive, it locks the vnode at (A) (since VOP_INACTIVE() 
expects the vnode to be locked and VOP_INACTIVE() releases the lock at
the end).

In nfs_inactive() a check is made for to see if the file was 
renamed/deleted while it was still in use. In this case it was, and so
np->n_sillyrename is valid. To make sure that the vnode does not get
reused while doing the I/O associated with discarding the buffers, the
use_count is bumped up using vget(). Then nfs_vinvalbuf() is called to
flush and invalidate all dirty buffers. After that vrele() is done on
the vnode at (B). vrele() again comes to (A) and tries to lock the
vnode, which was already locked before by the same process in the 1st
pass. The kernel panics with "locking against myself".

The vrele() that is done at (B) causes the cycle to be repeated. vrele()
calls nfs_inactive() which calls vrele() and so forth. In
nfs_inactive(), we know that the function calling it will free the
vnode. So we need not call vfree(). We also know that we should not lock
the vnode, again, and call VOP_INACTIVE(). So the fix is to inline the
part of vrele() that just decrements the usecount.

The following would be the fix. This is 4.3 source base.

 //depot/dev/freebsd-brick/src/sys/nfs/nfs_node.c#3 (text) 

225c225,244
<   vrele(ap->a_vp);
---
 >   /*
 >* Inline part of vrele() since VOP_INACTIVE
 >* has already been called.
 >*/
 >   simple_lock(&ap->a_vp->v_interlock);
 >   if (--ap->a_vp->v_usecount <= 0) {
 > #ifdef DIAGNOSTIC
 >   if (ap->a_vp->v_usecount < 0 ||
 >   ap->a_vp->v_writecount != 0) {
 >   vprint("nfs_inactive: bad ref 
count",
 >  ap->a_vp);
 >   panic("nfs_inactive: ref cnt");
 >   }
 > #endif
 >   /*
 >* Since this is nfs_inactive(), vfree()
 >* would be called from the caller.
 >*/
 >   }
 >   simple_unlock(&ap->a_vp->v_interlock);

Deepankar

Don Lewis wrote:

> I've gotten a couple "lockmgr: locking against myself" panics today and
> it seems to be somewhat reproduceable.  I had the serial console hooked
> up for the last one, so I was able to get a stack trace:
> 
> 
> panic: lockmgr: locking against myself
> Debugger("panic")
> Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
> db> tr
> Debugger(c0449abc) at Debugger+0x45
> panic(c0447760,215,0,c6cd7de0,10002) at panic+0x7c
> lockmgr(c6cd7ea8,10002,c6cd7de0,c6381f00,c6cd7de0) at lockmgr+0x412
> vop_sharedlock(e4b90b58,c047c440,c6cd7de0,1010002,c6381f00) at vop_sharedlock+0x
> 6e
> vn_lock(c6cd7de0,10002,c6381f00) at vn_lock+0xb4
> vrele(c6cd7de0,c6cd7de0,0,c681cb00,c6381f00,1) at vrele+0x8d
> nfs_inactive(e4b90bdc,c047c3c0,c6cd7de0,c6381f00,c6381f00) at nfs_inactive+0xbd
> VOP_INACTIVE(c6cd7de0,c6381f00) at VOP_INACTIVE+0xa0
> vrele(c6cd7de0,c6834474,c6834474,e4b90c38,c02eed4a) at vrele+0x9b
> vn_close(c6cd7de0,2,c681cb00,c6381f00,e4b90c70) at vn_close+0x37
> vn_closefile(c6834474,c6381f00) at vn_closefile+0x1e
> fdrop_locked(c6834474,c6381f00,c04fe8ac,0,c0446020) at fdrop_locked+0x102
> fdrop(c6834474,c6381f00,c028a1ec,c6381f00,c6377c8c) at fdrop+0x24
> closef(c6834474,c6381f00,c6834474,c6c0d3f8,c6381f00) at closef+0x77
> close(c6381f00,e4b90d14,1,b9,202) at close+0x107
> syscall(2f,2f,2f,82124cc,23) at syscall+0x243
> Xint0x80_syscall() at Xint0x80_syscall+0x1d
> --- syscall (6, FreeBSD ELF32, close), eip = 0x4862ff3b, esp = 0xbfbff2dc, ebp =
>  0xbfbff358 ---
> 
> db> show lockedvnods
> Locked vnodes
> 0xc6cd7de0: type VREG, usecount 0, writecount 0, refcount 0,

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-10 Thread Don Lewis


On 10 Sep, Don Lewis wrote:

> It looks like the call to vrele() from vn_close() is executing the
> following code:
> 
> if (vp->v_usecount == 1) {
> vp->v_usecount--;
> /*
>  * We must call VOP_INACTIVE with the node locked.
>  * If we are doing a vput, the node is already locked,
>  * but, in the case of vrele, we must explicitly lock
>  * the vnode before calling VOP_INACTIVE.
>  */ 
> if (vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, td) == 0)
> VOP_INACTIVE(vp, td);
> VI_LOCK(vp);
> if (VSHOULDFREE(vp))
> vfree(vp);
> else
> vlruvp(vp);
> VI_UNLOCK(vp);
> 
> It has decremented v_usecount to 0, grabbed an exclusive lock on the
> vnode, and has called VOP_INACTIVE().


> It looks like nfs_inactive() is executing the following code:
> 
> if (sp) {
> /*
>  * We need a reference to keep the vnode from being
>  * recycled by getnewvnode while we do the I/O
>  * associated with discarding the buffers unless we
>  * are being forcibly unmounted in which case we already
>  * have our own reference.
>  */
> if (ap->a_vp->v_usecount > 0)
> (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
> else if (vget(ap->a_vp, 0, td))
> panic("nfs_inactive: lost vnode");
> else {
> (void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
> vrele(ap->a_vp);
> }
> 

A potentially better solution just occurred to me.  It looks like it
would be better if vrele() waited to decrement v_usecount until *after*
the call to VOP_INACTIVE() (and after the call to VI_LOCK()).  If that
were done, nfs_inactive() wouldn't need to muck with v_usecount at all.


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

nfs_inactive() bug? -> panic: lockmgr: locking against myself

2002-09-10 Thread Don Lewis


I've gotten a couple "lockmgr: locking against myself" panics today and
it seems to be somewhat reproduceable.  I had the serial console hooked
up for the last one, so I was able to get a stack trace:


panic: lockmgr: locking against myself
Debugger("panic")
Stopped at  Debugger+0x45:  xchgl   %ebx,in_Debugger.0
db> tr
Debugger(c0449abc) at Debugger+0x45
panic(c0447760,215,0,c6cd7de0,10002) at panic+0x7c
lockmgr(c6cd7ea8,10002,c6cd7de0,c6381f00,c6cd7de0) at lockmgr+0x412
vop_sharedlock(e4b90b58,c047c440,c6cd7de0,1010002,c6381f00) at vop_sharedlock+0x
6e
vn_lock(c6cd7de0,10002,c6381f00) at vn_lock+0xb4
vrele(c6cd7de0,c6cd7de0,0,c681cb00,c6381f00,1) at vrele+0x8d
nfs_inactive(e4b90bdc,c047c3c0,c6cd7de0,c6381f00,c6381f00) at nfs_inactive+0xbd
VOP_INACTIVE(c6cd7de0,c6381f00) at VOP_INACTIVE+0xa0
vrele(c6cd7de0,c6834474,c6834474,e4b90c38,c02eed4a) at vrele+0x9b
vn_close(c6cd7de0,2,c681cb00,c6381f00,e4b90c70) at vn_close+0x37
vn_closefile(c6834474,c6381f00) at vn_closefile+0x1e
fdrop_locked(c6834474,c6381f00,c04fe8ac,0,c0446020) at fdrop_locked+0x102
fdrop(c6834474,c6381f00,c028a1ec,c6381f00,c6377c8c) at fdrop+0x24
closef(c6834474,c6381f00,c6834474,c6c0d3f8,c6381f00) at closef+0x77
close(c6381f00,e4b90d14,1,b9,202) at close+0x107
syscall(2f,2f,2f,82124cc,23) at syscall+0x243
Xint0x80_syscall() at Xint0x80_syscall+0x1d
--- syscall (6, FreeBSD ELF32, close), eip = 0x4862ff3b, esp = 0xbfbff2dc, ebp =
 0xbfbff358 ---

db> show lockedvnods
Locked vnodes
0xc6cd7de0: type VREG, usecount 0, writecount 0, refcount 0, flags (VV_OBJBUF)
tag VT_NFS, fileid 112174 fsid 0x100ff02


It looks like the call to vrele() from vn_close() is executing the
following code:

if (vp->v_usecount == 1) {
vp->v_usecount--;
/*
 * We must call VOP_INACTIVE with the node locked.
 * If we are doing a vput, the node is already locked,
 * but, in the case of vrele, we must explicitly lock
 * the vnode before calling VOP_INACTIVE.
 */ 
if (vn_lock(vp, LK_EXCLUSIVE | LK_INTERLOCK, td) == 0)
VOP_INACTIVE(vp, td);
VI_LOCK(vp);
if (VSHOULDFREE(vp))
vfree(vp);
else
vlruvp(vp);
VI_UNLOCK(vp);

It has decremented v_usecount to 0, grabbed an exclusive lock on the
vnode, and has called VOP_INACTIVE().

It looks like nfs_inactive() is executing the following code:

if (sp) {
/*
 * We need a reference to keep the vnode from being
 * recycled by getnewvnode while we do the I/O
 * associated with discarding the buffers unless we
 * are being forcibly unmounted in which case we already
 * have our own reference.
 */
if (ap->a_vp->v_usecount > 0)
(void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
else if (vget(ap->a_vp, 0, td))
panic("nfs_inactive: lost vnode");
else {
(void) nfs_vinvalbuf(ap->a_vp, 0, sp->s_cred, td, 1);
vrele(ap->a_vp);
}

Because v_usecount was 0, nfs_inactive() calls vget(), which increments
v_usecount to 1, has called nfs_vinvalbuf(), and finally calls vrele()
again.

Because v_usecount is 1, vrele() is re-executing the same code and blows
up when it tries to grab the lock again.  If that didn't cause a panic,
VOP_INACTIVE() and nfs_inactive() would be called recursively, which
would probably be a bad thing ...

I suspect that nfs_inactive() needs to avoid calling vrele() here and
should just decrement the v_usecount with the appropriate locking.
Something like:
VI_LOCK(ap->a_vp);
ap->a_vp->v_usecount--;
VI_UNLOCK(ap->a_vp);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

Re: nfs_inactive() bug? -> panic: lockmgr: locking against myself

nfs_inactive() bug? -> panic: lockmgr: locking against myself

9 matches

Site Navigation

Mail list logo

Footer information