Re: Core Dump / panic sleeping thread
On Mar 21, 2013, at 9:39 PM, Konstantin Belousov wrote: > > You should use the r248567 + r248581. OK thanks. I've upgraded to 9-STABLE and applied your patches. Will let you know if I experience further crashes. thanks, /mich ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Thu, Mar 21, 2013 at 07:59:25PM +0100, Michael Landin Hostbaek wrote: > > On Mar 21, 2013, at 8:58 AM, Konstantin Belousov wrote: > > > On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote: > >> Well, read/write sharing of files over NFS is pretty rare, so I suspect > >> a truncation of a file by another client (or locally in the NFS server) > >> is a rare event. As such, not invalidating the buffers here doesn't seem > >> like a big issue? (The client uses np->n_size to determine EOF.) > >> > >> Also, I think close-to-open consistency will typically throw away the > >> buffers on the next open when it sees the mtime changed. (Yes, there > >> won't necessarily be another open, but...) > > nfs buffers are VMIO. Each VMIO buffer wires the pages it references. > > Wired pages cannot be freed by vnode_pager_setsize() if the file is > > truncated. > > Should I wait for a new patch, or should I give the one you sent yesterday a > try? > > Thanks, You should use the r248567 + r248581. pgpMNLfWZLaVc.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
On Mar 21, 2013, at 8:58 AM, Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote: >> Well, read/write sharing of files over NFS is pretty rare, so I suspect >> a truncation of a file by another client (or locally in the NFS server) >> is a rare event. As such, not invalidating the buffers here doesn't seem >> like a big issue? (The client uses np->n_size to determine EOF.) >> >> Also, I think close-to-open consistency will typically throw away the >> buffers on the next open when it sees the mtime changed. (Yes, there >> won't necessarily be another open, but...) > nfs buffers are VMIO. Each VMIO buffer wires the pages it references. > Wired pages cannot be freed by vnode_pager_setsize() if the file is > truncated. Should I wait for a new patch, or should I give the one you sent yesterday a try? Thanks, /mich ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote: > Well, read/write sharing of files over NFS is pretty rare, so I suspect > a truncation of a file by another client (or locally in the NFS server) > is a rare event. As such, not invalidating the buffers here doesn't seem > like a big issue? (The client uses np->n_size to determine EOF.) > > Also, I think close-to-open consistency will typically throw away the > buffers on the next open when it sees the mtime changed. (Yes, there > won't necessarily be another open, but...) nfs buffers are VMIO. Each VMIO buffer wires the pages it references. Wired pages cannot be freed by vnode_pager_setsize() if the file is truncated. pgpOJM_BO3RwF.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote: > > Konstantin Belousov wrote: > > > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek > > > wrote: > > > > > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > > > > wrote: > > > > > > > > > > I do not like it. As I said in the previous response to > > > > > Andrey, > > > > > I think that moving the vnode_pager_setsize() after the unlock > > > > > is > > > > > better, since it reduces races with other thread seeing > > > > > half-done > > > > > attribute update or making attribute change simultaneously. > > > > > > > > OK - so should I wait for another patch - or? > > > > > > I think the following is what I mean. As an additional note, why > > > nfs > > > client does not trim the buffers when server reported node size > > > change > > > ? > > > > > > diff --git a/sys/fs/nfsclient/nfs_clport.c > > > b/sys/fs/nfsclient/nfs_clport.c > > > index a07a67f..4fe2e35 100644 > > > --- a/sys/fs/nfsclient/nfs_clport.c > > > +++ b/sys/fs/nfsclient/nfs_clport.c > > > @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > struct nfsnode *np; > > > struct nfsmount *nmp; > > > struct timespec mtime_save; > > > + u_quad_t nsize; > > > + int setnsize; > > > > > > /* > > > * If v_type == VNON it is a new node, so fill in the v_type, > > > @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > } else > > > vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; > > > np->n_attrstamp = time_second; > > > + setnsize = 0; > > > if (vap->va_size != np->n_size) { > > > if (vap->va_type == VREG) { > > > if (dontshrink && vap->va_size < np->n_size) { > > > @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, > > > struct > > > nfsvattr *nap, void *nvaper, > > > np->n_size = vap->va_size; > > > np->n_flag |= NSIZECHANGED; > > > } > > > - vnode_pager_setsize(vp, np->n_size); > > > } else { > > > np->n_size = vap->va_size; > > > } > > > + if (vap->va_type == VREG || vap->va_type == VDIR) { > > > + setnsize = 1; > > > + nsize = vap->va_size; > > I might have used np->n_size here, since that is what is given > > as the argument for the pre-patched version, but since > > np->n_size should equal vap->va_size (it is set the same for > > all cases in the code at this point), it doesn't really matter. > > > > I have no idea what the implications of doing vnode_pager_setsize() > > for VDIR is, but Kostik would be much more conversant that I on > > this, > > so if he thinks it's ok, that's fine with me. > > > > > + } > > > } > > > /* > > > * The following checks are added to prevent a race between (say) > > > @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0); > > > #endif > > > NFSUNLOCKNODE(np); > > > + if (setnsize) > > > + vnode_pager_setsize(vp, nsize); > > > return (0); > > > } > > Yes, I think Kostik's version of the patch is good. I had thought > > of doing it that way, but want for the "minimal change" version. > > I agree that avoiding unlocking/relocking the mutex is a good idea, > > although I didn't see anything after the relock that I thought > > might be an issue if something changed while unlocked. > If the parallel calls to nfscl_loadattrcache() are possible, then > IMHO at least the n_attrstamp could be cleared needlessly. > > > > > Kostik, thanks for posting this version, rick > > ps: Michael, I'd suggest you try this patch instead of mine. > Still, my patch has the issue I noted for the head as well: the > buffers > are not destroyed if the size of the vnode is decreased. I would be > inclined to suggest the following change on top of my patch, but I am > sure that it does not work, since vnode is generally not locked in > the nfs_loadattrcache(), I think: > Oh, and I think jhb@ was mentioning, if this client is only reading the file, it will invalidate the buffers when it sees the mtime change on a subsequent read. rick > diff --git a/sys/fs/nfsclient/nfs_clport.c > b/sys/fs/nfsclient/nfs_clport.c > index 4fe2e35..3a08424 100644 > --- a/sys/fs/nfsclient/nfs_clport.c > +++ b/sys/fs/nfsclient/nfs_clport.c > @@ -487,7 +487,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct > nfsvattr *nap, void *nvaper, > #endif > NFSUNLOCKNODE(np); > if (setnsize) > - vnode_pager_setsize(vp, nsize); > + vtruncbuf(vp, NOCRED, nsize, vp->v_bufobj.bo_bsize); > return (0); > } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote: > > Konstantin Belousov wrote: > > > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek > > > wrote: > > > > > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > > > > wrote: > > > > > > > > > > I do not like it. As I said in the previous response to > > > > > Andrey, > > > > > I think that moving the vnode_pager_setsize() after the unlock > > > > > is > > > > > better, since it reduces races with other thread seeing > > > > > half-done > > > > > attribute update or making attribute change simultaneously. > > > > > > > > OK - so should I wait for another patch - or? > > > > > > I think the following is what I mean. As an additional note, why > > > nfs > > > client does not trim the buffers when server reported node size > > > change > > > ? > > > > > > diff --git a/sys/fs/nfsclient/nfs_clport.c > > > b/sys/fs/nfsclient/nfs_clport.c > > > index a07a67f..4fe2e35 100644 > > > --- a/sys/fs/nfsclient/nfs_clport.c > > > +++ b/sys/fs/nfsclient/nfs_clport.c > > > @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > struct nfsnode *np; > > > struct nfsmount *nmp; > > > struct timespec mtime_save; > > > + u_quad_t nsize; > > > + int setnsize; > > > > > > /* > > > * If v_type == VNON it is a new node, so fill in the v_type, > > > @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > } else > > > vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; > > > np->n_attrstamp = time_second; > > > + setnsize = 0; > > > if (vap->va_size != np->n_size) { > > > if (vap->va_type == VREG) { > > > if (dontshrink && vap->va_size < np->n_size) { > > > @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, > > > struct > > > nfsvattr *nap, void *nvaper, > > > np->n_size = vap->va_size; > > > np->n_flag |= NSIZECHANGED; > > > } > > > - vnode_pager_setsize(vp, np->n_size); > > > } else { > > > np->n_size = vap->va_size; > > > } > > > + if (vap->va_type == VREG || vap->va_type == VDIR) { > > > + setnsize = 1; > > > + nsize = vap->va_size; > > I might have used np->n_size here, since that is what is given > > as the argument for the pre-patched version, but since > > np->n_size should equal vap->va_size (it is set the same for > > all cases in the code at this point), it doesn't really matter. > > > > I have no idea what the implications of doing vnode_pager_setsize() > > for VDIR is, but Kostik would be much more conversant that I on > > this, > > so if he thinks it's ok, that's fine with me. > > > > > + } > > > } > > > /* > > > * The following checks are added to prevent a race between (say) > > > @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > > nfsvattr *nap, void *nvaper, > > > KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0); > > > #endif > > > NFSUNLOCKNODE(np); > > > + if (setnsize) > > > + vnode_pager_setsize(vp, nsize); > > > return (0); > > > } > > Yes, I think Kostik's version of the patch is good. I had thought > > of doing it that way, but want for the "minimal change" version. > > I agree that avoiding unlocking/relocking the mutex is a good idea, > > although I didn't see anything after the relock that I thought > > might be an issue if something changed while unlocked. > If the parallel calls to nfscl_loadattrcache() are possible, then > IMHO at least the n_attrstamp could be cleared needlessly. > And that could result in an attribute cache miss, since setting n_attrstamp = 0 invalidates the cached attributes. I agree that your patch is preferred. I'll admit I was mostly lazy (and a little afraid of moving the vnode_pager_setsize()) when I did the patch. > > > > Kostik, thanks for posting this version, rick > > ps: Michael, I'd suggest you try this patch instead of mine. > Still, my patch has the issue I noted for the head as well: the > buffers > are not destroyed if the size of the vnode is decreased. I would be > inclined to suggest the following change on top of my patch, but I am > sure that it does not work, since vnode is generally not locked in > the nfs_loadattrcache(), I think: > Well, read/write sharing of files over NFS is pretty rare, so I suspect a truncation of a file by another client (or locally in the NFS server) is a rare event. As such, not invalidating the buffers here doesn't seem like a big issue? (The client uses np->n_size to determine EOF.) Also, I think close-to-open consistency will typically throw away the buffers on the next open when it sees the mtime changed. (Yes, there won't necessarily be another open, but...) As you point out, I don't think the vnode will be locked when nfscl_loadattrcache() is called for read/writes being done by the nfsiod threads and will only be a shared vnode lock for other reads. I think your patch without the below addition is just fine, rick > diff --git a/sys/fs/nfsclient/nfs_clport.c > b/sys/fs/nfsclient/nfs_
Re: Core Dump / panic sleeping thread
On Wed, Mar 20, 2013 at 08:58:08PM +0200, Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 09:43:20AM -0400, John Baldwin wrote: > > On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote: > > > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote: > > > > > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > > wrote: > > > > > > > > > > I do not like it. As I said in the previous response to Andrey, > > > > > I think that moving the vnode_pager_setsize() after the unlock is > > > > > better, since it reduces races with other thread seeing half-done > > > > > attribute update or making attribute change simultaneously. > > > > > > > > OK - so should I wait for another patch - or? > > > > > > I think the following is what I mean. As an additional note, why nfs > > > client does not trim the buffers when server reported node size change ? > > > > Will changing the size always result in an mtime change forcing the client > > to > > throw away the data on the next read or fault anyway (or does it only affect > > ctime)? > > UFS only modifies ctime on truncation, it seems. No, I was wrong. ffs_truncate() indeed only sets both IN_CHANGE | IN_UPDATE flags for the inode, and IN_UPDATE causes mtime update in ufs_itimes(), called from UFS_UPDATE(). pgp1vaxm89T7D.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
On Wed, Mar 20, 2013 at 09:43:20AM -0400, John Baldwin wrote: > On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote: > > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote: > > > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > wrote: > > > > > > > > I do not like it. As I said in the previous response to Andrey, > > > > I think that moving the vnode_pager_setsize() after the unlock is > > > > better, since it reduces races with other thread seeing half-done > > > > attribute update or making attribute change simultaneously. > > > > > > OK - so should I wait for another patch - or? > > > > I think the following is what I mean. As an additional note, why nfs > > client does not trim the buffers when server reported node size change ? > > Will changing the size always result in an mtime change forcing the client to > throw away the data on the next read or fault anyway (or does it only affect > ctime)? UFS only modifies ctime on truncation, it seems. pgpOCtPQobYQ4.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote: > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov wrote: > > > > > > I do not like it. As I said in the previous response to Andrey, > > > I think that moving the vnode_pager_setsize() after the unlock is > > > better, since it reduces races with other thread seeing half-done > > > attribute update or making attribute change simultaneously. > > > > OK - so should I wait for another patch - or? > > I think the following is what I mean. As an additional note, why nfs > client does not trim the buffers when server reported node size change ? Will changing the size always result in an mtime change forcing the client to throw away the data on the next read or fault anyway (or does it only affect ctime)? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote: > Konstantin Belousov wrote: > > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek > > wrote: > > > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > > > wrote: > > > > > > > > I do not like it. As I said in the previous response to Andrey, > > > > I think that moving the vnode_pager_setsize() after the unlock is > > > > better, since it reduces races with other thread seeing half-done > > > > attribute update or making attribute change simultaneously. > > > > > > OK - so should I wait for another patch - or? > > > > I think the following is what I mean. As an additional note, why nfs > > client does not trim the buffers when server reported node size change > > ? > > > > diff --git a/sys/fs/nfsclient/nfs_clport.c > > b/sys/fs/nfsclient/nfs_clport.c > > index a07a67f..4fe2e35 100644 > > --- a/sys/fs/nfsclient/nfs_clport.c > > +++ b/sys/fs/nfsclient/nfs_clport.c > > @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > nfsvattr *nap, void *nvaper, > > struct nfsnode *np; > > struct nfsmount *nmp; > > struct timespec mtime_save; > > + u_quad_t nsize; > > + int setnsize; > > > > /* > > * If v_type == VNON it is a new node, so fill in the v_type, > > @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > nfsvattr *nap, void *nvaper, > > } else > > vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; > > np->n_attrstamp = time_second; > > + setnsize = 0; > > if (vap->va_size != np->n_size) { > > if (vap->va_type == VREG) { > > if (dontshrink && vap->va_size < np->n_size) { > > @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > nfsvattr *nap, void *nvaper, > > np->n_size = vap->va_size; > > np->n_flag |= NSIZECHANGED; > > } > > - vnode_pager_setsize(vp, np->n_size); > > } else { > > np->n_size = vap->va_size; > > } > > + if (vap->va_type == VREG || vap->va_type == VDIR) { > > + setnsize = 1; > > + nsize = vap->va_size; > I might have used np->n_size here, since that is what is given > as the argument for the pre-patched version, but since > np->n_size should equal vap->va_size (it is set the same for > all cases in the code at this point), it doesn't really matter. > > I have no idea what the implications of doing vnode_pager_setsize() > for VDIR is, but Kostik would be much more conversant that I on this, > so if he thinks it's ok, that's fine with me. > > > + } > > } > > /* > > * The following checks are added to prevent a race between (say) > > @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > > nfsvattr *nap, void *nvaper, > > KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0); > > #endif > > NFSUNLOCKNODE(np); > > + if (setnsize) > > + vnode_pager_setsize(vp, nsize); > > return (0); > > } > Yes, I think Kostik's version of the patch is good. I had thought > of doing it that way, but want for the "minimal change" version. > I agree that avoiding unlocking/relocking the mutex is a good idea, > although I didn't see anything after the relock that I thought > might be an issue if something changed while unlocked. If the parallel calls to nfscl_loadattrcache() are possible, then IMHO at least the n_attrstamp could be cleared needlessly. > > Kostik, thanks for posting this version, rick > ps: Michael, I'd suggest you try this patch instead of mine. Still, my patch has the issue I noted for the head as well: the buffers are not destroyed if the size of the vnode is decreased. I would be inclined to suggest the following change on top of my patch, but I am sure that it does not work, since vnode is generally not locked in the nfs_loadattrcache(), I think: diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c index 4fe2e35..3a08424 100644 --- a/sys/fs/nfsclient/nfs_clport.c +++ b/sys/fs/nfsclient/nfs_clport.c @@ -487,7 +487,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, #endif NFSUNLOCKNODE(np); if (setnsize) - vnode_pager_setsize(vp, nsize); + vtruncbuf(vp, NOCRED, nsize, vp->v_bufobj.bo_bsize); return (0); } pgpXjtJ_eVr_v.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
Konstantin Belousov wrote: > On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek > wrote: > > > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov > > wrote: > > > > > > I do not like it. As I said in the previous response to Andrey, > > > I think that moving the vnode_pager_setsize() after the unlock is > > > better, since it reduces races with other thread seeing half-done > > > attribute update or making attribute change simultaneously. > > > > OK - so should I wait for another patch - or? > > I think the following is what I mean. As an additional note, why nfs > client does not trim the buffers when server reported node size change > ? > > diff --git a/sys/fs/nfsclient/nfs_clport.c > b/sys/fs/nfsclient/nfs_clport.c > index a07a67f..4fe2e35 100644 > --- a/sys/fs/nfsclient/nfs_clport.c > +++ b/sys/fs/nfsclient/nfs_clport.c > @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > nfsvattr *nap, void *nvaper, > struct nfsnode *np; > struct nfsmount *nmp; > struct timespec mtime_save; > + u_quad_t nsize; > + int setnsize; > > /* > * If v_type == VNON it is a new node, so fill in the v_type, > @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct > nfsvattr *nap, void *nvaper, > } else > vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; > np->n_attrstamp = time_second; > + setnsize = 0; > if (vap->va_size != np->n_size) { > if (vap->va_type == VREG) { > if (dontshrink && vap->va_size < np->n_size) { > @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct > nfsvattr *nap, void *nvaper, > np->n_size = vap->va_size; > np->n_flag |= NSIZECHANGED; > } > - vnode_pager_setsize(vp, np->n_size); > } else { > np->n_size = vap->va_size; > } > + if (vap->va_type == VREG || vap->va_type == VDIR) { > + setnsize = 1; > + nsize = vap->va_size; I might have used np->n_size here, since that is what is given as the argument for the pre-patched version, but since np->n_size should equal vap->va_size (it is set the same for all cases in the code at this point), it doesn't really matter. I have no idea what the implications of doing vnode_pager_setsize() for VDIR is, but Kostik would be much more conversant that I on this, so if he thinks it's ok, that's fine with me. > + } > } > /* > * The following checks are added to prevent a race between (say) > @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct > nfsvattr *nap, void *nvaper, > KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0); > #endif > NFSUNLOCKNODE(np); > + if (setnsize) > + vnode_pager_setsize(vp, nsize); > return (0); > } Yes, I think Kostik's version of the patch is good. I had thought of doing it that way, but want for the "minimal change" version. I agree that avoiding unlocking/relocking the mutex is a good idea, although I didn't see anything after the relock that I thought might be an issue if something changed while unlocked. Kostik, thanks for posting this version, rick ps: Michael, I'd suggest you try this patch instead of mine. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote: > > On Mar 20, 2013, at 10:49 AM, Konstantin Belousov wrote: > > > > I do not like it. As I said in the previous response to Andrey, > > I think that moving the vnode_pager_setsize() after the unlock is > > better, since it reduces races with other thread seeing half-done > > attribute update or making attribute change simultaneously. > > OK - so should I wait for another patch - or? I think the following is what I mean. As an additional note, why nfs client does not trim the buffers when server reported node size change ? diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c index a07a67f..4fe2e35 100644 --- a/sys/fs/nfsclient/nfs_clport.c +++ b/sys/fs/nfsclient/nfs_clport.c @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, struct nfsnode *np; struct nfsmount *nmp; struct timespec mtime_save; + u_quad_t nsize; + int setnsize; /* * If v_type == VNON it is a new node, so fill in the v_type, @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, } else vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; np->n_attrstamp = time_second; + setnsize = 0; if (vap->va_size != np->n_size) { if (vap->va_type == VREG) { if (dontshrink && vap->va_size < np->n_size) { @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, np->n_size = vap->va_size; np->n_flag |= NSIZECHANGED; } - vnode_pager_setsize(vp, np->n_size); } else { np->n_size = vap->va_size; } + if (vap->va_type == VREG || vap->va_type == VDIR) { + setnsize = 1; + nsize = vap->va_size; + } } /* * The following checks are added to prevent a race between (say) @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr *nap, void *nvaper, KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0); #endif NFSUNLOCKNODE(np); + if (setnsize) + vnode_pager_setsize(vp, nsize); return (0); } pgpa8lhv_88qt.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
Comments in line . --- On Mar 19, 2013, at 1:45 PM, Michael Landin Hostbaek wrote: > > On Mar 19, 2013, at 6:35 PM, Jeremy Chadwick wrote: > >> On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote: >> The kernel panic is happening in NFS-related code. Rick Macklem (and/or >> John Baldwin) should be able to help with this; I've CC'd both here. > > OK, thanks. > > >> >> You're going to need to provide the following details: >> >> 1. Contents of /etc/rc.conf > > sshd_enable="YES" > ntpdate_enable="YES" > ntpdate_hosts="xx.xx.xx.xx" > fsck_y_enable="YES" > named_enable="YES" > dumpdev="AUTO" > nfs_client_enable="YES" > rpc_lockd_enable="YES" > rpc_statd_enable="YES" > ifconfig_em0="inet xx.xx.xx.xx netmask 255.255.255.0 broadcast xx.xx.xx.xx" > defaultrouter="xx.xx.xx.xx" > hostname="" > cloned_interfaces="vlan" > ifconfig_vlan="inet xx.xx.xx.xx netmask 255.240.0.0 broadcast xx.xx.xx.xx > vlan vlandev em0" > apache22_enable="YES" > pureftpd_enable="YES" > revealcloud_enable=YES > > >> 2. Contents of /etc/sysctl.conf (if modified) > > vm.pmap.shpgperproc=250 > Small side note. This sysctl is no longer valid . It's had no effect after 7.2 iirc . >> 3. Contents of /etc/fstab > > # DeviceMountpoint FStype Options DumpPass# > /dev/mirror/gm0s1a/ufsrw11 > /dev/mirror/gm0s1bnoneswapsw00 > /dev/mirror/gm0s1d/varufsrw22 > /dev/mirror/gm0s1e/logsufsrw22 > /dev/mirror/gm0s1f/extraufsrw22 > /dev/mirror/gm0s1g/usrufsrw22 > proc/proc procfs rw 0 0 > xx.xx.xx.xx:/zpool-000xxx/www/mnt/wwwnfsrw00 > xx.xx.xx.xx:/zpool-000xxx/data/mnt/datanfsrw,tcp00 > linproc/compat/linux/proclinprocfsrw00 > > >> 4. ifconfig -a > > em0: flags=8843 metric 0 mtu 1500 > > options=4219b >ether 00:25:90:79:a5:ac >inet xx.xx.xx.xx netmask 0xff00 broadcast xx.xx.xx.xx >inet6 xx::a5ac%em0 prefixlen 64 scopeid 0x1 >nd6 options=29 >media: Ethernet autoselect (1000baseT ) >status: active > em1: flags=8c02 metric 0 mtu 1500 > > options=4219b >ether 00:25:90:79:a5:ad >nd6 options=29 >media: Ethernet autoselect >status: no carrier > lo0: flags=8049 metric 0 mtu 16384 >options=63 >inet6 ::1 prefixlen 128 >inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb >inet 127.0.0.1 netmask 0xff00 >nd6 options=21 > vlan: flags=8843 metric 0 mtu 1500 >options=103 >ether 00:25:90:79:a5:ac >inet xx.xx.xx.xx netmask 0xfff0 broadcast xx.xx.xx.xx >inet6 x:::5ac%vlan prefixlen 64 scopeid 0xc >nd6 options=29 >media: Ethernet autoselect (1000baseT ) >status: active >vlan: parent interface: em0 > > >> 5. OS used by the NFS server, and all configuration details pertaining >> to that system > > This is a hosted service, so I do not have access to this - though I believe > this is a ZFS fs. > Here's more info about the product: http://help.ovh.co.uk/Nas > > >> >> You may also be asked to upgrade to 9.1-STABLE, as there may be fixes >> for whatever this is in base/stable/9 that are not in -RELEASE, but this >> is speculative on my part. > > That is not a problem. I would simply like to confirm the issue, before > upgrading. > > > Thanks, > > /mich > > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" Mark saad | mark.s...@longcount.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Mar 20, 2013, at 10:49 AM, Konstantin Belousov wrote: > > I do not like it. As I said in the previous response to Andrey, > I think that moving the vnode_pager_setsize() after the unlock is > better, since it reduces races with other thread seeing half-done > attribute update or making attribute change simultaneously. OK - so should I wait for another patch - or? Thanks, /mich ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Tue, Mar 19, 2013 at 07:37:43PM -0400, Rick Macklem wrote: > Andriy Gapon wrote: > > on 19/03/2013 19:35 Jeremy Chadwick said the following: > > > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek > > > wrote: > > [snip] > > >> Unread portion of the kernel message buffer: > > >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock > > >> KDB: stack backtrace of thread 100256: > > >> #0 0x808f2d46 at mi_switch+0x186 > > >> #1 0x8092bb52 at sleepq_wait+0x42 > > >> #2 0x808f34d6 at _sleep+0x376 > > >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce > > >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d > > >> #5 0x8082102c at nfscl_loadattrcache+0x2cc > > >> #6 0x80818d37 at nfs_getattr+0x287 > > >> #7 0x8098f1c0 at vn_stat+0xb0 > > >> #8 0x809869d9 at kern_statat_vnhook+0xf9 > > >> #9 0x80986b55 at kern_statat+0x15 > > >> #10 0x80986c1a at sys_lstat+0x2a > > >> #11 0x80bd7ae6 at amd64_syscall+0x546 > > >> #12 0x80bc3447 at Xfast_syscall+0xf7 > > >> panic: sleeping thread > > >> cpuid = 0 > > >> KDB: stack backtrace: > > >> #0 0x809208a6 at kdb_backtrace+0x66 > > >> #1 0x808ea8be at panic+0x1ce > > >> #2 0x8092ed22 at propagate_priority+0x1d2 > > >> #3 0x8092fa4e at turnstile_wait+0x1be > > >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8 > > >> #5 0x80820fa4 at nfscl_loadattrcache+0x244 > > >> #6 0x8081758c at ncl_readrpc+0xac > > >> #7 0x80824c45 at ncl_getpages+0x485 > > >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c > > >> #9 0x80b3fc93 at vm_fault_hold+0x673 > > >> #10 0x80b41cc3 at vm_fault+0x73 > > >> #11 0x80bd84b4 at trap_pfault+0x124 > > >> #12 0x80bd8c6c at trap+0x49c > > >> #13 0x80bc315f at calltrap+0x8 > > [snip] > > > > I think that the regular mutex which is acquired via NFSLOCKNODE() in > > nfscl_loadattrcache() can not be held across vnode_pager_setsize. > > I am not sure though when vap->va_size != np->n_size case is > > triggered. > > > Yep, I'd agree to that. The same bug is in the old NFS client and > the new NFS client cribbed the code from there. > > I have attached a simple patch that unlocks the mutex for the > vnode_pager_setsize() call. Maybe you could test it? > > Thanks for reporting this, rick > ps: Hopefully "patch" can apply this patch (there have been > recent changes to this file, so the line#s could be off). > It should be easy to do manually if not. The change is > in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c. > > > > > You're going to need to provide the following details: > > > > > > 1. Contents of /etc/rc.conf > > > 2. Contents of /etc/sysctl.conf (if modified) > > > 3. Contents of /etc/fstab > > > 4. ifconfig -a > > > 5. OS used by the NFS server, and all configuration details > > > pertaining > > > to that system > > > > > > You may also be asked to upgrade to 9.1-STABLE, as there may be > > > fixes > > > for whatever this is in base/stable/9 that are not in -RELEASE, but > > > this > > > is speculative on my part. > > > > > I do not see a need for any of these. > > > > -- > > Andriy Gapon > > ___ > > freebsd-stable@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > > To unsubscribe, send any mail to > > "freebsd-stable-unsubscr...@freebsd.org" > --- fs/nfsclient/nfs_clport.c.savit 2013-03-19 18:37:33.0 -0400 > +++ fs/nfsclient/nfs_clport.c 2013-03-19 18:44:21.0 -0400 > @@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, > np->n_size = vap->va_size; > np->n_flag |= NSIZECHANGED; > } > + NFSUNLOCKNODE(np); > vnode_pager_setsize(vp, np->n_size); > + NFSLOCKNODE(np); > } else { > np->n_size = vap->va_size; > } I do not like it. As I said in the previous response to Andrey, I think that moving the vnode_pager_setsize() after the unlock is better, since it reduces races with other thread seeing half-done attribute update or making attribute change simultaneously. pgpZb2TvHmqTm.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
On Mar 20, 2013, at 12:37 AM, Rick Macklem wrote: >> > Yep, I'd agree to that. The same bug is in the old NFS client and > the new NFS client cribbed the code from there. > > I have attached a simple patch that unlocks the mutex for the > vnode_pager_setsize() call. Maybe you could test it? > > Thanks for reporting this, rick > ps: Hopefully "patch" can apply this patch (there have been >recent changes to this file, so the line#s could be off). >It should be easy to do manually if not. The change is >in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c. Thanks Rick, it's compiling right now. I will let you know if the problem persists. /mich ps. the patch worked perfectly up against REL9.1 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
Andriy Gapon wrote: > on 19/03/2013 19:35 Jeremy Chadwick said the following: > > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek > > wrote: > [snip] > >> Unread portion of the kernel message buffer: > >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock > >> KDB: stack backtrace of thread 100256: > >> #0 0x808f2d46 at mi_switch+0x186 > >> #1 0x8092bb52 at sleepq_wait+0x42 > >> #2 0x808f34d6 at _sleep+0x376 > >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce > >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d > >> #5 0x8082102c at nfscl_loadattrcache+0x2cc > >> #6 0x80818d37 at nfs_getattr+0x287 > >> #7 0x8098f1c0 at vn_stat+0xb0 > >> #8 0x809869d9 at kern_statat_vnhook+0xf9 > >> #9 0x80986b55 at kern_statat+0x15 > >> #10 0x80986c1a at sys_lstat+0x2a > >> #11 0x80bd7ae6 at amd64_syscall+0x546 > >> #12 0x80bc3447 at Xfast_syscall+0xf7 > >> panic: sleeping thread > >> cpuid = 0 > >> KDB: stack backtrace: > >> #0 0x809208a6 at kdb_backtrace+0x66 > >> #1 0x808ea8be at panic+0x1ce > >> #2 0x8092ed22 at propagate_priority+0x1d2 > >> #3 0x8092fa4e at turnstile_wait+0x1be > >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8 > >> #5 0x80820fa4 at nfscl_loadattrcache+0x244 > >> #6 0x8081758c at ncl_readrpc+0xac > >> #7 0x80824c45 at ncl_getpages+0x485 > >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c > >> #9 0x80b3fc93 at vm_fault_hold+0x673 > >> #10 0x80b41cc3 at vm_fault+0x73 > >> #11 0x80bd84b4 at trap_pfault+0x124 > >> #12 0x80bd8c6c at trap+0x49c > >> #13 0x80bc315f at calltrap+0x8 > [snip] > > I think that the regular mutex which is acquired via NFSLOCKNODE() in > nfscl_loadattrcache() can not be held across vnode_pager_setsize. > I am not sure though when vap->va_size != np->n_size case is > triggered. > Yep, I'd agree to that. The same bug is in the old NFS client and the new NFS client cribbed the code from there. I have attached a simple patch that unlocks the mutex for the vnode_pager_setsize() call. Maybe you could test it? Thanks for reporting this, rick ps: Hopefully "patch" can apply this patch (there have been recent changes to this file, so the line#s could be off). It should be easy to do manually if not. The change is in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c. > > You're going to need to provide the following details: > > > > 1. Contents of /etc/rc.conf > > 2. Contents of /etc/sysctl.conf (if modified) > > 3. Contents of /etc/fstab > > 4. ifconfig -a > > 5. OS used by the NFS server, and all configuration details > > pertaining > > to that system > > > > You may also be asked to upgrade to 9.1-STABLE, as there may be > > fixes > > for whatever this is in base/stable/9 that are not in -RELEASE, but > > this > > is speculative on my part. > > > I do not see a need for any of these. > > -- > Andriy Gapon > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to > "freebsd-stable-unsubscr...@freebsd.org" --- fs/nfsclient/nfs_clport.c.savit 2013-03-19 18:37:33.0 -0400 +++ fs/nfsclient/nfs_clport.c 2013-03-19 18:44:21.0 -0400 @@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, np->n_size = vap->va_size; np->n_flag |= NSIZECHANGED; } + NFSUNLOCKNODE(np); vnode_pager_setsize(vp, np->n_size); + NFSLOCKNODE(np); } else { np->n_size = vap->va_size; } ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Tue, Mar 19, 2013 at 07:45:56PM +0200, Andriy Gapon wrote: > on 19/03/2013 19:35 Jeremy Chadwick said the following: > > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote: > [snip] > >> Unread portion of the kernel message buffer: > >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock > >> KDB: stack backtrace of thread 100256: > >> #0 0x808f2d46 at mi_switch+0x186 > >> #1 0x8092bb52 at sleepq_wait+0x42 > >> #2 0x808f34d6 at _sleep+0x376 > >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce > >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d > >> #5 0x8082102c at nfscl_loadattrcache+0x2cc > >> #6 0x80818d37 at nfs_getattr+0x287 > >> #7 0x8098f1c0 at vn_stat+0xb0 > >> #8 0x809869d9 at kern_statat_vnhook+0xf9 > >> #9 0x80986b55 at kern_statat+0x15 > >> #10 0x80986c1a at sys_lstat+0x2a > >> #11 0x80bd7ae6 at amd64_syscall+0x546 > >> #12 0x80bc3447 at Xfast_syscall+0xf7 > >> panic: sleeping thread > >> cpuid = 0 > >> KDB: stack backtrace: > >> #0 0x809208a6 at kdb_backtrace+0x66 > >> #1 0x808ea8be at panic+0x1ce > >> #2 0x8092ed22 at propagate_priority+0x1d2 > >> #3 0x8092fa4e at turnstile_wait+0x1be > >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8 > >> #5 0x80820fa4 at nfscl_loadattrcache+0x244 > >> #6 0x8081758c at ncl_readrpc+0xac > >> #7 0x80824c45 at ncl_getpages+0x485 > >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c > >> #9 0x80b3fc93 at vm_fault_hold+0x673 > >> #10 0x80b41cc3 at vm_fault+0x73 > >> #11 0x80bd84b4 at trap_pfault+0x124 > >> #12 0x80bd8c6c at trap+0x49c > >> #13 0x80bc315f at calltrap+0x8 > [snip] > > I think that the regular mutex which is acquired via NFSLOCKNODE() in > nfscl_loadattrcache() can not be held across vnode_pager_setsize. > I am not sure though when vap->va_size != np->n_size case is triggered. When the file is modified on the server outside of the control of the client ? E.g., by direct access on the server, or from the other client. The only possible solution is to move the vnode_pager_setsize() outside the scope of the n_mtx. This is somewhat problematic because the nfsiod threads never bother to lock the vnode, so the truncation of the vm cache becomes racy. Still, this is probably the best cure. Another issue I see there is that vnode_pager_setsize() call is only performed for the VREG nodes. I believe that it is possible to cache the pages for the directories as well. Would you work out the patch ? pgpSjw8_XI0By.pgp Description: PGP signature
Re: Core Dump / panic sleeping thread
on 19/03/2013 19:35 Jeremy Chadwick said the following: > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote: [snip] >> Unread portion of the kernel message buffer: >> Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock >> KDB: stack backtrace of thread 100256: >> #0 0x808f2d46 at mi_switch+0x186 >> #1 0x8092bb52 at sleepq_wait+0x42 >> #2 0x808f34d6 at _sleep+0x376 >> #3 0x80b4f3ae at vm_object_page_remove+0x2ce >> #4 0x80b5ac7d at vnode_pager_setsize+0x17d >> #5 0x8082102c at nfscl_loadattrcache+0x2cc >> #6 0x80818d37 at nfs_getattr+0x287 >> #7 0x8098f1c0 at vn_stat+0xb0 >> #8 0x809869d9 at kern_statat_vnhook+0xf9 >> #9 0x80986b55 at kern_statat+0x15 >> #10 0x80986c1a at sys_lstat+0x2a >> #11 0x80bd7ae6 at amd64_syscall+0x546 >> #12 0x80bc3447 at Xfast_syscall+0xf7 >> panic: sleeping thread >> cpuid = 0 >> KDB: stack backtrace: >> #0 0x809208a6 at kdb_backtrace+0x66 >> #1 0x808ea8be at panic+0x1ce >> #2 0x8092ed22 at propagate_priority+0x1d2 >> #3 0x8092fa4e at turnstile_wait+0x1be >> #4 0x808d8d48 at _mtx_lock_sleep+0xd8 >> #5 0x80820fa4 at nfscl_loadattrcache+0x244 >> #6 0x8081758c at ncl_readrpc+0xac >> #7 0x80824c45 at ncl_getpages+0x485 >> #8 0x80b5aa0c at vnode_pager_getpages+0x9c >> #9 0x80b3fc93 at vm_fault_hold+0x673 >> #10 0x80b41cc3 at vm_fault+0x73 >> #11 0x80bd84b4 at trap_pfault+0x124 >> #12 0x80bd8c6c at trap+0x49c >> #13 0x80bc315f at calltrap+0x8 [snip] I think that the regular mutex which is acquired via NFSLOCKNODE() in nfscl_loadattrcache() can not be held across vnode_pager_setsize. I am not sure though when vap->va_size != np->n_size case is triggered. > You're going to need to provide the following details: > > 1. Contents of /etc/rc.conf > 2. Contents of /etc/sysctl.conf (if modified) > 3. Contents of /etc/fstab > 4. ifconfig -a > 5. OS used by the NFS server, and all configuration details pertaining > to that system > > You may also be asked to upgrade to 9.1-STABLE, as there may be fixes > for whatever this is in base/stable/9 that are not in -RELEASE, but this > is speculative on my part. > I do not see a need for any of these. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Mar 19, 2013, at 6:35 PM, Jeremy Chadwick wrote: > On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote: > The kernel panic is happening in NFS-related code. Rick Macklem (and/or > John Baldwin) should be able to help with this; I've CC'd both here. OK, thanks. > > You're going to need to provide the following details: > > 1. Contents of /etc/rc.conf sshd_enable="YES" ntpdate_enable="YES" ntpdate_hosts="xx.xx.xx.xx" fsck_y_enable="YES" named_enable="YES" dumpdev="AUTO" nfs_client_enable="YES" rpc_lockd_enable="YES" rpc_statd_enable="YES" ifconfig_em0="inet xx.xx.xx.xx netmask 255.255.255.0 broadcast xx.xx.xx.xx" defaultrouter="xx.xx.xx.xx" hostname="" cloned_interfaces="vlan" ifconfig_vlan="inet xx.xx.xx.xx netmask 255.240.0.0 broadcast xx.xx.xx.xx vlan vlandev em0" apache22_enable="YES" pureftpd_enable="YES" revealcloud_enable=YES > 2. Contents of /etc/sysctl.conf (if modified) vm.pmap.shpgperproc=250 > 3. Contents of /etc/fstab # DeviceMountpoint FStype Options DumpPass# /dev/mirror/gm0s1a / ufs rw 1 1 /dev/mirror/gm0s1b noneswapsw 0 0 /dev/mirror/gm0s1d /varufs rw 2 2 /dev/mirror/gm0s1e /logs ufs rw 2 2 /dev/mirror/gm0s1f /extra ufs rw 2 2 /dev/mirror/gm0s1g /usrufs rw 2 2 proc/proc procfs rw 0 0 xx.xx.xx.xx:/zpool-000xxx/www /mnt/wwwnfs rw 0 0 xx.xx.xx.xx:/zpool-000xxx/data /mnt/data nfs rw,tcp 0 0 linproc /compat/linux/proc linprocfs rw 0 0 > 4. ifconfig -a em0: flags=8843 metric 0 mtu 1500 options=4219b ether 00:25:90:79:a5:ac inet xx.xx.xx.xx netmask 0xff00 broadcast xx.xx.xx.xx inet6 xx::a5ac%em0 prefixlen 64 scopeid 0x1 nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active em1: flags=8c02 metric 0 mtu 1500 options=4219b ether 00:25:90:79:a5:ad nd6 options=29 media: Ethernet autoselect status: no carrier lo0: flags=8049 metric 0 mtu 16384 options=63 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb inet 127.0.0.1 netmask 0xff00 nd6 options=21 vlan: flags=8843 metric 0 mtu 1500 options=103 ether 00:25:90:79:a5:ac inet xx.xx.xx.xx netmask 0xfff0 broadcast xx.xx.xx.xx inet6 x:::5ac%vlan prefixlen 64 scopeid 0xc nd6 options=29 media: Ethernet autoselect (1000baseT ) status: active vlan: parent interface: em0 > 5. OS used by the NFS server, and all configuration details pertaining > to that system This is a hosted service, so I do not have access to this - though I believe this is a ZFS fs. Here's more info about the product: http://help.ovh.co.uk/Nas > > You may also be asked to upgrade to 9.1-STABLE, as there may be fixes > for whatever this is in base/stable/9 that are not in -RELEASE, but this > is speculative on my part. That is not a problem. I would simply like to confirm the issue, before upgrading. Thanks, /mich ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Core Dump / panic sleeping thread
On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote: > Hi, > > I am running a FreeBSD 9.1-REL system with GENERIC kernel: > FreeBSD x 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Fri Jan 4 12:28:48 CET > 2013 root@x:/usr/obj/usr/src/sys/GENERIC amd64 > > > It is crashing a couple of times per week, without any real pattern. There > are no hints in the syslog, and I only have the core debug to work from... > > It is a webserver, using a NFS mounted docroot (if it might help) - here's > the backtrace: > > > This GDB was configured as "amd64-marcel-freebsd"... > > Unread portion of the kernel message buffer: > Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock > KDB: stack backtrace of thread 100256: > #0 0x808f2d46 at mi_switch+0x186 > #1 0x8092bb52 at sleepq_wait+0x42 > #2 0x808f34d6 at _sleep+0x376 > #3 0x80b4f3ae at vm_object_page_remove+0x2ce > #4 0x80b5ac7d at vnode_pager_setsize+0x17d > #5 0x8082102c at nfscl_loadattrcache+0x2cc > #6 0x80818d37 at nfs_getattr+0x287 > #7 0x8098f1c0 at vn_stat+0xb0 > #8 0x809869d9 at kern_statat_vnhook+0xf9 > #9 0x80986b55 at kern_statat+0x15 > #10 0x80986c1a at sys_lstat+0x2a > #11 0x80bd7ae6 at amd64_syscall+0x546 > #12 0x80bc3447 at Xfast_syscall+0xf7 > panic: sleeping thread > cpuid = 0 > KDB: stack backtrace: > #0 0x809208a6 at kdb_backtrace+0x66 > #1 0x808ea8be at panic+0x1ce > #2 0x8092ed22 at propagate_priority+0x1d2 > #3 0x8092fa4e at turnstile_wait+0x1be > #4 0x808d8d48 at _mtx_lock_sleep+0xd8 > #5 0x80820fa4 at nfscl_loadattrcache+0x244 > #6 0x8081758c at ncl_readrpc+0xac > #7 0x80824c45 at ncl_getpages+0x485 > #8 0x80b5aa0c at vnode_pager_getpages+0x9c > #9 0x80b3fc93 at vm_fault_hold+0x673 > #10 0x80b41cc3 at vm_fault+0x73 > #11 0x80bd84b4 at trap_pfault+0x124 > #12 0x80bd8c6c at trap+0x49c > #13 0x80bc315f at calltrap+0x8 > Uptime: 8d0h54m10s > Dumping 2381 out of 24547 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% > > Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from > /boot/kernel/geom_mirror.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/geom_mirror.ko > Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from > /boot/kernel/geom_stripe.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/geom_stripe.ko > Reading symbols from /boot/kernel/if_em.ko...Reading symbols from > /boot/kernel/if_em.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/if_em.ko > Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from > /boot/kernel/linprocfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/linprocfs.ko > Reading symbols from /boot/kernel/linux.ko...Reading symbols from > /boot/kernel/linux.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/linux.ko > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > 224 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump (textdump=Variable "textdump" is not available. > ) at pcpu.h:224 > #1 0x808ea3a1 in kern_reboot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:448 > #2 0x808ea897 in panic (fmt=0x1 ) at > /usr/src/sys/kern/kern_shutdown.c:636 > #3 0x8092ed22 in propagate_priority (td=Variable "td" is not > available. > ) at /usr/src/sys/kern/subr_turnstile.c:227 > #4 0x8092fa4e in turnstile_wait (ts=Variable "ts" is not available. > ) at /usr/src/sys/kern/subr_turnstile.c:743 > #5 0x808d8d48 in _mtx_lock_sleep (m=0xfe044a3c8238, > tid=18446741888664231936, opts=Variable "opts" is not available. > ) > at /usr/src/sys/kern/kern_mutex.c:471 > #6 0x80820fa4 in nfscl_loadattrcache (vpp=Variable "vpp" is not > available. > ) at /usr/src/sys/fs/nfsclient/nfs_clport.c:379 > #7 0x8081758c in ncl_readrpc (vp=0xfe044a6cd780, > uiop=0xff86962fc650, cred=Variable "cred" is not available. > ) > at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1369 > #8 0x80824c45 in ncl_getpages (ap=0xff86962fc6f0) at > /usr/src/sys/fs/nfsclient/nfs_clbio.c:171 > #9 0x80b5aa0c in vnode_pager_getpages (object=0xfe016aa16570, > m=0xff86962fc770, count=Variable "count" is not available. > ) > at vnode_if.h:1154 > #10 0x80b3fc93 in vm_fault_hold (map=0xfe007f7e3188, > vaddr=34366988288, fault_type=1 '\001', fault_flags=Variable "fault_flags" is > not available. > ) > at vm_pager.h:128 > #11 0x80b41cc3 in vm_fault (map=0xfe007f7e3188, > vaddr=34366988288, fault_type=Variable "fault_type" is not available. > ) > at /usr/src/sys/vm/vm_fault.c:229 > #12 0x80bd84b4 in trap_pfault (frame=0xff86962fcc40, usermode=1) > at /usr/src/sys/amd64/amd64/trap.c:740 > #13 0xff
Core Dump / panic sleeping thread
Hi, I am running a FreeBSD 9.1-REL system with GENERIC kernel: FreeBSD x 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Fri Jan 4 12:28:48 CET 2013 root@x:/usr/obj/usr/src/sys/GENERIC amd64 It is crashing a couple of times per week, without any real pattern. There are no hints in the syslog, and I only have the core debug to work from... It is a webserver, using a NFS mounted docroot (if it might help) - here's the backtrace: This GDB was configured as "amd64-marcel-freebsd"... Unread portion of the kernel message buffer: Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock KDB: stack backtrace of thread 100256: #0 0x808f2d46 at mi_switch+0x186 #1 0x8092bb52 at sleepq_wait+0x42 #2 0x808f34d6 at _sleep+0x376 #3 0x80b4f3ae at vm_object_page_remove+0x2ce #4 0x80b5ac7d at vnode_pager_setsize+0x17d #5 0x8082102c at nfscl_loadattrcache+0x2cc #6 0x80818d37 at nfs_getattr+0x287 #7 0x8098f1c0 at vn_stat+0xb0 #8 0x809869d9 at kern_statat_vnhook+0xf9 #9 0x80986b55 at kern_statat+0x15 #10 0x80986c1a at sys_lstat+0x2a #11 0x80bd7ae6 at amd64_syscall+0x546 #12 0x80bc3447 at Xfast_syscall+0xf7 panic: sleeping thread cpuid = 0 KDB: stack backtrace: #0 0x809208a6 at kdb_backtrace+0x66 #1 0x808ea8be at panic+0x1ce #2 0x8092ed22 at propagate_priority+0x1d2 #3 0x8092fa4e at turnstile_wait+0x1be #4 0x808d8d48 at _mtx_lock_sleep+0xd8 #5 0x80820fa4 at nfscl_loadattrcache+0x244 #6 0x8081758c at ncl_readrpc+0xac #7 0x80824c45 at ncl_getpages+0x485 #8 0x80b5aa0c at vnode_pager_getpages+0x9c #9 0x80b3fc93 at vm_fault_hold+0x673 #10 0x80b41cc3 at vm_fault+0x73 #11 0x80bd84b4 at trap_pfault+0x124 #12 0x80bd8c6c at trap+0x49c #13 0x80bc315f at calltrap+0x8 Uptime: 8d0h54m10s Dumping 2381 out of 24547 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91% Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. done. Loaded symbols for /boot/kernel/geom_mirror.ko Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from /boot/kernel/geom_stripe.ko.symbols...done. done. Loaded symbols for /boot/kernel/geom_stripe.ko Reading symbols from /boot/kernel/if_em.ko...Reading symbols from /boot/kernel/if_em.ko.symbols...done. done. Loaded symbols for /boot/kernel/if_em.ko Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from /boot/kernel/linprocfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/linprocfs.ko Reading symbols from /boot/kernel/linux.ko...Reading symbols from /boot/kernel/linux.ko.symbols...done. done. Loaded symbols for /boot/kernel/linux.ko #0 doadump (textdump=Variable "textdump" is not available. ) at pcpu.h:224 224 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump (textdump=Variable "textdump" is not available. ) at pcpu.h:224 #1 0x808ea3a1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0x808ea897 in panic (fmt=0x1 ) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0x8092ed22 in propagate_priority (td=Variable "td" is not available. ) at /usr/src/sys/kern/subr_turnstile.c:227 #4 0x8092fa4e in turnstile_wait (ts=Variable "ts" is not available. ) at /usr/src/sys/kern/subr_turnstile.c:743 #5 0x808d8d48 in _mtx_lock_sleep (m=0xfe044a3c8238, tid=18446741888664231936, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:471 #6 0x80820fa4 in nfscl_loadattrcache (vpp=Variable "vpp" is not available. ) at /usr/src/sys/fs/nfsclient/nfs_clport.c:379 #7 0x8081758c in ncl_readrpc (vp=0xfe044a6cd780, uiop=0xff86962fc650, cred=Variable "cred" is not available. ) at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1369 #8 0x80824c45 in ncl_getpages (ap=0xff86962fc6f0) at /usr/src/sys/fs/nfsclient/nfs_clbio.c:171 #9 0x80b5aa0c in vnode_pager_getpages (object=0xfe016aa16570, m=0xff86962fc770, count=Variable "count" is not available. ) at vnode_if.h:1154 #10 0x80b3fc93 in vm_fault_hold (map=0xfe007f7e3188, vaddr=34366988288, fault_type=1 '\001', fault_flags=Variable "fault_flags" is not available. ) at vm_pager.h:128 #11 0x80b41cc3 in vm_fault (map=0xfe007f7e3188, vaddr=34366988288, fault_type=Variable "fault_type" is not available. ) at /usr/src/sys/vm/vm_fault.c:229 #12 0x80bd84b4 in trap_pfault (frame=0xff86962fcc40, usermode=1) at /usr/src/sys/amd64/amd64/trap.c:740 #13 0x80bd8c6c in trap (frame=0xff86962fcc40) at /usr/src/sys/amd64/amd64/trap.c:358 #14 0x80bc315f in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #15 0x000802091386 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) Dump header from device /dev/mirror/