Re: Core Dump / panic sleeping thread

2013-03-22 Thread Michael Landin Hostbaek

On Mar 21, 2013, at 9:39 PM, Konstantin Belousov kostik...@gmail.com wrote:

 
 You should use the r248567 + r248581.

OK  thanks. I've upgraded to 9-STABLE and applied your patches.

Will let you know if I experience further crashes.

thanks, 

/mich

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-21 Thread Konstantin Belousov
On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote:
 Well, read/write sharing of files over NFS is pretty rare, so I suspect
 a truncation of a file by another client (or locally in the NFS server)
 is a rare event. As such, not invalidating the buffers here doesn't seem
 like a big issue? (The client uses np-n_size to determine EOF.)
 
 Also, I think close-to-open consistency will typically throw away the
 buffers on the next open when it sees the mtime changed. (Yes, there
 won't necessarily be another open, but...)
nfs buffers are VMIO. Each VMIO buffer wires the pages it references.
Wired pages cannot be freed by vnode_pager_setsize() if the file is
truncated.


pgpOJM_BO3RwF.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-21 Thread Michael Landin Hostbaek

On Mar 21, 2013, at 8:58 AM, Konstantin Belousov kostik...@gmail.com wrote:

 On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote:
 Well, read/write sharing of files over NFS is pretty rare, so I suspect
 a truncation of a file by another client (or locally in the NFS server)
 is a rare event. As such, not invalidating the buffers here doesn't seem
 like a big issue? (The client uses np-n_size to determine EOF.)
 
 Also, I think close-to-open consistency will typically throw away the
 buffers on the next open when it sees the mtime changed. (Yes, there
 won't necessarily be another open, but...)
 nfs buffers are VMIO. Each VMIO buffer wires the pages it references.
 Wired pages cannot be freed by vnode_pager_setsize() if the file is
 truncated.

Should I wait for a new patch, or should I give the one you sent yesterday a 
try? 

Thanks, 

/mich

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-21 Thread Konstantin Belousov
On Thu, Mar 21, 2013 at 07:59:25PM +0100, Michael Landin Hostbaek wrote:
 
 On Mar 21, 2013, at 8:58 AM, Konstantin Belousov kostik...@gmail.com wrote:
 
  On Wed, Mar 20, 2013 at 09:14:37PM -0400, Rick Macklem wrote:
  Well, read/write sharing of files over NFS is pretty rare, so I suspect
  a truncation of a file by another client (or locally in the NFS server)
  is a rare event. As such, not invalidating the buffers here doesn't seem
  like a big issue? (The client uses np-n_size to determine EOF.)
  
  Also, I think close-to-open consistency will typically throw away the
  buffers on the next open when it sees the mtime changed. (Yes, there
  won't necessarily be another open, but...)
  nfs buffers are VMIO. Each VMIO buffer wires the pages it references.
  Wired pages cannot be freed by vnode_pager_setsize() if the file is
  truncated.
 
 Should I wait for a new patch, or should I give the one you sent yesterday a 
 try? 
 
 Thanks, 

You should use the r248567 + r248581.


pgpMNLfWZLaVc.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Michael Landin Hostbaek

On Mar 20, 2013, at 12:37 AM, Rick Macklem rmack...@uoguelph.ca wrote:

 
 Yep, I'd agree to that. The same bug is in the old NFS client and
 the new NFS client cribbed the code from there.
 
 I have attached a simple patch that unlocks the mutex for the
 vnode_pager_setsize() call. Maybe you could test it?
 
 Thanks for reporting this, rick
 ps: Hopefully patch can apply this patch (there have been
recent changes to this file, so the line#s could be off).
It should be easy to do manually if not. The change is
in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c.

Thanks Rick, it's compiling right now.
I will let you know if the problem persists. 

/mich

ps. the patch worked perfectly up against REL9.1

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Konstantin Belousov
On Tue, Mar 19, 2013 at 07:37:43PM -0400, Rick Macklem wrote:
 Andriy Gapon wrote:
  on 19/03/2013 19:35 Jeremy Chadwick said the following:
   On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek
   wrote:
  [snip]
   Unread portion of the kernel message buffer:
   Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
   KDB: stack backtrace of thread 100256:
   #0 0x808f2d46 at mi_switch+0x186
   #1 0x8092bb52 at sleepq_wait+0x42
   #2 0x808f34d6 at _sleep+0x376
   #3 0x80b4f3ae at vm_object_page_remove+0x2ce
   #4 0x80b5ac7d at vnode_pager_setsize+0x17d
   #5 0x8082102c at nfscl_loadattrcache+0x2cc
   #6 0x80818d37 at nfs_getattr+0x287
   #7 0x8098f1c0 at vn_stat+0xb0
   #8 0x809869d9 at kern_statat_vnhook+0xf9
   #9 0x80986b55 at kern_statat+0x15
   #10 0x80986c1a at sys_lstat+0x2a
   #11 0x80bd7ae6 at amd64_syscall+0x546
   #12 0x80bc3447 at Xfast_syscall+0xf7
   panic: sleeping thread
   cpuid = 0
   KDB: stack backtrace:
   #0 0x809208a6 at kdb_backtrace+0x66
   #1 0x808ea8be at panic+0x1ce
   #2 0x8092ed22 at propagate_priority+0x1d2
   #3 0x8092fa4e at turnstile_wait+0x1be
   #4 0x808d8d48 at _mtx_lock_sleep+0xd8
   #5 0x80820fa4 at nfscl_loadattrcache+0x244
   #6 0x8081758c at ncl_readrpc+0xac
   #7 0x80824c45 at ncl_getpages+0x485
   #8 0x80b5aa0c at vnode_pager_getpages+0x9c
   #9 0x80b3fc93 at vm_fault_hold+0x673
   #10 0x80b41cc3 at vm_fault+0x73
   #11 0x80bd84b4 at trap_pfault+0x124
   #12 0x80bd8c6c at trap+0x49c
   #13 0x80bc315f at calltrap+0x8
  [snip]
  
  I think that the regular mutex which is acquired via NFSLOCKNODE() in
  nfscl_loadattrcache() can not be held across vnode_pager_setsize.
  I am not sure though when vap-va_size != np-n_size case is
  triggered.
  
 Yep, I'd agree to that. The same bug is in the old NFS client and
 the new NFS client cribbed the code from there.
 
 I have attached a simple patch that unlocks the mutex for the
 vnode_pager_setsize() call. Maybe you could test it?
 
 Thanks for reporting this, rick
 ps: Hopefully patch can apply this patch (there have been
 recent changes to this file, so the line#s could be off).
 It should be easy to do manually if not. The change is
 in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c.
 
 
   You're going to need to provide the following details:
  
   1. Contents of /etc/rc.conf
   2. Contents of /etc/sysctl.conf (if modified)
   3. Contents of /etc/fstab
   4. ifconfig -a
   5. OS used by the NFS server, and all configuration details
   pertaining
   to that system
  
   You may also be asked to upgrade to 9.1-STABLE, as there may be
   fixes
   for whatever this is in base/stable/9 that are not in -RELEASE, but
   this
   is speculative on my part.
  
  I do not see a need for any of these.
  
  --
  Andriy Gapon
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to
  freebsd-stable-unsubscr...@freebsd.org

 --- fs/nfsclient/nfs_clport.c.savit   2013-03-19 18:37:33.0 -0400
 +++ fs/nfsclient/nfs_clport.c 2013-03-19 18:44:21.0 -0400
 @@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, 
   np-n_size = vap-va_size;
   np-n_flag |= NSIZECHANGED;
   }
 + NFSUNLOCKNODE(np);
   vnode_pager_setsize(vp, np-n_size);
 + NFSLOCKNODE(np);
   } else {
   np-n_size = vap-va_size;
   }

I do not like it. As I said in the previous response to Andrey,
I think that moving the vnode_pager_setsize() after the unlock is
better, since it reduces races with other thread seeing half-done
attribute update or making attribute change simultaneously.


pgpZb2TvHmqTm.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Michael Landin Hostbaek

On Mar 20, 2013, at 10:49 AM, Konstantin Belousov kostik...@gmail.com wrote:
 
 I do not like it. As I said in the previous response to Andrey,
 I think that moving the vnode_pager_setsize() after the unlock is
 better, since it reduces races with other thread seeing half-done
 attribute update or making attribute change simultaneously.

OK - so should I wait for another patch - or? 

Thanks,


/mich

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Mark Saad

Comments in line .
---


On Mar 19, 2013, at 1:45 PM, Michael Landin Hostbaek m...@freebsd.org wrote:

 
 On Mar 19, 2013, at 6:35 PM, Jeremy Chadwick j...@koitsu.org wrote:
 
 On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
 The kernel panic is happening in NFS-related code.  Rick Macklem (and/or
 John Baldwin) should be able to help with this; I've CC'd both here.
 
 OK, thanks. 
 
 
 
 You're going to need to provide the following details:
 
 1. Contents of /etc/rc.conf
 
 sshd_enable=YES
 ntpdate_enable=YES
 ntpdate_hosts=xx.xx.xx.xx
 fsck_y_enable=YES
 named_enable=YES
 dumpdev=AUTO
 nfs_client_enable=YES
 rpc_lockd_enable=YES
 rpc_statd_enable=YES
 ifconfig_em0=inet xx.xx.xx.xx netmask 255.255.255.0 broadcast xx.xx.xx.xx
 defaultrouter=xx.xx.xx.xx
 hostname=
 cloned_interfaces=vlan
 ifconfig_vlan=inet xx.xx.xx.xx netmask 255.240.0.0 broadcast xx.xx.xx.xx 
 vlan  vlandev em0
 apache22_enable=YES
 pureftpd_enable=YES
 revealcloud_enable=YES
 
 
 2. Contents of /etc/sysctl.conf (if modified)
 
 vm.pmap.shpgperproc=250
 

Small side note. This sysctl is no longer valid . It's had no effect after 7.2 
iirc . 



 3. Contents of /etc/fstab
 
 # DeviceMountpoint  FStype  Options DumpPass#
 /dev/mirror/gm0s1a/ufsrw11
 /dev/mirror/gm0s1bnoneswapsw00
 /dev/mirror/gm0s1d/varufsrw22
 /dev/mirror/gm0s1e/logsufsrw22
 /dev/mirror/gm0s1f/extraufsrw22
 /dev/mirror/gm0s1g/usrufsrw22
 proc/proc   procfs  rw  0   0
 xx.xx.xx.xx:/zpool-000xxx/www/mnt/wwwnfsrw00
 xx.xx.xx.xx:/zpool-000xxx/data/mnt/datanfsrw,tcp00
 linproc/compat/linux/proclinprocfsrw00
 
 
 4. ifconfig -a
 
 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

 options=4219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xff00 broadcast xx.xx.xx.xx
inet6 xx::a5ac%em0 prefixlen 64 scopeid 0x1 
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
 em1: flags=8c02BROADCAST,OACTIVE,SIMPLEX,MULTICAST metric 0 mtu 1500

 options=4219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO
ether 00:25:90:79:a5:ad
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect
status: no carrier
 lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=63RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6
inet6 ::1 prefixlen 128 
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb 
inet 127.0.0.1 netmask 0xff00 
nd6 options=21PERFORMNUD,AUTO_LINKLOCAL
 vlan: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=103RXCSUM,TXCSUM,TSO4
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xfff0 broadcast xx.xx.xx.xx
inet6 x:::5ac%vlan prefixlen 64 scopeid 0xc 
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
vlan:  parent interface: em0
 
 
 5. OS used by the NFS server, and all configuration details pertaining
 to that system
 
 This is a hosted service, so I do not have access to this - though I believe 
 this is a ZFS fs.
 Here's more info about the product: http://help.ovh.co.uk/Nas
 
 
 
 You may also be asked to upgrade to 9.1-STABLE, as there may be fixes
 for whatever this is in base/stable/9 that are not in -RELEASE, but this
 is speculative on my part.
 
 That is not a problem. I would simply like to confirm the issue, before 
 upgrading. 
 
 
 Thanks, 
 
 /mich
 
 
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Mark saad | mark.s...@longcount.org

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Konstantin Belousov
On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote:
 
 On Mar 20, 2013, at 10:49 AM, Konstantin Belousov kostik...@gmail.com wrote:
  
  I do not like it. As I said in the previous response to Andrey,
  I think that moving the vnode_pager_setsize() after the unlock is
  better, since it reduces races with other thread seeing half-done
  attribute update or making attribute change simultaneously.
 
 OK - so should I wait for another patch - or? 

I think the following is what I mean. As an additional note, why nfs
client does not trim the buffers when server reported node size change ?

diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
index a07a67f..4fe2e35 100644
--- a/sys/fs/nfsclient/nfs_clport.c
+++ b/sys/fs/nfsclient/nfs_clport.c
@@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
*nap, void *nvaper,
struct nfsnode *np;
struct nfsmount *nmp;
struct timespec mtime_save;
+   u_quad_t nsize;
+   int setnsize;
 
/*
 * If v_type == VNON it is a new node, so fill in the v_type,
@@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
*nap, void *nvaper,
} else
vap-va_fsid = vp-v_mount-mnt_stat.f_fsid.val[0];
np-n_attrstamp = time_second;
+   setnsize = 0;
if (vap-va_size != np-n_size) {
if (vap-va_type == VREG) {
if (dontshrink  vap-va_size  np-n_size) {
@@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
*nap, void *nvaper,
np-n_size = vap-va_size;
np-n_flag |= NSIZECHANGED;
}
-   vnode_pager_setsize(vp, np-n_size);
} else {
np-n_size = vap-va_size;
}
+   if (vap-va_type == VREG || vap-va_type == VDIR) {
+   setnsize = 1;
+   nsize = vap-va_size;
+   }
}
/*
 * The following checks are added to prevent a race between (say)
@@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
*nap, void *nvaper,
KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0);
 #endif
NFSUNLOCKNODE(np);
+   if (setnsize)
+   vnode_pager_setsize(vp, nsize);
return (0);
 }
 


pgpa8lhv_88qt.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Rick Macklem
Konstantin Belousov wrote:
 On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek
 wrote:
 
  On Mar 20, 2013, at 10:49 AM, Konstantin Belousov
  kostik...@gmail.com wrote:
  
   I do not like it. As I said in the previous response to Andrey,
   I think that moving the vnode_pager_setsize() after the unlock is
   better, since it reduces races with other thread seeing half-done
   attribute update or making attribute change simultaneously.
 
  OK - so should I wait for another patch - or?
 
 I think the following is what I mean. As an additional note, why nfs
 client does not trim the buffers when server reported node size change
 ?
 
 diff --git a/sys/fs/nfsclient/nfs_clport.c
 b/sys/fs/nfsclient/nfs_clport.c
 index a07a67f..4fe2e35 100644
 --- a/sys/fs/nfsclient/nfs_clport.c
 +++ b/sys/fs/nfsclient/nfs_clport.c
 @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 struct nfsnode *np;
 struct nfsmount *nmp;
 struct timespec mtime_save;
 + u_quad_t nsize;
 + int setnsize;
 
 /*
 * If v_type == VNON it is a new node, so fill in the v_type,
 @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 } else
 vap-va_fsid = vp-v_mount-mnt_stat.f_fsid.val[0];
 np-n_attrstamp = time_second;
 + setnsize = 0;
 if (vap-va_size != np-n_size) {
 if (vap-va_type == VREG) {
 if (dontshrink  vap-va_size  np-n_size) {
 @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 np-n_size = vap-va_size;
 np-n_flag |= NSIZECHANGED;
 }
 - vnode_pager_setsize(vp, np-n_size);
 } else {
 np-n_size = vap-va_size;
 }
 + if (vap-va_type == VREG || vap-va_type == VDIR) {
 + setnsize = 1;
 + nsize = vap-va_size;
I might have used np-n_size here, since that is what is given
as the argument for the pre-patched version, but since
np-n_size should equal vap-va_size (it is set the same for
all cases in the code at this point), it doesn't really matter.

I have no idea what the implications of doing vnode_pager_setsize()
for VDIR is, but Kostik would be much more conversant that I on this,
so if he thinks it's ok, that's fine with me.

 + }
 }
 /*
 * The following checks are added to prevent a race between (say)
 @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0);
 #endif
 NFSUNLOCKNODE(np);
 + if (setnsize)
 + vnode_pager_setsize(vp, nsize);
 return (0);
 }
Yes, I think Kostik's version of the patch is good. I had thought
of doing it that way, but want for the minimal change version.
I agree that avoiding unlocking/relocking the mutex is a good idea,
although I didn't see anything after the relock that I thought
might be an issue if something changed while unlocked.

Kostik, thanks for posting this version, rick
ps: Michael, I'd suggest you try this patch instead of mine.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Konstantin Belousov
On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote:
 Konstantin Belousov wrote:
  On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek
  wrote:
  
   On Mar 20, 2013, at 10:49 AM, Konstantin Belousov
   kostik...@gmail.com wrote:
   
I do not like it. As I said in the previous response to Andrey,
I think that moving the vnode_pager_setsize() after the unlock is
better, since it reduces races with other thread seeing half-done
attribute update or making attribute change simultaneously.
  
   OK - so should I wait for another patch - or?
  
  I think the following is what I mean. As an additional note, why nfs
  client does not trim the buffers when server reported node size change
  ?
  
  diff --git a/sys/fs/nfsclient/nfs_clport.c
  b/sys/fs/nfsclient/nfs_clport.c
  index a07a67f..4fe2e35 100644
  --- a/sys/fs/nfsclient/nfs_clport.c
  +++ b/sys/fs/nfsclient/nfs_clport.c
  @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
  nfsvattr *nap, void *nvaper,
  struct nfsnode *np;
  struct nfsmount *nmp;
  struct timespec mtime_save;
  + u_quad_t nsize;
  + int setnsize;
  
  /*
  * If v_type == VNON it is a new node, so fill in the v_type,
  @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
  nfsvattr *nap, void *nvaper,
  } else
  vap-va_fsid = vp-v_mount-mnt_stat.f_fsid.val[0];
  np-n_attrstamp = time_second;
  + setnsize = 0;
  if (vap-va_size != np-n_size) {
  if (vap-va_type == VREG) {
  if (dontshrink  vap-va_size  np-n_size) {
  @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp, struct
  nfsvattr *nap, void *nvaper,
  np-n_size = vap-va_size;
  np-n_flag |= NSIZECHANGED;
  }
  - vnode_pager_setsize(vp, np-n_size);
  } else {
  np-n_size = vap-va_size;
  }
  + if (vap-va_type == VREG || vap-va_type == VDIR) {
  + setnsize = 1;
  + nsize = vap-va_size;
 I might have used np-n_size here, since that is what is given
 as the argument for the pre-patched version, but since
 np-n_size should equal vap-va_size (it is set the same for
 all cases in the code at this point), it doesn't really matter.
 
 I have no idea what the implications of doing vnode_pager_setsize()
 for VDIR is, but Kostik would be much more conversant that I on this,
 so if he thinks it's ok, that's fine with me.
 
  + }
  }
  /*
  * The following checks are added to prevent a race between (say)
  @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
  nfsvattr *nap, void *nvaper,
  KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0);
  #endif
  NFSUNLOCKNODE(np);
  + if (setnsize)
  + vnode_pager_setsize(vp, nsize);
  return (0);
  }
 Yes, I think Kostik's version of the patch is good. I had thought
 of doing it that way, but want for the minimal change version.
 I agree that avoiding unlocking/relocking the mutex is a good idea,
 although I didn't see anything after the relock that I thought
 might be an issue if something changed while unlocked.
If the parallel calls to nfscl_loadattrcache() are possible, then
IMHO at least the n_attrstamp could be cleared needlessly.

 
 Kostik, thanks for posting this version, rick
 ps: Michael, I'd suggest you try this patch instead of mine.
Still, my patch has the issue I noted for the head as well: the buffers
are not destroyed if the size of the vnode is decreased. I would be
inclined to suggest the following change on top of my patch, but I am
sure that it does not work, since vnode is generally not locked in
the nfs_loadattrcache(), I think:

diff --git a/sys/fs/nfsclient/nfs_clport.c b/sys/fs/nfsclient/nfs_clport.c
index 4fe2e35..3a08424 100644
--- a/sys/fs/nfsclient/nfs_clport.c
+++ b/sys/fs/nfsclient/nfs_clport.c
@@ -487,7 +487,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct nfsvattr 
*nap, void *nvaper,
 #endif
NFSUNLOCKNODE(np);
if (setnsize)
-   vnode_pager_setsize(vp, nsize);
+   vtruncbuf(vp, NOCRED, nsize, vp-v_bufobj.bo_bsize);
return (0);
 }
 


pgpXjtJ_eVr_v.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread John Baldwin
On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote:
 On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote:
  
  On Mar 20, 2013, at 10:49 AM, Konstantin Belousov kostik...@gmail.com 
wrote:
   
   I do not like it. As I said in the previous response to Andrey,
   I think that moving the vnode_pager_setsize() after the unlock is
   better, since it reduces races with other thread seeing half-done
   attribute update or making attribute change simultaneously.
  
  OK - so should I wait for another patch - or? 
 
 I think the following is what I mean. As an additional note, why nfs
 client does not trim the buffers when server reported node size change ?

Will changing the size always result in an mtime change forcing the client to
throw away the data on the next read or fault anyway (or does it only affect
ctime)?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Konstantin Belousov
On Wed, Mar 20, 2013 at 09:43:20AM -0400, John Baldwin wrote:
 On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote:
  On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote:
   
   On Mar 20, 2013, at 10:49 AM, Konstantin Belousov kostik...@gmail.com 
 wrote:

I do not like it. As I said in the previous response to Andrey,
I think that moving the vnode_pager_setsize() after the unlock is
better, since it reduces races with other thread seeing half-done
attribute update or making attribute change simultaneously.
   
   OK - so should I wait for another patch - or? 
  
  I think the following is what I mean. As an additional note, why nfs
  client does not trim the buffers when server reported node size change ?
 
 Will changing the size always result in an mtime change forcing the client to
 throw away the data on the next read or fault anyway (or does it only affect
 ctime)?

UFS only modifies ctime on truncation, it seems.


pgpOCtPQobYQ4.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Konstantin Belousov
On Wed, Mar 20, 2013 at 08:58:08PM +0200, Konstantin Belousov wrote:
 On Wed, Mar 20, 2013 at 09:43:20AM -0400, John Baldwin wrote:
  On Wednesday, March 20, 2013 9:22:22 am Konstantin Belousov wrote:
   On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek wrote:

On Mar 20, 2013, at 10:49 AM, Konstantin Belousov kostik...@gmail.com 
  wrote:
 
 I do not like it. As I said in the previous response to Andrey,
 I think that moving the vnode_pager_setsize() after the unlock is
 better, since it reduces races with other thread seeing half-done
 attribute update or making attribute change simultaneously.

OK - so should I wait for another patch - or? 
   
   I think the following is what I mean. As an additional note, why nfs
   client does not trim the buffers when server reported node size change ?
  
  Will changing the size always result in an mtime change forcing the client 
  to
  throw away the data on the next read or fault anyway (or does it only affect
  ctime)?
 
 UFS only modifies ctime on truncation, it seems.

No, I was wrong. ffs_truncate() indeed only sets both IN_CHANGE | IN_UPDATE
flags for the inode, and IN_UPDATE causes mtime update in ufs_itimes(),
called from UFS_UPDATE().



pgp1vaxm89T7D.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-20 Thread Rick Macklem
Konstantin Belousov wrote:
 On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote:
  Konstantin Belousov wrote:
   On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek
   wrote:
   
On Mar 20, 2013, at 10:49 AM, Konstantin Belousov
kostik...@gmail.com wrote:

 I do not like it. As I said in the previous response to
 Andrey,
 I think that moving the vnode_pager_setsize() after the unlock
 is
 better, since it reduces races with other thread seeing
 half-done
 attribute update or making attribute change simultaneously.
   
OK - so should I wait for another patch - or?
  
   I think the following is what I mean. As an additional note, why
   nfs
   client does not trim the buffers when server reported node size
   change
   ?
  
   diff --git a/sys/fs/nfsclient/nfs_clport.c
   b/sys/fs/nfsclient/nfs_clport.c
   index a07a67f..4fe2e35 100644
   --- a/sys/fs/nfsclient/nfs_clport.c
   +++ b/sys/fs/nfsclient/nfs_clport.c
   @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   struct nfsnode *np;
   struct nfsmount *nmp;
   struct timespec mtime_save;
   + u_quad_t nsize;
   + int setnsize;
  
   /*
   * If v_type == VNON it is a new node, so fill in the v_type,
   @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   } else
   vap-va_fsid = vp-v_mount-mnt_stat.f_fsid.val[0];
   np-n_attrstamp = time_second;
   + setnsize = 0;
   if (vap-va_size != np-n_size) {
   if (vap-va_type == VREG) {
   if (dontshrink  vap-va_size  np-n_size) {
   @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp,
   struct
   nfsvattr *nap, void *nvaper,
   np-n_size = vap-va_size;
   np-n_flag |= NSIZECHANGED;
   }
   - vnode_pager_setsize(vp, np-n_size);
   } else {
   np-n_size = vap-va_size;
   }
   + if (vap-va_type == VREG || vap-va_type == VDIR) {
   + setnsize = 1;
   + nsize = vap-va_size;
  I might have used np-n_size here, since that is what is given
  as the argument for the pre-patched version, but since
  np-n_size should equal vap-va_size (it is set the same for
  all cases in the code at this point), it doesn't really matter.
 
  I have no idea what the implications of doing vnode_pager_setsize()
  for VDIR is, but Kostik would be much more conversant that I on
  this,
  so if he thinks it's ok, that's fine with me.
 
   + }
   }
   /*
   * The following checks are added to prevent a race between (say)
   @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0);
   #endif
   NFSUNLOCKNODE(np);
   + if (setnsize)
   + vnode_pager_setsize(vp, nsize);
   return (0);
   }
  Yes, I think Kostik's version of the patch is good. I had thought
  of doing it that way, but want for the minimal change version.
  I agree that avoiding unlocking/relocking the mutex is a good idea,
  although I didn't see anything after the relock that I thought
  might be an issue if something changed while unlocked.
 If the parallel calls to nfscl_loadattrcache() are possible, then
 IMHO at least the n_attrstamp could be cleared needlessly.
 
And that could result in an attribute cache miss, since setting
n_attrstamp = 0 invalidates the cached attributes.

I agree that your patch is preferred. I'll admit I was mostly lazy
(and a little afraid of moving the vnode_pager_setsize()) when I
did the patch.

 
  Kostik, thanks for posting this version, rick
  ps: Michael, I'd suggest you try this patch instead of mine.
 Still, my patch has the issue I noted for the head as well: the
 buffers
 are not destroyed if the size of the vnode is decreased. I would be
 inclined to suggest the following change on top of my patch, but I am
 sure that it does not work, since vnode is generally not locked in
 the nfs_loadattrcache(), I think:
 
Well, read/write sharing of files over NFS is pretty rare, so I suspect
a truncation of a file by another client (or locally in the NFS server)
is a rare event. As such, not invalidating the buffers here doesn't seem
like a big issue? (The client uses np-n_size to determine EOF.)

Also, I think close-to-open consistency will typically throw away the
buffers on the next open when it sees the mtime changed. (Yes, there
won't necessarily be another open, but...)

As you point out, I don't think the vnode will be locked when
nfscl_loadattrcache() is called for read/writes being done by the
nfsiod threads and will only be a shared vnode lock for other reads.

I think your patch without the below addition is just fine, rick

 diff --git a/sys/fs/nfsclient/nfs_clport.c
 b/sys/fs/nfsclient/nfs_clport.c
 index 4fe2e35..3a08424 100644
 --- a/sys/fs/nfsclient/nfs_clport.c
 +++ b/sys/fs/nfsclient/nfs_clport.c
 @@ -487,7 +487,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 #endif
 NFSUNLOCKNODE(np);
 if (setnsize)
 - vnode_pager_setsize(vp, nsize);
 + 

Re: Core Dump / panic sleeping thread

2013-03-20 Thread Rick Macklem
Konstantin Belousov wrote:
 On Wed, Mar 20, 2013 at 11:37:56AM -0400, Rick Macklem wrote:
  Konstantin Belousov wrote:
   On Wed, Mar 20, 2013 at 12:13:05PM +0100, Michael Landin Hostbaek
   wrote:
   
On Mar 20, 2013, at 10:49 AM, Konstantin Belousov
kostik...@gmail.com wrote:

 I do not like it. As I said in the previous response to
 Andrey,
 I think that moving the vnode_pager_setsize() after the unlock
 is
 better, since it reduces races with other thread seeing
 half-done
 attribute update or making attribute change simultaneously.
   
OK - so should I wait for another patch - or?
  
   I think the following is what I mean. As an additional note, why
   nfs
   client does not trim the buffers when server reported node size
   change
   ?
  
   diff --git a/sys/fs/nfsclient/nfs_clport.c
   b/sys/fs/nfsclient/nfs_clport.c
   index a07a67f..4fe2e35 100644
   --- a/sys/fs/nfsclient/nfs_clport.c
   +++ b/sys/fs/nfsclient/nfs_clport.c
   @@ -361,6 +361,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   struct nfsnode *np;
   struct nfsmount *nmp;
   struct timespec mtime_save;
   + u_quad_t nsize;
   + int setnsize;
  
   /*
   * If v_type == VNON it is a new node, so fill in the v_type,
   @@ -418,6 +420,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   } else
   vap-va_fsid = vp-v_mount-mnt_stat.f_fsid.val[0];
   np-n_attrstamp = time_second;
   + setnsize = 0;
   if (vap-va_size != np-n_size) {
   if (vap-va_type == VREG) {
   if (dontshrink  vap-va_size  np-n_size) {
   @@ -444,10 +447,13 @@ nfscl_loadattrcache(struct vnode **vpp,
   struct
   nfsvattr *nap, void *nvaper,
   np-n_size = vap-va_size;
   np-n_flag |= NSIZECHANGED;
   }
   - vnode_pager_setsize(vp, np-n_size);
   } else {
   np-n_size = vap-va_size;
   }
   + if (vap-va_type == VREG || vap-va_type == VDIR) {
   + setnsize = 1;
   + nsize = vap-va_size;
  I might have used np-n_size here, since that is what is given
  as the argument for the pre-patched version, but since
  np-n_size should equal vap-va_size (it is set the same for
  all cases in the code at this point), it doesn't really matter.
 
  I have no idea what the implications of doing vnode_pager_setsize()
  for VDIR is, but Kostik would be much more conversant that I on
  this,
  so if he thinks it's ok, that's fine with me.
 
   + }
   }
   /*
   * The following checks are added to prevent a race between (say)
   @@ -480,6 +486,8 @@ nfscl_loadattrcache(struct vnode **vpp, struct
   nfsvattr *nap, void *nvaper,
   KDTRACE_NFS_ATTRCACHE_LOAD_DONE(vp, vap, 0);
   #endif
   NFSUNLOCKNODE(np);
   + if (setnsize)
   + vnode_pager_setsize(vp, nsize);
   return (0);
   }
  Yes, I think Kostik's version of the patch is good. I had thought
  of doing it that way, but want for the minimal change version.
  I agree that avoiding unlocking/relocking the mutex is a good idea,
  although I didn't see anything after the relock that I thought
  might be an issue if something changed while unlocked.
 If the parallel calls to nfscl_loadattrcache() are possible, then
 IMHO at least the n_attrstamp could be cleared needlessly.
 
 
  Kostik, thanks for posting this version, rick
  ps: Michael, I'd suggest you try this patch instead of mine.
 Still, my patch has the issue I noted for the head as well: the
 buffers
 are not destroyed if the size of the vnode is decreased. I would be
 inclined to suggest the following change on top of my patch, but I am
 sure that it does not work, since vnode is generally not locked in
 the nfs_loadattrcache(), I think:
 
Oh, and I think jhb@ was mentioning, if this client is only reading
the file, it will invalidate the buffers when it sees the mtime
change on a subsequent read.

rick

 diff --git a/sys/fs/nfsclient/nfs_clport.c
 b/sys/fs/nfsclient/nfs_clport.c
 index 4fe2e35..3a08424 100644
 --- a/sys/fs/nfsclient/nfs_clport.c
 +++ b/sys/fs/nfsclient/nfs_clport.c
 @@ -487,7 +487,7 @@ nfscl_loadattrcache(struct vnode **vpp, struct
 nfsvattr *nap, void *nvaper,
 #endif
 NFSUNLOCKNODE(np);
 if (setnsize)
 - vnode_pager_setsize(vp, nsize);
 + vtruncbuf(vp, NOCRED, nsize, vp-v_bufobj.bo_bsize);
 return (0);
 }
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Jeremy Chadwick
On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
 Hi, 
 
 I am running a FreeBSD 9.1-REL system with GENERIC kernel:
 FreeBSD x 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Fri Jan  4 12:28:48 CET 
 2013 root@x:/usr/obj/usr/src/sys/GENERIC  amd64
 
 
 It is crashing a couple of times per week, without any real pattern. There 
 are no hints in the syslog, and I only have the core debug to work from...  
 
 It is a webserver, using a NFS mounted docroot (if it might help) - here's 
 the backtrace:
 
 snip
 This GDB was configured as amd64-marcel-freebsd...
 
 Unread portion of the kernel message buffer:
 Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
 KDB: stack backtrace of thread 100256:
 #0 0x808f2d46 at mi_switch+0x186
 #1 0x8092bb52 at sleepq_wait+0x42
 #2 0x808f34d6 at _sleep+0x376
 #3 0x80b4f3ae at vm_object_page_remove+0x2ce
 #4 0x80b5ac7d at vnode_pager_setsize+0x17d
 #5 0x8082102c at nfscl_loadattrcache+0x2cc
 #6 0x80818d37 at nfs_getattr+0x287
 #7 0x8098f1c0 at vn_stat+0xb0
 #8 0x809869d9 at kern_statat_vnhook+0xf9
 #9 0x80986b55 at kern_statat+0x15
 #10 0x80986c1a at sys_lstat+0x2a
 #11 0x80bd7ae6 at amd64_syscall+0x546
 #12 0x80bc3447 at Xfast_syscall+0xf7
 panic: sleeping thread
 cpuid = 0
 KDB: stack backtrace:
 #0 0x809208a6 at kdb_backtrace+0x66
 #1 0x808ea8be at panic+0x1ce
 #2 0x8092ed22 at propagate_priority+0x1d2
 #3 0x8092fa4e at turnstile_wait+0x1be
 #4 0x808d8d48 at _mtx_lock_sleep+0xd8
 #5 0x80820fa4 at nfscl_loadattrcache+0x244
 #6 0x8081758c at ncl_readrpc+0xac
 #7 0x80824c45 at ncl_getpages+0x485
 #8 0x80b5aa0c at vnode_pager_getpages+0x9c
 #9 0x80b3fc93 at vm_fault_hold+0x673
 #10 0x80b41cc3 at vm_fault+0x73
 #11 0x80bd84b4 at trap_pfault+0x124
 #12 0x80bd8c6c at trap+0x49c
 #13 0x80bc315f at calltrap+0x8
 Uptime: 8d0h54m10s
 Dumping 2381 out of 24547 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
 
 Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from 
 /boot/kernel/geom_mirror.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/geom_mirror.ko
 Reading symbols from /boot/kernel/geom_stripe.ko...Reading symbols from 
 /boot/kernel/geom_stripe.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/geom_stripe.ko
 Reading symbols from /boot/kernel/if_em.ko...Reading symbols from 
 /boot/kernel/if_em.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/if_em.ko
 Reading symbols from /boot/kernel/linprocfs.ko...Reading symbols from 
 /boot/kernel/linprocfs.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/linprocfs.ko
 Reading symbols from /boot/kernel/linux.ko...Reading symbols from 
 /boot/kernel/linux.ko.symbols...done.
 done.
 Loaded symbols for /boot/kernel/linux.ko
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 224   pcpu.h: No such file or directory.
   in pcpu.h
 (kgdb) bt
 #0  doadump (textdump=Variable textdump is not available.
 ) at pcpu.h:224
 #1  0x808ea3a1 in kern_reboot (howto=260) at 
 /usr/src/sys/kern/kern_shutdown.c:448
 #2  0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds) at 
 /usr/src/sys/kern/kern_shutdown.c:636
 #3  0x8092ed22 in propagate_priority (td=Variable td is not 
 available.
 ) at /usr/src/sys/kern/subr_turnstile.c:227
 #4  0x8092fa4e in turnstile_wait (ts=Variable ts is not available.
 ) at /usr/src/sys/kern/subr_turnstile.c:743
 #5  0x808d8d48 in _mtx_lock_sleep (m=0xfe044a3c8238, 
 tid=18446741888664231936, opts=Variable opts is not available.
 )
 at /usr/src/sys/kern/kern_mutex.c:471
 #6  0x80820fa4 in nfscl_loadattrcache (vpp=Variable vpp is not 
 available.
 ) at /usr/src/sys/fs/nfsclient/nfs_clport.c:379
 #7  0x8081758c in ncl_readrpc (vp=0xfe044a6cd780, 
 uiop=0xff86962fc650, cred=Variable cred is not available.
 )
 at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:1369
 #8  0x80824c45 in ncl_getpages (ap=0xff86962fc6f0) at 
 /usr/src/sys/fs/nfsclient/nfs_clbio.c:171
 #9  0x80b5aa0c in vnode_pager_getpages (object=0xfe016aa16570, 
 m=0xff86962fc770, count=Variable count is not available.
 )
 at vnode_if.h:1154
 #10 0x80b3fc93 in vm_fault_hold (map=0xfe007f7e3188, 
 vaddr=34366988288, fault_type=1 '\001', fault_flags=Variable fault_flags is 
 not available.
 )
 at vm_pager.h:128
 #11 0x80b41cc3 in vm_fault (map=0xfe007f7e3188, 
 vaddr=34366988288, fault_type=Variable fault_type is not available.
 )
 at /usr/src/sys/vm/vm_fault.c:229
 #12 0x80bd84b4 in trap_pfault (frame=0xff86962fcc40, usermode=1) 
 at /usr/src/sys/amd64/amd64/trap.c:740
 #13 0x80bd8c6c in trap (frame=0xff86962fcc40) at 
 /usr/src/sys/amd64/amd64/trap.c:358
 #14 

Re: Core Dump / panic sleeping thread

2013-03-19 Thread Michael Landin Hostbaek

On Mar 19, 2013, at 6:35 PM, Jeremy Chadwick j...@koitsu.org wrote:

 On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
 The kernel panic is happening in NFS-related code.  Rick Macklem (and/or
 John Baldwin) should be able to help with this; I've CC'd both here.

OK, thanks. 


 
 You're going to need to provide the following details:
 
 1. Contents of /etc/rc.conf

sshd_enable=YES
ntpdate_enable=YES
ntpdate_hosts=xx.xx.xx.xx
fsck_y_enable=YES
named_enable=YES
dumpdev=AUTO
nfs_client_enable=YES
rpc_lockd_enable=YES
rpc_statd_enable=YES
ifconfig_em0=inet xx.xx.xx.xx netmask 255.255.255.0 broadcast xx.xx.xx.xx
defaultrouter=xx.xx.xx.xx
hostname=
cloned_interfaces=vlan
ifconfig_vlan=inet xx.xx.xx.xx netmask 255.240.0.0 broadcast xx.xx.xx.xx 
vlan  vlandev em0
apache22_enable=YES
pureftpd_enable=YES
revealcloud_enable=YES


 2. Contents of /etc/sysctl.conf (if modified)

vm.pmap.shpgperproc=250

 3. Contents of /etc/fstab

# DeviceMountpoint  FStype  Options DumpPass#
/dev/mirror/gm0s1a  /   ufs rw  1   
1
/dev/mirror/gm0s1b  noneswapsw  0   0
/dev/mirror/gm0s1d  /varufs rw  2   
2
/dev/mirror/gm0s1e  /logs   ufs rw  2   
2
/dev/mirror/gm0s1f  /extra  ufs rw  2   
2
/dev/mirror/gm0s1g  /usrufs rw  2   
2
proc/proc   procfs  rw  0   0
xx.xx.xx.xx:/zpool-000xxx/www   /mnt/wwwnfs rw  0   0
xx.xx.xx.xx:/zpool-000xxx/data  /mnt/data   nfs rw,tcp  0   0
linproc /compat/linux/proc  linprocfs   rw  0   0


 4. ifconfig -a

em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

options=4219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xff00 broadcast xx.xx.xx.xx
inet6 xx::a5ac%em0 prefixlen 64 scopeid 0x1 
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
em1: flags=8c02BROADCAST,OACTIVE,SIMPLEX,MULTICAST metric 0 mtu 1500

options=4219bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO
ether 00:25:90:79:a5:ad
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect
status: no carrier
lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=63RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6
inet6 ::1 prefixlen 128 
inet6 fe80::1%lo0 prefixlen 64 scopeid 0xb 
inet 127.0.0.1 netmask 0xff00 
nd6 options=21PERFORMNUD,AUTO_LINKLOCAL
vlan: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=103RXCSUM,TXCSUM,TSO4
ether 00:25:90:79:a5:ac
inet xx.xx.xx.xx netmask 0xfff0 broadcast xx.xx.xx.xx
inet6 x:::5ac%vlan prefixlen 64 scopeid 0xc 
nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL
media: Ethernet autoselect (1000baseT full-duplex)
status: active
vlan:  parent interface: em0


 5. OS used by the NFS server, and all configuration details pertaining
 to that system

This is a hosted service, so I do not have access to this - though I believe 
this is a ZFS fs.
Here's more info about the product: http://help.ovh.co.uk/Nas


 
 You may also be asked to upgrade to 9.1-STABLE, as there may be fixes
 for whatever this is in base/stable/9 that are not in -RELEASE, but this
 is speculative on my part.

That is not a problem. I would simply like to confirm the issue, before 
upgrading. 


Thanks, 

/mich


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Andriy Gapon
on 19/03/2013 19:35 Jeremy Chadwick said the following:
 On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
[snip]
 Unread portion of the kernel message buffer:
 Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
 KDB: stack backtrace of thread 100256:
 #0 0x808f2d46 at mi_switch+0x186
 #1 0x8092bb52 at sleepq_wait+0x42
 #2 0x808f34d6 at _sleep+0x376
 #3 0x80b4f3ae at vm_object_page_remove+0x2ce
 #4 0x80b5ac7d at vnode_pager_setsize+0x17d
 #5 0x8082102c at nfscl_loadattrcache+0x2cc
 #6 0x80818d37 at nfs_getattr+0x287
 #7 0x8098f1c0 at vn_stat+0xb0
 #8 0x809869d9 at kern_statat_vnhook+0xf9
 #9 0x80986b55 at kern_statat+0x15
 #10 0x80986c1a at sys_lstat+0x2a
 #11 0x80bd7ae6 at amd64_syscall+0x546
 #12 0x80bc3447 at Xfast_syscall+0xf7
 panic: sleeping thread
 cpuid = 0
 KDB: stack backtrace:
 #0 0x809208a6 at kdb_backtrace+0x66
 #1 0x808ea8be at panic+0x1ce
 #2 0x8092ed22 at propagate_priority+0x1d2
 #3 0x8092fa4e at turnstile_wait+0x1be
 #4 0x808d8d48 at _mtx_lock_sleep+0xd8
 #5 0x80820fa4 at nfscl_loadattrcache+0x244
 #6 0x8081758c at ncl_readrpc+0xac
 #7 0x80824c45 at ncl_getpages+0x485
 #8 0x80b5aa0c at vnode_pager_getpages+0x9c
 #9 0x80b3fc93 at vm_fault_hold+0x673
 #10 0x80b41cc3 at vm_fault+0x73
 #11 0x80bd84b4 at trap_pfault+0x124
 #12 0x80bd8c6c at trap+0x49c
 #13 0x80bc315f at calltrap+0x8
[snip]

I think that the regular mutex which is acquired via NFSLOCKNODE() in
nfscl_loadattrcache() can not be held across vnode_pager_setsize.
I am not sure though when vap-va_size != np-n_size case is triggered.

 You're going to need to provide the following details:
 
 1. Contents of /etc/rc.conf
 2. Contents of /etc/sysctl.conf (if modified)
 3. Contents of /etc/fstab
 4. ifconfig -a
 5. OS used by the NFS server, and all configuration details pertaining
 to that system
 
 You may also be asked to upgrade to 9.1-STABLE, as there may be fixes
 for whatever this is in base/stable/9 that are not in -RELEASE, but this
 is speculative on my part.
 
I do not see a need for any of these.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Konstantin Belousov
On Tue, Mar 19, 2013 at 07:45:56PM +0200, Andriy Gapon wrote:
 on 19/03/2013 19:35 Jeremy Chadwick said the following:
  On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek wrote:
 [snip]
  Unread portion of the kernel message buffer:
  Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
  KDB: stack backtrace of thread 100256:
  #0 0x808f2d46 at mi_switch+0x186
  #1 0x8092bb52 at sleepq_wait+0x42
  #2 0x808f34d6 at _sleep+0x376
  #3 0x80b4f3ae at vm_object_page_remove+0x2ce
  #4 0x80b5ac7d at vnode_pager_setsize+0x17d
  #5 0x8082102c at nfscl_loadattrcache+0x2cc
  #6 0x80818d37 at nfs_getattr+0x287
  #7 0x8098f1c0 at vn_stat+0xb0
  #8 0x809869d9 at kern_statat_vnhook+0xf9
  #9 0x80986b55 at kern_statat+0x15
  #10 0x80986c1a at sys_lstat+0x2a
  #11 0x80bd7ae6 at amd64_syscall+0x546
  #12 0x80bc3447 at Xfast_syscall+0xf7
  panic: sleeping thread
  cpuid = 0
  KDB: stack backtrace:
  #0 0x809208a6 at kdb_backtrace+0x66
  #1 0x808ea8be at panic+0x1ce
  #2 0x8092ed22 at propagate_priority+0x1d2
  #3 0x8092fa4e at turnstile_wait+0x1be
  #4 0x808d8d48 at _mtx_lock_sleep+0xd8
  #5 0x80820fa4 at nfscl_loadattrcache+0x244
  #6 0x8081758c at ncl_readrpc+0xac
  #7 0x80824c45 at ncl_getpages+0x485
  #8 0x80b5aa0c at vnode_pager_getpages+0x9c
  #9 0x80b3fc93 at vm_fault_hold+0x673
  #10 0x80b41cc3 at vm_fault+0x73
  #11 0x80bd84b4 at trap_pfault+0x124
  #12 0x80bd8c6c at trap+0x49c
  #13 0x80bc315f at calltrap+0x8
 [snip]
 
 I think that the regular mutex which is acquired via NFSLOCKNODE() in
 nfscl_loadattrcache() can not be held across vnode_pager_setsize.
 I am not sure though when vap-va_size != np-n_size case is triggered.

When the file is modified on the server outside of the control of
the client ? E.g., by direct access on the server, or from the other
client.

The only possible solution is to move the vnode_pager_setsize() outside
the scope of the n_mtx. This is somewhat problematic because the nfsiod
threads never bother to lock the vnode, so the truncation of the vm
cache becomes racy. Still, this is probably the best cure.

Another issue I see there is that vnode_pager_setsize() call is only
performed for the VREG nodes. I believe that it is possible to cache
the pages for the directories as well.

Would you work out the patch ?


pgpSjw8_XI0By.pgp
Description: PGP signature


Re: Core Dump / panic sleeping thread

2013-03-19 Thread Rick Macklem
Andriy Gapon wrote:
 on 19/03/2013 19:35 Jeremy Chadwick said the following:
  On Tue, Mar 19, 2013 at 06:18:06PM +0100, Michael Landin Hostbaek
  wrote:
 [snip]
  Unread portion of the kernel message buffer:
  Sleeping thread (tid 100256, pid 85641) owns a non-sleepable lock
  KDB: stack backtrace of thread 100256:
  #0 0x808f2d46 at mi_switch+0x186
  #1 0x8092bb52 at sleepq_wait+0x42
  #2 0x808f34d6 at _sleep+0x376
  #3 0x80b4f3ae at vm_object_page_remove+0x2ce
  #4 0x80b5ac7d at vnode_pager_setsize+0x17d
  #5 0x8082102c at nfscl_loadattrcache+0x2cc
  #6 0x80818d37 at nfs_getattr+0x287
  #7 0x8098f1c0 at vn_stat+0xb0
  #8 0x809869d9 at kern_statat_vnhook+0xf9
  #9 0x80986b55 at kern_statat+0x15
  #10 0x80986c1a at sys_lstat+0x2a
  #11 0x80bd7ae6 at amd64_syscall+0x546
  #12 0x80bc3447 at Xfast_syscall+0xf7
  panic: sleeping thread
  cpuid = 0
  KDB: stack backtrace:
  #0 0x809208a6 at kdb_backtrace+0x66
  #1 0x808ea8be at panic+0x1ce
  #2 0x8092ed22 at propagate_priority+0x1d2
  #3 0x8092fa4e at turnstile_wait+0x1be
  #4 0x808d8d48 at _mtx_lock_sleep+0xd8
  #5 0x80820fa4 at nfscl_loadattrcache+0x244
  #6 0x8081758c at ncl_readrpc+0xac
  #7 0x80824c45 at ncl_getpages+0x485
  #8 0x80b5aa0c at vnode_pager_getpages+0x9c
  #9 0x80b3fc93 at vm_fault_hold+0x673
  #10 0x80b41cc3 at vm_fault+0x73
  #11 0x80bd84b4 at trap_pfault+0x124
  #12 0x80bd8c6c at trap+0x49c
  #13 0x80bc315f at calltrap+0x8
 [snip]
 
 I think that the regular mutex which is acquired via NFSLOCKNODE() in
 nfscl_loadattrcache() can not be held across vnode_pager_setsize.
 I am not sure though when vap-va_size != np-n_size case is
 triggered.
 
Yep, I'd agree to that. The same bug is in the old NFS client and
the new NFS client cribbed the code from there.

I have attached a simple patch that unlocks the mutex for the
vnode_pager_setsize() call. Maybe you could test it?

Thanks for reporting this, rick
ps: Hopefully patch can apply this patch (there have been
recent changes to this file, so the line#s could be off).
It should be easy to do manually if not. The change is
in nfscl_loadattrcache() in sys/fs/nfsclient/nfs_clport.c.


  You're going to need to provide the following details:
 
  1. Contents of /etc/rc.conf
  2. Contents of /etc/sysctl.conf (if modified)
  3. Contents of /etc/fstab
  4. ifconfig -a
  5. OS used by the NFS server, and all configuration details
  pertaining
  to that system
 
  You may also be asked to upgrade to 9.1-STABLE, as there may be
  fixes
  for whatever this is in base/stable/9 that are not in -RELEASE, but
  this
  is speculative on my part.
 
 I do not see a need for any of these.
 
 --
 Andriy Gapon
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 freebsd-stable-unsubscr...@freebsd.org
--- fs/nfsclient/nfs_clport.c.savit	2013-03-19 18:37:33.0 -0400
+++ fs/nfsclient/nfs_clport.c	2013-03-19 18:44:21.0 -0400
@@ -444,7 +444,9 @@ nfscl_loadattrcache(struct vnode **vpp, 
 np-n_size = vap-va_size;
 np-n_flag |= NSIZECHANGED;
 			}
+			NFSUNLOCKNODE(np);
 			vnode_pager_setsize(vp, np-n_size);
+			NFSLOCKNODE(np);
 		} else {
 			np-n_size = vap-va_size;
 		}
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org