Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-12 Thread Kostik Belousov
On Thu, Dec 11, 2008 at 02:53:49PM -0800, David Wolfskill wrote:
 On Wed, Dec 10, 2008 at 07:06:20PM +0200, Kostik Belousov wrote:
  ...
   What concerns me is that even if the attempted unmount gets EBUSY, the
   user-level process descending the directory hierarchy is getting ENOENT
   trying to issue fstatfs() against an open file descriptor.
   
   I'm having trouble figuring out any way that makes any sense.
  
  Basically, the problem is that NFS uses shared lookup, and this allows
  for the bug where several negative namecache entries are created for
  non-existent node. Then this node gets created, removing only the first
  negative namecache entry. For some reasons, vnode is reclaimed; amd'
  tasting of unmount is a good reason for vnode to be reclaimed.
  
  Now, you have existing path and a negative cache entry. This was
  reported by Peter Holm first, I listed relevant revisions that
  should fix this in previous mail.
 
 Well, I messed up the machine I had been using for testing, and needed
 to wait for IT to do something to it since I don't have physical or
 console access to it.
 
 So after I happened to demonstrate the effect using my desktop  -- which
 had been running RELENG_7_1, sources updated as of around 0400 hrs.
 US/Pacific -- I decided to go ahead and update the desktop to RELENG_7_1
 as of this morning (which had the commit to sys/kern/vfs_cache.c), then
 test.
 
 It still failed, apparently in the same way; details below.
 
 First, here's a list of the files that were changed:
 
 U lib/libarchive/archive_read_support_format_iso9660.c
 U lib/libarchive/archive_string.c
 U lib/libarchive/archive_string.h
 U lib/libc/gen/times.3
 U lib/libc/i386/sys/pipe.S
 U lib/libc/i386/sys/reboot.S
 U lib/libc/i386/sys/setlogin.S
 U lib/libutil/Makefile
 U lib/libutil/kinfo_getfile.c
 U lib/libutil/kinfo_getvmmap.c
 U lib/libutil/libutil.h
 U share/man/man4/bce.4
 U share/man/man5/Makefile
 U share/man/man5/fstab.5
 U share/man/man5/nullfs.5
 U sys/amd64/Makefile
 U sys/boot/forth/loader.conf.5
 U sys/dev/ale/if_ale.c
 U sys/dev/bce/if_bce.c
 U sys/dev/cxgb/cxgb_main.c
 U sys/dev/cxgb/common/cxgb_ael1002.c
 U sys/dev/cxgb/common/cxgb_t3_hw.c
 U sys/dev/cxgb/common/cxgb_xgmac.c
 U sys/dev/re/if_re.c
 U sys/fs/nullfs/null_vnops.c
 U sys/kern/Make.tags.inc
 U sys/kern/kern_descrip.c
 U sys/kern/kern_proc.c
 U sys/kern/vfs_cache.c
 U sys/netinet/in_pcb.h
 U sys/pci/if_rlreg.h
 U sys/sys/sysctl.h
 U sys/sys/user.h
 U sys/ufs/ufs/ufs_quota.c
 U usr.bin/procstat/Makefile
 U usr.bin/procstat/procstat_files.c
 U usr.bin/procstat/procstat_vm.c
 U usr.bin/tar/util.c
 U usr.bin/tar/test/Makefile
 U usr.bin/tar/test/test_strip_components.c
 U usr.bin/tar/test/test_symlink_dir.c
 U usr.bin/xargs/xargs.1
 U usr.sbin/mtree/mtree.c
 
 We see that sys/kern/vfs_cache.c is, indeed, among them.  And:
 
 dwolf-bsd(7.1-P)[5] grep '\$FreeBSD' /sys/kern/vfs_cache.c
 __FBSDID($FreeBSD: src/sys/kern/vfs_cache.c,v 1.114.2.3 2008/12/09 16:20:58 
 kib Exp $);
 dwolf-bsd(7.1-P)[6] 
 
 That should correspond to the desired version of the file.
 
 Here we see an excerpt from the ktrace output for the amd(8) process and
 its children; this is a point when amd(8) is trying an unmount() to see
 if it can get away with it:
 
977 amd  1229033597.269612 CALL  gettimeofday(0x807ad48,0)
977 amd  1229033597.269620 RET   gettimeofday 0
977 amd  1229033597.269630 CALL  
 sigprocmask(SIG_BLOCK,0xbfbfeaec,0xbfbfeadc)
977 amd  1229033597.269637 RET   sigprocmask 0
977 amd  1229033597.269645 CALL  fork
977 amd  1229033597.273810 RET   fork 1712/0x6b0
   1712 amd  1229033597.273811 RET   fork 0
977 amd  1229033597.273836 CALL  sigprocmask(SIG_SETMASK,0xbfbfeadc,0)
   1712 amd  1229033597.273845 CALL  getpid
977 amd  1229033597.273850 RET   sigprocmask 0
   1712 amd  1229033597.273855 RET   getpid 1712/0x6b0
977 amd  1229033597.273864 CALL  gettimeofday(0x807ad48,0)
977 amd  1229033597.273874 RET   gettimeofday 0
   1712 amd  1229033597.273878 CALL  unmount(0x2832c610,invalid0)
 ...
   1712 amd  1229033597.352643 RET   unmount -1 errno 16 Device busy
   1712 amd  1229033597.352695 CALL  
 sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfea0c)
   1712 amd  1229033597.352728 RET   sigprocmask 0
   1712 amd  1229033597.352751 CALL  sigprocmask(SIG_SETMASK,0x28097c10,0)
   1712 amd  1229033597.352769 RET   sigprocmask 0
   1712 amd  1229033597.352781 CALL  
 sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfe9dc)
   1712 amd  1229033597.352790 RET   sigprocmask 0
   1712 amd  1229033597.352801 CALL  sigprocmask(SIG_SETMASK,0x28097c10,0)
   1712 amd  1229033597.352805 RET   sigprocmask 0
   1712 amd  1229033597.352815 CALL  exit(0x10)
977 amd  1229033597.353085 RET   select -1 errno 4 Interrupted system 
 call
977 amd  1229033597.353093 PSIG  SIGCHLD caught handler=0x805de50 
 mask=0x0 code=0x0
977 amd  

Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-12 Thread David Wolfskill
On Fri, Dec 12, 2008 at 03:41:29PM +0200, Kostik Belousov wrote:
 ...
  * At 1229033597.287187 it issues an fstatfs() against FD 4; the
unsuccessful return is at 1229033597.287195, claiming ENOENT.
  
  Say WHAT??!?
 ...
 
 But is this error transient or permanent ? I.e., would restart of rm
 successful or failing ?

In a test yesterday, it took 3 attempts (each attempt being an
invocations of rm -fr ~bspace/ports) to actually complete removal of
the hierarchy.

Please note that:

* Done on a locally-mounted file systen (vs. NFS), a single invocation
  is sufficient and terminates normally.  Each of the above-cited
  attempts but the last terminated with a status code of 1 (as well as
  a whine that one or more subdirectories was not empty -- this, as a
  result of rm getting inconsistent information about the status of the
  file system).

* Done on either a locally- or NFS-mounted file system in FreeBSD 6.x, a
  single invocation is sufficient and terminates normally.

In other words, this is a regression.

 Anyway, this error looks different too.

?  From the earlier-posted results in 7.x?  Not that I can tell.  In
each case, the amd(8) child process is forked to attempt an unmount(),
tries it, gets EBUSY, and exits.  Meanwhile, rm(1) is descending a
directory tree.  It had performed a readdir(), and had been unlinking
files and performing rmdir() against empty subdirectories.  It
encounters an entry, issues stat(), finds that it's a subdirectory,
open()s it, gets an FD, issues fstat(), gets results that match those of
the earlier stat(), issues fcntl() against the FD (which returns 0),
tries to issue fstatfs() against the FD *that is still open*, and gets
told ENOENT.

It does differ from the behavior in 8-CURRENT, in that the amd(8) child
process in 8-CURRENT does not appear to get EBUSY.  The behavior from
rm(1)'s perspective is very similar, though.

If it would help, I could try getting a ktrace from a 6.x system, but I
expect it will be very boring: the amd(8) child process should get EBUSY
(as it does in 7.x), and nothing else should happen, since the unmount()
attempt failed.  And since it failed, rm(1) doesn't get told
inconsistent information, so things Just Work.

I admit that I'm no expert on VFS or much of the rest of the kernel,
for that matter.  But what I have observed happening in recent 7.x
is both wrong and a regression.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpNoDLDaFr3Z.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-11 Thread David Wolfskill
On Wed, Dec 10, 2008 at 07:06:20PM +0200, Kostik Belousov wrote:
 ...
  What concerns me is that even if the attempted unmount gets EBUSY, the
  user-level process descending the directory hierarchy is getting ENOENT
  trying to issue fstatfs() against an open file descriptor.
  
  I'm having trouble figuring out any way that makes any sense.
 
 Basically, the problem is that NFS uses shared lookup, and this allows
 for the bug where several negative namecache entries are created for
 non-existent node. Then this node gets created, removing only the first
 negative namecache entry. For some reasons, vnode is reclaimed; amd'
 tasting of unmount is a good reason for vnode to be reclaimed.
 
 Now, you have existing path and a negative cache entry. This was
 reported by Peter Holm first, I listed relevant revisions that
 should fix this in previous mail.

Well, I messed up the machine I had been using for testing, and needed
to wait for IT to do something to it since I don't have physical or
console access to it.

So after I happened to demonstrate the effect using my desktop  -- which
had been running RELENG_7_1, sources updated as of around 0400 hrs.
US/Pacific -- I decided to go ahead and update the desktop to RELENG_7_1
as of this morning (which had the commit to sys/kern/vfs_cache.c), then
test.

It still failed, apparently in the same way; details below.

First, here's a list of the files that were changed:

U lib/libarchive/archive_read_support_format_iso9660.c
U lib/libarchive/archive_string.c
U lib/libarchive/archive_string.h
U lib/libc/gen/times.3
U lib/libc/i386/sys/pipe.S
U lib/libc/i386/sys/reboot.S
U lib/libc/i386/sys/setlogin.S
U lib/libutil/Makefile
U lib/libutil/kinfo_getfile.c
U lib/libutil/kinfo_getvmmap.c
U lib/libutil/libutil.h
U share/man/man4/bce.4
U share/man/man5/Makefile
U share/man/man5/fstab.5
U share/man/man5/nullfs.5
U sys/amd64/Makefile
U sys/boot/forth/loader.conf.5
U sys/dev/ale/if_ale.c
U sys/dev/bce/if_bce.c
U sys/dev/cxgb/cxgb_main.c
U sys/dev/cxgb/common/cxgb_ael1002.c
U sys/dev/cxgb/common/cxgb_t3_hw.c
U sys/dev/cxgb/common/cxgb_xgmac.c
U sys/dev/re/if_re.c
U sys/fs/nullfs/null_vnops.c
U sys/kern/Make.tags.inc
U sys/kern/kern_descrip.c
U sys/kern/kern_proc.c
U sys/kern/vfs_cache.c
U sys/netinet/in_pcb.h
U sys/pci/if_rlreg.h
U sys/sys/sysctl.h
U sys/sys/user.h
U sys/ufs/ufs/ufs_quota.c
U usr.bin/procstat/Makefile
U usr.bin/procstat/procstat_files.c
U usr.bin/procstat/procstat_vm.c
U usr.bin/tar/util.c
U usr.bin/tar/test/Makefile
U usr.bin/tar/test/test_strip_components.c
U usr.bin/tar/test/test_symlink_dir.c
U usr.bin/xargs/xargs.1
U usr.sbin/mtree/mtree.c

We see that sys/kern/vfs_cache.c is, indeed, among them.  And:

dwolf-bsd(7.1-P)[5] grep '\$FreeBSD' /sys/kern/vfs_cache.c
__FBSDID($FreeBSD: src/sys/kern/vfs_cache.c,v 1.114.2.3 2008/12/09 16:20:58 
kib Exp $);
dwolf-bsd(7.1-P)[6] 

That should correspond to the desired version of the file.

Here we see an excerpt from the ktrace output for the amd(8) process and
its children; this is a point when amd(8) is trying an unmount() to see
if it can get away with it:

   977 amd  1229033597.269612 CALL  gettimeofday(0x807ad48,0)
   977 amd  1229033597.269620 RET   gettimeofday 0
   977 amd  1229033597.269630 CALL  
sigprocmask(SIG_BLOCK,0xbfbfeaec,0xbfbfeadc)
   977 amd  1229033597.269637 RET   sigprocmask 0
   977 amd  1229033597.269645 CALL  fork
   977 amd  1229033597.273810 RET   fork 1712/0x6b0
  1712 amd  1229033597.273811 RET   fork 0
   977 amd  1229033597.273836 CALL  sigprocmask(SIG_SETMASK,0xbfbfeadc,0)
  1712 amd  1229033597.273845 CALL  getpid
   977 amd  1229033597.273850 RET   sigprocmask 0
  1712 amd  1229033597.273855 RET   getpid 1712/0x6b0
   977 amd  1229033597.273864 CALL  gettimeofday(0x807ad48,0)
   977 amd  1229033597.273874 RET   gettimeofday 0
  1712 amd  1229033597.273878 CALL  unmount(0x2832c610,invalid0)
...
  1712 amd  1229033597.352643 RET   unmount -1 errno 16 Device busy
  1712 amd  1229033597.352695 CALL  
sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfea0c)
  1712 amd  1229033597.352728 RET   sigprocmask 0
  1712 amd  1229033597.352751 CALL  sigprocmask(SIG_SETMASK,0x28097c10,0)
  1712 amd  1229033597.352769 RET   sigprocmask 0
  1712 amd  1229033597.352781 CALL  
sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfe9dc)
  1712 amd  1229033597.352790 RET   sigprocmask 0
  1712 amd  1229033597.352801 CALL  sigprocmask(SIG_SETMASK,0x28097c10,0)
  1712 amd  1229033597.352805 RET   sigprocmask 0
  1712 amd  1229033597.352815 CALL  exit(0x10)
   977 amd  1229033597.353085 RET   select -1 errno 4 Interrupted system 
call
   977 amd  1229033597.353093 PSIG  SIGCHLD caught handler=0x805de50 
mask=0x0 code=0x0
   977 amd  1229033597.353103 CALL  wait4(0x,0xbfbfe83c,WNOHANG,0)
   977 amd  1229033597.353116 RET   wait4 1712/0x6b0
   977 amd  1229033597.353122 CALL  

Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-10 Thread Kostik Belousov
On Tue, Dec 09, 2008 at 02:20:05PM -0800, Julian Elischer wrote:
 Kostik Belousov wrote:
 On Tue, Dec 09, 2008 at 11:01:10AM -0800, David Wolfskill wrote:
 On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:
 I seem to have a fairly- (though not deterministly so) reproducible
 mode of failure with an NFS-mounted directory hierarchy:  An attempt to
 traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
 -fr) will fail to visit some subdirectories, typically apparently
 acting as if the subdirectories in question do not actually exist
 (despite the names having been returned in the output of a previous
 readdir()).
 ... 
 
 Did you saw me previous answer ? Supposed patch for your problem was
 committed to head as r185557, and MFCed to 7 in r185796, and to
 7.1 in r185801.
 
 Please test with latest sources.
 
 
 did you notice that he tested with latest -current and releng 7?

Yes, and failure mode on the HEAD looks like a different issue.


pgpr8TYNabIWV.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-10 Thread David Wolfskill
On Wed, Dec 10, 2008 at 11:30:26AM -0500, Rick Macklem wrote:
... 
 The different behaviour for -CURRENT could be the newer RPC layer that
 was recently introduced, but that doesn't explain the basic problem.

OK.

 All I can think of is to ask the obvious question. Are you using
 interruptible or soft mounts? If so, switch to hard mounts and see
 if the problem goes away. (imho, neither interruptible nor soft mounts
 are a good idea. You can use a forced dismount if there is a crashed
 NFS server that isn't coming back anytime soon.)

From examination of /etc/amd* -- I don't see how to get mount(8) or
amq(8) to report it -- it appears that we are using interruptible
mounts, as we always have.

The point is that the behavior has changed in an unexpected way.  And
I'm not so sure that the use of a forced dismount is generally
available, as it would require logging in to the NFS client first, which
may be difficult if the NFS server hosting non-root home directories is
failing to respond and direct root login via ssh(1) is not permitted (as
is the default).

 If you are getting this with hard mounts, I'm afraid I have no idea
 what the problem is, rick.

What concerns me is that even if the attempted unmount gets EBUSY, the
user-level process descending the directory hierarchy is getting ENOENT
trying to issue fstatfs() against an open file descriptor.

I'm having trouble figuring out any way that makes any sense.

Peace,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpiNs29PfKCN.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-10 Thread Kostik Belousov
On Wed, Dec 10, 2008 at 08:50:22AM -0800, David Wolfskill wrote:
 On Wed, Dec 10, 2008 at 11:30:26AM -0500, Rick Macklem wrote:
 ... 
  The different behaviour for -CURRENT could be the newer RPC layer that
  was recently introduced, but that doesn't explain the basic problem.
 
 OK.
 
  All I can think of is to ask the obvious question. Are you using
  interruptible or soft mounts? If so, switch to hard mounts and see
  if the problem goes away. (imho, neither interruptible nor soft mounts
  are a good idea. You can use a forced dismount if there is a crashed
  NFS server that isn't coming back anytime soon.)
 
 From examination of /etc/amd* -- I don't see how to get mount(8) or
 amq(8) to report it -- it appears that we are using interruptible
 mounts, as we always have.
 
 The point is that the behavior has changed in an unexpected way.  And
 I'm not so sure that the use of a forced dismount is generally
 available, as it would require logging in to the NFS client first, which
 may be difficult if the NFS server hosting non-root home directories is
 failing to respond and direct root login via ssh(1) is not permitted (as
 is the default).
 
  If you are getting this with hard mounts, I'm afraid I have no idea
  what the problem is, rick.
 
 What concerns me is that even if the attempted unmount gets EBUSY, the
 user-level process descending the directory hierarchy is getting ENOENT
 trying to issue fstatfs() against an open file descriptor.
 
 I'm having trouble figuring out any way that makes any sense.

Basically, the problem is that NFS uses shared lookup, and this allows
for the bug where several negative namecache entries are created for
non-existent node. Then this node gets created, removing only the first
negative namecache entry. For some reasons, vnode is reclaimed; amd'
tasting of unmount is a good reason for vnode to be reclaimed.

Now, you have existing path and a negative cache entry. This was
reported by Peter Holm first, I listed relevant revisions that
should fix this in previous mail.


pgpJTtEegr73d.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-10 Thread Rick Macklem



On Tue, 9 Dec 2008, David Wolfskill wrote:


On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:

I seem to have a fairly- (though not deterministly so) reproducible
mode of failure with an NFS-mounted directory hierarchy:  An attempt to
traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
-fr) will fail to visit some subdirectories, typically apparently
acting as if the subdirectories in question do not actually exist
(despite the names having been returned in the output of a previous
readdir()).
...


I was able to reproduce the external symptoms of the failure running
CURRENT as of yesterday, using rm -fr of a copy of a recent
/usr/ports hierachy on an NFS-mounted file system as a test case.
However, I believe the mechanism may be a bit different -- while
still being other than what I would expect.

One aspect in which the externally-observable symptoms were different
(under CURRENT, vs. RELENG_7) is that under CURRENT, once the error
condition occurred, the NFS client machine was in a state where it
merely kept repeating

nfs server [EMAIL PROTECTED]:/volume: not responding

until I logged in as root  rebooted it.


The different behaviour for -CURRENT could be the newer RPC layer that
was recently introduced, but that doesn't explain the basic problem.

All I can think of is to ask the obvious question. Are you using
interruptible or soft mounts? If so, switch to hard mounts and see
if the problem goes away. (imho, neither interruptible nor soft mounts
are a good idea. You can use a forced dismount if there is a crashed
NFS server that isn't coming back anytime soon.)

If you are getting this with hard mounts, I'm afraid I have no idea
what the problem is, rick.

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-09 Thread David Wolfskill
On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:
 I seem to have a fairly- (though not deterministly so) reproducible
 mode of failure with an NFS-mounted directory hierarchy:  An attempt to
 traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
 -fr) will fail to visit some subdirectories, typically apparently
 acting as if the subdirectories in question do not actually exist
 (despite the names having been returned in the output of a previous
 readdir()).
 ... 

I was able to reproduce the external symptoms of the failure running
CURRENT as of yesterday, using rm -fr of a copy of a recent
/usr/ports hierachy on an NFS-mounted file system as a test case.
However, I believe the mechanism may be a bit different -- while
still being other than what I would expect.

One aspect in which the externally-observable symptoms were different
(under CURRENT, vs. RELENG_7) is that under CURRENT, once the error
condition occurred, the NFS client machine was in a state where it
merely kept repeating

nfs server [EMAIL PROTECTED]:/volume: not responding

until I logged in as root  rebooted it.


Here's a cut/paste of the kdump from the ktrace of the amd(8) process
under CURRENT, showing where the master amd(8) process (pid 848)
forks a child (4126) to try the unmount:

   848 amd  1228846258.722953 CALL  gettimeofday(0x8078e48,0)
   848 amd  1228846258.722964 RET   gettimeofday 0
   848 amd  1228846258.722982 CALL  
sigprocmask(SIG_BLOCK,0xbfbfeaec,0xbfbfeadc)
   848 amd  1228846258.722993 RET   sigprocmask 0
   848 amd  1228846258.723003 CALL  fork
   848 amd  1228846258.730250 RET   fork 4126/0x101e
   848 amd  1228846258.730405 CALL  sigprocmask(SIG_SETMASK,0xbfbfeadc,0)
  4126 amd  1228846258.730252 RET   fork 0
  4126 amd  1228846258.730456 CALL  getpid
  4126 amd  1228846258.730467 RET   getpid 4126/0x101e
  4126 amd  1228846258.730493 CALL  unmount(0x2825f340,invalid0)
   848 amd  1228846258.730422 RET   sigprocmask 0
   848 amd  1228846258.730595 CALL  gettimeofday(0x8078e48,0)
   848 amd  1228846258.730608 RET   gettimeofday 0
...
   848 amd  1228846258.914814 CALL  sigprocmask(SIG_SETMASK,0xbfbfeba0,0)
   848 amd  1228846258.914826 RET   sigprocmask 0
   848 amd  1228846258.914838 CALL  select(0x400,0xbfbfec40,0,0,0xbfbfecd8)
  4126 amd  1228846259.090428 RET   unmount 0
  4126 amd  1228846259.090492 CALL  
sigprocmask(SIG_BLOCK,0x2809b080,0xbfbfea0c)
  4126 amd  1228846259.090505 RET   sigprocmask 0
  4126 amd  1228846259.090518 CALL  sigprocmask(SIG_SETMASK,0x2809b090,0)
  4126 amd  1228846259.090530 RET   sigprocmask 0
  4126 amd  1228846259.090545 CALL  
sigprocmask(SIG_BLOCK,0x2809b080,0xbfbfe9dc)
  4126 amd  1228846259.090556 RET   sigprocmask 0
  4126 amd  1228846259.090576 CALL  sigprocmask(SIG_SETMASK,0x2809b090,0)
  4126 amd  1228846259.090587 RET   sigprocmask 0
  4126 amd  1228846259.090605 CALL  exit(0)
   848 amd  1228846259.091248 RET   select -1 errno 4 Interrupted system 
call
   848 amd  1228846259.091277 PSIG  SIGCHLD caught handler=0x805e090 
mask=0x0 code=0x0
   848 amd  1228846259.091298 CALL  wait4(0x,0xbfbfe83c,WNOHANG,0)
   848 amd  1228846259.091329 RET   wait4 4126/0x101e
   848 amd  1228846259.091342 CALL  wait4(0x,0xbfbfe83c,WNOHANG,0)
   848 amd  1228846259.091352 RET   wait4 -1 errno 10 No child processes
   848 amd  1228846259.091365 CALL  sigprocmask(SIG_SETMASK,0x80795bc,0)
   848 amd  1228846259.091377 RET   sigprocmask 0
   848 amd  1228846259.091390 CALL  sigprocmask(SIG_BLOCK,0x80792c4,0)
   848 amd  1228846259.091401 RET   sigprocmask 0
   848 amd  1228846259.091411 CALL  gettimeofday(0x8078e48,0)
   848 amd  1228846259.091422 RET   gettimeofday 0

Note that while the child didn't get EBUSY (as it does under RELENG_7)
-- indeed, the unmount call appears to have returned 0 -- the master
amd(8) process looks to be seeing errno 4 Interrupted system call.


And here's a relevent part of the kdump from the rm -fr -- I had
kdump spit out Epoch timestamps with each in order to make correlation
easier:

  4121 rm   1228846258.736266 CALL  unlink(0x2821c148)
  4121 rm   1228846258.736281 NAMI  distinfo
  4121 rm   1228846258.738329 RET   unlink 0
  4121 rm   1228846258.738379 CALL  unlink(0x2821c1b8)
  4121 rm   1228846258.738401 NAMI  pkg-descr
  4121 rm   1228846258.739963 RET   unlink 0
  4121 rm   1228846258.739982 CALL  open(0x28178b6b,O_RDONLY,unused0)
  4121 rm   1228846258.740002 NAMI  ..
  4121 rm   1228846258.740541 RET   open 4
  4121 rm   1228846258.740558 CALL  fstat(0x4,0xbfbfe96c)
  4121 rm   1228846258.740579 STRU  struct stat {dev=67174155, 
ino=22674937, mode=drwxr-xr-x , nlink=114, uid=9874, gid=929, rdev=0, 
atime=1228846258.184514000, stime
=1228846258.779501000, ctime=1228846258.779501000, birthtime=-1, 

Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-09 Thread Kostik Belousov
On Tue, Dec 09, 2008 at 11:01:10AM -0800, David Wolfskill wrote:
 On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:
  I seem to have a fairly- (though not deterministly so) reproducible
  mode of failure with an NFS-mounted directory hierarchy:  An attempt to
  traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
  -fr) will fail to visit some subdirectories, typically apparently
  acting as if the subdirectories in question do not actually exist
  (despite the names having been returned in the output of a previous
  readdir()).
  ... 
 

Did you saw me previous answer ? Supposed patch for your problem was
committed to head as r185557, and MFCed to 7 in r185796, and to
7.1 in r185801.

Please test with latest sources.


pgpc9EK8FjhlR.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-09 Thread Julian Elischer

Kostik Belousov wrote:

On Tue, Dec 09, 2008 at 11:01:10AM -0800, David Wolfskill wrote:

On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:

I seem to have a fairly- (though not deterministly so) reproducible
mode of failure with an NFS-mounted directory hierarchy:  An attempt to
traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
-fr) will fail to visit some subdirectories, typically apparently
acting as if the subdirectories in question do not actually exist
(despite the names having been returned in the output of a previous
readdir()).
... 


Did you saw me previous answer ? Supposed patch for your problem was
committed to head as r185557, and MFCed to 7 in r185796, and to
7.1 in r185801.

Please test with latest sources.



did you notice that he tested with latest -current and releng 7?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-09 Thread David Wolfskill
On Tue, Dec 09, 2008 at 02:20:05PM -0800, Julian Elischer wrote:
 Kostik Belousov wrote:
 ...
 Did you saw me previous answer ? Supposed patch for your problem was
 committed to head as r185557, and MFCed to 7 in r185796, and to
 7.1 in r185801.
 
 Please test with latest sources.
 
 did you notice that he tested with latest -current and releng 7?

CURRENT was as of yesterday, as was RELENG_7; while kib@'s commit
hit HEAD on 02 Dec, but didn't hit RELENG_7 until after I grabbed
the sources for RELENG_7 yesterday.

I have some local infrastructure hassles to deal with so I can update
the sources in question, but I will test RELENG_7 with the commit 
report back.

Peace,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpn5zT16LyT9.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-03 Thread Danny Braniss
 
 --hYooF8G/hrfVAmum
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 I seem to have a fairly- (though not deterministly so) reproducible
 mode of failure with an NFS-mounted directory hierarchy:  An attempt to
 traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
 -fr) will fail to visit some subdirectories, typically apparently
 acting as if the subdirectories in question do not actually exist
 (despite the names having been returned in the output of a previous
 readdir()).
 
 The file system is mounted read-write, courtesy of amd(8); none of
 the files has any non-default flags; there are no ACLs involved;
 and I owned the lot (that is, as owning user of the files).
 
 An example of sufficiently large has been demonstrated to be a recent
 copy of a FreeBSD ports tree.  (The problem was discovered using a
 hierarchy that had some proprietary content; I tried a copy of the ports
 tree to see if I could replicate the issue with something a FreeBSD
 hacker would more likely have handy.  And avoid NDA issues.  :-})
 
 Now, before I go further: I'm not pointing the finger at FreeBSD,
 here (yet).  At minimum, there could be fault with FreeBSD (as the NFS
 client); with amd(8); with the NetApp Filer (as the NFS server);
 or the network -- or the configuration(s) of any of them.
 
 But I just tried this, using the same NFS server, but a machine running
 Solaris 8 as an NFS client, and was unable to re-create the problem.
 
 And I found a way to avoid having the problem occur using a FreeBSD NFS
 client:  whack amd(8)'s config so that the dismount_interval is 12 hours
 instead of the default 2 minutes, thus effectivly preventing amd(8) from
 its normal attempts to unmount file systems.  Please note that I don't
 consider this a fix -- or even an acceptable circumvention, in the long
 term.  Rather, it's a diagnostic change, in an attempt to better
 understand the nature of the problem.
 
 Here are step-by-step instructions to recreate the problem;
 unfortunately, I believe I don't have the resources to test this
 anywhere but at work, though I will try it at home, to the extent
 that I can:
 
 * Set up the environment.
   * The failing environment uses NetApp filers as NFS servers.  I don't
 know what kind or how recent the software is on them, but can
 find out.  (I exepct they're fairly well-maintained.)
   * Ensure that the NFS space available is at least 10 GB or more.
 I will refer to this as ~/NFS/, as I tend to create such symlinks
 to keep track of things.
   * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of
 yesterday morning as an NFS client.  It also had a recently-updated
 /usr/ports tree, which was a CVS working directory (so each real
 subdirectory also had a CVS subdirectory within it).
   * Set up amd(8) so that ~/NFS is mounted on demand when it's
 referenced, and only via amd(8).  Ensure that the dismount_interval
 has the default value of 120 seconds.
 * Create a reference tarball.
   * cd /usr  tar zcpf ~/NFS/ports.tgz ports/
 * Create the test directory hierarchy.
   * cd ~/NFS  tar zxpf ports.tgz
 * Clear any cache.
   * Unmount ~/NFS, then re-mount it.  Or just reboot the NFS client
 machine.  Or arrange to have done all of the above set-up stuff
 from a differnet NFS client.
 * Set up for information capture (optional).
   * Use ps(1) or your favorite alternative tool to determine the PID for
 amd(8).  Note that `cat /var/run/amd.pid` won't do the trick.  :-{
   * Run ktrace(1) to capture activity from amd(8) and its descendants,
 e.g.:
 
   sudo ktrace -dip ${amd_pid} -f ktrace_amd.out
 
   * Start a packet-capture for NFS traffic, e.g.:
 
   sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server}
 
 * Start the test.
   * Do this under ktrace(1), if you did the above optional step:
 
   rm -fr ~/NFS/ports; echo $?
 
 As soon as rm(1) issues a whine, you might as well interrupt it
 (^C).
 
 * Stop the information capture, if you started it.
   * ^C for the tcpdump(1) process.
   * sudo ktrace -C
 
 
 If the packet capture file is too big for the analysis program you
 prefer to digest as a unit, see the net/tcpslice port for a bit of
 relief.  (Wireshark seems to want to read an entire packet capture file
 into main memory.)
 
 I have performed the above, with the information-gathering step; I can
 *probably* make that information available, but I'll need to check --
 some organizations get paranoid about things like host names.  I don't
 expect that my current employer is, but I don't know yet, so I won't
 promise.
 
 In the mean time, I should be able to extract somewhat-relevant
 information from what I've collected, if that would be useful.  While I
 wouldn't mind sharing the results, I strongly suspect that blow-by-blow
 analysis wouldn't be ideal for this (or any other) mailing list; I would
 be very happy to work 

Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-03 Thread David Wolfskill
On Wed, Dec 03, 2008 at 02:20:32PM +0200, Danny Braniss wrote:
 ...
 i'll try to check it here soon, but in the meantime, could you try the same
 but mounting directly, not via amd, to remove one item from the equation?
 (I don't know how much amd is involved here, but if you are running on a
 64bit host, amd could be swapped out, in which case it tends to realy screw
 things up, which is not your case, but ...)

Sorry; I should have mentioned that the NFS client was running
RELENG_7_1 as of Monday morning, i386 arch.  The amd.conf file specifies
plock for amd(8).

Note that merely telling amd(8) to kick the interval of attempted
unmounts from 2 minutes to 12 hours appears to avoid the observed
symptoms, so I'm fairly confident that bypassing amd(8) altogether would
do so as well.

In looking at the output from ktrace against amd(8), I recall having
seen that shortly before an observed failure, the (master) amd
process forks a child to attempt the unmount; the child issues an
unmount, the return for which is EBUSY (IIRC -- I'm not in a good
position to check just at the moment), so the child terminates with an
interrupted system call.

I'd have thought that since the attempted unmount failed, it wouldn't
make any difference, but it's right around that point that rm(1) is told
that a directory entry it found earlier doesn't exist, which rather
snowballs into the previously-described symptoms.

Peace,
david
-- 
David H. Wolfskill  [EMAIL PROTECTED]
Depriving a girl or boy of an opportunity for education is evil.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


pgpp431JKqC0x.pgp
Description: PGP signature


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-03 Thread Danny Braniss
 
 --vmttodhTwj0NAgWp
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On Wed, Dec 03, 2008 at 02:20:32PM +0200, Danny Braniss wrote:
  ...
  i'll try to check it here soon, but in the meantime, could you try the sa=
 me
  but mounting directly, not via amd, to remove one item from the equation?
  (I don't know how much amd is involved here, but if you are running on a
  64bit host, amd could be swapped out, in which case it tends to realy scr=
 ew
  things up, which is not your case, but ...)
 
 Sorry; I should have mentioned that the NFS client was running
 RELENG_7_1 as of Monday morning, i386 arch.  The amd.conf file specifies
 plock for amd(8).
 
 Note that merely telling amd(8) to kick the interval of attempted
 unmounts from 2 minutes to 12 hours appears to avoid the observed
 symptoms, so I'm fairly confident that bypassing amd(8) altogether would
 do so as well.
 
 In looking at the output from ktrace against amd(8), I recall having
 seen that shortly before an observed failure, the (master) amd
 process forks a child to attempt the unmount; the child issues an
 unmount, the return for which is EBUSY (IIRC -- I'm not in a good
 position to check just at the moment), so the child terminates with an
 interrupted system call.
 
 I'd have thought that since the attempted unmount failed, it wouldn't
 make any difference, but it's right around that point that rm(1) is told
 that a directory entry it found earlier doesn't exist, which rather
 snowballs into the previously-described symptoms.

so it does point to amd - or something inocent it does - which triggers the 
error.
btw, there are some patches (5 I think), that try to fix some of amd problems.
I've installed them, and things are quiet/ok -most of the time- but I get a
glitch once in a while. would love to iron them out though.

cheers,
danny


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS ( amd?) dysfunction descending a hierarchy

2008-12-03 Thread Kostik Belousov
On Tue, Dec 02, 2008 at 04:15:38PM -0800, David Wolfskill wrote:
 I seem to have a fairly- (though not deterministly so) reproducible
 mode of failure with an NFS-mounted directory hierarchy:  An attempt to
 traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
 -fr) will fail to visit some subdirectories, typically apparently
 acting as if the subdirectories in question do not actually exist
 (despite the names having been returned in the output of a previous
 readdir()).
 
 The file system is mounted read-write, courtesy of amd(8); none of
 the files has any non-default flags; there are no ACLs involved;
 and I owned the lot (that is, as owning user of the files).
 
 An example of sufficiently large has been demonstrated to be a recent
 copy of a FreeBSD ports tree.  (The problem was discovered using a
 hierarchy that had some proprietary content; I tried a copy of the ports
 tree to see if I could replicate the issue with something a FreeBSD
 hacker would more likely have handy.  And avoid NDA issues.  :-})
 
 Now, before I go further: I'm not pointing the finger at FreeBSD,
 here (yet).  At minimum, there could be fault with FreeBSD (as the NFS
 client); with amd(8); with the NetApp Filer (as the NFS server);
 or the network -- or the configuration(s) of any of them.
 
 But I just tried this, using the same NFS server, but a machine running
 Solaris 8 as an NFS client, and was unable to re-create the problem.
 
 And I found a way to avoid having the problem occur using a FreeBSD NFS
 client:  whack amd(8)'s config so that the dismount_interval is 12 hours
 instead of the default 2 minutes, thus effectivly preventing amd(8) from
 its normal attempts to unmount file systems.  Please note that I don't
 consider this a fix -- or even an acceptable circumvention, in the long
 term.  Rather, it's a diagnostic change, in an attempt to better
 understand the nature of the problem.
 
 Here are step-by-step instructions to recreate the problem;
 unfortunately, I believe I don't have the resources to test this
 anywhere but at work, though I will try it at home, to the extent
 that I can:
 
 * Set up the environment.
   * The failing environment uses NetApp filers as NFS servers.  I don't
 know what kind or how recent the software is on them, but can
 find out.  (I exepct they're fairly well-maintained.)
   * Ensure that the NFS space available is at least 10 GB or more.
 I will refer to this as ~/NFS/, as I tend to create such symlinks
 to keep track of things.
   * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of
 yesterday morning as an NFS client.  It also had a recently-updated
 /usr/ports tree, which was a CVS working directory (so each real
 subdirectory also had a CVS subdirectory within it).
   * Set up amd(8) so that ~/NFS is mounted on demand when it's
 referenced, and only via amd(8).  Ensure that the dismount_interval
 has the default value of 120 seconds.
 * Create a reference tarball.
   * cd /usr  tar zcpf ~/NFS/ports.tgz ports/
 * Create the test directory hierarchy.
   * cd ~/NFS  tar zxpf ports.tgz
 * Clear any cache.
   * Unmount ~/NFS, then re-mount it.  Or just reboot the NFS client
 machine.  Or arrange to have done all of the above set-up stuff
 from a differnet NFS client.
 * Set up for information capture (optional).
   * Use ps(1) or your favorite alternative tool to determine the PID for
 amd(8).  Note that `cat /var/run/amd.pid` won't do the trick.  :-{
   * Run ktrace(1) to capture activity from amd(8) and its descendants,
 e.g.:
 
   sudo ktrace -dip ${amd_pid} -f ktrace_amd.out
 
   * Start a packet-capture for NFS traffic, e.g.:
 
   sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server}
 
 * Start the test.
   * Do this under ktrace(1), if you did the above optional step:
 
   rm -fr ~/NFS/ports; echo $?
 
 As soon as rm(1) issues a whine, you might as well interrupt it
 (^C).
 
 * Stop the information capture, if you started it.
   * ^C for the tcpdump(1) process.
   * sudo ktrace -C
 
 
 If the packet capture file is too big for the analysis program you
 prefer to digest as a unit, see the net/tcpslice port for a bit of
 relief.  (Wireshark seems to want to read an entire packet capture file
 into main memory.)
 
 I have performed the above, with the information-gathering step; I can
 *probably* make that information available, but I'll need to check --
 some organizations get paranoid about things like host names.  I don't
 expect that my current employer is, but I don't know yet, so I won't
 promise.
 
 In the mean time, I should be able to extract somewhat-relevant
 information from what I've collected, if that would be useful.  While I
 wouldn't mind sharing the results, I strongly suspect that blow-by-blow
 analysis wouldn't be ideal for this (or any other) mailing list; I would
 be very happy to work with others to figure out what's gone wrong (or is
 misconfigured) and get 

NFS ( amd?) dysfunction descending a hierarchy

2008-12-02 Thread David Wolfskill
I seem to have a fairly- (though not deterministly so) reproducible
mode of failure with an NFS-mounted directory hierarchy:  An attempt to
traverse a sufficiently large hierarchy (e.g., via tar zcpf or rm
-fr) will fail to visit some subdirectories, typically apparently
acting as if the subdirectories in question do not actually exist
(despite the names having been returned in the output of a previous
readdir()).

The file system is mounted read-write, courtesy of amd(8); none of
the files has any non-default flags; there are no ACLs involved;
and I owned the lot (that is, as owning user of the files).

An example of sufficiently large has been demonstrated to be a recent
copy of a FreeBSD ports tree.  (The problem was discovered using a
hierarchy that had some proprietary content; I tried a copy of the ports
tree to see if I could replicate the issue with something a FreeBSD
hacker would more likely have handy.  And avoid NDA issues.  :-})

Now, before I go further: I'm not pointing the finger at FreeBSD,
here (yet).  At minimum, there could be fault with FreeBSD (as the NFS
client); with amd(8); with the NetApp Filer (as the NFS server);
or the network -- or the configuration(s) of any of them.

But I just tried this, using the same NFS server, but a machine running
Solaris 8 as an NFS client, and was unable to re-create the problem.

And I found a way to avoid having the problem occur using a FreeBSD NFS
client:  whack amd(8)'s config so that the dismount_interval is 12 hours
instead of the default 2 minutes, thus effectivly preventing amd(8) from
its normal attempts to unmount file systems.  Please note that I don't
consider this a fix -- or even an acceptable circumvention, in the long
term.  Rather, it's a diagnostic change, in an attempt to better
understand the nature of the problem.

Here are step-by-step instructions to recreate the problem;
unfortunately, I believe I don't have the resources to test this
anywhere but at work, though I will try it at home, to the extent
that I can:

* Set up the environment.
  * The failing environment uses NetApp filers as NFS servers.  I don't
know what kind or how recent the software is on them, but can
find out.  (I exepct they're fairly well-maintained.)
  * Ensure that the NFS space available is at least 10 GB or more.
I will refer to this as ~/NFS/, as I tend to create such symlinks
to keep track of things.
  * I used a dual, quad-core machine running FreeBSD RELENG_7_1 as of
yesterday morning as an NFS client.  It also had a recently-updated
/usr/ports tree, which was a CVS working directory (so each real
subdirectory also had a CVS subdirectory within it).
  * Set up amd(8) so that ~/NFS is mounted on demand when it's
referenced, and only via amd(8).  Ensure that the dismount_interval
has the default value of 120 seconds.
* Create a reference tarball.
  * cd /usr  tar zcpf ~/NFS/ports.tgz ports/
* Create the test directory hierarchy.
  * cd ~/NFS  tar zxpf ports.tgz
* Clear any cache.
  * Unmount ~/NFS, then re-mount it.  Or just reboot the NFS client
machine.  Or arrange to have done all of the above set-up stuff
from a differnet NFS client.
* Set up for information capture (optional).
  * Use ps(1) or your favorite alternative tool to determine the PID for
amd(8).  Note that `cat /var/run/amd.pid` won't do the trick.  :-{
  * Run ktrace(1) to capture activity from amd(8) and its descendants,
e.g.:

sudo ktrace -dip ${amd_pid} -f ktrace_amd.out

  * Start a packet-capture for NFS traffic, e.g.:

sudo tcpdump -s 0 -n -w nfs.bpf host ${nfs_server}

* Start the test.
  * Do this under ktrace(1), if you did the above optional step:

rm -fr ~/NFS/ports; echo $?

As soon as rm(1) issues a whine, you might as well interrupt it
(^C).

* Stop the information capture, if you started it.
  * ^C for the tcpdump(1) process.
  * sudo ktrace -C


If the packet capture file is too big for the analysis program you
prefer to digest as a unit, see the net/tcpslice port for a bit of
relief.  (Wireshark seems to want to read an entire packet capture file
into main memory.)

I have performed the above, with the information-gathering step; I can
*probably* make that information available, but I'll need to check --
some organizations get paranoid about things like host names.  I don't
expect that my current employer is, but I don't know yet, so I won't
promise.

In the mean time, I should be able to extract somewhat-relevant
information from what I've collected, if that would be useful.  While I
wouldn't mind sharing the results, I strongly suspect that blow-by-blow
analysis wouldn't be ideal for this (or any other) mailing list; I would
be very happy to work with others to figure out what's gone wrong (or is
misconfigured) and get things working properly.

If someone(s) would be willing to help, I'd appreciate it very much.  If
(enough) folks would actually prefer that the details stay in