Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread J. Bruce Fields
On Fri, Sep 21, 2007 at 07:46:32PM +0100, Nix wrote:
> On 18 Sep 2007, J. Bruce Fields told this:
> > Also I suppose we should check which version of nfs-utils that fix is in
> > and make sure distributions are getting the fixed nfs-utils before they
> > get the new libc, or we're going to see this bug a lot
> 
> Further info. This behaviour, although it is allowed by POSIX, is not
> *specified*, as such, by it (ah! ambiguity! life would be so dull
> without you!), and is not replicated by any other C library on the face
> of the planet.
> 
> Further, it's not even the behaviour of glibc, as such.
> 
> It's the behaviour of glibc 2.6.x as patched with this Debian-specific
> patch:

Good grief.  Well, I'd still count it as a bug in nfs-utils, then, but
nfs-util's share of the blame is much less than we'd previously
thought

> This patch also breaks at least bash and zsh as well as nfs-utils (see
> ) and I sort of
> suspect it's not long for this world as a result.
> 
> So this is much less nasty than we feared (well, and more nasty if you
> happen to be a Debian unstable or testing user).

OK!  Sounds like good news.

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread Nix
On 18 Sep 2007, J. Bruce Fields told this:
> Also I suppose we should check which version of nfs-utils that fix is in
> and make sure distributions are getting the fixed nfs-utils before they
> get the new libc, or we're going to see this bug a lot

Further info. This behaviour, although it is allowed by POSIX, is not
*specified*, as such, by it (ah! ambiguity! life would be so dull
without you!), and is not replicated by any other C library on the face
of the planet.

Further, it's not even the behaviour of glibc, as such.

It's the behaviour of glibc 2.6.x as patched with this Debian-specific
patch:

,[ any/submitted-fileops-and-signals.diff ]
| Index: glibc-2.6/libio/fileops.c
| ===
| --- glibc-2.6.orig/libio/fileops.c
| +++ glibc-2.6/libio/fileops.c
| @@ -517,10 +517,19 @@
|  fp->_cur_column = INTUSE(_IO_adjust_column) (fp->_cur_column - 1, data,
|count) + 1;
|_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
| -  fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_buf_base;
| -  fp->_IO_write_end = (fp->_mode <= 0
| +  if (__builtin_expect (count == to_do, 1))
| +{
| +  fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_buf_base;
| +  fp->_IO_write_end = (fp->_mode <= 0
|  && (fp->_flags & (_IO_LINE_BUF+_IO_UNBUFFERED))
|  ? fp->_IO_buf_base : fp->_IO_buf_end);
| +}
| +  else if (count > 0 && fp->_IO_write_base == data)
| +{
| +  memmove (data, data+count, 
| +   fp->_IO_write_ptr - fp->_IO_write_base - count);
| +  fp->_IO_write_ptr -= count;
| +}
|return count;
|  }
|  
`

This patch also breaks at least bash and zsh as well as nfs-utils (see
) and I sort of
suspect it's not long for this world as a result.

So this is much less nasty than we feared (well, and more nasty if you
happen to be a Debian unstable or testing user).

-- 
`Some people don't think performance issues are "real bugs", and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread Nix
On 18 Sep 2007, J. Bruce Fields told this:
 Also I suppose we should check which version of nfs-utils that fix is in
 and make sure distributions are getting the fixed nfs-utils before they
 get the new libc, or we're going to see this bug a lot

Further info. This behaviour, although it is allowed by POSIX, is not
*specified*, as such, by it (ah! ambiguity! life would be so dull
without you!), and is not replicated by any other C library on the face
of the planet.

Further, it's not even the behaviour of glibc, as such.

It's the behaviour of glibc 2.6.x as patched with this Debian-specific
patch:

,[ any/submitted-fileops-and-signals.diff ]
| Index: glibc-2.6/libio/fileops.c
| ===
| --- glibc-2.6.orig/libio/fileops.c
| +++ glibc-2.6/libio/fileops.c
| @@ -517,10 +517,19 @@
|  fp-_cur_column = INTUSE(_IO_adjust_column) (fp-_cur_column - 1, data,
|count) + 1;
|_IO_setg (fp, fp-_IO_buf_base, fp-_IO_buf_base, fp-_IO_buf_base);
| -  fp-_IO_write_base = fp-_IO_write_ptr = fp-_IO_buf_base;
| -  fp-_IO_write_end = (fp-_mode = 0
| +  if (__builtin_expect (count == to_do, 1))
| +{
| +  fp-_IO_write_base = fp-_IO_write_ptr = fp-_IO_buf_base;
| +  fp-_IO_write_end = (fp-_mode = 0
|   (fp-_flags  (_IO_LINE_BUF+_IO_UNBUFFERED))
|  ? fp-_IO_buf_base : fp-_IO_buf_end);
| +}
| +  else if (count  0  fp-_IO_write_base == data)
| +{
| +  memmove (data, data+count, 
| +   fp-_IO_write_ptr - fp-_IO_write_base - count);
| +  fp-_IO_write_ptr -= count;
| +}
|return count;
|  }
|  
`

This patch also breaks at least bash and zsh as well as nfs-utils (see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021) and I sort of
suspect it's not long for this world as a result.

So this is much less nasty than we feared (well, and more nasty if you
happen to be a Debian unstable or testing user).

-- 
`Some people don't think performance issues are real bugs, and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-21 Thread J. Bruce Fields
On Fri, Sep 21, 2007 at 07:46:32PM +0100, Nix wrote:
 On 18 Sep 2007, J. Bruce Fields told this:
  Also I suppose we should check which version of nfs-utils that fix is in
  and make sure distributions are getting the fixed nfs-utils before they
  get the new libc, or we're going to see this bug a lot
 
 Further info. This behaviour, although it is allowed by POSIX, is not
 *specified*, as such, by it (ah! ambiguity! life would be so dull
 without you!), and is not replicated by any other C library on the face
 of the planet.
 
 Further, it's not even the behaviour of glibc, as such.
 
 It's the behaviour of glibc 2.6.x as patched with this Debian-specific
 patch:

Good grief.  Well, I'd still count it as a bug in nfs-utils, then, but
nfs-util's share of the blame is much less than we'd previously
thought

 This patch also breaks at least bash and zsh as well as nfs-utils (see
 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=429021) and I sort of
 suspect it's not long for this world as a result.
 
 So this is much less nasty than we feared (well, and more nasty if you
 happen to be a Debian unstable or testing user).

OK!  Sounds like good news.

--b.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-18 Thread Nix
On 18 Sep 2007, J. Bruce Fields stated:

> On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote:
>> The code which calls new_do_write() looks like this:
>> 
>> ,[ libio/fileops.c:_IO_new_file_xsputn() ]
>> |  if (do_write)
>> |{
>> |  count = new_do_write (f, s, do_write);
>> |  to_do -= count;
>> |  if (count < do_write)
>> |return n - to_do;
>> |}
>> `
>> 
>> This code handles partial writes followed by errors by returning a
>> suitable nonzero value, and immediate errors by returning -1.
>> 
>> In either case the buffer will have been filled as much as possible by
>> that point, and will still be filled when (vf)printf() is next called.
>
> OK, I'm a little lost at this point (what's n?  What's to_do?), but I'll
> take your word for it.

n is the total amount to write: to_do is the amount still unwritten. The
rest is buffered but written.

> I'd be kinda curious when exactly the behavior changed and why.

Same here. I'm sort of surprised that it *did* change. The last change
to anything in that function was in 2005, and the fragment shown is
ten years old.

I suspect some other change made it easier to see this pre-existing
behaviour. I wonder if something in glibc used to call __fpurge() for
you?

> Also I suppose we should check which version of nfs-utils that fix is in
> and make sure distributions are getting the fixed nfs-utils before they
> get the new libc, or we're going to see this bug a lot

And since it looks like a kernel bug idiots like me are going to keep on
bugging the l-k list with a non-kernel bug.

> Let me know if the problem's fixed.

Well, the machine's still running after more than six hours, where before
it would freeze solid in half an hour or less.

So I'd say it's fixed :)

-- 
`Some people don't think performance issues are "real bugs", and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-18 Thread Nix
On 18 Sep 2007, J. Bruce Fields stated:

 On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote:
 The code which calls new_do_write() looks like this:
 
 ,[ libio/fileops.c:_IO_new_file_xsputn() ]
 |  if (do_write)
 |{
 |  count = new_do_write (f, s, do_write);
 |  to_do -= count;
 |  if (count  do_write)
 |return n - to_do;
 |}
 `
 
 This code handles partial writes followed by errors by returning a
 suitable nonzero value, and immediate errors by returning -1.
 
 In either case the buffer will have been filled as much as possible by
 that point, and will still be filled when (vf)printf() is next called.

 OK, I'm a little lost at this point (what's n?  What's to_do?), but I'll
 take your word for it.

n is the total amount to write: to_do is the amount still unwritten. The
rest is buffered but written.

 I'd be kinda curious when exactly the behavior changed and why.

Same here. I'm sort of surprised that it *did* change. The last change
to anything in that function was in 2005, and the fragment shown is
ten years old.

I suspect some other change made it easier to see this pre-existing
behaviour. I wonder if something in glibc used to call __fpurge() for
you?

 Also I suppose we should check which version of nfs-utils that fix is in
 and make sure distributions are getting the fixed nfs-utils before they
 get the new libc, or we're going to see this bug a lot

And since it looks like a kernel bug idiots like me are going to keep on
bugging the l-k list with a non-kernel bug.

 Let me know if the problem's fixed.

Well, the machine's still running after more than six hours, where before
it would freeze solid in half an hour or less.

So I'd say it's fixed :)

-- 
`Some people don't think performance issues are real bugs, and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote:
> The code which calls new_do_write() looks like this:
> 
> ,[ libio/fileops.c:_IO_new_file_xsputn() ]
> |  if (do_write)
> |{
> |  count = new_do_write (f, s, do_write);
> |  to_do -= count;
> |  if (count < do_write)
> |return n - to_do;
> |}
> `
> 
> This code handles partial writes followed by errors by returning a
> suitable nonzero value, and immediate errors by returning -1.
> 
> In either case the buffer will have been filled as much as possible by
> that point, and will still be filled when (vf)printf() is next called.

OK, I'm a little lost at this point (what's n?  What's to_do?), but I'll
take your word for it.

I'd be kinda curious when exactly the behavior changed and why.

Also I suppose we should check which version of nfs-utils that fix is in
and make sure distributions are getting the fixed nfs-utils before they
get the new libc, or we're going to see this bug a lot

> This behaviour is, IIRC, mandated by the C Standard: I can find no
> reference in the Standard to streams being flushed on error, only
> on fclose(), fflush(), or program termination.

OK!

Let me know if the problem's fixed.

--b.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
On 17 Sep 2007, J. Bruce Fields stated:
> On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote:
>> A while later we start seeing runs of malloc failures, which I think
>> correlated with the unexplained pauses in NFS response:
>
> Actually, they're nothing to do with malloc failures--the message
> printed here is misleading, and isn't even an error; it gets printed
> whenever an upcall to mountd is made.

Indeed, with more debugging, all the failures I see come from the call
to exp_find(), which is digging out exports...

>The problem is almost certainly a
> problem with kernel<->mountd communication--the kernel depends on mountd
> to answer questions about exported filesystems as part of the fh_verify
> code.

Ah! I keep forgetting that mountd isn't just used at mount time: damned
misleading names, grumble.

Restarting mountd clears the problem up temporarily, so you are
definitely right.

> commit dd087896285da9e160e13ee9f7d75381b67895e3
> Author: J. Bruce Fields <[EMAIL PROTECTED]>
> Date:   Thu Jul 26 16:30:46 2007 -0400

Aha! I'm on 3b55934b9baefecee17aefc3ea139e261a4b03b8, over a month older.

> On a recent Debian/Sid machine, I saw libc retrying stdio writes that
> returned write errors.

Debian Sid recently upgraded to glibc 2.6.x, as did I... earlier
versions of glibc will have had this behaviour too, but it may have been
less frequent.

> I don't know whether this libc behavior is correct or expected, but it
> seems safest to add the __fpurge() (suggested by Neil) to ensure data is
> thrown away.

It is expected, judging from my reading of the
code. stdio-common/vfprintf.c emits single chars using the outchar()
macro, and strings using the outstring() macro, using functions in libio
to do the job. The string output routine then calls _IO_file_xsputn(),
which, tracing through libio's jump tables and symbol aliases, ends up
calling _IO_new_file_xsputn() in libio/fileops.c. (I've only just
started to understand libio. It's basically undocumented as far as I can
tell, but it's deeply nifty. Think of stdio, only made entirely out of
hookable components. :) )

(Actual writing then thunks down through _IO_new_do_write() and
new_do_write() in the same file, which finally calls __write().  If
there's any kind of error this returns EOF after some opaque messing
about with a _cur_column value, which is as far as I can tell never
used!)

The code which calls new_do_write() looks like this:

,[ libio/fileops.c:_IO_new_file_xsputn() ]
|  if (do_write)
|{
|  count = new_do_write (f, s, do_write);
|  to_do -= count;
|  if (count < do_write)
|return n - to_do;
|}
`

This code handles partial writes followed by errors by returning a
suitable nonzero value, and immediate errors by returning -1.

In either case the buffer will have been filled as much as possible by
that point, and will still be filled when (vf)printf() is next called.


This behaviour is, IIRC, mandated by the C Standard: I can find no
reference in the Standard to streams being flushed on error, only
on fclose(), fflush(), or program termination.


I'm upgrading now: thank you!

-- 
`Some people don't think performance issues are "real bugs", and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote:
> Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
> Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
>  d32ff38f 404811a6 a88d96ab 0x1f
> Sep 17 22:57:55 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
>  d32ff38f 404811a6 a88d96ab)
> Sep 17 22:57:55 loki warning: kernel: nfsd: Dropping request due to malloc 
> failure!
> Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
> trying
> Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
> trying
> Sep 17 22:58:55 hades notice: kernel: nfs: server loki not responding, still 
> trying
> Sep 17 22:59:40 hades notice: kernel: nfs: server loki not responding, still 
> trying
> 
> 
> >From then on, *every* fh_verify() request fails the same way, and
> obviously if you can't verify any fds you can't do much with NFS.
> 
> Looking back in the log I see intermittent malloc failures starting
> almost as soon as I've booted (allowing a couple of minutes for me to
> turn debugging on):
> 
> Sep 17 22:25:50 hades notice: kernel: nfs: server loki OK
> [...]
> Sep 17 22:28:09 loki warning: kernel: nfsd_dispatch: vers 3 proc 19
> Sep 17 22:28:09 loki warning: kernel: nfsd: FSINFO(3)   28: 00070001 000fb001 
>  d32ff38f 404811a6 a88d96ab
> Sep 17 22:28:09 loki warning: kernel: nfsd: fh_verify(28: 00070001 000fb001 
>  d32ff38f 404811a6 a88d96ab)
> Sep 17 22:28:09 loki warning: kernel: nfsd: Dropping request due to malloc 
> failure!
> 
> A while later we start seeing runs of malloc failures, which I think
> correlated with the unexplained pauses in NFS response:

Actually, they're nothing to do with malloc failures--the message
printed here is misleading, and isn't even an error; it gets printed
whenever an upcall to mountd is made.  The problem is almost certainly a
problem with kernel<->mountd communication--the kernel depends on mountd
to answer questions about exported filesystems as part of the fh_verify
code.

It's just a shot in the dark, but you might try the latest nfs-utils
(get the latest out of git://linux-nfs.org/nfs-utils if you're already
on the most recent your distro will give you).  Or just apply the
following--which did fix a problem whose symptoms varied depending on
libc behavior.

If that doesn't work, I'd try

strace -s0 `pidof rpc.mountd`

and also look at the contents of /proc/net/rpc/nfsd.fh/contents.

--b.

commit dd087896285da9e160e13ee9f7d75381b67895e3
Author: J. Bruce Fields <[EMAIL PROTECTED]>
Date:   Thu Jul 26 16:30:46 2007 -0400

Use __fpurge to ensure single-line writes to cache files

On a recent Debian/Sid machine, I saw libc retrying stdio writes that
returned write errors.  The result is that if an export downcall returns
an error (which it can in normal operation, since it currently
(incorrectly) returns -ENOENT on any negative downcall), then subsequent
downcalls will write multiple lines (including the original line that
received the error).

The result is that the server fails to respond to any rpc call that
refers to an unexported mount point (such as a readdir of a directory
containing such a mountpoint), so client commands hang.

I don't know whether this libc behavior is correct or expected, but it
seems safest to add the __fpurge() (suggested by Neil) to ensure data is
thrown away.

Signed-off-by: "J. Bruce Fields" <[EMAIL PROTECTED]>
Signed-off-by: Neil Brown <[EMAIL PROTECTED]>

diff --git a/support/nfs/cacheio.c b/support/nfs/cacheio.c
index a76915b..9d271cd 100644
--- a/support/nfs/cacheio.c
+++ b/support/nfs/cacheio.c
@@ -17,6 +17,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -111,7 +112,18 @@ void qword_printint(FILE *f, int num)
 
 int qword_eol(FILE *f)
 {
+   int err;
+
fprintf(f,"\n");
+   err = fflush(f);
+   /*
+* We must send one line (and one line only) in a single write
+* call.  In case of a write error, libc may accumulate the
+* unwritten data and try to write it again later, resulting in a
+* multi-line write.  So we must explicitly ask it to throw away
+* any such cached data:
+*/
+   __fpurge(f);
return fflush(f);
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
Back in early 2006 I reported persistent hangs on the NFS server,
whereby all of a sudden about ten minutes after boot my primary NFS
server would cease responding to NFS requests until it was rebooted.


That time, the problem vanished when I switched to NFS-over-TCP:


Well, I just rebooted --- post-glibc-upgrade from 2.5 to 2.6.1, no
kernel upgrade or anything, so this bug has been latent during at
least my last three weeks of uptime. And it's back. (I've been
seeing strange long pauses doing things like ls, and I suspect
they are related: see below.)

/proc/sys/nfsd/exports on the freezing server:

# Version 1.1
# Path Client(Flags) # IPs
/usr/packages.bin/non-free  
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=90a98d8a:a8be4806:aea3a4e1:fe3437a0)
/home/.loki.wkstn.nix   
esperi.srvr.nix(rw,root_squash,async,wdelay,no_subtree_check,uuid=8ff32fd3:a6114840:ab968da8:25b41721)
/home/.loki.wkstn.nix   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=8ff32fd3:a6114840:ab968da8:25b41721)
/usr/lib/X11/fonts  
hades.wkstn.nix(ro,root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/share/xplanet  
hades.wkstn.nix(ro,root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/share/xemacs   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/packages   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=2a35f82a:cca144df:a1123587:23527f53)

I turned on ALL-class nfsd debugging and here's what I see as it freezes
up:

Sep 17 22:57:00 loki warning: kernel: nfsd_dispatch: vers 3 proc 1
Sep 17 22:57:00 loki warning: kernel: nfsd: GETATTR(3)  36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab
Sep 17 22:57:00 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:44 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:44 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:44 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:45 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:57:52 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:52 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:52 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:52 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:55 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:55 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:58:55 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:59:40 hades notice: kernel: nfs: server loki not responding, still 
trying


>From then on, *every* fh_verify() request fails the same way, and
obviously if you can't verify any fds you can't do much with NFS.

Looking back in the log I see intermittent malloc failures starting
almost as soon as I've booted (allowing a couple of minutes for me to
turn debugging on):

Sep 17 22:25:50 hades notice: kernel: nfs: server loki OK
[...]
Sep 17 22:28:09 loki warning: kernel: nfsd_dispatch: vers 3 proc 19
Sep 17 22:28:09 loki warning: kernel: nfsd: FSINFO(3)   28: 00070001 000fb001 
 d32ff38f 404811a6 a88d96ab
Sep 17 22:28:09 loki warning: kernel: nfsd: fh_verify(28: 00070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:28:09 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!

A while later we start seeing runs of malloc failures, which I think
correlated with the unexplained pauses in NFS response:

Sep 17 22:33:59 loki warning: kernel: nfsd_dispatch: vers 3 proc 6
Sep 17 22:33:59 loki warning: kernel: nfsd: READ(3) 44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7 20480 bytes at 4096
Sep 17 22:33:59 loki warning: kernel: nfsd: fh_verify(44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7)
Sep 17 22:33:59 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:33:59 loki warning: kernel: nfsd_dispatch: vers 3 proc 6
Sep 17 22:33:59 loki warning: kernel: nfsd: READ(3) 44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7 28672 bytes at 53248
Sep 17 22:33:59 loki 

[2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
Back in early 2006 I reported persistent hangs on the NFS server,
whereby all of a sudden about ten minutes after boot my primary NFS
server would cease responding to NFS requests until it was rebooted.
http://www.ussg.iu.edu/hypermail/linux/kernel/0601.3/1631.html

That time, the problem vanished when I switched to NFS-over-TCP:
http://www.ussg.iu.edu/hypermail/linux/kernel/0601.3/1773.html

Well, I just rebooted --- post-glibc-upgrade from 2.5 to 2.6.1, no
kernel upgrade or anything, so this bug has been latent during at
least my last three weeks of uptime. And it's back. (I've been
seeing strange long pauses doing things like ls, and I suspect
they are related: see below.)

/proc/sys/nfsd/exports on the freezing server:

# Version 1.1
# Path Client(Flags) # IPs
/usr/packages.bin/non-free  
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=90a98d8a:a8be4806:aea3a4e1:fe3437a0)
/home/.loki.wkstn.nix   
esperi.srvr.nix(rw,root_squash,async,wdelay,no_subtree_check,uuid=8ff32fd3:a6114840:ab968da8:25b41721)
/home/.loki.wkstn.nix   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=8ff32fd3:a6114840:ab968da8:25b41721)
/usr/lib/X11/fonts  
hades.wkstn.nix(ro,root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/share/xplanet  
hades.wkstn.nix(ro,root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/share/xemacs   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,uuid=87553c5b:d84740fc:b7d9f7e8:4f749689)
/usr/packages   
hades.wkstn.nix(rw,no_root_squash,async,wdelay,no_subtree_check,uuid=2a35f82a:cca144df:a1123587:23527f53)

I turned on ALL-class nfsd debugging and here's what I see as it freezes
up:

Sep 17 22:57:00 loki warning: kernel: nfsd_dispatch: vers 3 proc 1
Sep 17 22:57:00 loki warning: kernel: nfsd: GETATTR(3)  36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab
Sep 17 22:57:00 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:44 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:44 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:44 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:45 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:57:52 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:52 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:52 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:52 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab 0x1f
Sep 17 22:57:55 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:57:55 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:58:55 hades notice: kernel: nfs: server loki not responding, still 
trying
Sep 17 22:59:40 hades notice: kernel: nfs: server loki not responding, still 
trying


From then on, *every* fh_verify() request fails the same way, and
obviously if you can't verify any fds you can't do much with NFS.

Looking back in the log I see intermittent malloc failures starting
almost as soon as I've booted (allowing a couple of minutes for me to
turn debugging on):

Sep 17 22:25:50 hades notice: kernel: nfs: server loki OK
[...]
Sep 17 22:28:09 loki warning: kernel: nfsd_dispatch: vers 3 proc 19
Sep 17 22:28:09 loki warning: kernel: nfsd: FSINFO(3)   28: 00070001 000fb001 
 d32ff38f 404811a6 a88d96ab
Sep 17 22:28:09 loki warning: kernel: nfsd: fh_verify(28: 00070001 000fb001 
 d32ff38f 404811a6 a88d96ab)
Sep 17 22:28:09 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!

A while later we start seeing runs of malloc failures, which I think
correlated with the unexplained pauses in NFS response:

Sep 17 22:33:59 loki warning: kernel: nfsd_dispatch: vers 3 proc 6
Sep 17 22:33:59 loki warning: kernel: nfsd: READ(3) 44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7 20480 bytes at 4096
Sep 17 22:33:59 loki warning: kernel: nfsd: fh_verify(44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7)
Sep 17 22:33:59 loki warning: kernel: nfsd: Dropping request due to malloc 
failure!
Sep 17 22:33:59 loki warning: kernel: nfsd_dispatch: vers 3 proc 6
Sep 17 22:33:59 loki warning: kernel: nfsd: READ(3) 44: 02070001 0001ce75 
 5b3c5587 fc4047d8 e8f7d9b7 28672 bytes at 53248
Sep 17 22:33:59 loki warning: 

Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote:
 Sep 17 22:57:55 loki warning: kernel: nfsd_dispatch: vers 3 proc 4
 Sep 17 22:57:55 loki warning: kernel: nfsd: ACCESS(3)   36: 01070001 000fb001 
  d32ff38f 404811a6 a88d96ab 0x1f
 Sep 17 22:57:55 loki warning: kernel: nfsd: fh_verify(36: 01070001 000fb001 
  d32ff38f 404811a6 a88d96ab)
 Sep 17 22:57:55 loki warning: kernel: nfsd: Dropping request due to malloc 
 failure!
 Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
 trying
 Sep 17 22:58:50 hades notice: kernel: nfs: server loki not responding, still 
 trying
 Sep 17 22:58:55 hades notice: kernel: nfs: server loki not responding, still 
 trying
 Sep 17 22:59:40 hades notice: kernel: nfs: server loki not responding, still 
 trying
 
 
 From then on, *every* fh_verify() request fails the same way, and
 obviously if you can't verify any fds you can't do much with NFS.
 
 Looking back in the log I see intermittent malloc failures starting
 almost as soon as I've booted (allowing a couple of minutes for me to
 turn debugging on):
 
 Sep 17 22:25:50 hades notice: kernel: nfs: server loki OK
 [...]
 Sep 17 22:28:09 loki warning: kernel: nfsd_dispatch: vers 3 proc 19
 Sep 17 22:28:09 loki warning: kernel: nfsd: FSINFO(3)   28: 00070001 000fb001 
  d32ff38f 404811a6 a88d96ab
 Sep 17 22:28:09 loki warning: kernel: nfsd: fh_verify(28: 00070001 000fb001 
  d32ff38f 404811a6 a88d96ab)
 Sep 17 22:28:09 loki warning: kernel: nfsd: Dropping request due to malloc 
 failure!
 
 A while later we start seeing runs of malloc failures, which I think
 correlated with the unexplained pauses in NFS response:

Actually, they're nothing to do with malloc failures--the message
printed here is misleading, and isn't even an error; it gets printed
whenever an upcall to mountd is made.  The problem is almost certainly a
problem with kernel-mountd communication--the kernel depends on mountd
to answer questions about exported filesystems as part of the fh_verify
code.

It's just a shot in the dark, but you might try the latest nfs-utils
(get the latest out of git://linux-nfs.org/nfs-utils if you're already
on the most recent your distro will give you).  Or just apply the
following--which did fix a problem whose symptoms varied depending on
libc behavior.

If that doesn't work, I'd try

strace -s0 `pidof rpc.mountd`

and also look at the contents of /proc/net/rpc/nfsd.fh/contents.

--b.

commit dd087896285da9e160e13ee9f7d75381b67895e3
Author: J. Bruce Fields [EMAIL PROTECTED]
Date:   Thu Jul 26 16:30:46 2007 -0400

Use __fpurge to ensure single-line writes to cache files

On a recent Debian/Sid machine, I saw libc retrying stdio writes that
returned write errors.  The result is that if an export downcall returns
an error (which it can in normal operation, since it currently
(incorrectly) returns -ENOENT on any negative downcall), then subsequent
downcalls will write multiple lines (including the original line that
received the error).

The result is that the server fails to respond to any rpc call that
refers to an unexported mount point (such as a readdir of a directory
containing such a mountpoint), so client commands hang.

I don't know whether this libc behavior is correct or expected, but it
seems safest to add the __fpurge() (suggested by Neil) to ensure data is
thrown away.

Signed-off-by: J. Bruce Fields [EMAIL PROTECTED]
Signed-off-by: Neil Brown [EMAIL PROTECTED]

diff --git a/support/nfs/cacheio.c b/support/nfs/cacheio.c
index a76915b..9d271cd 100644
--- a/support/nfs/cacheio.c
+++ b/support/nfs/cacheio.c
@@ -17,6 +17,7 @@
 
 #include nfslib.h
 #include stdio.h
+#include stdio_ext.h
 #include ctype.h
 #include unistd.h
 #include sys/types.h
@@ -111,7 +112,18 @@ void qword_printint(FILE *f, int num)
 
 int qword_eol(FILE *f)
 {
+   int err;
+
fprintf(f,\n);
+   err = fflush(f);
+   /*
+* We must send one line (and one line only) in a single write
+* call.  In case of a write error, libc may accumulate the
+* unwritten data and try to write it again later, resulting in a
+* multi-line write.  So we must explicitly ask it to throw away
+* any such cached data:
+*/
+   __fpurge(f);
return fflush(f);
 }
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread Nix
On 17 Sep 2007, J. Bruce Fields stated:
 On Mon, Sep 17, 2007 at 11:23:46PM +0100, Nix wrote:
 A while later we start seeing runs of malloc failures, which I think
 correlated with the unexplained pauses in NFS response:

 Actually, they're nothing to do with malloc failures--the message
 printed here is misleading, and isn't even an error; it gets printed
 whenever an upcall to mountd is made.

Indeed, with more debugging, all the failures I see come from the call
to exp_find(), which is digging out exports...

The problem is almost certainly a
 problem with kernel-mountd communication--the kernel depends on mountd
 to answer questions about exported filesystems as part of the fh_verify
 code.

Ah! I keep forgetting that mountd isn't just used at mount time: damned
misleading names, grumble.

Restarting mountd clears the problem up temporarily, so you are
definitely right.

 commit dd087896285da9e160e13ee9f7d75381b67895e3
 Author: J. Bruce Fields [EMAIL PROTECTED]
 Date:   Thu Jul 26 16:30:46 2007 -0400

Aha! I'm on 3b55934b9baefecee17aefc3ea139e261a4b03b8, over a month older.

 On a recent Debian/Sid machine, I saw libc retrying stdio writes that
 returned write errors.

Debian Sid recently upgraded to glibc 2.6.x, as did I... earlier
versions of glibc will have had this behaviour too, but it may have been
less frequent.

 I don't know whether this libc behavior is correct or expected, but it
 seems safest to add the __fpurge() (suggested by Neil) to ensure data is
 thrown away.

It is expected, judging from my reading of the
code. stdio-common/vfprintf.c emits single chars using the outchar()
macro, and strings using the outstring() macro, using functions in libio
to do the job. The string output routine then calls _IO_file_xsputn(),
which, tracing through libio's jump tables and symbol aliases, ends up
calling _IO_new_file_xsputn() in libio/fileops.c. (I've only just
started to understand libio. It's basically undocumented as far as I can
tell, but it's deeply nifty. Think of stdio, only made entirely out of
hookable components. :) )

(Actual writing then thunks down through _IO_new_do_write() and
new_do_write() in the same file, which finally calls __write().  If
there's any kind of error this returns EOF after some opaque messing
about with a _cur_column value, which is as far as I can tell never
used!)

The code which calls new_do_write() looks like this:

,[ libio/fileops.c:_IO_new_file_xsputn() ]
|  if (do_write)
|{
|  count = new_do_write (f, s, do_write);
|  to_do -= count;
|  if (count  do_write)
|return n - to_do;
|}
`

This code handles partial writes followed by errors by returning a
suitable nonzero value, and immediate errors by returning -1.

In either case the buffer will have been filled as much as possible by
that point, and will still be filled when (vf)printf() is next called.


This behaviour is, IIRC, mandated by the C Standard: I can find no
reference in the Standard to streams being flushed on error, only
on fclose(), fflush(), or program termination.


I'm upgrading now: thank you!

-- 
`Some people don't think performance issues are real bugs, and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22.6] nfsd: fh_verify() `malloc failure' with lots of free memory leads to NFS hang

2007-09-17 Thread J. Bruce Fields
On Tue, Sep 18, 2007 at 12:54:07AM +0100, Nix wrote:
 The code which calls new_do_write() looks like this:
 
 ,[ libio/fileops.c:_IO_new_file_xsputn() ]
 |  if (do_write)
 |{
 |  count = new_do_write (f, s, do_write);
 |  to_do -= count;
 |  if (count  do_write)
 |return n - to_do;
 |}
 `
 
 This code handles partial writes followed by errors by returning a
 suitable nonzero value, and immediate errors by returning -1.
 
 In either case the buffer will have been filled as much as possible by
 that point, and will still be filled when (vf)printf() is next called.

OK, I'm a little lost at this point (what's n?  What's to_do?), but I'll
take your word for it.

I'd be kinda curious when exactly the behavior changed and why.

Also I suppose we should check which version of nfs-utils that fix is in
and make sure distributions are getting the fixed nfs-utils before they
get the new libc, or we're going to see this bug a lot

 This behaviour is, IIRC, mandated by the C Standard: I can find no
 reference in the Standard to streams being flushed on error, only
 on fclose(), fflush(), or program termination.

OK!

Let me know if the problem's fixed.

--b.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/