Bug#898137: nfsd: increase DRC cache limit

2018-05-27 Thread Salvatore Bonaccorso
Control: tags -1 + pending

Hi Sergio,

On Mon, May 07, 2018 at 09:40:08PM +0200, Sergio Gelato wrote:
> Source: linux
> Version: 4.9.88-1
> Severity: wishlist
> Tags: patch
> 
> I've run into this capacity limitation in stretch, which is addressed
> upstream in Linux 4.15 by the following commit:
> 
> commit 44d8660d3bb0a1c8363ebcb906af2343ea8e15f6
> Author: J. Bruce Fields 
> Date:   Tue Sep 19 20:51:31 2017 -0400
> 
> nfsd: increase DRC cache limit
> 
> which trivially applies to Linux 4.9 (I haven't checked 3.16) and provides
> significant relief in my use case. It would save me (and perhaps others)
> work if this change could be included in Debian's 4.9 kernel packages;
> otherwise I'll have to keep maintaining my own fork. (4.15 has other
> issues so I don't want to use it in production yet.)
> 
> For the benefit of others who may be running into the same problem, here
> is a more detailed description.
> 
> Symptom: an NFS server accepts only a limited number of concurrent v4.1+
> mounts. Once that limit is reached, new clients get NFS4ERR_DELAY (10008)
> replies to CREATE_SESSION. (This can be seen in the server's dmesg after
> rpcdebug -m nfsd -s proc.) Increasing the number of nfsd threads has no
> impact on the number of mounts allowed. A server with 512MB of RAM
> only accepts 7 or 8 concurrent NFSv4.1+ mounts. From the perspective of
> an affected client, mount.nfs appears to hang (triggering a kernel backtrace
> after 120 seconds); in reality, though, it just keeps reissuing CREATE_SESSION
> calls until one of them succeeds.
> 
> Pre-v4.1 clients are unaffected by this since sessions are new to NFS v4.1.
> 
> The proposed patch just increases the limit by an order of magnitude, at
> the cost of using more kernel memory. As noted in comments in the source
> code, it would be nice to make this tuneable by the server administrator.

Alright, I have added 44d8660d3bb0a1c8363ebcb906af2343ea8e15f6 in the
stretch branch, so it will be included in the next point release
update for src:linux packages.

Regards,
Salvatore



Processed: Re: Bug#898137: nfsd: increase DRC cache limit

2018-05-27 Thread Debian Bug Tracking System
Processing control commands:

> tags -1 + pending
Bug #898137 [src:linux] nfsd: increase DRC cache limit
Added tag(s) pending.

-- 
898137: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898137
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#898137: nfsd: increase DRC cache limit

2018-05-07 Thread Sergio Gelato
Source: linux
Version: 4.9.88-1
Severity: wishlist
Tags: patch

I've run into this capacity limitation in stretch, which is addressed
upstream in Linux 4.15 by the following commit:

commit 44d8660d3bb0a1c8363ebcb906af2343ea8e15f6
Author: J. Bruce Fields 
Date:   Tue Sep 19 20:51:31 2017 -0400

nfsd: increase DRC cache limit

which trivially applies to Linux 4.9 (I haven't checked 3.16) and provides
significant relief in my use case. It would save me (and perhaps others)
work if this change could be included in Debian's 4.9 kernel packages;
otherwise I'll have to keep maintaining my own fork. (4.15 has other
issues so I don't want to use it in production yet.)

For the benefit of others who may be running into the same problem, here
is a more detailed description.

Symptom: an NFS server accepts only a limited number of concurrent v4.1+
mounts. Once that limit is reached, new clients get NFS4ERR_DELAY (10008)
replies to CREATE_SESSION. (This can be seen in the server's dmesg after
rpcdebug -m nfsd -s proc.) Increasing the number of nfsd threads has no
impact on the number of mounts allowed. A server with 512MB of RAM
only accepts 7 or 8 concurrent NFSv4.1+ mounts. From the perspective of
an affected client, mount.nfs appears to hang (triggering a kernel backtrace
after 120 seconds); in reality, though, it just keeps reissuing CREATE_SESSION
calls until one of them succeeds.

Pre-v4.1 clients are unaffected by this since sessions are new to NFS v4.1.

The proposed patch just increases the limit by an order of magnitude, at
the cost of using more kernel memory. As noted in comments in the source
code, it would be nice to make this tuneable by the server administrator.