On 28/07/2013 6:19 p.m., Peter Retief wrote:
Peter:
Do you mean you've patched the source code, and if so, how do I get
that patch?  Do I have to move from the stable trunk?
Amos:
Sorry yes that is what I meant and it can now be found here:

http://www.squid-cache.org/Versions/v3/3.HEAD/changesets/squid-3-12957.patch
It should apply on the stable 3.3 easily, although I have not tested that.
NP: if you rebuild please go with the 3.3.8 security update release.
I have patched the file as documented, and recompiled with the 3.3.8 branch

Peter:
The first log occurences are:
2013/07/23 08:26:13 kid2| Attempt to open socket for EUI retrieval
failed:
(24) Too many open files
2013/07/23 08:26:13 kid2| comm_open: socket failure: (24) Too many
open files
2013/07/23 08:26:13 kid2| Reserved FD adjusted from 100 to 15394 due
to failures
Amos:
So this worker #2 got errors after reaching about 990 open FD (16K -
15394). Ouch.

Note that all these socket opening operations are failing with the "Too
many open files" error the OS sends back when limiting Squid to 990 or so
FD. This has confirmed that Squid is not mis-calculating > where its limit
is, but something in the OS is actually causing it to limit the worker. The
first one to hit was a socket, but also a disk file access is getting them
soon after so it is likely the global OS limit
rather than a particular FD type limit. That 990 usable FD is also
suspiciously close to 1024 with a few % held spare for emergency use (as
Squid does when calculating its reservation value).

Amos, I don't understand how you deduced the 990 open FD from the error
messages above ( "adjusted from 100 to 15394")?

Squid starts with 16K of which 100 are reserved FD. When it changes that the 16K limit is still the total, but the reserved is raised to make N sockets reserved/unavailable.

So 16384 - 15394 = 990 FD safe to use after adjustments caused by the error.


   I would have deduced that
there was some internal limit of 100 (not 1000) FD's, and that squid was
re-adjusting to the maximum currently allowed (16K)?

Yes, that is correct. However it is the "reserved" limit being raised.

Reserved is the number of FD which are configured as available but determined to be unusable. For example this can be though of as the cordon on a danger zone for FD - if Squid strays into using those number of sockets again it can expect errors. Raising that count reduces Squid operational FD resources by the amount raised. Squid may still try to use some of them under peak load conditions, but will do so only if there is no other way to free up the safe in-use FD.

Due to that case for emergency usage, when Squid sets the reserved limit it does not set it exactly on the FD number which got error'd. It sets is 2-3% into the "safe" FD count. So rounding 990 up that slight amount we get the 1024 which is a highly suspicious value.

Amos

Reply via email to