On 10/11/17 12:11, Bike dernikov1 wrote:
On Thu, Nov 9, 2017 at 5:13 PM, Marcus Kool <marcus.k...@urlfilterdb.com> wrote:


On 09/11/17 11:04, Bike dernikov1 wrote:
[snip]

Memory compsumption:squid use largest part of memory  (12GB now,
second proces use 300MB memory), 14GB used by all process. So squid
use over 80% of total used memory.
So no there are not any problematic process. But we changed swappiness
settings.


Did you monitor Squid for growth (it can start with 12 GB and grow
slowly) ?


Yes we are monitoring continuosly.
Now:
Output from free -m.

             total       used    free   shared  buff/cache  available
Mem:  24101     20507  256    146      3337         3034
Swap: 24561      5040   19521

vm.swappiness=40

Memory by process:
squid  Virt       RES   SHR  MEM%
             22,9G  18.7   8164   79,6


Hmm. Squid grew from 12 GB to 18.7 GB (23 GB virtual).

Today problem appeared again after logrotate at 2.56AM.
Used memory was at peek 23,7GB.

ok. it is clear that Squid grows too much.
On a 24GB system with many helpers and a URL filter I think the maximum size 
should be 14GB.

Before logrorate started, cached was at 2GB, buffer at 1,5GB.
After logrorate started cache jumped to 3.7GB and buffer unchanged at 1,5GB.

Fork errors stopped after 1 minute. At 2:57.
cache memory dropped by 500MB  to 3.2GB and continued at same level
till morning, buffer  same at 1.5GB.

After 4 at 3:00 minutes new WARNING appeared. external ACL queue
overload. Using stale results.

We have night shift and they told us that Internet worked ok.

After restart at around 7.00AM used memory dropped from 22 GB to 7GB,
cache and buffer remain at same levels.

How come Squid uses 7 GB at startup when there is no disk cache ?

With vm.swappiness=40 Linux starts to page out parts of processes when they
occupy more than 60% of the memory.
This is a potential bottleneck and I would have also decreased vm.swappiness
to 10 as you did.

My guess is that Squid starts too many helpers in a short time frame and
that because of paging there are too many forks in progress simultaneously
which causes the memory exhaustion.

We are now testing with 100 helpers for negotiate_kerberos_auth.
vm.swappiness returned to 60.

I suggest to reduce the memory cache of Squid by 50% and set vm.swappiness
to 20.

Squid cache memory is set at 14GB reduced from 16GB from 20GB  in two turns.

are you saying that you have
   cache_mem 14G
If yes, you should read the memory FAQ and reduce this.
'cache_mem 14G' explains that Squid starts 'small' and grows over time.

And then observe:
- total memory use
- total swap usage (should be lower than the 5 GB that you have now)
- number of helper processes that are started in short time frames
And then in small steps increase the memory cache and maybe further reduce
vm.swappiness to 10.

If we survive with actual setup, we will continue with reducing as you suggest.
Last extreme will be swap disable swappof but just for test with 6
eyes on monitoring :)

squidguard two process  300MB boths,.

CPU 0.33 0.37 0.43

Squid cannot fork and higher swappiness increases the amount of memory
that
the OS can use to copy processes.
It makes me think that you have the memory overcommit set to 2 (no
overcommit).
What is the output of the following command ?
     sysctl  -a | grep overcommit


Command output:

vm.nr_overcommit_hugepages = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50

cat /proc/sys/vm/overcommit_memory
0


The overcommit settings look fine.

At least something right :)


Advice for some settings:
We have absolute max peak of  2500 users which user squid (of 2800),
what are recomended settings for:
negotiate_kerberos_children start/idle
squidguard helpers.



I have little experience with kerberos, but most likely this is not the
issue.
When Squid cannot fork the helpers, helper settings do not matter much.


For 2500 users you probably need 32-64 squidguard helpers.


Can you confirm: For 2500 users:

url_rewrite children X (squidguard)  32-64 will be ok ? We have set
much larger number.

Squidguard url_rewrite children was set to 64.

Did I understand it correctly that earlier in this reply you said that there
are two squidguard processes (300 MB each).

Yes (first two process in htop, two rewrite childrens) others was on 0.0%.

ufdbGuard is faster than squidGuard and has multithreaded helpers.
ufdbGuard needs less helpers than squidGuard.
If you have a much larger number than 64 url rewrite helpers than I suggest
to switch to ufdbGuard as soon as possible since the memory usage is then at
least 600% less.

UfdbGuard have few strong features. Development, kerberos,
concurency/multitreading.
As i wrote, if we read documentation slower we wouldn't
Do ufdbGuard supoort ldap secure auth ? We tried ldap secure with
squidguard without success.

ufdbGuard supports any user database with the "execuserlist" feature.
See the Reference Manual for details.

For  helper:
negotitate_kerberos_auth

auth_param negotiate children X startup Y idle Z. What X, Y, Z are
best for our user number ?

We disabled kerberos replay cache because of disk performance (4 SAS
DISK  15K, RAID 10) (iowait jumped high, and CPU load jumped to min
40 max 200).
We don't use disk caching.

Thanks for help,

Marcus


Thanks for help,

On Wed, Nov 8, 2017 at 10:53 AM, Marcus Kool
<marcus.k...@urlfilterdb.com> wrote:


There is definitely a problem with available memory because Squid
cannot
fork.
So start with looking at how much memory Squid and its helpers use.
Do do have other processes on this system that consume a lot of memory
?

Also note that ufdbGuard uses less memory that squidGuard.
If there are 30 helpers squidguard uses 300% more memory than
ufdbGuard.

Look at the wiki for more information about memory usage:
https://wiki.squid-cache.org/SquidFaq/SquidMemory   (currently has an
expired certificate but it is safe to go ahead)

Marcus



On 08/11/17 07:26, Bike dernikov1 wrote:



Hi, I hope that someone can explain what happened, why squid stopped
working.
The problem is related to  memory/swap handling.

After we changed vm.swappiness parameter from 60 to 10 (tuning
attempt, to lower a disk usage, because we have only 4 disks in a
RAID10, so disk subsystem  is a weak link), we got a lot of errors in
cache.log.
The problems started after scheduled logrotate after  2AM.
Squid ran out of memory, auth helpers stopped working.
It's weird because we didn't disable swap, but behavior is like we
did.
After an error, we increased parameter from 10 to 40.

The server has 24GB DDR3 memory,  disk swap set to 24GB, 12 CPU (24HT
cores).
We have 2800 users, using  kerberos authentication, squidguard for
filtering, ldap authorization.
When problem appeared memory was still 3GB free (free column), ram
(caching) was filled to 15GB, so 21 GB ram filled, 3GB free.

Thanks for help,


errors from cache.log.

2017/11/08 02:55:27| Set Current Directory to /var/log/squid/
2017/11/08 02:55:27 kid1| storeDirWriteCleanLogs: Starting...
2017/11/08 02:55:27 kid1|   Finished.  Wrote 0 entries.
2017/11/08 02:55:27 kid1|   Took 0.00 seconds (  0.00 entries/sec).
2017/11/08 02:55:27 kid1| logfileRotate:
daemon:/var/log/squid/access.log
2017/11/08 02:55:27 kid1| logfileRotate:
daemon:/var/log/squid/access.log
2017/11/08 02:55:28 kid1| Pinger socket opened on FD 30
2017/11/08 02:55:28 kid1| helperOpenServers: Starting 1/1000
'squidGuard' processes
2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
2017/11/08 02:55:28 kid1| WARNING: Cannot run '/usr/bin/squidGuard'
process.
2017/11/08 02:55:28 kid1| helperOpenServers: Starting 300/3000
'negotiate_kerberos_auth' processes
2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
2017/11/08 02:55:28 kid1| WARNING: Cannot run
'/usr/lib/squid/negotiate_kerberos_auth' process.
2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
2017/11/08 02:55:28 kid1| WARNING: Cannot run
'/usr/lib/squid/negotiate_kerberos_auth' process.
2017/11/08 02:55:28 kid1| ipcCreate: fork: (12) Cannot allocate memory
2017/11/08 02:55:28 kid1| WARNING: Cannot run
'/usr/lib/squid/negotiate_kerberos_auth' process.

external ACL 'memberof' queue overload. Using stale result.
_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users




_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users



_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


_______________________________________________
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users

Reply via email to