Hi!

I did a bit of an analysis of the data gathered so far.

Current status: 8236072KB of allocated memory by squid since restart of squid 
on the 8th, so about 5-6 days.

The following memory pools have most of an increase in the last 2 days (>100kB):
Type-date               KB-20140712     KB-20140714     KB-Delta        
Cnt-20140712    Cnt-20140714    Cnt-Delta
Total                           5629096 7494319 1865223 26439770        
33704210        7264440
mem_node                2375038 3192370 817332  588017  790374  202357
4K Buffer                       1138460 1499996 361536  284615  374999  90384
Short Strings           456219  606107  149888  11679188        15516319        
3837131
16K Buffer              213120  323120  110000  13320           20195           
6875
HttpHeaderEntry         312495  415162  102667  5714194 7591516 1877322
2K Buffer                       249876  351226  101350  124938  175613  50675
8K Buffer                       135944  182360  46416           16993           
22795           5802
HttpReply               133991  178174  44183           490023  651607  161584
MemObject               114845  152713  37868           490004  651575  161571
Medium Strings          90893           120859  29966           727141  966866  
239725
cbdata BodyPipe (39)    65367           88238           22871           440363  
594443  154080
HttpHdrCc               41486           55327           13841           442515  
590153  147638
32K Buffer              23584           35360           11776           737     
        1105            368
cbdata MemBuf (13)      30627           40726           10099           490026  
651615  161589
LRU policy node         46068           49871           3803            1965553 
2127797 162244
64K Buffer              1664            2240            576             26      
        35              9
Long Strings            1444            2007            563             2888    
        4014            1126
StoreEntry              173530  173746  216             1480781 1482628 1847

All of those have linear increases.
They also show similar "wavy" behavior - when one has a "bump" then some of the 
others also have a Bump.

So now there are several "groups":
* pools that stay constant (wordlist,...)
* pools that show variability like our traffic-curves (Comm::Connections)
* pools that increase minimally (starting at 80% of current KB 2 days ago) 
(ip_cache_entry, LRU_policy_node)
* pool that increases a bit (starting at 35% of current KB 2days ago) 
fqdncache_entry
* Pools that increase a lot (starting at below 20% of the currend KB 2 days 
ago) - which are (sorted from Biggest to smallest KB footprint):
** mem_node
** 4K Buffer
** Short Strings
** HttpHeaderEntry
** 2K Buffer
** 16K Buffer
** 8K Buffer
** Http Reply
** Mem Object
** Medium Strings
** cbdata BodyPipe (39)
** HttpHdrCc
** cbdata MemBuff(13)
** 32K Buffer
** Long Strings

So there must be something that links all of those in the last group together.

If you again look at the delta of the % between hours one can find that most of 
those show again a "traffic-curve" pattern in the increase (which is the wavy 
part I was talking about earlier)
All of the pools in this specific group show (again) similar behavior with 
similar ratios.

So it seems to me as we keeping too much information in our cache, which never 
gets evicted - as I had said earlier: my guess would be the extra info to 
manage "Vary" possibly related to some cleanup processes not evicting all the 
"related" objects in cache...

This is when I started to look at some other variation reported in other values.

So here the values of "StoreEntries" for the last few days:
20140709-020001:        1472007 StoreEntries
20140710-020001:        1475545 StoreEntries
20140711-020001:        1478025 StoreEntries
20140712-020001:        1480771 StoreEntries
20140713-020001:        1481721 StoreEntries
20140714-020001:        1482608 StoreEntries
These stayed almost constant...

But looking at " StoreEntries with MemObjects" the picture is totally different.
20140709-020001:        128542 StoreEntries with MemObjects
20140710-020001:        275923 StoreEntries with MemObjects
20140711-020001:        387990 StoreEntries with MemObjects
20140712-020001:        489994 StoreEntries with MemObjects
20140713-020001:        571872 StoreEntries with MemObjects
20140714-020001:        651560 StoreEntries with MemObjects

And "on disk objects":
20140709-020001:        1470163 on-disk objects
20140710-020001:        1472215 on-disk objects
20140711-020001:        1473671 on-disk objects
20140712-020001:        1475614 on-disk objects
20140713-020001:        1475933 on-disk objects
20140714-020001:        1476291 on-disk objects
(constant again)

And " Hot Object Cache Items":
20140709-020001:        128532 Hot Object Cache Items
20140710-020001:        275907 Hot Object Cache Items
20140711-020001:        387985 Hot Object Cache Items
20140712-020001:        489989 Hot Object Cache Items
20140713-020001:        571862 Hot Object Cache Items
20140714-020001:        651556 Hot Object Cache Items

So if you look at the finer details and traffic pattern we again see that 
traffic pattern for:
* storeEntries with MemObjects
* Hot Object Cache Items

And these show similar behavior to the pools mentioned above.
The other 2 types stay fairly constant and also decrease in count.

So maybe all this gives additional evidence which objects are using so much 
more memory.

Did this give any hints?

Do you want to see any other data gathered?

martin

-----Original Message-----
From: Martin Sperl 
Sent: Freitag, 11. Juli 2014 09:06
To: Amos Jeffries; squid-users@squid-cache.org
Subject: RE: [squid-users] squid: Memory utilization higher than expected since 
moving from 3.3 to 3.4 and Vary: working

The basic connection stats are in the mgr:info:
File descriptor usage for squid:
        Maximum number of file descriptors:   65536
        Largest file desc currently in use:   1351
        Number of file desc currently in use:  249
        Files queued for open:                   0
        Available number of file descriptors: 65287
        Reserved number of file descriptors:   100
        Store Disk files open:                   0

Also: our loadbalancer will disconnect idle connections after some time and I 
believe the config has similar settings...

Will send you the hourly details since the restart in a personal email due to 
size limits of the mailinglist.

Here the current size of the process:
squid    15022  9.5 29.6 4951452 4838272 ?     Sl   Jul08 317:01 (squid-1) -f 
/opt/cw/squid/squid.conf

Martin

-----Original Message-----
From: Amos Jeffries [mailto:squ...@treenet.co.nz] 
Sent: Freitag, 11. Juli 2014 05:24
To: Martin Sperl; squid-users@squid-cache.org
Subject: Re: [squid-users] squid: Memory utilization higher than expected since 
moving from 3.3 to 3.4 and Vary: working

On 8/07/2014 10:20 p.m., Martin Sperl wrote:
> The problem is that it is a "slow" leak - it takes some time (month) to find 
> it...
> Also it only happens on real live traffic with high volume plus high 
> utilization of "Vary:"
> Moving our prod environment to head would be quite a political issue inside 
> our organization.
> Arguing to go to the latest stable version 3.4.6 would be possible, but I 
> doubt it would change a thing
> 
> In the meantime we have not restarted the squids yet, so we still got a bit 
> of information available if needed.
> But we cannot keep it up in this state much longer.
> 
> I created a core-dump, but analyzing that is hard.
> 
> Here the top strings from that 10GB core-file - taken via: strings corefile| 
> sort | uniq -c | sort -rn | head -20).
> This may give you some idea:
> 2071897 =0.7
> 1353960 Keep-Alive
> 1343528 image/gif
>  877129 HTTP/1.1 200 OK
>  855949  GMT
>  852122 Content-Type
>  851706 HTTP/
>  851371 Date
>  850485 Server
>  848027 IEND
>  821956 Content-Length
>  776359 Content-Type: image/gif
>  768935 Cache-Control
>  760741 ETag
>  743341 live
>  720255 Connection
>  677920 Connection: Keep-Alive
>  676108 Last-Modified
>  662765 Expires
>  585139 X-Powered-By: Servlet/2.4 JSP/2.0
> 
> Another thing I thought we could do is:
> * restart squids
> * run mgr:mem every day and compare the daily changes for all the values 
> (maybe others?)
> 
> Any other ideas how to "find" the issue?
> 

Possibly a list of the mgr:filedescriptors open will show if there are
any hung connections/transactions, or long-polling connections holding
onto state.


Do you have the mgr:mem reports over the last few days? I can start
analysing to see if anything else pops out at me.

Amos


This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,
you may review at http://www.amdocs.com/email_disclaimer.asp

Reply via email to