Hi,

I'm running Squid 2.6STABLE21 as a reverse proxy (migration to 2.7STABLE6 in progress).

After Squid has been up for a day or two handling about 500 (mostly cacheable) requests per second, we start to see CPU spikes reaching 100% and response times getting longer. It usually recovers on its own, but we sometimes resort to restarting it, which always fixes the problem quickly. Attaching gdb and hitting Ctrl-C randomly while it is in this state usually lands in malloc. Zenoss plots (from SNMP) of the number of cached objects always show a decline when this is happening, as if a burst of requests yielding larger responses is displacing many more smaller responses already in the cache.

The config uses no disk cache ("cache_dir null /mw/data/cache/diskcache") and roughly 3GB ("cache_mem 3072 MB") of memory cache on an 8GB machine. I've tried bumping memory_pools_limit up to 1024 MB from the default, but that doesn't seem to make a difference.

I've modified the source in ways that use a bit more dynamic memory than usual - for example logging all four of the HTTP headers (client/server, request/response) and I've added a few fields to some core data structures like _request_t and _AccessLogEntry. But I'm pretty sure there are no memory leaks, and it has run smoothly for extended periods in the past.

It seems like there is something about a newly shifting workload that's exposing a heap fragmentation problem, and I'm trying to get a grip on how memory pools work. If my extended code uses only xmalloc() for dynamic memory, do those objects automatically become candidates for storage in a memory pool when freed? Or do I have to do something special to associate them with a memory pool?

Thanks,
Ben

Reply via email to