I have interesting memory leak data to share with these two lists (crossposting to both svn and apr dev lists).
Ever since we launched svn-on-bigtable over at Google (about 2 years ago), we've been struggling with mysterious memory leaks in apache -- very similar to what users are complaining about in Subversion issue 3084. After lots of analysis, here's what we've figured out so far. Symptom: When you have a process that runs for a very long time while making use of APR pools, the global pool tends to fragment into tiny pieces, and APR just keeps on malloc()ing without ever calling free(). In other words, a guaranteed long-and-slow leak. Most people don't notice this problem with httpd, because they run httpd in prefork mode: a bunch of httpd processes that only serve 1000 requests, then die and get re-spawned. They never live long enough to exhibit the leak. But if you run apache in threaded mode, and let the same apache run for days and weeks, it leaks a *lot*. Cause: If you look at APR's pool code, you can see the main reason for fragmentation. In a nutshell, it never recombines recycled memory. For example, suppose over an hour I create 20 subpools each 5k in size, then apr_pool_destroy() them in turn. APR then places these blocks into a 'free memory' list for future recycling. If I then create a new subpool that requires 3k, no problem -- APR gives me back one of the existing 5k blocks to use. But if I create a subpool that requires 20k, whoops, it just goes and malloc()s 20k from the OS, rather than combining four adjacent blocks from the 'free' list. Our solution: Over at Google, we simply hacked APR to *never* hold on to blocks for recycling. Essentially, this makes apr_pool_destroy() always free() the block, and makes apr_pool_create() always call malloc() malloc. Poof, all the memory leak went away instantly. What was more troubling is that the use of the MaxMemFree directive -- which is supposed to limit the total size of the 'free memory' recycling list -- didn't seem to work for us. What we need to do is go back and debug this more carefully, and see if it's a bug in APR, apache, or just in our testing methodology. But I think there's still got to be something wrong with MaxMemFree, since users are claiming it's not working for them in issue 3084. Something is fishy. We plan to look into it more, but since users are screaming, maybe someone else can beat us to it... In the long term, I think we need to question the utility of having APR do memory recycling at all. Back in the early 90's, malloc() was insanely slow and worth avoiding. In 2008, now that we're running apache with nothing but malloc/free, we're unable measure any performance hit. The whole pool interface is really nice, but I wonder if pool recycling may just be unnecessary on modern hardware and OSes.
