Re: [Dbmail-dev] Memory pools (was: double free/junk pointers)

Sean Chittenden Wed, 10 Nov 2004 07:56:09 +0100 (CET)

So, we should crib a memory pool API from someplace (IIRC, Timo
Sirainen has a good one in Dovecot that is MIT licensed) or write
our own in Glib style and send it upstream (probably a bigger
project than it needs to be).


Why?  To avoid the constant free(3)'ing?


Yes, exactly. Keeping track of each malloc to properly match it with a

free is a huge pain in the butt and a constant source of trouble. And,as

good C programmers, we're supposed to have a distaste for garbage
collection ;-)

Garbage collection, actually, is turning out to be pretty quick thesedays. I'm no fan of most things put out by Microsoft, but, their C#garbage collector and, more importantly, the bohem garbage collector isturning out to be as fast as manually allocated/deallocated chunks ofmemory. It's also tending to suffer from less fragmentation from whatI've heard. While I tend to err on the side of explicitallocation/deallocation, I want to point out that many of my peers arecorrect in pointing out that garbage collection isn't bad and in mostcases, should be used.


A reasonable rant on the topic:

http://www.jwz.org/doc/gc.html

From the Boehm homepage(http://www.hpl.hp.com/personal/Hans_Boehm/gc/), the following quoteseems appropriate:

"Performance of the nonincremental collector is typically competitivewith malloc/free implementations. Both space and time overhead arelikely to be only slightly higher for programs written for malloc/free(see Detlefs, Dosser and Zorn's Memory Allocation Costs in Large C andC++ Programs.) For programs allocating primarily very small objects,the collector may be faster; for programs allocating primarily largeobjects it will be slower. If the collector is used in a multithreadedenvironment and configured for thread-local allocation, it may in somecases significantly outperform malloc/free allocation in time."

Other quotes from a GC FAQ(http://www.iecc.com/gclist/GC-faq.html#Common%20questions):

Is garbage collection slow?
Not necessarily. Modern garbage collectors appear to run as quickly asmanual storage allocators (malloc/free or new/delete). Garbagecollection probably will not run as quickly as customized memoryallocator designed for use in a specific program. On the other hand,the extra code required to make manual memory management work properly(for example, explicit reference counting) is often more expensivethan a garbage collector would be. This is more likely to be true in amultithreaded program, if the specialized allocator is a sharedresource (which it usually is).
Since this was first written, memory has become so cheap that garbagecollectors have been applied to very-large heaps, for example morethan a gigabyte. For a sufficiently large live set, pause times arestill an issue. On the other hand, for very many applications moderngarbage collectors provide pause times that are completely compatiblewith human interaction. Pause times below 1/10th of a second are oftenthe case, and applications with relatively small live sets (or slowlychanging live sets, for generational collector) can obtain pause timesbelow 1/100th of a second.


But I digress...

Memory pools are big chunks of malloc() memory that we request fromthe
system and then use for ourselves as needed in smaller pieces. In
Apache's
APR implementation (which is really, really bloated) you can alsonestpools. Rather than a malloc/free pair for every string, you justmake a
pool, "malloc" from the pool recklessly ;-) and then when you're all
done,
free the entire pool back to the system.
I'm in favor of stability, then optimizing later.  :)  In all
seriousness, this is what FreeBSD's malloc(3) call does (and is why
it's really quick).
The system's malloc only closes the "application pool" when the
application closes. We're writing a daemon, so this functionality is
basically useless. Hence the need to write something similar that works
within the application. For example, on a per-connection basis, wherethememory used by a client is freed when they close the connection.(There'dstill a need to be vigilant so that we don't run up too much memory ona
connection that's open for a long time, though.) And, most saliently,
where the memory used to service a command is freed once the commandhas
been responded to.

That's one way of doing it, but that's an arbitrary point to deallocatememory. In the case of FreeBSD, contiguous blocks of memory at the endof the process space get sbrk(2)'ed back to the OS. So, while thismeans that there is no large deallocation done at the end of a clientconnection, it does mean that after the second client disconnects, thebackend should be able to release two connections worth of memory. Forexample:


0xc0de0000 -> Memory allocated for the first connection
0xc0de1000 -> Memory allocated for the second connection
0xc0de0000 -> Memory allocated for the third connection

When the first free(3) happens for the third connection, it should beable to sbrk(2) data from the 0xcode1000 range back to the kernel.Now, that assumption is based on the fact that there isn't some tid-bitof data hanging out above the 0xc0de1000 range, which is very plausibleand which defeats my point. :) Might I suggest the use of the "MemoryPool System?"


http://www.ravenbrook.com/project/mps/

It's BSD licensed, which is a very good thing, IMHO.

        --Sean Trying-hard-to-procrastinate-moving-IP-blocks Chittenden

Re: [Dbmail-dev] Memory pools (was: double free/junk pointers)

Reply via email to