So, we should crib a memory pool API from someplace (IIRC, Timo
Sirainen has a good one in Dovecot that is MIT licensed) or write
our own in Glib style and send it upstream (probably a bigger
project than it needs to be).
Why? To avoid the constant free(3)'ing?
Yes, exactly. Keeping track of each malloc to properly match it with a
free is a huge pain in the butt and a constant source of trouble. And,
as
good C programmers, we're supposed to have a distaste for garbage
collection ;-)
Garbage collection, actually, is turning out to be pretty quick these
days. I'm no fan of most things put out by Microsoft, but, their C#
garbage collector and, more importantly, the bohem garbage collector is
turning out to be as fast as manually allocated/deallocated chunks of
memory. It's also tending to suffer from less fragmentation from what
I've heard. While I tend to err on the side of explicit
allocation/deallocation, I want to point out that many of my peers are
correct in pointing out that garbage collection isn't bad and in most
cases, should be used.
A reasonable rant on the topic:
http://www.jwz.org/doc/gc.html
From the Boehm homepage
(http://www.hpl.hp.com/personal/Hans_Boehm/gc/), the following quote
seems appropriate:
"Performance of the nonincremental collector is typically competitive
with malloc/free implementations. Both space and time overhead are
likely to be only slightly higher for programs written for malloc/free
(see Detlefs, Dosser and Zorn's Memory Allocation Costs in Large C and
C++ Programs.) For programs allocating primarily very small objects,
the collector may be faster; for programs allocating primarily large
objects it will be slower. If the collector is used in a multithreaded
environment and configured for thread-local allocation, it may in some
cases significantly outperform malloc/free allocation in time."
Other quotes from a GC FAQ
(http://www.iecc.com/gclist/GC-faq.html#Common%20questions):
Is garbage collection slow?
Not necessarily. Modern garbage collectors appear to run as quickly as
manual storage allocators (malloc/free or new/delete). Garbage
collection probably will not run as quickly as customized memory
allocator designed for use in a specific program. On the other hand,
the extra code required to make manual memory management work properly
(for example, explicit reference counting) is often more expensive
than a garbage collector would be. This is more likely to be true in a
multithreaded program, if the specialized allocator is a shared
resource (which it usually is).
Since this was first written, memory has become so cheap that garbage
collectors have been applied to very-large heaps, for example more
than a gigabyte. For a sufficiently large live set, pause times are
still an issue. On the other hand, for very many applications modern
garbage collectors provide pause times that are completely compatible
with human interaction. Pause times below 1/10th of a second are often
the case, and applications with relatively small live sets (or slowly
changing live sets, for generational collector) can obtain pause times
below 1/100th of a second.
But I digress...
Memory pools are big chunks of malloc() memory that we request from
the
system and then use for ourselves as needed in smaller pieces. In
Apache's
APR implementation (which is really, really bloated) you can also
nest
pools. Rather than a malloc/free pair for every string, you just
make a
pool, "malloc" from the pool recklessly ;-) and then when you're all
done,
free the entire pool back to the system.
I'm in favor of stability, then optimizing later. :) In all
seriousness, this is what FreeBSD's malloc(3) call does (and is why
it's really quick).
The system's malloc only closes the "application pool" when the
application closes. We're writing a daemon, so this functionality is
basically useless. Hence the need to write something similar that works
within the application. For example, on a per-connection basis, where
the
memory used by a client is freed when they close the connection.
(There'd
still a need to be vigilant so that we don't run up too much memory on
a
connection that's open for a long time, though.) And, most saliently,
where the memory used to service a command is freed once the command
has
been responded to.
That's one way of doing it, but that's an arbitrary point to deallocate
memory. In the case of FreeBSD, contiguous blocks of memory at the end
of the process space get sbrk(2)'ed back to the OS. So, while this
means that there is no large deallocation done at the end of a client
connection, it does mean that after the second client disconnects, the
backend should be able to release two connections worth of memory. For
example:
0xc0de0000 -> Memory allocated for the first connection
0xc0de1000 -> Memory allocated for the second connection
0xc0de0000 -> Memory allocated for the third connection
When the first free(3) happens for the third connection, it should be
able to sbrk(2) data from the 0xcode1000 range back to the kernel.
Now, that assumption is based on the fact that there isn't some tid-bit
of data hanging out above the 0xc0de1000 range, which is very plausible
and which defeats my point. :) Might I suggest the use of the "Memory
Pool System?"
http://www.ravenbrook.com/project/mps/
It's BSD licensed, which is a very good thing, IMHO.
--Sean Trying-hard-to-procrastinate-moving-IP-blocks Chittenden