------- Comment #5 from jima at cadence dot com 2007-03-21 02:31 ------- (In reply to comment #4) > (In reply to comment #3) >... Why do you think 1 or 2 more > instructions will be a hit, most of the problems you are going to hit into now > is cache issues anyways unless you do a better malloc pool in the first case > as > you will have non consecative memory that would really like to be close > together. Also you alloc the memory and then you add it to your free list but > never actually free the memory so say if you alloc 100 10 sized objects and > free those, and then alloc 50 20 sized objects, now your memory usage is twice > than what it should be.
This was a contrived demo program. The real application mallocs large fixed-sized chunks and sub-divides them to fill a free-list with many objects at once. On average, new & delete execute only a few instructions (if the object has no virtuals!). This is similar to what the Linux malloc does internally, but without the indirect function calls. It is true that lots of memory can be stranded after different sized objects are allocated and freed. In our system, applications occasionally call a method which detects if any chunks contain no live blocks, and frees those chunks. This mostly cures the problem because each chunk usually contains objects of only one size, so chunks with any live objects tend to be fairly well occupied. However the apps. have to actually use this feature :-) We also reduce the number of unique sizes by rounding the requested size up to some multiple [a compile-time constant calculation]. I agree that pre-allocating many objects may reduce cache locality. On the other hand, LIFO recycling means that we re-use the hottest memory first, and applications should use only a few different sizes during a given phase. So the active blocks will tend to be adjacent (in multiple sets, one set for each unique size after rounding). I haven't tried to measure cache locality effect, however. We think this all pays off for specific application areas which spend a significant portion of their time allocating & freeing small objects, such as for inlined code to to do graph operations, hash-table inserts & deletes, etc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289