------- Comment #5 from jima at cadence dot com  2007-03-21 02:31 -------
(In reply to comment #4)
> (In reply to comment #3)
>... Why do you think 1 or 2 more
> instructions will be a hit, most of the problems you are going to hit into now
> is cache issues anyways unless you do a better malloc pool in the first case 
> as
> you will have non consecative memory that would really like to be close
> together.  Also you alloc the memory and then you add it to your free list but
> never actually free the memory so say if you alloc 100 10 sized objects and
> free those, and then alloc 50 20 sized objects, now your memory usage is twice
> than what it should be.

This was a contrived demo program.  The real application mallocs large 
fixed-sized chunks and sub-divides them to fill a free-list with many 
objects at once.  On average, new & delete execute only a few instructions 
(if the object has no virtuals!).  This is similar to what the Linux malloc
does internally, but without the indirect function calls.

It is true that lots of memory can be stranded after different sized objects
are allocated and freed.  In our system, applications occasionally call a
method which detects if any chunks contain no live blocks, and frees those
chunks.  This mostly cures the problem  because each chunk usually
contains objects of only one size, so chunks with any live
objects tend to be fairly well occupied.  However the apps. have to actually
use this feature :-)

We also reduce the number of unique sizes by rounding the requested size up to
some multiple [a compile-time constant calculation].

I agree that pre-allocating many objects may reduce cache locality.
On the other hand, LIFO recycling means that we re-use the hottest memory
first, and applications should use only a few different sizes during a given
phase.  So the active blocks will tend to be adjacent (in multiple sets,
one set for each unique size after rounding).  I haven't tried to measure
cache locality effect, however.

We think this all pays off for specific application areas which spend a
significant portion of their time allocating & freeing small objects, 
such as for inlined code to to do graph operations, hash-table 
inserts & deletes, etc.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31289

Reply via email to