Re: landing g_slice_*() in glib

2005-12-02 Thread Alexander Larsson
On Fri, 2005-12-02 at 02:55 +0100, Tim Janik wrote:
 g_malloc/g_free:
Starting 1 threads allocating random blocks = 1024 bytes with 
 seed=123456789 using slab allocator + magazine cache
real0m16.784s
 internal slab allocator:
Starting 1 threads allocating random blocks = 1024 bytes with 
 seed=123456789 using slab allocator
real0m26.040s
 g_slice_*() (slab allocator + magazine cache):
Starting 1 threads allocating random blocks = 1024 bytes with 
 seed=123456789 using system malloc
real1m40.123s
 

This part obviously has the headers in the wrong order. g_malloc is the
one that takes 1m40s.

Also, are you making GType use g_slice for allocation of objects instead
of memchunks (if n_preallocs is set)?

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander LarssonRed Hat, Inc 
   [EMAIL PROTECTED][EMAIL PROTECTED] 
He's a genetically engineered albino barbarian from a doomed world. She's a 
sharp-shooting antique-collecting mercenary who hides her beauty behind a pair 
of thick-framed spectacles. They fight crime! 

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: landing g_slice_*() in glib

2005-12-02 Thread Tim Janik

On Fri, 2 Dec 2005, Alexander Larsson wrote:


On Fri, 2005-12-02 at 02:55 +0100, Tim Janik wrote:

g_malloc/g_free:
   Starting 1 threads allocating random blocks = 1024 bytes with 
seed=123456789 using slab allocator + magazine cache
   real0m16.784s
internal slab allocator:
   Starting 1 threads allocating random blocks = 1024 bytes with 
seed=123456789 using slab allocator
   real0m26.040s
g_slice_*() (slab allocator + magazine cache):
   Starting 1 threads allocating random blocks = 1024 bytes with 
seed=123456789 using system malloc
   real1m40.123s



This part obviously has the headers in the wrong order. g_malloc is the
one that takes 1m40s.


thanks, you're right.


Also, are you making GType use g_slice for allocation of objects instead
of memchunks (if n_preallocs is set)?


memchunks in cvs are already using g_slice_*:
gpointer
g_mem_chunk_alloc (GMemChunk *mem_chunk)
{
  g_return_val_if_fail (mem_chunk != NULL, NULL);

  return g_slice_alloc (mem_chunk-alloc_size);
}

and gtype.c also uses g_slice_*() alloc regardless of n_preallocs, because
as i wrote in the original email, if the chunks/objects are too large,
g_slice_*() wil delegate to malloc anyway.

---
ciaoTJ
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: landing g_slice_*() in glib

2005-12-02 Thread Paul LeoNerd Evans
On Fri, 02 Dec 2005 09:44:04 +0100
Alexander Larsson [EMAIL PROTECTED] wrote:

 Also, are you making GType use g_slice for allocation of objects instead
 of memchunks (if n_preallocs is set)?

And what of the allocators used by GString / GList / GSlist / ...? Will
they be using g_slice now too..?

-- 
Paul LeoNerd Evans

[EMAIL PROTECTED]
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


landing g_slice_*() in glib

2005-12-01 Thread Tim Janik

hey all.

i've comitted the new memory allcoator to glib HEAD now.
it replaces the old slow and memory bloated memchunk implementation
and should provide significant speedups and memory savings over
using the systems malloc()/free() implementations.

gslice.c contains a large comment which outlines the basic
allocator structure, i'm appending that to this email for the
interested reader, and the really curious ones can continue with
reading the two slab allocator papers mentioned in the comment which
served as a basis for the implementation ;)

to sum it up, the new slab allocator uses posix_memalign(3) and free(1)
to optimize allocations of many euqally sized chunks, and we now have
per thread free lists (the magazine layer) to quickly satisfy allocation
requests of already known structure sizes.
this is accompanied by extra caching logic that can keep around freed
memory for approximately 15 seconds before returning it to the system.
memory left over from alignment constraints is used for cache colorization
(random distribution of chunk addresses) to improve processor cache
utilization. to improve SMP scalability, the caching layer can adapt
itself to high lock contention as it can occour with multiple threads on
SMP systems.

there's also a new test/profiling program glibs/tests/slice-test now,
which allocates lots of randomly sized structures and releases/reallocates
those in random order.
here're the timing comparisons of system malloc vs. just the new slab
allocator (used internally by gslice.c) and then the slab allocator plus
magazine cache as it's available through g_slice_new() and g_slice_free():

g_malloc/g_free:
  Starting 1 threads allocating random blocks = 512 bytes with seed=123456789 
using system malloc
  real0m39.502s
internal slab allocator:
  Starting 1 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator
  real0m20.064s
g_slice_*() (slab allocator + magazine cache):
  Starting 1 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m13.888s

that's single threaded on an 1.8GHz Athlon 512KB cache, glibc-2.3.2. malloc
takes allmost 3 times as long as gslice.c here.
for greater block sizes, malloc performs a lot worse:

g_malloc/g_free:
  Starting 1 threads allocating random blocks = 1024 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m16.784s
internal slab allocator:
  Starting 1 threads allocating random blocks = 1024 bytes with seed=123456789 
using slab allocator
  real0m26.040s
g_slice_*() (slab allocator + magazine cache):
  Starting 1 threads allocating random blocks = 1024 bytes with seed=123456789 
using system malloc
  real1m40.123s

and here's a test across multiple CPUs (window.gnome.org, QuadXeon, 2.8GHz,
unfortunately tested under load and without support for sched_setaffinity(2)):

  Starting 1 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m9.099s
  Starting 2 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m9.945s
  Starting 3 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m14.043s
  Starting 4 threads allocating random blocks = 512 bytes with seed=123456789 
using slab allocator + magazine cache
  real0m18.680s

the seed of the random block allocator was fixed to ensure the same allocation
pattern was used to produce comparable runs.


for large block sizes, g_slice_new() and g_slice_alloc() will automatically
delegate to the system malloc implementation. so for newly written code, as
long as objects are not resized during their lifetime and the obejct size
used at allocation time is still available when freeing, it is recommended
to use the new g_slice_*() API instead of g_malloc() and friends.

the allocator can be considered mostly ready at this point, what's left on
my TODO for it is:
- some additional test/profiling programs that should go into CVS
- look into facilitating __thread instead of g_private_get() to speed up
  the common code path
- do controlled profiling on multi-CPU machines to measure scalability with
  increasing numbers of processors.
- do more realistic profiling, preferably with memory allocation intensive
  apps that don't require user interaction.
especially in the latter area, help is apprechiated, i.e. if people have
specific memory benchmarking scenarios that can be made to work with glib
head, please speak up ;)


from gslice.c:
/* the GSlice allocator is split up into 4 layers, roughly modelled after the 
slab
 * allocator and magazine extensions as outlined in:
 * + [Bonwick94] Jeff Bonwick, The slab allocator: An object-caching kernel
 *   memory allocator. USENIX 1994, 
http://citeseer.ist.psu.edu/bonwick94slab.html
 * + [Bonwick01] Bonwick and Jonathan Adams, Magazines and vmem: Extending