On Sat, 3 Dec 2005, Morten Welinder wrote:
I might be dense, but what exactly does g_slice_ buy us over plain, old malloc? * Don't say speed. That's a bogus argument because you cannot possibly have tested all mallocs.
speed. simply because for some (glibc) tested mallocs, g_slice is faster, and on many systems malloc is known to be a lot slower than the glibc malloc.
* Don't say overhead for the same reason.
overhead. per page (4096 bytes on ia32), we're currently using 24 bytes for administrative purposes (8 of those are left for the systems malloc). the rest of the page is divided into chunks of the required size, and the reminder is used for cache coloring. so taking only the administrative strcutures into account, you're better off than malloc which needs an extra 4byte size tag for each block, taking the reminder into account, you may end up worse if comparing only to mallocs size tag. but then, glibc malloc provides cache coloring as well and i don#t know the overhead of that, though i'd expect it to be somewhat around the figures used by g_slice.
What do we pay?
one thing you should take into account is the wastage and speed issues the old mem-chunk code had. most importantly, g_slice is about replacing memchunks as described in http://bugzilla.gnome.org/show_bug.cgi?id=118439 (and the dups linked from there)
* We have probably killed tools like Purify that know about malloc specifically.
there is nothing worse in using g_slice over using memchunks here.
* We have more overhead now since both malloc and g_slice_ will keep their own free pools.
we have a lot less overhead now, because each memchunk was keeping its own free pool. with gslice, we only have one global depot left.
* More code (now malloc and g_slice_), both likely to be kept hot by the cpu cache.
trade this against the old memchunk code, and you end up +-0.
I just don't see it. Trying to outdo an unknown malloc is hubris. If you had wanted something marginally fancy, you could have taken goffice's pool allocator. It comes with: * Free leak detection support. You can iterate over leaked pieces. * You can free a whole pool without freeing the pieces (but then you cannot detect leaks, of course). * O(1) ability to give back memory to the system when chunks become free.
if you really want to understand the issues at hand, you'll have to read up the bugreports i pointed you at and take a look at what the old memchunk code does. and if you seriously want to argue one allocator over the other, you should read the slab allocator papers first: [Bonwick94] Jeff Bonwick, The slab allocator: An object-caching kernel memory allocator. USENIX 1994, http://citeseer.ist.psu.edu/bonwick94slab.html [Bonwick01] Bonwick and Jonathan Adams, Magazines and vmem: Extending the slab allocator to many cpu's and arbitrary resources. maybe that'll already answer some of the questions you have.
For better [speed] or worse, it isn't currently thread safe. That could be trivial to add, though.
making an allocator thread safe is definitely non-trivial, the second paper [Bonwick01] has the details on that. just to mention a few issues involved: - you don't want to pay lock+unlock per every single allocation - if you use n CPUs (n>1), you want to be able to allocate n times the amount of memory of a single CPU per time interval, and not just be safe against concurrent allocations - if you consider one allocator per thread, take into account that chunks may be allocated in one thread and freed in another - you want to avoid false sharing, where different CPUs keep accessing different chunks which lie in the same cache line, this paper has more information on that: [Berger00] Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson, Hoard: A Scalable Memory Allocator for Multithreaded Applications. November 2000, http://www.cs.umass.edu/~emery/hoard/
Morten
Unrelatedly, why drop type-safety for the non-gcc case?
getting type safe arguments right in macros is non trivial, especially across multiple compilers. i already had to adapt the initially comitted logic simply due to differences in gcc versions.
Something along these lines ought to work: #define g_slice_free(type, mem) \ do { \ if (1) g_slice_free (type, mem); else (void)((type*)0 == mem); \ } while (0);
thanks, i'll look into using that for g_slice.
In the gcc case you need to supply a non-macro definition of g_slice_free. Otherewise the resulting program is not valid, regardless of that "while (0)".
sorry, i don't understand you here. what do you mean by "not valid" and what version of the gcc case are you referring to (it'd help if you pasted the relevant lines)? --- ciaoTJ _______________________________________________ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list