Phong, how does the new AST vmalloc compare to jemalloc? Olga
---------- Forwarded message ---------- From: Nicholas Clark <n...@ccl4.org> Date: Wed, Aug 15, 2012 at 9:27 PM Subject: jemalloc To: perl5-port...@perl.org Artur and Tim Bunce suggested investigating jemalloc, which is a high performance malloc implementation now used by (among others) FreeBSD and Facebook. Artur also suggest that our use of arenas of memory (for SV bodies) is no longer the best idea, give that malloc() implementations have got better. Fortunately arenas are easy to disable, by compiling with -DPURIFY. So here is a comparison of blead (on dromedary, -Os, no threads), default, compiled with -DPURIFY, default using an LD_PRELOAD to force the use of jemalloc 3.0.0, and finally compiled with -DPURIFY and using jemalloc. Not having anything fantastically better to hand, this is perlbench, with each of the 4 run twice. IIRC smaller numbers are better, and anything less than 5% is probably noise: A B C D E F G H --- --- --- --- --- --- --- --- arith/mixed 100 101 101 98 102 98 101 101 arith/trig 100 101 101 99 100 98 99 100 array/copy 100 101 95 101 101 100 102 100 array/foreach 100 79 102 76 101 76 101 79 array/index 100 112 101 105 100 112 101 110 array/pop 100 103 100 100 102 102 102 102 array/shift 100 101 97 98 100 100 101 100 array/sort-num 100 103 100 103 100 103 100 102 array/sort 100 87 98 84 100 84 97 87 call/0arg 100 111 100 104 107 104 102 108 call/1arg 100 99 103 96 106 96 103 97 call/2arg 100 105 97 99 96 100 95 103 call/9arg 100 103 98 102 101 94 99 103 call/empty 100 102 99 102 99 97 96 103 call/fib 100 100 100 97 97 100 101 101 call/method 100 106 101 102 97 104 100 105 call/wantarray 100 109 98 101 100 102 98 110 hash/copy 100 85 102 81 101 78 104 88 hash/each 100 94 102 88 85 88 102 93 hash/foreach-sort 100 97 99 97 100 94 101 96 hash/foreach 100 96 98 95 103 93 102 94 hash/get 100 101 98 101 100 102 101 102 hash/set 100 96 102 102 100 101 102 91 loop/for-c 100 106 111 105 101 106 109 106 loop/for-range-const 100 99 99 97 96 97 94 98 loop/for-range 100 100 101 92 99 98 99 99 loop/getline 100 104 98 104 100 103 100 104 loop/while-my 100 103 101 99 100 101 99 99 loop/while 100 71 100 96 96 98 101 99 re/const 100 99 99 99 100 97 99 99 re/w 100 99 100 101 98 100 101 97 startup/fewmod 100 98 99 97 100 96 98 98 startup/lotsofsub 100 98 100 98 100 98 100 98 startup/noprog 100 101 79 79 100 79 79 100 string/base64 100 100 99 99 100 100 98 99 string/htmlparser 100 98 108 105 100 105 107 98 string/index-const 100 100 98 100 100 101 99 101 string/index-var 100 100 98 99 100 100 100 99 string/ipol 100 108 107 107 108 106 108 106 string/tr 100 101 100 101 99 101 101 102 AVERAGE 100 99 100 98 100 98 100 99 ed2b02642a84b031 A A +PURIFY B B +jemalloc C C +PURIFY +jemalloc D D It's not much, so I'm not sure if it's noise or "signal". If it's signal, it's suggesting that glibc malloc is fractionally better than using arenas, and jemalloc fractionally better still. But not much. (And that with arenas, malloc doesn't seem to matter) Would anyone like to pursue this further? jemalloc is BSD licensed, actively maintained and likely to improve, so potentially we could ship it as a replacement for the current malloc.c However, I'm not sure how easy it would be to integrate. We're not in a position to enforce the use of LD_PRELOAD to swap out the libc malloc, so just like the current malloc.c we'd have to do a bit more to rename the symbols, and to place nicely with the system malloc, particularly if both use sbrk(). Nicholas Clark -- , _ _ , { \/`o;====- Olga Kryzhanovska -====;o`\/ } .----'-/`-/ olga.kryzhanov...@gmail.com \-`\-'----. `'-..-| / http://twitter.com/fleyta \ |-..-'` /\/\ Solaris/BSD//C/C++ programmer /\/\ `--` `--` _______________________________________________ ast-developers mailing list ast-developers@research.att.com https://mailman.research.att.com/mailman/listinfo/ast-developers