On 12/12/06, Zoran Vasiljevic <[EMAIL PROTECTED]> wrote:
On 12.12.2006, at 19:40, Stephen Deasey wrote: > > Tcl not being the fastest kid on the block, and folklore saying you > should reboot your AOLserver/NaviServer once/day due to chronic memory > fragmentation, it would be nice if these thousand node DOMs were > constructed once, at compile time, with one of them thar fancy-pants > 1970's macro systems... Hm... this is important point.... 1. memory fragmentation 2. pre-allocation of dom nodes. Lets skip 2 for the moment. As 1. is affecting all users... Yes, we do currently restart the server daily because of that nasty fragmentation and memory eating. Which means there should be some other options, i.e. other memory allocation techniques we could use. By "we" I mean ourselves, as we are not that focused on raw speed of the allocator. Speed-sensible things we already do with custom C-code and take care about memory ourselves. But general stuff like Tcl and 2. (above) still hit us in the knee every day. Does anybody happen to know a nice malloc replacement for Win/Unix (Solaris, Linux, Mac OSX) that is fairly fast for MT process, yet not that memory-hungry as the current allocator?
When I start the default tclsh on my Fedora Core 5 box it gobbles up 10MB of memory probing the stack to figure out the recursion limit. The libtcl8.4.so is only 770k, whereas as libpython is 1.1MB, but a running python shell is only ~ 2MB. Tcl isn't very friendly to small memory systems. The default Tcl on my machine is also compiled with thread support, so it uses the zippy allocator. Nathan says: The Tcl threaded allocator, which is derived from the original AOLserver "zippy" allocator, is optimized solely for lock avoidance, and this comes at the cost of memory foot print. In fact, there's no garbage collection for smaller allocations. Instead, memory moves between the shared and per-thread pools. So Tcl isn't very friendly to long running processes, say, on the desktop. We know it's not friendly to long running servers, the original specialisation of the allocator, we have to keep rebooting them! I'm wondering if times have changed and default OS allocators are reasonable, at least for the average Tcl app. By default, does threaded Tcl need a leaky, kinda-bloaty allocator? The last time we talked about this someone mentioned the Sun papers about their kernel slab allocator, and the newer one with per-thread "magazines". They've also ported this to user space as libumem. It would be cool if you could link against that and see how it works. The thing I really like about libumem over hoard or tcmalloc is it's wider interface. You can create memory pools and specify functions which initialise, re-initialise, and destroy objects (as well as plain-old malloc). If you think about it, that's exactly what the Tcl_Obj allocator is doing -- it's a specialised allocator for one particular type of object. The underlying memory allocator handles thread-local pools for speed, the Obj allocator add initialisation and cleanup caching (etc.). But there are other places you want to do that, Ns_Conn structs for example. A couple of versions back these were pre-allocated at start up in a contiguous chunk (Ns_Conn conns[maxconns]), and now they're allocated on demand and pushed on a free list. The code is obfuscated, and there are probably other places which would benefit from similar caching but no one's taken them time to write the code. The Sun guys make the nice observation that it's better to engineer a really good general solution and use it widely. Maybe this would work well for Tcl now? Ditch the current zippy allocator, add an interface for setting your own malloc/free etc. functions, add a new interface for object pools modeled after Sun's libumem. The default Tcl implementation of that would use the Tcl_Obj stuff, but generalised. But you could build against umem, bypassing that. Port libumem to Linux, BSD etc. seeing as it's now Open Source. Suggesting work for other people is easy and fun!