On Thu, May 15, 2014 at 10:35 AM, Ted Ross <[email protected]> wrote: > > > On 05/14/2014 04:12 PM, Alan Conway wrote: > > I'm getting up to speed with the router code, I had a couple of > > questions: > > > > alloc.c: Do we have performance data that shows this is better than > > malloc? > > Modern malloc() has optimizations for per-thread, small-object etc. etc. > > In my experience it's very hard to beat. The problem with custom alloc > > is: > > - it's more code to maintain, it's hard to get right. > > - it makes tools like valgrind much less useful. > > We don't have any benchmark data for this. It would be good to find out > how it actually performs. > > This allocation mechanism comes from our experience with the C++ broker > and problems with the general-purpose heap manager in Linux. Because a > broker/router will very frequently allocate memory on one thread and > free it on another, there is either lock contention or significant > over-allocation. We need to disable the "arena" optimizations in malloc > to prevent memory exhaustion. > > The alloc.c module provides per-thread free-pools with limited size > thereby amortizing the locking overhead for many operations into a > single rebalancing of memory between threads (i.e. a lock is not taken > out during alloc/free unless there is a need to rebalance a block of > memory into or out of the thread pool). > > Dispatch uses alloc.c in select cases, where rapid and frequent > allocation is expected. The general heap or static allocation is used > everywhere else. >
Can you describe in more detail the cases you mention? I've been looking into the possibility of adding some sort of buffer pooling to proton internals, possibly with an eye to expose/allow integration with external pooling strategies, so I'd be interested in getting more details on this, particularly if any of the cases you mention intersect with proton buffers. --Rafael
