Brian Akins wrote:
I'm writing an optimized caching module. I've been using 2.0.50. Here are the "top 50" from oprofile ( http://oprofile.sourceforge.net/ ):
Linux cnnsquid2 2.6.7-cnn.1smp #1 SMP Wed Jun 16 13:41:14 EDT 2004 x86_64 x86_64 x86_64 GNU/Linux
Using leader mpm. I patched apr_atomics so that atomics work on x84_64
The serving is serving ~27k requests per second:
Are there optimizations to apr_palloc in 2.1, this seems to be a good place to optimize.
There's not much left to optimize in apr_palloc itself. In the common case, the execution path consists of a single conditional check and a bit of pointer addition. But reducing the number of calls to apr_palloc could yield an improvement. Do you happen to have a way to enable call-graph profiling on your test system, to find out what the top call chains are for the top 50 functions? (My understanding is that recent releases of oprofile support this with an optional patch.)
The large amount of time spent in memcpy could also be a good candidate for optimization.
Brian
