Brian Akins wrote:
I'm writing an optimized caching module. I've been using 2.0.50.
Here are the "top 50" from oprofile ( http://oprofile.sourceforge.net/ ):
Linux cnnsquid2 2.6.7-cnn.1smp #1 SMP Wed Jun 16 13:41:14 EDT 2004
x86_64 x86_64 x86_64 GNU/Linux
Using leader mpm. I patched apr_atomics so that atomics work on x84_64
The serving is serving ~27k requests per second:
Are there optimizations to apr_palloc in 2.1, this seems to be a good
place to optimize.
There's not much left to optimize in apr_palloc itself. In the common
case, the execution path consists of a single conditional check and a
bit of pointer addition. But reducing the number of calls to apr_palloc
could yield an improvement. Do you happen to have a way to enable
call-graph profiling on your test system, to find out what the top call
chains are for the top 50 functions? (My understanding is that recent
releases of oprofile support this with an optional patch.)
The large amount of time spent in memcpy could also be a good
candidate for optimization.
Brian