Hi Philip, Philip Martin writes: > The performance of svnrdump is likely to be dominated by IO from the > repository, network or disk depending on the RA layer. strace is a > useful tool to see opens/reads/writes. You can see what order the > calls occur, how many there are, how big they are and how long they > take.
Ah, thanks for the tip. > Valgrind/Callgrind is good and doesn't require you to instrument the > code, but it does help to build with debug information. It does > impose a massive runtime overhead. I don't mind -- I'm mostly using some remote machines to gather the profiling data :) > This is what I get when dumping 1000 revisions from a local mirror of > the Subversion repository over ra_neon: > > CPU: Core 2, speed 1200 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (Unhalted core cycles) count 100000 > samples % app name symbol name > 4738 41.1893 no-vmlinux (no symbols) > 1037 9.0150 libxml2.so.2.6.32 (no symbols) > 700 6.0854 libneon.so.27.1.2 (no symbols) > 238 2.0690 libc-2.7.so _int_malloc > 228 1.9821 libc-2.7.so memcpy > 221 1.9212 libc-2.7.so memset > 217 1.8865 libc-2.7.so strlen > 191 1.6604 libsvn_subr-1.so.0.0.0 decode_bytes > 180 1.5648 libc-2.7.so vfprintf > 171 1.4866 libc-2.7.so strcmp > 153 1.3301 libapr-1.so.0.2.12 apr_hashfunc_default > 134 1.1649 libapr-1.so.0.2.12 apr_vformatter > 130 1.1301 libapr-1.so.0.2.12 apr_palloc > > That's on my Debian desktop. At the recent Apache Retreat I tried to > demonstrate OProfile on my Ubuntu laptop and could not get it to work > properly, probably because I forgot about -fno-omit-frame-pointer. Ah, now I see why it didn't work for me. The data from Callgrind is very interesting- it seems to suggest that APR hashtables are prohibitively expensive. @Stefan: Thoughts on hacking APR hashtables directly? > Finally there is traditional gprof. It's a long time since I used it > so I don't remember the details. You instrument the code at compile > time using CFLAGS=-pg. If an instrumented function foo calls into a > library bar that is not instrumented then bar is invisible, all you > see is how long foo took to execute. Yes, I used gprof initially. Callgrind is WAY more useful. -- Ram