Stefan Fuhrmann <stefanfuhrm...@alice-dsl.de> writes: > On 27.09.2010 21:44, Ramkumar Ramachandra wrote: >> Could you tell me which tools you use to profile the various >> applications in trunk? I'm looking to profile svnrdump to fix some >> perf issues, but OProfile doesn't seem to work for me. > > Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind. > That gives me a good idea of what code gets executed too often, how > often a jump (loop or condition) has been taken etc. It will not show the > non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the > execution be a factor of 100 (YMMV).
The performance of svnrdump is likely to be dominated by IO from the repository, network or disk depending on the RA layer. strace is a useful tool to see opens/reads/writes. You can see what order the calls occur, how many there are, how big they are and how long they take. Valgrind/Callgrind is good and doesn't require you to instrument the code, but it does help to build with debug information. It does impose a massive runtime overhead. OProfile will give you CPU usage with far lower overhead than valgrind/callgrind. Like valgrind/callgrind you don't need to instrument the code but it works better with debug information and with modern gcc if you use -O2 then you need -fno-omit-frame-pointer for the callgraph stuff to work. I use it like so: opcontrol --init opcontrol --no-vmlinux --separate=library --callgraph=32 opcontrol --start opcontrol --reset subversion/svnrdump/svnrdump ... opcontrol --stop opcontrol --dump opreport --merge all -l image:/path/to/lt-svnrdump This is what I get when dumping 1000 revisions from a local mirror of the Subversion repository over ra_neon: CPU: Core 2, speed 1200 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000 samples % app name symbol name 4738 41.1893 no-vmlinux (no symbols) 1037 9.0150 libxml2.so.2.6.32 (no symbols) 700 6.0854 libneon.so.27.1.2 (no symbols) 238 2.0690 libc-2.7.so _int_malloc 228 1.9821 libc-2.7.so memcpy 221 1.9212 libc-2.7.so memset 217 1.8865 libc-2.7.so strlen 191 1.6604 libsvn_subr-1.so.0.0.0 decode_bytes 180 1.5648 libc-2.7.so vfprintf 171 1.4866 libc-2.7.so strcmp 153 1.3301 libapr-1.so.0.2.12 apr_hashfunc_default 134 1.1649 libapr-1.so.0.2.12 apr_vformatter 130 1.1301 libapr-1.so.0.2.12 apr_palloc That's on my Debian desktop. At the recent Apache Retreat I tried to demonstrate OProfile on my Ubuntu laptop and could not get it to work properly, probably because I forgot about -fno-omit-frame-pointer. Finally there is traditional gprof. It's a long time since I used it so I don't remember the details. You instrument the code at compile time using CFLAGS=-pg. If an instrumented function foo calls into a library bar that is not instrumented then bar is invisible, all you see is how long foo took to execute. -- Philip