Stefan Fuhrmann <stefanfuhrm...@alice-dsl.de> writes:

>  On 27.09.2010 21:44, Ramkumar Ramachandra wrote:
>> Could you tell me which tools you use to profile the various
>> applications in trunk? I'm looking to profile svnrdump to fix some
>> perf issues, but OProfile doesn't seem to work for me.
>
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).

The performance of svnrdump is likely to be dominated by IO from the
repository, network or disk depending on the RA layer.  strace is a
useful tool to see opens/reads/writes.  You can see what order the
calls occur, how many there are, how big they are and how long they
take.

Valgrind/Callgrind is good and doesn't require you to instrument the
code, but it does help to build with debug information.  It does
impose a massive runtime overhead.

OProfile will give you CPU usage with far lower overhead than
valgrind/callgrind.  Like valgrind/callgrind you don't need to
instrument the code but it works better with debug information and
with modern gcc if you use -O2 then you need -fno-omit-frame-pointer
for the callgraph stuff to work.  I use it like so:

opcontrol --init
opcontrol --no-vmlinux --separate=library --callgraph=32
opcontrol --start
opcontrol --reset
subversion/svnrdump/svnrdump ...
opcontrol --stop
opcontrol --dump
opreport --merge all -l image:/path/to/lt-svnrdump

This is what I get when dumping 1000 revisions from a local mirror of
the Subversion repository over ra_neon:

CPU: Core 2, speed 1200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (Unhalted core cycles) count 100000
samples  %        app name                 symbol name
4738     41.1893  no-vmlinux               (no symbols)
1037      9.0150  libxml2.so.2.6.32        (no symbols)
700       6.0854  libneon.so.27.1.2        (no symbols)
238       2.0690  libc-2.7.so              _int_malloc
228       1.9821  libc-2.7.so              memcpy
221       1.9212  libc-2.7.so              memset
217       1.8865  libc-2.7.so              strlen
191       1.6604  libsvn_subr-1.so.0.0.0   decode_bytes
180       1.5648  libc-2.7.so              vfprintf
171       1.4866  libc-2.7.so              strcmp
153       1.3301  libapr-1.so.0.2.12       apr_hashfunc_default
134       1.1649  libapr-1.so.0.2.12       apr_vformatter
130       1.1301  libapr-1.so.0.2.12       apr_palloc

That's on my Debian desktop.  At the recent Apache Retreat I tried to
demonstrate OProfile on my Ubuntu laptop and could not get it to work
properly, probably because I forgot about -fno-omit-frame-pointer.

Finally there is traditional gprof.  It's a long time since I used it
so I don't remember the details.  You instrument the code at compile
time using CFLAGS=-pg.  If an instrumented function foo calls into a
library bar that is not instrumented then bar is invisible, all you
see is how long foo took to execute.

-- 
Philip

Reply via email to