Doh! Thanks, Nathan. I think you put your finger on it.
I was well aware of the overhead that gprof can introduce,
but did not recognize that this overhead was not being
counted by gprof.
Jon
Nathan Froyd wrote:
On Mon, Feb 22, 2010 at 03:23:52PM -0600, Jon Turner wrote:
In it, you will find a directory with all the source code
needed to observe the problem for yourself.
The top level directory contains a linux executable called
maxFlo, which you should be able to run on a linux box
as is. But if you want/need to compile things yourself,
type "make clean" and "make all" in the top level
directory and you should get a fresh copy of maxFlo.
So, compiling maxFlo with no -pg option:
@nightcrawler:~/src/gprof-trouble-case$ time ./maxFlo
real 0m3.465s
user 0m3.460s
sys 0m0.000s
Compiling maxFlo with -pg option:
@nightcrawler:~/src/gprof-trouble-case$ time ./maxFlo
real 0m9.780s
user 0m9.760s
sys 0m0.010s
Notice that ~60% of the running time with gprof enabled is simply
overhead from call counting and the like. That time isn't recorded by
gprof. That alone accounts for your report about gprof ignoring 2/3 of
the execution time.
Checking to see whether maxFlo is a dynamic executable (since you
claimed earlier that you were statically linking your program):
@nightcrawler:~/src/gprof-trouble-case$ ldd ./maxFlo
linux-vdso.so.1 => (0x00007fff2977f000)
libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x00007fb422c21000)
libm.so.6 => /lib/libm.so.6 (0x00007fb42299d000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fb422786000)
libc.so.6 => /lib/libc.so.6 (0x00007fb422417000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb422f31000)
So calls to shared library functions (such as functions in libm) will
not be caught by gprof. Those calls count account for a significant
amount of running time of your program and gprof can't tell you about
them.
Inspecting the gmon.out file:
@nightcrawler:~/src/gprof-trouble-case$ gprof maxFlo gmon.out
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls s/call s/call name
16.09 0.37 0.37 27649 0.00 0.00 shortPath::findPath()
12.61 0.66 0.29 55889952 0.00 0.00 graph::next(int,int) const
11.96 0.94 0.28 61391904 0.00 0.00 graph::mate(int,int) const
10.87 1.19 0.25 58654752 0.00 0.00 flograph::res(int,int)
const
10.44 1.43 0.24 _fini
6.96 1.59 0.16 65055289 0.00 0.00 graph::term(int) const
6.96 1.75 0.16 61391904 0.00 0.00 digraph::tail(int) const
[...lots of stuff elided...]
0.00 2.30 0.00 1 0.00 0.00 graph
gprof is telling you about 2.3 seconds of your execution time. With the
factors above accounted for, that doesn't seem unreasonable.
-Nathan