On Tuesday, 25 June 2013 at 06:21:09 UTC, dennis luehring wrote:
Am 25.06.2013 07:51, schrieb dennis luehring:
Am 24.06.2013 18:15, schrieb Richard Webb:
DMD built with DMC takes ~49 seconds to complete, but DMD
build
with VC2008 only takes ~12 seconds. (Need to get a proper VC
build done to test it properly).
Looks like the DMC build spends far more time allocating
memory,
even though the peak memory usage is only slightly lower in
the
VS version?
i've done VS2012 + Intel VTune Amp XE 2013 profiling - see the
attached
zipped csv file
the AMD CodeXL results are also different - both VTune and
CodeXL fully integrated into VS2010 and using "same" settings
btw nice to read:
http://www.yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html
GProf tends to be pretty useless for actual profiling in my
experience.
I think the best way is to use a sampling profiler such as 'perf'
(a part of the linux project on a recent debian/ubuntu/mint type
'perf' into console to get info about what package to install,
docs at
https://perf.wiki.kernel.org/index.php/Tutorial,
'oprofile' (pretty much the same featureset as perf, sometimes
hard to set up) or VTune mentioned here. Never expect gprof to
give you reliable data as to how much time which function takes.
Callgrind/kcachegrind is also pretty good if your code doesn't
spend a lot of time on i/o, system calls, etc (as the main code
is running in a slow VM - anything not running in that VM will
seem to run much faster).
Furthermore, _neither_ of these requires compiling with special
flags. As for debug symbols, it's best to enable optimizations
together with enabling debug symbols. Optimizations are not a big
issue - even if some functions were inlined, these tools give you
per-line and per-instruction results. Not to mention cache
hits/misses, branches, branch mispredictions, and if you use CPU
specific event IDs whatever else your CPU can record. AND it
doesn't affect performance of profiled code measurably, unless
you set an insanely high sample rate.
And if this sounds difficult to configure, most of these tools
(perf at the very least) have very sane defaults that give way
more useful results than gprof.
TLDR: gprof is horrible. Never use it for profiling. There are
approximaly 5 billion better tools that give more detailed
results _and_ are easier to use.
I seriously need to write a blog post/article about this.