On Tuesday, 25 June 2013 at 06:21:09 UTC, dennis luehring wrote:
Am 25.06.2013 07:51, schrieb dennis luehring:
Am 24.06.2013 18:15, schrieb Richard Webb:
DMD built with DMC takes ~49 seconds to complete, but DMD build
with VC2008 only takes ~12 seconds. (Need to get a proper VC
build done to test it properly).
Looks like the DMC build spends far more time allocating memory, even though the peak memory usage is only slightly lower in the
VS version?

i've done VS2012 + Intel VTune Amp XE 2013 profiling - see the attached
zipped csv file



the AMD CodeXL results are also different - both VTune and CodeXL fully integrated into VS2010 and using "same" settings

btw nice to read: http://www.yosefk.com/blog/how-profilers-lie-the-cases-of-gprof-and-kcachegrind.html


GProf tends to be pretty useless for actual profiling in my experience.

I think the best way is to use a sampling profiler such as 'perf' (a part of the linux project on a recent debian/ubuntu/mint type 'perf' into console to get info about what package to install, docs at
https://perf.wiki.kernel.org/index.php/Tutorial‎,
'oprofile' (pretty much the same featureset as perf, sometimes hard to set up) or VTune mentioned here. Never expect gprof to give you reliable data as to how much time which function takes. Callgrind/kcachegrind is also pretty good if your code doesn't spend a lot of time on i/o, system calls, etc (as the main code is running in a slow VM - anything not running in that VM will seem to run much faster).

Furthermore, _neither_ of these requires compiling with special flags. As for debug symbols, it's best to enable optimizations together with enabling debug symbols. Optimizations are not a big issue - even if some functions were inlined, these tools give you per-line and per-instruction results. Not to mention cache hits/misses, branches, branch mispredictions, and if you use CPU specific event IDs whatever else your CPU can record. AND it doesn't affect performance of profiled code measurably, unless you set an insanely high sample rate.

And if this sounds difficult to configure, most of these tools (perf at the very least) have very sane defaults that give way more useful results than gprof.

TLDR: gprof is horrible. Never use it for profiling. There are approximaly 5 billion better tools that give more detailed results _and_ are easier to use.

I seriously need to write a blog post/article about this.

Reply via email to