02-Aug-2013 20:47, Walter Bright пишет:
On 8/2/2013 6:16 AM, Dmitry Olshansky wrote:
I failed to see much of any improvement on Win32 though, allocations are
dominating the picture.

And sharing the joy of having a nice sampling profiler, here is what AMD
CodeAnalyst have to say (top X functions by CPU clocks not halted).

[snip]


This underlies the point that DMC RTL allocator is the biggest speed
detractor.
It is "followed" by ledata (could it be due to linear search inside?) and
surprisingly the tiny Obj::fltused is draining lots of cycles (is it
called that
often?).

It's not fltused() that is taking up time, it is the static function
following it. The sampling profiler you're using is unaware of
non-global function names.


Thanks, that must be it! And popping that function above another one gets Obj::far16thunk to be blamed :) Need to watch out for this sort of problem next time. Could it be due to how it works with old CV debug info format?

--
Dmitry Olshansky

Reply via email to