On Friday, 6 November 2015 at 11:37:22 UTC, Marc Schütz wrote:
Ok, benchA and benchB have the same assembler code generated. However, I _can_ reproduce the slowdown albeit on average only 20%-40%, not a factor of 10.

Forgot to add that this is on Linux x86_64, so that probably explains the difference.


It turns out that it's always the first tested function that's slower. You can test this by switching benchA and benchB in the call to benchmark(). I suspect the reason is that the OS is paging in the code the first time, and we're actually seeing the cost of the page fault. If you a second round of benchmarks after the first one, that one shows more or less the same performance for both functions.


Reply via email to