On 3/2/20 7:32 PM, aliak wrote:
On Monday, 2 March 2020 at 23:27:22 UTC, Steven Schveighoffer wrote:
What I think is happening is that it determines nobody is using the
result, and the function is pure, so it doesn't bother calling that
function (probably not even the lambda, and then probably removes the
loop completely).
I'm assuming for some reason, the binary search is not flagged pure,
so it's not being skipped.
Apparently you're right:
https://github.com/dlang/phobos/blob/5e13653a6eb55c1188396ae064717a1a03fd7483/std/range/package.d#L11107
That's not definitive. Note that a template member or member of a struct
template can be *inferred* to be pure.
It's also entirely possible for the function to be pure, but the
compiler decides for another reason not to elide the whole thing.
Optimization isn't ever guaranteed.
If I change to this to ensure side effects:
bool makeImpure; // TLS variable outside of main
...
auto results = benchmark!(
() => makeImpure = r1.canFind(max),
() => makeImpure = r2.contains(max),
() => makeImpure = r3.canFind(max),
)(5_000);
writefln("%(%s\n%)", results); // modified to help with the comma
confusion
I now get:
4 secs, 428 ms, and 3 hnsecs
221 μs and 9 hnsecs
4 secs, 49 ms, 982 μs, and 5 hnsecs
More like what I expected!
Ahhhh damn! And here I was thinking that branch prediction made a HUGE
difference! Ok, I'm taking my tail and slowly moving away now :) Let us
never speak of this again.
LOL, I'm sure this will come up again ;) The forums are full of
confusing benchmarks where LDC has elided the whole thing being tested.
It's amazing at optimizing. Sometimes, too amazing.
On 3/2/20 6:46 PM, H. S. Teoh wrote:
> To prevent the optimizer from eliding "useless" code, you need to do
> something with the return value that isn't trivial (assigning to a
> variable that doesn't get used afterwards is "trivial", so that's not
> enough). The easiest way is to print the result: the optimizer cannot
> elide I/O.
Yeah, well, that means you are also benchmarking the i/o (which would
dwarf the other pieces being tested).
I think assigning the result to a global fits the bill pretty well, but
obviously only works when you're not inside a pure function.
-Steve