On Tuesday, 16 October 2018 at 16:57:12 UTC, welkam wrote:
So I run profiler and 97% of time is spent in void twinsSieve
function and hotspots are seg[k] = seg[k] | 1; lines. Since
seg[k] can only be 1 or 0 I removed that or operation. And the
results are. Queue the drum-roll... 5% slower.
I thought that all of my studying was getting somewhere. That I
beginning to understand things but no. Performing OR operation
and then storing data is faster than just storing data.
[sarcasm] Of course it makes sense [/sarcasm]
I looked at assembler and nothing changed except
orb $0x1,(%rbx,%rdi,1)
is changed to
movb $0x1,(%rbx,%rdi,1)
I`m completely lost.
This is the exact same behavior I found with the Nim compiler too.
seg[k] = 1 is slower than seg[k] = seg[k] or 1, which is why
it's coded like that.
Thanks for finding the exact instruction difference.
How many other unknown gotchas are in the compilers? :-(