On 24/04/2016 1:37 AM, Andrei Alexandrescu wrote:
https://issues.dlang.org/show_bug.cgi?id=15951. I showed a few obvious
cases, but finding the best code in general is tricky. Ideas? -- Andrei

That opEquals that is generated looks awfully big as well.

Also form what I've read you could get a speed boost if you lump the mov's together.

E.g.
movq <memory>, rax
movq <memory>, rbx
movq <memory>, rcx
movq <memory>, rdx
movq <memory>, r8
movq <memory>, r9
...
movq rax, <memory>
movq rbx, <memory>
movq rcx, <memory>
movq rdx, <memory>
movq r8, <memory>
movq r9, <memory>

In theory, in newer cpu's that should be fairly cheap compare to switching between.
Of course that won't help anything in decreasing instruction count.
Unfortunately I can't find anything backing this up so take it with a grain of salt, also rep mov might be just as good.

Also there is a new series of CPU's coming out in a month or two, so who knows if it'll change assuming its valid now.

Reply via email to