Manu wrote:
These are indeed common gotchas. But they don't necessarily
apply to D, and
if they do, then they should be bugged and hopefully addressed.
There is no
reason that D needs to follow these typical performance
patterns from C.
It's worth noting that not all C compilers suffer from this
problem. There
are many (most actually) compilers that can recognise a struct
with a
single member and treat it as if it were an instance of that
member
directly when being passed by value.
It only tends to be a problem on older games-console compilers.
As I said earlier. When I get back to finishing srd.simd off (I
presume
this will be some time after Walter has finished Win64
support), I'll go
through and scrutinise the code-gen for the API very
thoroughly. We'll see
what that reveals. But I don't think there's any reason we
should suffer
the same legacy C by-value code-gen problems in D... (hopefully
I won't eat
those words ;)
Thanks for the insight (and the code examples, though I've been
researching SIMD best-practice in C recently). It's good to know
that D should (hopefully) be able to avoid these pitfalls.
On a side note, I'm not sure how easy LLVM is to build on Windows
(I think I built it once a long time ago), but recent performance
comparisons between DMD, LDC, and GDC show that LDC (with LLVM
3.1 auto-vectorization and not using GCC -ffast-math) actually
produces on-par-or-faster binary compared to GDC, at least in my
code on Linux64. SIMD in LDC is currently broken, but you might
consider using that if you're having trouble keeping a D release
compiler up-to-date.