Sean Cavanaugh wrote:
On 4/22/2011 2:20 PM, bearophile wrote:
Kai Meyer:
The purpose of the original post was to indicate that some low level
research shows that underlying data structures (as applied to video game
development) can have an impact on the performance of the application,
which D (I think) cares very much about.
The idea of the original post was a bit more complex: how can we
invent new/better ways to express semantics in D code that will not
forbid future D compilers to perform a bit of changes in the layout of
data structures to increase code performance? Complex transforms of
the data layout seem too much complex for even a good compiler, but
maybe simpler ones will be possible. And I think to do this the D code
needs some more semantics. I was suggesting an annotation that forbids
inbound pointers, that allows the compiler to move data around a
little, but this is just a start.
Bye,
bearophile
In many ways the biggest thing I use regularly in game development that
I would lose by moving to D would be good built-in SIMD support. The PC
compilers from MS and Intel both have intrinsic data types and
instructions that cover all the operations from SSE1 up to AVX. The
intrinsics are nice in that the job of register allocation and
scheduling is given to the compiler and generally the code it outputs is
good enough (though it needs to be watched at times).
Unlike ASM, intrinsics can be inlined so your math library can provide a
platform abstraction at that layer before building up to larger
operations (like vectorized forms of sin, cos, etc) and algorithms (like
frustum cull checks, k-dop polygon collision etc), which makes porting
and reusing the algorithms to other platforms much much easier, as only
the low level layer needs to be ported, and only outliers at the
algorithm level need to be tweaked after you get it up and running.
On the consoles there is AltiVec (VMX) which is very similar to SSE in
many ways. The common ground is basically SSE1 tier operations : 128
bit values operating on 4x32 bit integer and 4x32 bit float support. 64
bit AMD/Intel makes SSE2 the minimum standard, and a systems language on
those platforms should reflect that.
Yes. It is for primarily for this reason that we made static arrays
return-by-value. It is intended that on x86, float[4] will be an SSE1
register.
So it should be possible to write SIMD code with standard array
operations. (Note that this is *much* easier for the compiler, than
trying to vectorize scalar code).
This gives syntax like:
float[4] a, b, c;
a[] += b[] * c[];
(currently works, but doesn't use SSE, so has dismal performance).
Loading and storing is comparable across platforms with similar
alignment restrictions or penalties for working with unaligned data.
Packing/swizzle/shuffle/permuting are different but this is not a huge
problem for most algorithms. The lack of fused multiply and add on the
Intel side can be worked around or abstracted (i.e. always write code as
if it existed, have the Intel version expand to multiple ops).
And now my wish list:
If you have worked with shader programming through HLSL or CG the
expressiveness of doing the work in SIMD is very high. If I could write
something that looked exactly like HLSL but it was integrated perfectly
in a language like D or C++, it would be pretty huge to me. The amount
of math you can have in a line or two in HLSL is mind boggling at times,
yet extremely intuitive and rather easy to debug.