Sean Cavanaugh wrote:
On 4/22/2011 2:20 PM, bearophile wrote:
Kai Meyer:

The purpose of the original post was to indicate that some low level
research shows that underlying data structures (as applied to video game
development) can have an impact on the performance of the application,
which D (I think) cares very much about.

The idea of the original post was a bit more complex: how can we invent new/better ways to express semantics in D code that will not forbid future D compilers to perform a bit of changes in the layout of data structures to increase code performance? Complex transforms of the data layout seem too much complex for even a good compiler, but maybe simpler ones will be possible. And I think to do this the D code needs some more semantics. I was suggesting an annotation that forbids inbound pointers, that allows the compiler to move data around a little, but this is just a start.

Bye,
bearophile


In many ways the biggest thing I use regularly in game development that I would lose by moving to D would be good built-in SIMD support. The PC compilers from MS and Intel both have intrinsic data types and instructions that cover all the operations from SSE1 up to AVX. The intrinsics are nice in that the job of register allocation and scheduling is given to the compiler and generally the code it outputs is good enough (though it needs to be watched at times).

Unlike ASM, intrinsics can be inlined so your math library can provide a platform abstraction at that layer before building up to larger operations (like vectorized forms of sin, cos, etc) and algorithms (like frustum cull checks, k-dop polygon collision etc), which makes porting and reusing the algorithms to other platforms much much easier, as only the low level layer needs to be ported, and only outliers at the algorithm level need to be tweaked after you get it up and running.

On the consoles there is AltiVec (VMX) which is very similar to SSE in many ways. The common ground is basically SSE1 tier operations : 128 bit values operating on 4x32 bit integer and 4x32 bit float support. 64 bit AMD/Intel makes SSE2 the minimum standard, and a systems language on those platforms should reflect that.

Yes. It is for primarily for this reason that we made static arrays return-by-value. It is intended that on x86, float[4] will be an SSE1 register. So it should be possible to write SIMD code with standard array operations. (Note that this is *much* easier for the compiler, than trying to vectorize scalar code).

This gives syntax like:
float[4] a, b, c;
a[] += b[] * c[];
(currently works, but doesn't use SSE, so has dismal performance).


Loading and storing is comparable across platforms with similar alignment restrictions or penalties for working with unaligned data. Packing/swizzle/shuffle/permuting are different but this is not a huge problem for most algorithms. The lack of fused multiply and add on the Intel side can be worked around or abstracted (i.e. always write code as if it existed, have the Intel version expand to multiple ops).

And now my wish list:

If you have worked with shader programming through HLSL or CG the expressiveness of doing the work in SIMD is very high. If I could write something that looked exactly like HLSL but it was integrated perfectly in a language like D or C++, it would be pretty huge to me. The amount of math you can have in a line or two in HLSL is mind boggling at times, yet extremely intuitive and rather easy to debug.

Reply via email to