On 5/01/12 12:42 AM, bearophile wrote:
Manu:

I'm not referring to vector OPERATIONS. I only refer to the creation of a
type to identify these registers...

Please, try to step back a bit and look at this problem from a bit more 
distance. D has vector operations, and so far they have received only a tiny 
amount of love. Are you able to find some ways to solve some of your problems 
using a hypothetical much better implementation of D vector operations? Please, 
think about the possibilities of this syntax.

Think about future CPU evolution with SIMD registers 128, then 256, then 512, 
then 1024 bits long. In theory a good compiler is able to use them with no 
changes in the D code that uses vector operations.

Intrinsics are an additive change, adding them later is possible. But I think 
fixing the syntax of vector ops is more important. I have some bug reports in 
Bugzilla about vector ops that are sleeping there since two years or so, and 
they are not about implementation performance.

I think the good Hara will be able to implement those syntax fixes in a matter 
of just one day or very few days if a consensus is reached about what actually 
is to be fixed in D vector ops syntax.

Instead of discussing about *adding* something (register intrinsics) I suggest 
to discuss about what to fix about the *already present* vector op syntax. This 
is not a request to just you Manu, but to this whole newsgroup.

Bye,
bearophile

D has no alignment support, so there is no way to specify that you want a float[4] to be aligned on 16-bytes, which means there is no way for the compiler to generate code to exploit SSE well. It has to be conservative and assume unaligned.

Suppose alignment support is added:

alias align(16) float[4] vec4f;

vec4f a, b;
...
a[0] = a[3];
a[1] = a[2];
a[2] = b[0];
a[3] = b[1];

Is it reasonable to expect compilers to generate a single shuffle instruction from this? What about more complicated code like computing a dot product. What D code do I write to get the compiler to generate the expected machine code?

If we get alignment support and lots of work goes into optimizing vector ops for this then we can go a long with without intrinsics, but I don't think we'll ever be able to completely remove the need for intrinsics.

Reply via email to