On 5/01/12 12:42 AM, bearophile wrote:
Manu:
I'm not referring to vector OPERATIONS. I only refer to the creation of a
type to identify these registers...
Please, try to step back a bit and look at this problem from a bit more
distance. D has vector operations, and so far they have received only a tiny
amount of love. Are you able to find some ways to solve some of your problems
using a hypothetical much better implementation of D vector operations? Please,
think about the possibilities of this syntax.
Think about future CPU evolution with SIMD registers 128, then 256, then 512,
then 1024 bits long. In theory a good compiler is able to use them with no
changes in the D code that uses vector operations.
Intrinsics are an additive change, adding them later is possible. But I think
fixing the syntax of vector ops is more important. I have some bug reports in
Bugzilla about vector ops that are sleeping there since two years or so, and
they are not about implementation performance.
I think the good Hara will be able to implement those syntax fixes in a matter
of just one day or very few days if a consensus is reached about what actually
is to be fixed in D vector ops syntax.
Instead of discussing about *adding* something (register intrinsics) I suggest
to discuss about what to fix about the *already present* vector op syntax. This
is not a request to just you Manu, but to this whole newsgroup.
Bye,
bearophile
D has no alignment support, so there is no way to specify that you want
a float[4] to be aligned on 16-bytes, which means there is no way for
the compiler to generate code to exploit SSE well. It has to be
conservative and assume unaligned.
Suppose alignment support is added:
alias align(16) float[4] vec4f;
vec4f a, b;
...
a[0] = a[3];
a[1] = a[2];
a[2] = b[0];
a[3] = b[1];
Is it reasonable to expect compilers to generate a single shuffle
instruction from this? What about more complicated code like computing a
dot product. What D code do I write to get the compiler to generate the
expected machine code?
If we get alignment support and lots of work goes into optimizing vector
ops for this then we can go a long with without intrinsics, but I don't
think we'll ever be able to completely remove the need for intrinsics.