Manu:

They must be aligned, and multiples of N elements.

The D GC currently allocates them 16-bytes aligned (but if you slice the array you can lose some alignment). On some new CPUs the penalty for misalignment is small.

You often have "n" values, where n is variable. If n is large enough and you are using D vector ops, the handling of the head and tail doesn't waste too much time. If you have very few values it's much better to use the SIMD code.


Well, each are valid comparisons in different situations. I'm not sure how syntax could clearly select the one you want.

Maybe later we'll look for some syntax sugar for this.


Are D intrinsics offering instructions to perform prefetching?

Well, GCC does at least. If you're worried about performance at this level, you're probably already using GCC :)

I think D SIMD programmers will expect something functionally like __builtin_prefetch to be available in D too:
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-g_t_005f_005fbuiltin_005fprefetch-3396

Thank you,
bye,
bearophile

Reply via email to