used like this (a* are inputs and r* are outputs):

transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);


... the problem is, without multiple return values (come on, D should have
multiple return values!), how do you return the result? :)


Maybe those functions could be used to implement the functions that take
and return structs.


Yes... I've been pondering how to do this properly for ages actually. That's the main reason I haven't fleshed out any matrix functions yet; I'm
still not at all sold on how to represent the matrices.
Ideally, there should not be any memory access. But even if they pass by ref/pointer, as soon as the function is inlined, the memory access will
disappear, and it'll effectively generate the same code...

I meant having functions that would return through reference parameters. The transpose function above would have signature transpose(float4, float4, float4, float4, ref float4, ref float4, ref float4, ref float4).

Sure. I wasn't sure how useful they were in practise... I didn't want to load it with countless silly permutation routines so I figured I'll add them by request, or as they are proven useful in real world apps. What would you typically do with the interleave functions at a high level? Sure you don't just use it as a component behind a few actually useful
functions which should be exposed instead?

I think they would be useful when you work with arrays of structs with two elements such as complex numbers. For example to calculate a square of a complex array you could do:

for(size_t i=0; i < a.length; i += 2)
{
   float4 first = a[i];
   float4 second  = a[i + 1];
   float4 re = deinterleaveLow(first, second);
   float4 im = deinterleaveHigh(first, second);
   flaot4 re2 = re * re - im * im;
   float4 im2 = re * im
   im2 += im2;
   a[i] = interleaveLow(re2, im2);
   a[i + 1] = interleaveHigh(re2, im2);   }

Interleave and interleave can also be useful when you want to shuffle data in some custom way. You can't cover all possible permutations of elements over multiple vectors in a library (unless you do something like A* search at compile time and generate code based on that - but that would probably be way to slow), but you can expose at least the capabilities that are common to most platforms, such as interleave and deinterleave.


Reply via email to