True, I have only been working in x86 GDC so far, but I just wanted to get
feedback about my approach and API design at this point.
It seems there are no serious objections, I'll continue as is.

I have one proposal about API design of matrix operations. Maybe there could be functions that would take row vectors as parameters in addition to those that take matrix structs. That way one could call matrix functions on data that isn't stored as matrix structures without copying. So for example for the transpose function there would also be a function that would be used like this (a* are inputs and r* are outputs):

transpose(aX, aY, aZ, aW, rX, rY, rZ, rW);

Maybe those functions could be used to implement the functions that take and return structs.

I also think that interleave and deinterleave operations would be useful. For four element float vectors those can be implemented with only one instruction at least for SSE (using unpcklps, unpckhps and shufps) and NEON (using vuzp and vzip).

I have an
ARM compiler too now, so I'll be implementing/testing against that as
reference also.

Could you please tell me how did you get the ARM compiler to work?

Reply via email to