I'm finding HEAPS of SIMD functions want to return pairs (unpacks in
particular): int4 (low, hight) = unpack(someShort8);
Currently I have to duplicate everyting: int4 low =
unpackLow(someShort8); int4 high = unpackHigh(someShort8);
I'm getting really sick of that, it feels so... last millennium.

It can also be realy inefficient. For example ARM NEON has vzip instruction that is used like this:

vzip.32 q0, q1

This will interleave elements of vectors in q0 and q1 in one instruction.

Reply via email to