On 4/25/22 00:01, Paul Brook wrote:
+#define SHUFFLE4(F, a, b, offset) do { \
+r0 = a->F((order & 3) + offset);\
+r1 = a->F(((order >> 2) & 3) + offset); \
+r2 = b->F(((order >> 4) & 3) + offset); \
+r3 = b->F(((order >> 6) & 3) + offset); \
+d->F(offset) = r0;
These helpers need to take special care to avoid overwriting source values
before the wole result has been calculated. Currently they use a dummy
Reg typed variable to store the result then assign the whole register.
This will cause 128 bit operations to corrupt the upper half of the register,
so