ubyte16* masks = ...; foreach (ref c; pixels) { c = __simd(XMM.PSHUFB, c, *masks); }
I see it has shufflevector function but it only accepts constant masks and I am using a variable one. Is this possible under LDC?
BTW. Shuffling channels within pixels using DMD simd is about 5 times faster than with normal code on my machine :)