On 05/30/2015 02:17 PM, Chen Gang wrote: > + for (count = 0; count < 8; count++) { > + sel = (rsrcb >> (count * 8)) & 0xf; > + if (sel < 8) { > + vdst |= ((rdst >> (8 * sel)) & 0xff) << (count * 8); > + } else { > + vdst |= ((rsrc >> (8 * (8 - sel))) & 0xff) << (count * 8);
8 - sel is wrong; you wanted sel - 8. That said, you can do better with masking operations. And for brevity, let count increment by 8. E.g. uint64_t vdst = 0; int count; for (count = 0; count < 64; count += 8) { uint64_t sel = rsrcb >> count; uint64_t src = (sel & 8 ? rsrc : rdst); vdst |= ((src >> ((sel & 7) * 8)) & 0xff) << count; } r~