The meat of this is in the second patch, which makes the AArch64 backend look
for shuffle masks that can be turned into EXT instructions, and updates the vext[q]_* Neon Intrinsics to use __builtin_shuffle rather than the current inline assembler; this then produces the same instructions (unless the midend can do better).

Before that, the first patch adds execution + assembler tests of the existing
intrinsics, which then serve as a testcase for the second patch.

Third patch reuses the test bodies from first patch in equivalent tests on the
ARM architecture.

Ok for trunk?

--Alan

Reply via email to