https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125550
--- Comment #2 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Artemiy Volkov <[email protected]>: https://gcc.gnu.org/g:8076bc965f8aebafcb18739492432a19ee62a17c commit r17-1448-g8076bc965f8aebafcb18739492432a19ee62a17c Author: Artemiy Volkov <[email protected]> Date: Tue Jun 2 08:53:40 2026 +0000 aarch64: use ZIP1 instead of UZP1 for concatenation [PR125550] This patch addresses the issue in PR125550, where two float16 values are being concatenated using UZP1, i.e., this code: svfloat16_t foo (float x0, float x1) { return svdupq_n_f16 (x0, x1, x0, x1, x0, x1, x0, x1); } is being compiled into: fcvt h0, s0 fcvt h1, s1 uzp1 v0.4h, v0.4h, v1.4h mov z0.s, s0 ret causing the duplication of a 2-element vector ((float16) x0, 0) into z0. This is a copy-paste error from the original combine_internal patterns, where UZP1 always operates on vectors of 2 elements, in which circumstance it is equivalent to ZIP1. For smaller element sizes (and thus higher element counts) only ZIP1 is correct. The fix is to emit ZIP1 when concatenating values on vector registers. For consistency, I've changed the original combine_internal patterns as well as the ones added in r17-898-g920eeb67a3537b. Since this latter change has nothing to do with the PR, it could have been better to split the patch in two; I'd be happy to do that if necessary. Both aforementioned changes required adjusting existing AdvSIMD/SVE vec_init-related testcases; I've added pr125550.c from the PR on top of that as well. Bootstrapped and regtested on aarch64-linux-gnu. PR target/125550 gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_combine_internal<mode>): Use zip1 instead of uzp1 to concatenate values residing in SIMD registers. (*aarch64_combine_internal_be<mode>): Likewise. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ldp_stp_16.c: Adjust testcases. * gcc.target/aarch64/pr109072_1.c: Likewise. * gcc.target/aarch64/simd/mf8_data_1.c: Likewise. * gcc.target/aarch64/sve/vec_init_5.c: Likewise. * gcc.target/aarch64/vec-init-14.c: Likewise. * gcc.target/aarch64/vec-init-23.c: Likewise. * gcc.target/aarch64/vec-init-9.c: Likewise. * gcc.target/aarch64/sve/pr125550.c: New test.
