https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104882
Bug ID: 104882 Summary: [12 Regression] MVE: Wrong code at -O2 since r12-1434-g046a3beb1673bf4a61c131373b6a5e84158e92bf Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- Created attachment 52608 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52608&action=edit broken assembly output The following code: int i; char src[1072]; char dst[72]; int main() { for (i = 0; i < 128; i++) src[i] = i; __builtin_memcpy(dst, src, 7); for (i = 0; i < 7; i++) if (dst[i] != i) __builtin_abort(); } is miscompiled at -O2 since vectorization was enabled at -O2. With -O2 -ftree-vectorize, it is miscompiled earlier, starting with: commit 046a3beb1673bf4a61c131373b6a5e84158e92bf Author: Christophe Lyon <christophe.l...@linaro.org> Date: Thu Jun 3 15:35:50 2021 arm: Auto-vectorization for MVE: add pack/unpack patterns It looks like we do some dubious packing of vector elements before storing to src. If I change the last loop to print the elements of dst instead, I see: 0 8 4 12 1 9 5 it should of course print: 0 1 2 3 4 5 6. The broken code is attached. The testcase above was reduced from gcc/testsuite/gcc.c-torture/execute/memcpy-1.c.