https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93720
Bug ID: 93720 Summary: [10 Regression] vector creation from two parts of two vectors produces TBL rather than ins Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pinskia at gcc dot gnu.org Target Milestone: --- Target: aarch64*-*-* Take: #define vector __attribute__((vector_size(2*sizeof(double) ))) vector double test_vpasted2 (vector double low, vector double high) { return (vector double){low[0], high[1]}; } --- CUT --- This produces on the trunk: adrp x0, .LC0 ldr q2, [x0, #:lo12:.LC0] tbl v0.16b, {v0.16b - v1.16b}, v2.16b ret ---- CUT ---- Where LC0 is {0..7,24..31} When really this should produce just ins. GCC 8.2 produces: dup v0.2d, v0.d[0] ins v0.d[1], v1.d[1] But GCC 9 produces the correct thing of just an ins (though I don't have a compiler to prove that). The problem is GCC 10 converts the above code to: _4 = VEC_PERM_EXPR <low_1(D), high_2(D), { 0, 3 }>; Which the back-end does not optimize to do an ins.