https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228
--- Comment #2 from Kewen Lin <linkw at gcc dot gnu.org> --- (In reply to Peter Bergner from comment #1) > Confirmed. The testsuite log shows for vsx-extract-6.c and vsx-extract-7.c: > > gcc.target/powerpc/vsx-extract-6.c: \\mxxpermdi\\M found 2 times > FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-times \\mxxpermdi\\M > 1 > FAIL: gcc.target/powerpc/vsx-extract-6.c scan-assembler-not \\mvspltisw\\M > > So we have an extra xxpermdi than we expected and we also have a vspltisw > when we expected none. I haven't looked at whether the code is better or > worse though, to know whether we should just update the expected counts or > whether this is really a code quality regression. The commit makes the vsx-extract-6.c end up with: test_vpasted: .LFB0: .cfi_startproc xxspltib 0,0 xxpermdi 34,34,0,1 xxpermdi 34,34,35,1 blr instead of (the original expected): test_vpasted: .LFB0: .cfi_startproc xxpermdi 34,34,35,1 blr I think it's a code quality regression. The optimized gimple IR is changed to: __vector unsigned long long test_vpasted (__vector unsigned long long high, __vector unsigned long long low) { __vector unsigned long long res; <bb 2> [local count: 1073741824]: res_3 = VEC_PERM_EXPR <res_2(D), high_1(D), { 0, 3 }>; res_5 = VEC_PERM_EXPR <low_4(D), res_3, { 0, 3 }>; return res_5; } from: __vector unsigned long long test_vpasted (__vector unsigned long long high, __vector unsigned long long low) { __vector unsigned long long res; long long unsigned int _1; long long unsigned int _2; <bb 2> [local count: 1073741824]: _1 = BIT_FIELD_REF <high_3(D), 64, 64>; res_5 = BIT_INSERT_EXPR <res_4(D), _1, 64 (64 bits)>; _2 = BIT_FIELD_REF <low_6(D), 64, 0>; res_7 = BIT_INSERT_EXPR <res_5, _2, 0 (64 bits)>; return res_7; } For gimple IRs: res_3 = VEC_PERM_EXPR <res_2(D), high_1(D), { 0, 3 }>; res_5 = VEC_PERM_EXPR <low_4(D), res_3, { 0, 3 }>; I'd expect it can be further optimized into res_5 = VEC_PERM_EXPR <low_4(D), high_1(D), { 0, 3 }>;