https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67636
Bug ID: 67636 Summary: [6 Regression][SH] gcc.target/sh/pr54236-1.c failures Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: olegendo at gcc dot gnu.org CC: segher at gcc dot gnu.org Target Milestone: --- Target: sh*-*-* The following new failure popped up a while ago. non-SH2A: FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times negc 2 SH2A: FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times bld 1 FAIL: gcc.target/sh/pr54236-1.c scan-assembler-times movt 1 The failing sub test is: int test_07 (int *vec) { /* Must not see a 'sett' or 'addc' here. This is a case where combine tries to produce 'a + (0 - b) + 1' out of 'a - b + 1'. On non-SH2A there is a 'tst + negc', on SH2A a 'bld + movt'. */ int z = vec[0]; int vi = vec[1]; int zi = vec[2]; if (zi != 0 && z < -1) vi -= (((vi >> 7) & 0x01) << 1) - 1; return vi; } For non-SH2A GCC 5 produces: mov.l @(8,r4),r1 tst r1,r1 bt/s .L9 mov.l @(4,r4),r0 mov.l @r4,r1 mov #-1,r2 cmp/gt r1,r2 bf/s .L9 tst #128,r0 mov #-1,r1 negc r1,r1 add r1,r1 r1 = T << 1 sub r1,r0 r0 = r0 - (T << 1) add #1,r0 r0 = r0 - (T << 1) + 1 .L9: rts nop For SH2A GCC 5 produces: mov.l @(8,r4),r1 mov.l @(4,r4),r0 tst r1,r1 bt.s .L9 mov #-1,r1 mov.l @r4,r2 cmp/ge r1,r2 bt.s .L9 bld #7,r0 movt r1 add r1,r1 sub r1,r0 add #1,r0 .L9: rts/n Now, trunk produces for both, non-SH2A and SH2A (only the relevant BB): ... bt.s .L9 mov r0,r1 shlr2 r1 shlr2 r1 shlr2 r1 mov #2,r2 and r2,r1 sub r1,r0 add #1,r0 In GCC 5 combine was using zero_extract for the bit test. Now it tries something like: Failed to match this instruction: (set (reg/v:SI 162 [ vi ]) (plus:SI (minus:SI (reg/v:SI 162 [ vi ]) (and:SI (lshiftrt:SI (reg/v:SI 162 [ vi ]) (const_int 6 [0x6])) (const_int 2 [0x2]))) (const_int 1 [0x1]))) i.e. x - ((y >> 6) & 2) + 1 Both GCC 5 and trunk are not optimal. On non-SH2A the T bit calculation goes like this: r0 = r0 - 2*(1 - T) - 1 = r0 - 2 + 2*T - 1 = r0 - 3 + 2*T which could be realized as: tst #128,r0 mov #-3,r1 movt r2 addc r1,r0 add r2,r0 On SH2A the bld insn can be used: bld #7,r0 movt r2 mov #-1,r1 addc r1,r0 add r2,r0 If the constants for addc are hoisted out of the loop, this results in 4 insns for non-SH2A and for SH2A. However, without the zero_extract this is a bit clunky to catch. It'd be easier if this sub-rtx: (and:SI (lshiftrt:SI (reg/v:SI 162 [ vi ]) (const_int 6 [0x6])) (const_int 2 [0x2]))) ... was matched as: (ashift:SI (zero_extract:SI (reg/v:SI 162 [ vi ]) (const_int 1) (const_int 7)) (const_int 1))