[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #7 from Oleg Endo olegendo at gcc dot gnu.org --- Author: olegendo Date: Mon Nov 25 16:47:16 2013 New Revision: 205358 URL: http://gcc.gnu.org/viewcvs?rev=205358root=gccview=rev Log: PR target/53976 PR target/59243 * config/sh/sh_optimize_sett_clrt.cc (struct ccreg_value): Update comments. (sh_optimize_sett_clrt::find_last_ccreg_values): Check stack of previously visited basic blocks before recursing instead of only one basic block. Modified: trunk/gcc/ChangeLog trunk/gcc/config/sh/sh_optimize_sett_clrt.cc
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #5 from Oleg Endo olegendo at gcc dot gnu.org --- Author: olegendo Date: Thu Nov 21 08:19:38 2013 New Revision: 205191 URL: http://gcc.gnu.org/viewcvs?rev=205191root=gccview=rev Log: PR target/53976 * config/sh/sh_optimize_sett_clrt.cc: New SH specific RTL pass. * config/sh/sh.c (register_sh_passes): Add sh_optimize_sett_clrt pass. * config/sh/sh/t-sh (sh_optimize_sett_clrt pass.o): New entry. * config.gcc (sh[123456789lbe]*-*-* | sh-*-*): Add sh_optimize_sett_clrt pass.o toextra_objs. PR target/53976 * gcc.target/sh/pr53976-1.c: New. Added: trunk/gcc/config/sh/sh_optimize_sett_clrt.cc trunk/gcc/testsuite/gcc.target/sh/pr53976-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config.gcc trunk/gcc/config/sh/sh.c trunk/gcc/config/sh/t-sh trunk/gcc/testsuite/ChangeLog
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #6 from Oleg Endo olegendo at gcc dot gnu.org --- (In reply to Oleg Endo from comment #4) One option to get rid of the redundant clrt and sett in BBs that are reached with a conditional branch would be to add an SH specific RTL pass that analyses the BBs and eliminates the insns in question. Another option could be to try and inject artificial sett / clrt insns at the start of BBs that are reached by conditional branches, and then split them away to nops or output empty asm with insn length 0. The idea would be to let other already existing RTL passes figure out the redundant T bit sets. I've decided to do it with an RTL pass, as it's easier and less obscure. The initial version committed in r205191 only eliminates redundant sett / clrt insns. However, there are also some opportunities to e.g. hoist sett / clrt insns out of loops: long long test0 (long long* a, unsigned int c) { long long s = 0; do s += *a++; while (--c); return s; } Currently compiles to: _test0: mov #0,r0 mov #0,r1 .align 2 .L3: mov.l @r4+,r2 mov.l @r4+,r3 clrt addcr3,r1 addcr2,r0 add #-1,r5 tst r5,r5 bf .L3 rts nop The previous T bit value at the clrt insn in the loop basic block is currently detected to have an unknown value from the first basic block and value = 0 after the end of the loop. In this case the clrt insn can be removed from the loop and put into the first basic block: _test0: mov #0,r0 mov #0,r1 clrt .align 2 .L3: mov.l @r4+,r2 mov.l @r4+,r3 addcr3,r1 addcr2,r0 add #-1,r5 tst r5,r5 bf .L3 rts nop
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #4 from Oleg Endo olegendo at gcc dot gnu.org --- One option to get rid of the redundant clrt and sett in BBs that are reached with a conditional branch would be to add an SH specific RTL pass that analyses the BBs and eliminates the insns in question. Another option could be to try and inject artificial sett / clrt insns at the start of BBs that are reached by conditional branches, and then split them away to nops or output empty asm with insn length 0. The idea would be to let other already existing RTL passes figure out the redundant T bit sets. There is already a similar functionality in cse.c (cse_condition_code_reg), but it probably wouldn't kick in, since clrt / sett insns are emitted until the split1 pass before register allocation and cse2 runs before that. But still, maybe another RTL pass would figure it out. It doesn't necessarily need to be before register allocation, since the T bit is a hard-reg anyway.
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #3 from Oleg Endo olegendo at gcc dot gnu.org --- (In reply to Oleg Endo from comment #2) Interestingly, the following function shows some improved behavior (notice the removed volatile mem store): int test_2_1 (int* a, int b, int c) { a[1] = b != 0; if (b == 0) a[10] = c; return b == 0; } -O2 -m2a: tst r5,r5 movrt r1 mov.l r1,@(4,r4) bf .L4 mov.l r6,@(40,r4) .L4: rts movtr0 This is already minimal. However, for non-SH2A it's still the same: tst r5,r5 mov #-1,r1 negcr1,r1 tst r5,r5 bf/s.L4 mov.l r1,@(4,r4) mov.l r6,@(40,r4) tst r5,r5 .L4: rts movtr0 One of the problems in this case is that negc clobbers the T bit. Another alternative movt r0 xor#1,r0 should be selected here. This could be done by looking at the insns around the negc-movrt and check whether some insn after negc-movrt sets the T bit in the same way as it was set before the negc-movrt. In this case not clobbering the T bit would eliminate the redundant test. However, if this pattern occurs in a loop or pressure on R0 is high, using negc and the redundant test is probably going to be better.
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #1 from Oleg Endo olegendo at gcc dot gnu.org 2012-09-23 20:50:23 UTC --- The clrt insn gets placed into another basic block, thus using a peephole will not work in this case. In order to be able to eliminate the clrt (or any sett) the value of the T bit must be tracked not only inside a basic block but also across basic blocks. Another case, which shows that the T bit value is lost and has to be recalculated: int test_2 (volatile int* a, int b, int c) { a[1] = b != 0; if (b == 0) a[10] = c; return b == 0; } compiled with -O2 -m4: tst r5,r5 mov #-1,r1 negcr1,r1 mov.l r1,@(4,r4) tst r5,r5!! OK, negc above clobbers T bit bf .L2 mov.l r6,@(40,r4) .L2: tst r5,r5!! T bit lost in new BB rts movtr0 compiled with -O2 -m2a: tst r5,r5 movrt r1 mov.l r1,@(4,r4) bf.s.L4 tst r5,r5 !! T bit lost in new BB mov.lr6,@(40,r4) tst r5,r5 !! T bit lost in new BB .L4: rts movtr0
[Bug target/53976] [SH] Unnecessary clrt after bt
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976 --- Comment #2 from Oleg Endo olegendo at gcc dot gnu.org 2012-09-23 21:26:54 UTC --- Interestingly, the following function shows some improved behavior (notice the removed volatile mem store): int test_2_1 (int* a, int b, int c) { a[1] = b != 0; if (b == 0) a[10] = c; return b == 0; } -O2 -m2a: tst r5,r5 movrt r1 mov.l r1,@(4,r4) bf .L4 mov.l r6,@(40,r4) .L4: rts movtr0 This is already minimal. However, for non-SH2A it's still the same: tst r5,r5 mov #-1,r1 negcr1,r1 tst r5,r5 bf/s.L4 mov.l r1,@(4,r4) mov.l r6,@(40,r4) tst r5,r5 .L4: rts movtr0