[Bug target/53976] [SH] Unnecessary clrt after bt

2013-11-25 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976

--- Comment #7 from Oleg Endo olegendo at gcc dot gnu.org ---
Author: olegendo
Date: Mon Nov 25 16:47:16 2013
New Revision: 205358

URL: http://gcc.gnu.org/viewcvs?rev=205358root=gccview=rev
Log:
PR target/53976
PR target/59243
* config/sh/sh_optimize_sett_clrt.cc (struct ccreg_value): Update
comments.
(sh_optimize_sett_clrt::find_last_ccreg_values): Check stack of
previously visited basic blocks before recursing instead of only one
basic block.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/sh/sh_optimize_sett_clrt.cc


[Bug target/53976] [SH] Unnecessary clrt after bt

2013-11-21 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976

--- Comment #5 from Oleg Endo olegendo at gcc dot gnu.org ---
Author: olegendo
Date: Thu Nov 21 08:19:38 2013
New Revision: 205191

URL: http://gcc.gnu.org/viewcvs?rev=205191root=gccview=rev
Log:
PR target/53976
* config/sh/sh_optimize_sett_clrt.cc: New SH specific RTL pass.
* config/sh/sh.c (register_sh_passes): Add sh_optimize_sett_clrt pass.
* config/sh/sh/t-sh (sh_optimize_sett_clrt pass.o): New entry.
* config.gcc (sh[123456789lbe]*-*-* | sh-*-*): Add
sh_optimize_sett_clrt pass.o toextra_objs.

PR target/53976
* gcc.target/sh/pr53976-1.c: New.


Added:
trunk/gcc/config/sh/sh_optimize_sett_clrt.cc
trunk/gcc/testsuite/gcc.target/sh/pr53976-1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config.gcc
trunk/gcc/config/sh/sh.c
trunk/gcc/config/sh/t-sh
trunk/gcc/testsuite/ChangeLog


[Bug target/53976] [SH] Unnecessary clrt after bt

2013-11-21 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976

--- Comment #6 from Oleg Endo olegendo at gcc dot gnu.org ---
(In reply to Oleg Endo from comment #4)
 One option to get rid of the redundant clrt and sett in BBs that are reached
 with a conditional branch would be to add an SH specific RTL pass that
 analyses the BBs and eliminates the insns in question.
 
 Another option could be to try and inject artificial sett / clrt insns at
 the start of BBs that are reached by conditional branches, and then split
 them away to nops or output empty asm with insn length 0.  The idea would be
 to let other already existing RTL passes figure out the redundant T bit sets.

I've decided to do it with an RTL pass, as it's easier and less obscure.
The initial version committed in r205191 only eliminates redundant sett / clrt
insns.  However, there are also some opportunities to e.g. hoist sett / clrt
insns out of loops:

long long test0 (long long* a, unsigned int c)
{
  long long s = 0;
  do s += *a++; while (--c);
  return s;
}

Currently compiles to:
_test0:
mov #0,r0
mov #0,r1
.align 2
.L3:
mov.l   @r4+,r2
mov.l   @r4+,r3
clrt
addcr3,r1
addcr2,r0
add #-1,r5
tst r5,r5
bf  .L3
rts
nop

The previous T bit value at the clrt insn in the loop basic block is currently
detected to have an unknown value from the first basic block and value = 0
after the end of the loop.
In this case the clrt insn can be removed from the loop and put into the first
basic block:

_test0:
mov #0,r0
mov #0,r1
clrt
.align 2
.L3:
mov.l   @r4+,r2
mov.l   @r4+,r3
addcr3,r1
addcr2,r0
add #-1,r5
tst r5,r5
bf  .L3
rts
nop


[Bug target/53976] [SH] Unnecessary clrt after bt

2013-08-18 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976

--- Comment #4 from Oleg Endo olegendo at gcc dot gnu.org ---
One option to get rid of the redundant clrt and sett in BBs that are reached
with a conditional branch would be to add an SH specific RTL pass that analyses
the BBs and eliminates the insns in question.

Another option could be to try and inject artificial sett / clrt insns at the
start of BBs that are reached by conditional branches, and then split them away
to nops or output empty asm with insn length 0.  The idea would be to let other
already existing RTL passes figure out the redundant T bit sets.

There is already a similar functionality in cse.c (cse_condition_code_reg), but
it probably wouldn't kick in, since clrt / sett insns are emitted until the
split1 pass before register allocation and cse2 runs before that.  But still,
maybe another RTL pass would figure it out.  It doesn't necessarily need to be
before register allocation, since the T bit is a hard-reg anyway.


[Bug target/53976] [SH] Unnecessary clrt after bt

2013-08-03 Thread olegendo at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976

--- Comment #3 from Oleg Endo olegendo at gcc dot gnu.org ---
(In reply to Oleg Endo from comment #2)
 Interestingly, the following function shows some improved behavior (notice
 the removed volatile mem store):
 
 int test_2_1 (int* a, int b, int c)
 {
   a[1] = b != 0;
 
   if (b == 0)
 a[10] = c;
 
   return b == 0;
 }
 
 -O2 -m2a:
 tst r5,r5
 movrt   r1
 mov.l   r1,@(4,r4)
 bf  .L4
 mov.l   r6,@(40,r4)
 .L4:
 rts
 movtr0
 
 
 This is already minimal.
 However, for non-SH2A it's still the same:
 tst r5,r5
 mov #-1,r1
 negcr1,r1
 tst r5,r5
 bf/s.L4
 mov.l   r1,@(4,r4)
 mov.l   r6,@(40,r4)
 tst r5,r5
 .L4:
 rts
 movtr0

One of the problems in this case is that negc clobbers the T bit.  Another
alternative
movt   r0
xor#1,r0

should be selected here.  This could be done by looking at the insns around the
negc-movrt and check whether some insn after negc-movrt sets the T bit in the
same way as it was set before the negc-movrt.  In this case not clobbering the
T bit would eliminate the redundant test.  However, if this pattern occurs in a
loop or pressure on R0 is high, using negc and the redundant test is probably
going to be better.


[Bug target/53976] [SH] Unnecessary clrt after bt

2012-09-23 Thread olegendo at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976



--- Comment #1 from Oleg Endo olegendo at gcc dot gnu.org 2012-09-23 20:50:23 
UTC ---

The clrt insn gets placed into another basic block, thus using a peephole will

not work in this case.  In order to be able to eliminate the clrt (or any sett)

the value of the T bit must be tracked not only inside a basic block but also

across basic blocks.



Another case, which shows that the T bit value is lost and has to be

recalculated:



int test_2 (volatile int* a, int b, int c)

{

  a[1] = b != 0;



  if (b == 0)

a[10] = c;



  return b == 0;

}



compiled with -O2 -m4:

tst r5,r5

mov #-1,r1

negcr1,r1

mov.l   r1,@(4,r4)

tst r5,r5!! OK, negc above clobbers T bit

bf  .L2

mov.l   r6,@(40,r4)

.L2:

tst r5,r5!! T bit lost in new BB

rts

movtr0





compiled with -O2 -m2a:

tst r5,r5

movrt   r1

mov.l   r1,@(4,r4)

bf.s.L4

tst r5,r5   !! T bit lost in new BB

mov.lr6,@(40,r4)

tst r5,r5   !! T bit lost in new BB

.L4:

rts

movtr0


[Bug target/53976] [SH] Unnecessary clrt after bt

2012-09-23 Thread olegendo at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53976



--- Comment #2 from Oleg Endo olegendo at gcc dot gnu.org 2012-09-23 21:26:54 
UTC ---

Interestingly, the following function shows some improved behavior (notice the

removed volatile mem store):



int test_2_1 (int* a, int b, int c)

{

  a[1] = b != 0;



  if (b == 0)

a[10] = c;



  return b == 0;

}



-O2 -m2a:

tst r5,r5

movrt   r1

mov.l   r1,@(4,r4)

bf  .L4

mov.l   r6,@(40,r4)

.L4:

rts

movtr0





This is already minimal.

However, for non-SH2A it's still the same:

tst r5,r5

mov #-1,r1

negcr1,r1

tst r5,r5

bf/s.L4

mov.l   r1,@(4,r4)

mov.l   r6,@(40,r4)

tst r5,r5

.L4:

rts

movtr0