http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60537

            Bug ID: 60537
           Summary: Loop optimization code bloat for simple loops
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
            Target: sh*-*-*

I have noticed this on SH, maybe it also applies to other targets (checked on
4.9 r208241).

The following simple loop (simple strlen implementation):

unsigned int test (const char* s0)
{
  const char* s1 = s0;

  while (*s1)
    s1++;

  return s1 - s0;
}

With -O2 -m4 gets compiled to:

        mov.b   @r4,r1
        tst     r1,r1
        bt/s    .L4
        mov     r4,r1
        add     #1,r1
    .align 2
.L3:
        mov     r1,r0
        mov.b   @r0,r2
        tst     r2,r2
        bf/s    .L3
        add     #1,r1
        rts
        sub     r4,r0
        .align 1
.L4:
        rts
        mov     #0,r0


With -Os -m4 it is basically just the inner loop:
        mov    r4,r1
.L2:
        mov     r1,r0
        mov.b   @r0,r2
        tst     r2,r2
        bf/s    .L2
        add     #1,r1
        rts
        sub     r4,r0


The additional loop test in the loop header in the -O2 version seems a bit
pointless.  If the loop exists at the first iteration, it simply falls through.
 The additional test and jump around the loop doesn't gain anything in this
case but just increases code size unnecessarily.

Reply via email to