http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50037

--- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-11 
11:39:44 UTC ---
It probably doesn't help that tree IVOPTs replaces the nice induction variable
with a pointer one:

  # BLOCK 2 freq:900
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  count_5 = n_4(D) & 3;
  D.2721_17 = (long unsigned int) count_5;
  D.2722_16 = D.2721_17 * 4;
  D.2723_15 = (long unsigned int) addr_6(D);
  D.2724_21 = D.2723_15 + 4;
  D.2725_22 = D.2724_21 + D.2722_16;
  D.2726_23 = (int *) D.2725_22;
  # SUCC: 3 [100.0%]  (fallthru,exec)

  # BLOCK 3 freq:9100
  # PRED: 3 [90.1%]  (dfs_back,true,exec) 2 [100.0%]  (fallthru,exec)
  # addr_18 = PHI <addr_11(3), addr_6(D)(2)>
  # sum_20 = PHI <sum_9(3), sum_7(D)(2)>
  D.2703_8 = MEM[base: addr_18, offset: 0B];
  sum_9 = D.2703_8 + sum_20;
  addr_11 = addr_18 + 4;
  if (addr_11 != D.2726_23)
    goto <bb 3>;
  else
    goto <bb 4>;
  # SUCC: 3 [90.1%]  (dfs_back,true,exec) 4 [9.9%]  (false,exec)

but even without IVOPTs RTL has difficulties and unrolls 8 times:

Loop 1 is simple:
  simple exit 3 -> 4
  number of iterations: (reg/v:SI 62 [ count ])
  upper bound: -2

Most canonical testcase:

int foo (int n, int *addr)
{
  int count, sum;
  n = n & 0x3;
  for (count = 0; count < n; count++)
    sum += addr[count];
  return sum;
}

with IVOPTs on (which preserves a count != n exit test and count):

Loop 1 is simple:
  simple exit 4 -> 5
  does not roll if: (expr_list:REG_DEP_TRUE (le:SI (and:SI (reg/v:SI 86 [ n ])
            (const_int 3 [0x3]))
        (const_int 0 [0]))
    (expr_list:REG_DEP_TRUE (eq:SI (and:SI (reg/v:SI 86 [ n ])
                (const_int 3 [0x3]))
            (const_int -2147483648 [0xffffffff80000000]))
        (nil)))
  number of iterations: (subreg:SI (plus:DI (not:DI (reg:DI 83 [ ivtmp.11 ]))
        (sign_extend:DI (reg/v:SI 76 [ n ]))) 0)
  upper bound: 4294967295

it looks like we do not look for the definition of n in the exit test

(insn 31 29 32 4 (set (reg:CCGC 17 flags)
        (compare:CCGC (reg/v:SI 76 [ n ])
            (subreg:SI (reg:DI 83 [ ivtmp.11 ]) 0))) t.c:5 6 {*cmpsi_1}
     (expr_list:REG_DEAD (reg/v:SI 76 [ n ])
        (nil)))

which would show

(insn 23 19 24 2 (parallel [
            (set (reg/v:SI 76 [ n ])
                (and:SI (reg/v:SI 86 [ n ])
                    (const_int 3 [0x3])))
            (clobber (reg:CC 17 flags))
        ]) t.c:4 378 {*andsi_1}
     (expr_list:REG_DEAD (reg/v:SI 86 [ n ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

Reply via email to