http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50037
--- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-08-11 11:39:44 UTC --- It probably doesn't help that tree IVOPTs replaces the nice induction variable with a pointer one: # BLOCK 2 freq:900 # PRED: ENTRY [100.0%] (fallthru,exec) count_5 = n_4(D) & 3; D.2721_17 = (long unsigned int) count_5; D.2722_16 = D.2721_17 * 4; D.2723_15 = (long unsigned int) addr_6(D); D.2724_21 = D.2723_15 + 4; D.2725_22 = D.2724_21 + D.2722_16; D.2726_23 = (int *) D.2725_22; # SUCC: 3 [100.0%] (fallthru,exec) # BLOCK 3 freq:9100 # PRED: 3 [90.1%] (dfs_back,true,exec) 2 [100.0%] (fallthru,exec) # addr_18 = PHI <addr_11(3), addr_6(D)(2)> # sum_20 = PHI <sum_9(3), sum_7(D)(2)> D.2703_8 = MEM[base: addr_18, offset: 0B]; sum_9 = D.2703_8 + sum_20; addr_11 = addr_18 + 4; if (addr_11 != D.2726_23) goto <bb 3>; else goto <bb 4>; # SUCC: 3 [90.1%] (dfs_back,true,exec) 4 [9.9%] (false,exec) but even without IVOPTs RTL has difficulties and unrolls 8 times: Loop 1 is simple: simple exit 3 -> 4 number of iterations: (reg/v:SI 62 [ count ]) upper bound: -2 Most canonical testcase: int foo (int n, int *addr) { int count, sum; n = n & 0x3; for (count = 0; count < n; count++) sum += addr[count]; return sum; } with IVOPTs on (which preserves a count != n exit test and count): Loop 1 is simple: simple exit 4 -> 5 does not roll if: (expr_list:REG_DEP_TRUE (le:SI (and:SI (reg/v:SI 86 [ n ]) (const_int 3 [0x3])) (const_int 0 [0])) (expr_list:REG_DEP_TRUE (eq:SI (and:SI (reg/v:SI 86 [ n ]) (const_int 3 [0x3])) (const_int -2147483648 [0xffffffff80000000])) (nil))) number of iterations: (subreg:SI (plus:DI (not:DI (reg:DI 83 [ ivtmp.11 ])) (sign_extend:DI (reg/v:SI 76 [ n ]))) 0) upper bound: 4294967295 it looks like we do not look for the definition of n in the exit test (insn 31 29 32 4 (set (reg:CCGC 17 flags) (compare:CCGC (reg/v:SI 76 [ n ]) (subreg:SI (reg:DI 83 [ ivtmp.11 ]) 0))) t.c:5 6 {*cmpsi_1} (expr_list:REG_DEAD (reg/v:SI 76 [ n ]) (nil))) which would show (insn 23 19 24 2 (parallel [ (set (reg/v:SI 76 [ n ]) (and:SI (reg/v:SI 86 [ n ]) (const_int 3 [0x3]))) (clobber (reg:CC 17 flags)) ]) t.c:4 378 {*andsi_1} (expr_list:REG_DEAD (reg/v:SI 86 [ n ]) (expr_list:REG_UNUSED (reg:CC 17 flags) (nil))))