[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-05 Thread steven at gcc dot gnu dot org
--- Comment #16 from steven at gcc dot gnu dot org 2010-02-05 13:33 --- I'm trying to coerce IVOPTSs into producing the following, optimal code in the GIMPLE optimizers (without much luck, so far): bb 2: pretmp.11_26 = (int) s_11(D); ivtmp.20_28 = (long unsigned int) b_inout_5(D);

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-05 Thread rakdver at kam dot mff dot cuni dot cz
--- Comment #17 from rakdver at kam dot mff dot cuni dot cz 2010-02-05 13:58 --- Subject: Re: Inefficient loop unrolling But if I adjust the cost to make index cost only 1 or 2, I get this: bb 2: pretmp.11_26 = (int) s_11(D); ivtmp.25_28 = (long unsigned int)

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-05 Thread steven at gcc dot gnu dot org
--- Comment #18 from steven at gcc dot gnu dot org 2010-02-05 14:02 --- I used -O2 -std=c99 -mcpu=arm9 -funroll-loops and I manually hacked the cost in GDB to change from: Address costs: index costs 6 cst + index costs 2 ...to this...: Address costs: index costs 1 cst + index

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread rearnsha at gcc dot gnu dot org
--- Comment #9 from rearnsha at gcc dot gnu dot org 2010-02-04 11:11 --- (In reply to comment #8) ldr r2, [r1, #0] mul r3, r2, r0 str r3, [r1], #4 ldr r2, [r1, #0] mul r3, r2, r0 str r3, [r1], #4

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread steven at gcc dot gnu dot org
--- Comment #10 from steven at gcc dot gnu dot org 2010-02-04 11:21 --- I'm going to crack this bug. -- steven at gcc dot gnu dot org changed: What|Removed |Added

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread rguenth at gcc dot gnu dot org
--- Comment #11 from rguenth at gcc dot gnu dot org 2010-02-04 11:47 --- Also try the patches from PR42617 to see if they improve the pre-regalloc scheduling. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread steven at gcc dot gnu dot org
--- Comment #12 from steven at gcc dot gnu dot org 2010-02-04 14:54 --- With the patches from bug 42617 applied, I get the following: .file tst.c .text .align 2 .global Unroll .type Unroll, %function Unroll: @ args = 0, pretend = 0,

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread steven at gcc dot gnu dot org
--- Comment #13 from steven at gcc dot gnu dot org 2010-02-04 14:56 --- With -fno-web, the patches from bug 42617 do not help and the output is the same as that of comment #8 (second asm dump). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread steven at gcc dot gnu dot org
--- Comment #14 from steven at gcc dot gnu dot org 2010-02-04 15:19 --- Part of the problem comes from the way IVOPTS optimizes the memory access: ;; Generating RTL for gimple basic block 3 ;; D.1814_10 = MEM[base: D.1846_29]; (insn 52 51 0 tst.c:6 (set (reg:SI 172 [ D.1814 ])

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-02-04 Thread steven at gcc dot gnu dot org
--- Comment #15 from steven at gcc dot gnu dot org 2010-02-04 16:06 --- The patches for bug 31849 have been commited, it seems, and it doesn't help for this case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712

[Bug rtl-optimization/36712] Inefficient loop unrolling

2010-01-25 Thread froydnj at gcc dot gnu dot org
--- Comment #8 from froydnj at gcc dot gnu dot org 2010-01-25 21:10 --- First, something has gotten better; an arm-eabi gcc (-O2 -std=c99 -mcpu=arm9 -funroll-loops) from 20091209 gives: Unroll: @ Function supports interworking. @ args = 0, pretend = 0, frame = 0

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-10-15 Thread drow at gcc dot gnu dot org
--- Comment #7 from drow at gcc dot gnu dot org 2009-10-15 12:12 --- I really would like to see this submitted - at least as a starting point for discussion. You don't need to do anything different than for a small patch; if you've missed any steps, a reviewer will tell you. Another

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-21 Thread bmei at broadcom dot com
--- Comment #6 from bmei at broadcom dot com 2009-05-21 08:38 --- I only submitted small patch before. To add a pass (may need new command-line option, disabling the old rtl-level unrolling) seems to be a big issue to me. Don't know what's procedure. My code also contains my own

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-20 Thread ramana at gcc dot gnu dot org
--- Comment #1 from ramana at gcc dot gnu dot org 2009-05-20 13:19 --- Can be reproduced with trunk today. -- ramana at gcc dot gnu dot org changed: What|Removed |Added

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-20 Thread rguenth at gcc dot gnu dot org
--- Comment #2 from rguenth at gcc dot gnu dot org 2009-05-20 14:09 --- I think there is no induction variable optimization on RTL anymore. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36712

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-20 Thread ramana at gcc dot gnu dot org
--- Comment #3 from ramana at gcc dot gnu dot org 2009-05-20 14:14 --- There was a discussion thread here. http://gcc.gnu.org/ml/gcc/2008-07/msg00037.html and one of the solutions that Bingfeng was investigating was loop unrolling before ivopts in certain cases being useful . --

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-20 Thread bmei at broadcom dot com
--- Comment #4 from bmei at broadcom dot com 2009-05-20 14:17 --- I implemented a tree-level loop-unrolling pass in our private porting, which takes advantage of later tree ivopt pass. It produces much better code than rtl-level loop unrolling in such scenarios. Not sure whether

[Bug rtl-optimization/36712] Inefficient loop unrolling

2009-05-20 Thread dje dot gcc at gmail dot com
--- Comment #5 from dje dot gcc at gmail dot com 2009-05-20 17:51 --- Subject: Re: Inefficient loop unrolling I implemented a tree-level loop-unrolling pass in our private porting, which takes advantage of later tree ivopt pass. It produces much better code than rtl-level loop