Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2012-03-30 Thread Teresa Johnson
Pulling this one back as I have a better solution, patch coming shortly. Thanks, Teresa On Fri, Mar 16, 2012 at 3:33 PM, Teresa Johnson tejohn...@google.com wrote: Ping - now that stage 1 is open, could someone review? Thanks, Teresa On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2012-03-16 Thread Teresa Johnson
Ping - now that stage 1 is open, could someone review? Thanks, Teresa On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson tejohn...@google.com wrote: Latest patch which improves the efficiency as described below is included here. Boostrapped and checked again with x86_64-unknown-linux-gnu. Could

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-09 Thread Xinliang David Li
The patch is good for google branches for now while waiting for upstream review. David On Sun, Dec 4, 2011 at 10:26 PM, Teresa Johnson tejohn...@google.com wrote: Latest patch which improves the efficiency as described below is included here. Boostrapped and checked again with

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
On Fri, Dec 2, 2011 at 11:59 AM, Xinliang David Li davi...@google.com wrote: ; +/* Determine whether LOOP contains floating-point computation. */ +bool +loop_has_FP_comp(struct loop *loop) +{ +  rtx set, dest; This probably should be extended to detect other long latency operations in

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-04 Thread Teresa Johnson
Latest patch which improves the efficiency as described below is included here. Boostrapped and checked again with x86_64-unknown-linux-gnu. Could someone review? Thanks, Teresa 2011-12-04 Teresa Johnson tejohn...@google.com * loop-unroll.c (decide_unroll_constant_iterations): Call

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Teresa Johnson
Thanks, Andreas. You are right in that fully peeling a loop is done by a different code path (peel_loops_completely() and earlier in the tree unroller). Teresa On Fri, Dec 2, 2011 at 12:54 AM, Andreas Krebbel kreb...@linux.vnet.ibm.com wrote: On Thu, Dec 01, 2011 at 11:39:36PM -0800, Teresa

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Andi Kleen
Teresa Johnson tejohn...@google.com writes: Interesting optimization. I would be concerned a little bit about compile time, does it make a measurable difference? The attached patch detects loops containing instructions that tend to incur high LCP (loop changing prefix) stalls on Core i7, and

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Xinliang David Li
; +/* Determine whether LOOP contains floating-point computation. */ +bool +loop_has_FP_comp(struct loop *loop) +{ +  rtx set, dest; This probably should be extended to detect other long latency operations in the future. + +  if (ix86_tune != PROCESSOR_COREI7_64 +      ix86_tune !=

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Xinliang David Li
On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen a...@firstfloor.org wrote: Teresa Johnson tejohn...@google.com writes: Interesting optimization. I would be concerned a little bit about compile time, does it make a measurable difference? The attached patch detects loops containing instructions

Re: [Patch, i386] Limit unroll factor for certain loops on Corei7

2011-12-02 Thread Teresa Johnson
On Fri, Dec 2, 2011 at 11:36 AM, Andi Kleen a...@firstfloor.org wrote: Teresa Johnson tejohn...@google.com writes: Interesting optimization. I would be concerned a little bit about compile time, does it make a measurable difference? I haven't measured compile time explicitly, but I don't it