[Bug tree-optimization/18316] Missed IV optimization
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316 Steven Bosscher steven at gcc dot gnu.org changed: What|Removed |Added Last reconfirmed|2010-02-12 21:46:26 |2012-11-08 21:46:26 --- Comment #15 from Steven Bosscher steven at gcc dot gnu.org 2012-11-08 22:59:49 UTC --- Still a missed optimization as of trunk r193340: strength_test2: movl(%rdi), %ecx movl%ecx, %eax .p2align 4,,10 .p2align 3 .L3: movslq 8(%rdi), %rdx movl$2, (%rdi,%rdx,4) movl%eax, %edx addl%ecx, %eax cmpl4(%rdi), %edx jl .L3 rep ret strength_result2: movl(%rdi), %ecx xorl%eax, %eax .p2align 4,,10 .p2align 3 .L7: movslq 8(%rdi), %rdx addl%ecx, %eax movl$2, (%rdi,%rdx,4) cmpl4(%rdi), %eax jl .L7 rep ret
[Bug tree-optimization/18316] Missed IV optimization
--- Comment #14 from steven at gcc dot gnu dot org 2010-02-12 21:46 --- On x86_64 the two functions still give different code: ;; Function strength_test2 (strength_test2) strength_test2 (int * data) { unsigned int ivtmp.12; int * pretmp.9; int * pretmp.7; int k; int D.2743; int D.2741; int * D.2740; long unsigned int D.2739; long unsigned int D.2738; int D.2737; bb 2: k_3 = *data_2(D); pretmp.7_24 = data_2(D) + 8; pretmp.9_26 = data_2(D) + 4; ivtmp.12_25 = (unsigned int) k_3; bb 3: # ivtmp.12_5 = PHI ivtmp.12_25(2), ivtmp.12_12(3) D.2737_6 = *pretmp.7_24; D.2738_7 = (long unsigned int) D.2737_6; D.2739_8 = D.2738_7 * 4; D.2740_9 = data_2(D) + D.2739_8; *D.2740_9 = 2; D.2741_28 = (int) ivtmp.12_5; D.2743_13 = *pretmp.9_26; ivtmp.12_12 = ivtmp.12_5 + ivtmp.12_25; if (D.2743_13 D.2741_28) goto bb 3; else goto bb 4; bb 4: return; } ;; Function strength_result2 (strength_result2) strength_result2 (int * data) { unsigned int D.2772; unsigned int D.2773; unsigned int D.2774; int * pretmp.21; int i; int k; int D.2735; int * D.2733; long unsigned int D.2732; long unsigned int D.2731; int D.2730; bb 2: k_3 = *data_2(D); pretmp.21_22 = data_2(D) + 8; pretmp.21_23 = data_2(D) + 4; bb 3: # i_1 = PHI 0(2), i_25(3) D.2730_6 = *pretmp.21_22; D.2731_7 = (long unsigned int) D.2730_6; D.2732_8 = D.2731_7 * 4; D.2733_9 = data_2(D) + D.2732_8; *D.2733_9 = 2; D.2772_5 = (unsigned int) i_1; D.2773_11 = (unsigned int) k_3; D.2774_24 = D.2772_5 + D.2773_11; i_25 = (int) D.2774_24; D.2735_12 = *pretmp.21_23; if (D.2735_12 i_25) goto bb 3; else goto bb 4; bb 4: return; } -- steven at gcc dot gnu dot org changed: What|Removed |Added Last reconfirmed|2005-12-21 03:39:56 |2010-02-12 21:46:26 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Comment #13 from pinskia at gcc dot gnu dot org 2008-09-14 04:02 --- Still the extra mr is still there. In fact for PPC64, it is even worse as there are two extra instructions: .L2: lwa 11,0(7) extsw 9,0 sldi 11,11,2 add 0,0,10 stwx 6,3,11 lwz 11,0(8) rldicl 0,0,0,32 cmpw 7,11,9 bgt 7,.L2 vs: .L7: lwa 9,0(7) add 0,0,11 sldi 9,9,2 extsw 0,0 stwx 10,3,9 lwz 9,0(8) cmpw 7,9,0 bgt 7,.L7 blr -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Comment #12 from steven at gcc dot gnu dot org 2007-12-17 08:14 --- Andrew, could you compare the two functions for ppc with a recent SVN revision, please? -- steven at gcc dot gnu dot org changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-07-08 18:19 --- We still have either a ra issue (or ivopts issue which our current ra cannot resolve). On the tree level we get the following difference. strength_test2: L0:; *(data + (int *) ((unsigned int) *pretmp.9 * 4)) = 2; D.1276 = (int) ivtmp.14; ivtmp.14 = ivtmp.14 + ivtmp.17; if (*pretmp.11 D.1276) goto L0; else goto L1; strength_result2: L0:; *(data + (int *) ((unsigned int) *pretmp.27 * 4)) = 2; i = (int) ((unsigned int) i + (unsigned int) k); if (*pretmp.28 i) goto L0; else goto L1; The PPC asm is: test: L2: lwz r0,0(r7) mr r9,r11 add r11,r11,r8 slwi r0,r0,2 stwx r6,r3,r0 lwz r2,0(r10) cmpw cr7,r2,r9 bgt+ cr7,L2 result: L9: lwz r0,0(r10) add r9,r9,r8 slwi r0,r0,2 stwx r7,r3,r0 lwz r2,0(r11) cmpw cr7,r2,r9 bgt+ cr7,L9 Notice the extra mv. -- What|Removed |Added Keywords|patch |ra http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From cvs-commit at gcc dot gnu dot org 2005-05-01 08:08 --- Subject: Bug 18316 CVSROOT:/cvs/gcc Module name:gcc Changes by: [EMAIL PROTECTED] 2005-05-01 08:08:14 Modified files: gcc: ChangeLog tree-scalar-evolution.c tree-scalar-evolution.h tree-ssa-loop-ivopts.c tree-ssa-loop-manip.c tree-ssa-loop-niter.c tree.c gcc/testsuite : ChangeLog Added files: gcc/testsuite/gcc.dg/tree-ssa: loop-8.c Log message: PR tree-optimization/18316 PR tree-optimization/19126 * tree.c (build_int_cst_type): Avoid shift by size of type. * tree-scalar-evolution.c (simple_iv): Add allow_nonconstant_step argument. * tree-scalar-evolution.h (simple_iv): Declaration changed. * tree-ssa-loop-ivopts.c (struct iv_cand): Add depends_on field. (dump_cand): Dump depends_on information. (determine_biv_step): Add argument to simple_iv call. (contains_abnormal_ssa_name_p): Handle case expr == NULL. (find_bivs, find_givs_in_stmt_scev): Do not require step to be a constant. (add_candidate_1): Record depends_on for candidates. (tree_int_cst_sign_bit, constant_multiple_of): New functions. (get_computation_at, get_computation_cost_at, may_eliminate_iv): Handle ivs with nonconstant step. (iv_ca_set_remove_invariants, iv_ca_set_add_invariants): New functions. (iv_ca_set_no_cp, iv_ca_set_cp): Handle cand-depends_on. (create_new_iv): Unshare the step before passing it to create_iv. (free_loop_data): Free cand-depends_on. (build_addr_strip_iref): New function. (find_interesting_uses_address): Use build_addr_strip_iref. (strip_offset_1): Split the recursive part from strip_offset. Strip constant offset component_refs and array_refs. (strip_offset): Split the recursive part to strip_offset_1. (add_address_candidates): Removed. (add_derived_ivs_candidates): Do not use add_address_candidates. (add_iv_value_candidates): Add candidates with stripped constant offset. Consider all candidates with initial value 0 important. (struct affine_tree_combination): New. (aff_combination_const, aff_combination_elt, aff_combination_scale, aff_combination_add_elt, aff_combination_add, tree_to_aff_combination, add_elt_to_tree, aff_combination_to_tree, fold_affine_sum): New functions. (get_computation_at): Use fold_affine_sum. * tree-ssa-loop-manip.c (create_iv): Handle ivs with nonconstant step. * tree-ssa-loop-niter.c (number_of_iterations_exit): Add argument to simple_iv call. * gcc.dg/tree-ssa/loop-8.c: New test. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gccr1=2.8543r2=2.8544 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-scalar-evolution.c.diff?cvsroot=gccr1=2.21r2=2.22 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-scalar-evolution.h.diff?cvsroot=gccr1=2.3r2=2.4 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-ivopts.c.diff?cvsroot=gccr1=2.65r2=2.66 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-manip.c.diff?cvsroot=gccr1=2.31r2=2.32 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-ssa-loop-niter.c.diff?cvsroot=gccr1=2.24r2=2.25 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree.c.diff?cvsroot=gccr1=1.476r2=1.477 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gccr1=1.5421r2=1.5422 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/tree-ssa/loop-8.c.diff?cvsroot=gccr1=NONEr2=1.1 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-05-01 13:48 --- On PPC, there are two extra mr's in the first case. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-05-01 13:49 --- Oh, I have forgotten to say the IV opt is caught though. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-04-18 09:33 --- Updated patch: http://gcc.gnu.org/ml/gcc-patches/2005-04/msg01959.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
-- What|Removed |Added Status|REOPENED|ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-02-02 08:56 --- Patch: http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00142.html -- What|Removed |Added Keywords||patch http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
-- What|Removed |Added Target Milestone|4.0.0 |--- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-25 11:35 --- Reopening as an enhancement request for ivopts. -- What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From stevenb at suse dot de 2005-01-24 09:12 --- Subject: Re: Missed IV optimization *sigh* The old loop optimizer... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 14:37 --- Bravo Zdenek!!! .text .p2align 4,,15 .globl strength_test2 .type strength_test2, @function strength_test2: .LFB2: movl(%rdi), %r8d leaq8(%rdi), %rsi leaq4(%rdi), %rcx xorl%edx, %edx .p2align 4,,7 .L2: movslq (%rsi),%rax addl%r8d, %edx movl$2, (%rdi,%rax,4) cmpl(%rcx), %edx jl .L2 rep ; ret .LFE2: .size strength_test2, .-strength_test2 .p2align 4,,15 .globl strength_result2 .type strength_result2, @function strength_result2: .LFB3: movl(%rdi), %r8d leaq8(%rdi), %rsi leaq4(%rdi), %rcx xorl%edx, %edx .p2align 4,,7 .L9: movslq (%rsi),%rax addl%r8d, %edx movl$2, (%rdi,%rax,4) cmpl(%rcx), %edx jl .L9 rep ; ret .LFE3: .size strength_result2, .-strength_result2 -- What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED Target Milestone|--- |4.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-23 14:58 --- hmm... ivopts definitely are not the responsible for this, and I am fairly suprised that this is fixed; could you please check what's happening? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316
[Bug tree-optimization/18316] Missed IV optimization
--- Additional Comments From pinskia at gcc dot gnu dot org 2004-11-06 17:00 --- Confirmed, so does this. -- What|Removed |Added Severity|normal |enhancement Status|UNCONFIRMED |NEW Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2004-11-06 17:00:55 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18316