[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 --- Comment #7 from Vladimir Makarov vmakarov at gcc dot gnu.org --- (In reply to amker from comment #6) Since it works on gcc 3.4, so I consider this as a regression and applied the patch. Should be fixed now. Hi Vlad, could you please help me verify that the original benchmark is fixed too? Thanks very much! Yes, it was fixed. Thanks for working on this.
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 amker at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #8 from amker at gcc dot gnu.org --- Fixed according to Vlad's input.
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 --- Comment #5 from amker at gcc dot gnu.org --- Author: amker Date: Fri Feb 13 05:44:46 2015 New Revision: 220676 URL: https://gcc.gnu.org/viewcvs?rev=220676root=gccview=rev Log: PR tree-optimization/64705 * tree-ssa-loop-niter.h (expand_simple_operations): New parameter. * tree-ssa-loop-niter.c (expand_simple_operations): New parameter. * tree-ssa-loop-ivopts.c (extract_single_var_from_expr): New. (find_bivs, find_givs_in_stmt_scev): Pass new argument to expand_simple_operations. testsuite PR tree-optimization/64705 * gcc.dg/tree-ssa/pr64705.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr64705.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-loop-ivopts.c trunk/gcc/tree-ssa-loop-niter.c trunk/gcc/tree-ssa-loop-niter.h
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 --- Comment #6 from amker at gcc dot gnu.org --- Since it works on gcc 3.4, so I consider this as a regression and applied the patch. Should be fixed now. Hi Vlad, could you please help me verify that the original benchmark is fixed too? Thanks very much!
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 --- Comment #4 from amker at gcc dot gnu.org --- I had a patch.
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 amker at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-01-23 CC||amker at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |amker at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from amker at gcc dot gnu.org --- Confirmed. I shall have a look.
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 --- Comment #2 from amker at gcc dot gnu.org --- Loop dump before IVOPT is like below: Loop 4, basic blocks 28/30; bb 26: count_54 = count_172 + 1; _55 = i_161 + i_161; prime_56 = _55 + 3; k_57 = prime_56 + i_161; if (size_26 = k_57) goto bb 27; else goto bb 31; bb 27: bb 28: # k_167 = PHI k_57(27), k_62(30) # ci_168 = PHI ci_169(27), ci_58(30) ci_58 = ci_168 + 1; k.19_59 = (sizetype) k_167; _60 = flags_30 + k.19_59; *_60 = 0; k_62 = prime_56 + k_167; if (size_26 = k_62) goto bb 30; else goto bb 29; bb 29: # ci_154 = PHI ci_58(28) goto bb 31; bb 30: goto bb 28; The IV uses found by IVOPT is like below: use 0 address defined in statement used in statement *_60 = 0; at position *_60 type char * base flags_30 + (sizetype) k_57 step (sizetype) prime_56 base object (void *) flags_30 related candidates use 1 compare defined in statement used in statement if (size_26 = k_62) at position type long int base (_55 + 3) + k_57 step prime_56 is a biv related candidates use 2 generic (computed on exit edge) defined in statement ci_58 = ci_168 + 1; used in statement ci_154 = PHI ci_58(28) at position type long int base ci_169 + 1 step 1 is a biv related candidates Root cause is IVOPT expands use 1 from {prime_56 + k_57, prime_56}_loop to {(_55 + _3) + k_57, prime_56}_loop. Thus information of iv.step == prime_56 == (_55+_3) is lost during costs computation and uses rewrting, resulting in wrong candidate selected and bloated loop after IVOPT. The related code is in function find_givs_in_stmt_scev, specifically, if (!simple_iv (loop, loop_containing_stmt (stmt), lhs, iv, true)) return false; iv-base = expand_simple_operations (iv-base); // --- expansion I will see how to fix the issue by skipping expansion in case like this. Thanks, bin
[Bug tree-optimization/64705] Bad code generation of sieve on x86-64 because of too aggressive IV optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64705 amker at gcc dot gnu.org changed: What|Removed |Added Target|x86_64-*-* |x86_64-*-*, aarch64 --- Comment #3 from amker at gcc dot gnu.org --- Also it's a target independent issue. Though IVOPT chooses base+index addressing mode, it needs one more instruction to calculate the condition. LLVM's assembly: .LBB34_7: addx26, x26, #1 strb wzr, [x22, x9] add x9, x9, x24 cmp x9, x28 b.le.LBB34_7 GCC's assembly: .L71: strbwzr, [x27, x0] addx0, x0, x2 addx19, x19, 1 addx1, x4, x0 cmpx21, x1 bge.L71