[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From pinskia at gcc dot gnu dot org 2005-02-13 23:14 --- Moving to 4.1 as I don't care much if this gets fixed. -- What|Removed |Added Target Milestone|4.0.0 |4.1.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-27 13:18 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) With the following patch I got some speedup for depth 100. from: tree iv optimization : 2.62 (62%) usr 0.27 (82%) sys 2.92 (62%) wall tree iv optimization : 2.33 (59%) usr 0.25 (83%) sys 2.63 (54%) wall to: tree iv optimization : 1.19 (46%) usr 0.04 (40%) sys 1.26 (45%) wall tree iv optimization : 1.21 (47%) usr 0.05 (45%) sys 1.30 (46%) wall Basically we're reseting too much information each time scev_reset is called. It would even be possible to reset only the part of the scev database that contains symbols instead of erasing everything. Do somebody know how to iterate over all the elements of the hash setting to NULL_TREE the elements that answer true to chrec_contains_symbols? Index: tree-scalar-evolution.c === RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v retrieving revision 2.16 diff -d -u -p -r2.16 tree-scalar-evolution.c --- tree-scalar-evolution.c 18 Jan 2005 11:36:26 - 2.16 +++ tree-scalar-evolution.c 27 Jan 2005 13:12:36 - @@ -2490,7 +2490,7 @@ scev_reset (void) for (i = 1; i current_loops-num; i++) { loop = current_loops-parray[i]; - if (loop) + if (loop chrec_contains_symbols (loop-nb_iterations)) loop-nb_iterations = NULL_TREE; } } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-27 15:00 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) --- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-27 13:18 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) With the following patch I got some speedup for depth 100. from: tree iv optimization : 2.62 (62%) usr 0.27 (82%) sys 2.92 (62%) wall tree iv optimization : 2.33 (59%) usr 0.25 (83%) sys 2.63 (54%) wall to: tree iv optimization : 1.19 (46%) usr 0.04 (40%) sys 1.26 (45%) wall tree iv optimization : 1.21 (47%) usr 0.05 (45%) sys 1.30 (46%) wall Basically we're reseting too much information each time scev_reset is called. It would even be possible to reset only the part of the scev database that contains symbols instead of erasing everything. Do somebody know how to iterate over all the elements of the hash setting to NULL_TREE the elements that answer true to chrec_contains_symbols? the patch is below (in stronger form -- only removing entries that contain invalidated symbols), and indeed helps with this testcase. It however causes measurable slow down on normal code (see analysis in one of the previous mails). Note that your patch is not entirely correct -- in case loop is unrolled, its number of iterations is changed, so in this case scev_reset should clear its number of iterations unconditionally (I think something similar occurs with vectorizer, so we need to be a bit careful). Index: tree-loop-linear.c === RCS file: /cvs/gcc/gcc/gcc/tree-loop-linear.c,v retrieving revision 2.6 diff -c -3 -p -r2.6 tree-loop-linear.c *** tree-loop-linear.c 18 Jan 2005 11:36:24 - 2.6 --- tree-loop-linear.c 24 Jan 2005 01:34:04 - *** linear_transform_loops (struct loops *lo *** 373,379 free_data_refs (datarefs); } free_df (); ! scev_reset (); rewrite_into_loop_closed_ssa (); #ifdef ENABLE_CHECKING verify_loop_closed_ssa (); --- 373,379 free_data_refs (datarefs); } free_df (); ! scev_reset (NULL); rewrite_into_loop_closed_ssa (); #ifdef ENABLE_CHECKING verify_loop_closed_ssa (); Index: tree-scalar-evolution.c === RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v retrieving revision 2.16 diff -c -3 -p -r2.16 tree-scalar-evolution.c *** tree-scalar-evolution.c 18 Jan 2005 11:36:26 - 2.16 --- tree-scalar-evolution.c 24 Jan 2005 01:34:05 - *** scev_initialize (struct loops *loops) *** 2475,2484 loops-parray[i]-nb_iterations = NULL_TREE; } ! /* Cleans up the information cached by the scalar evolutions analysis. */ void ! scev_reset (void) { unsigned i; struct loop *loop; --- 2475,2539 loops-parray[i]-nb_iterations = NULL_TREE; } ! /* Returns true if EXPR references one of the ssa names in set !INVALIDATED_NAMES. */ ! ! static bool ! tree_references_names (tree expr, bitmap invalidated_names) ! { ! if (!expr) ! return false; ! ! if (TREE_CODE (expr) == SSA_NAME) ! return bitmap_bit_p (invalidated_names, SSA_NAME_VERSION (expr)); ! ! switch (TREE_CODE_LENGTH (TREE_CODE (expr))) ! { ! case 4: ! if (tree_references_names (TREE_OPERAND (expr, 3), invalidated_names)) ! return true; ! case 3: ! if (tree_references_names (TREE_OPERAND (expr, 2), invalidated_names)) ! return true; ! case 2: ! if (tree_references_names (TREE_OPERAND (expr, 1), invalidated_names)) ! return true; ! case 1: ! if (tree_references_names (TREE_OPERAND (expr, 0), invalidated_names)) ! return true; ! case 0: ! return false; ! ! default: ! /* Don't worry about handling strange cases. This function is only used !in a way in that it does not matter if we return true here. */ ! return true; ! } ! } ! ! /* If the SLOT in the scev cache contains ssa name belonging to the set !passed in DATA, the function removes it from the cache. Callback !for htab_traverse. */ ! ! static int ! invalidate_names_reference (void **slot, void *data) ! { ! bitmap invalidated_names = data; ! struct scev_info_str *elt = *slot; ! ! if (tree_references_names (elt-var, invalidated_names) ! || tree_references_names (elt-chrec, invalidated_names)) ! htab_clear_slot (scalar_evolution_info, slot); ! ! return 1; ! } ! ! /* Cleans up the information cached by the scalar evolutions analysis. !If INVALIDATED_NAMES is provided, only the references to invalidated !ssa names are removed. */ void ! scev_reset (bitmap invalidated_names) { unsigned i; struct loop *loop; *** scev_reset (void) *** 2486,2497 if
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-27 15:12 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: the patch is below (in stronger form -- only removing entries that contain invalidated symbols), and indeed helps with this testcase. Many thanks. It however causes measurable slow down on normal code (see analysis in one of the previous mails). I see. Another idea: would it be possible to insert the invalidated names during the optimization pass instead of invalidating all the symbols? This would avoid the first pass over the scev database that search for symbols. Note that your patch is not entirely correct -- in case loop is unrolled, its number of iterations is changed, so in this case scev_reset should clear its number of iterations unconditionally or the unroller should do that work on the unrolled loops? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-27 19:10 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Another idea: would it be possible to insert the invalidated names during the optimization pass instead of invalidating all the symbols? This would avoid the first pass over the scev database that search for symbols. I don't understand this proposal. Note that your patch is not entirely correct -- in case loop is unrolled, its number of iterations is changed, so in this case scev_reset should clear its number of iterations unconditionally or the unroller should do that work on the unrolled loops? This seems to be the right approach. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-27 20:38 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: Another idea: would it be possible to insert the invalidated names during the optimization pass instead of invalidating all the symbols? This would avoid the first pass over the scev database that search for symbols. I don't understand this proposal. Anyway, I see that I was wrong: even if you mark the SSA_NAMEs that are removed in an optimization, and that you invalidate only those symbols, you still have to walk the scev database to ensure that there is no other evolution that references the removed SSA_NAMEs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 10:32 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: Adding the instantiation cache was compile time neutral on normal code, so I don't think the effect on scev cost is significant. How that? You end up querying and caching an evolution for every instantiation of an SSA_NAME! I will quantify the number of queries wrt the number of times this information is useful. I think that with my patch, the instantiation cache information is used zero times on a bootstrap. The problem is that we end up calling the instantiate_parameters_1 function 83214+2453273 (outside + recursive) times (for N=100). We also call analyze_scalar_evolution_1 1146317 times. Many of these calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat). Large part of 5142869 of build_int_cst_wide calls is very likely also due to scev analysis. This is not going to be cheap. Removing the instantiation cache definitely would not decrease the number of instantiate_parameters_1 calls (might increase it, in fact, although I don't know how significantly). This also could be a bad use of the scev analysis. For Steven: you can have a O(N**3) algorithm very simply: loop O(N) times loop O(N) times algorithm in O(N) One part of the problem is that loop optimizers need to throw away the scev caches after each change to loops, which leads to recounting the information too much. Allowing to invalidate only changed parts helps, but seems to be relatively costly on normal code -- you need to scan the hash table for things that refer to removed or from some other reason invalidated ssa names, which is slow. Just set the SSA_NAME to not_analyzed_yet or NULL_TREE, and the next time you'll ask for the evolution of this SSA_NAME you'll analyze the evolution from scratch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 10:39 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: How? If the reference is left in symbolic form, it means that you know nothing about its value, so the only thing you can do with it is to check its equality/inequality, in code like while (...) { i = something_weird (); a[i] = ...; (a) a[i+1] = ...; (b) a[i] = ...; (c) } The following is probably a more frequent case: i = 0 x = something_weird () or a function parameter loop i = i + 1 a[i] = ... ... = a[i + x] endloop where you end with a symbolic distance vector. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 11:02 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: (*) I hope; scev is a mess of mutualy recursive functions -- analyze_scalar_evolution calling number_of_iterations calling instantiate_parameters calling analyze_scalar_evolution again, with instantiate_parameters hacked so that the infinite cycle cannot occur is my favourite one. Nobody can say anything sure about behavior of scev -- it is not even well defined what analyze_scalar_evolutions will return to you, It returns to you an evolution that might contain SSA_NAMEs. unless you call instantiate_parameters or resolve_mixers on the result. And once you call instantiate_parameters on the result you're guaranteed that what you get does contain only determined constants, or otherwise the result is chrec_dont_know. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 11:14 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: Adding the instantiation cache was compile time neutral on normal code, so I don't think the effect on scev cost is significant. How that? You end up querying and caching an evolution for every instantiation of an SSA_NAME! which is pretty cheap (constant time operation); note that we do an equivalent lookup in the analyze_scalar_evolution call in any case, also without causing any compile time problems. I haven't measured any slowdown on normal testcases. I will quantify the number of queries wrt the number of times this information is useful. I think that with my patch, the instantiation cache information is used zero times on a bootstrap. I don't think so. Please come up with some numbers, otherwise this discussion is pointless. The problem is that we end up calling the instantiate_parameters_1 function 83214+2453273 (outside + recursive) times (for N=100). We also call analyze_scalar_evolution_1 1146317 times. Many of these calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat). Large part of 5142869 of build_int_cst_wide calls is very likely also due to scev analysis. This is not going to be cheap. Removing the instantiation cache definitely would not decrease the number of instantiate_parameters_1 calls (might increase it, in fact, although I don't know how significantly). This also could be a bad use of the scev analysis. partly -- see the analysis at the PR. However one O(n) per query comes from scev itself. For Steven: you can have a O(N**3) algorithm very simply: loop O(N) times loop O(N) times algorithm in O(N) One part of the problem is that loop optimizers need to throw away the scev caches after each change to loops, which leads to recounting the information too much. Allowing to invalidate only changed parts helps, but seems to be relatively costly on normal code -- you need to scan the hash table for things that refer to removed or from some other reason invalidated ssa names, which is slow. Just set the SSA_NAME to not_analyzed_yet or NULL_TREE, and the next time you'll ask for the evolution of this SSA_NAME you'll analyze the evolution from scratch. This does not work, of course. When the ssa name is removed, you must remove all references to it from the cache. Otherwise you will ICE when you try to process the relevant entry in (say) instantiate_parameters. What's worse, the ssa name may get reused by ssa name management, thus causing you not even be able to note the change, and thus possibly to misscompilations. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 11:16 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) How? If the reference is left in symbolic form, it means that you know nothing about its value, so the only thing you can do with it is to check its equality/inequality, in code like while (...) { i = something_weird (); a[i] = ...; (a) a[i+1] = ...; (b) a[i] = ...; (c) } The following is probably a more frequent case: i = 0 x = something_weird () or a function parameter loop i = i + 1 a[i] = ... ... = a[i + x] endloop where you end with a symbolic distance vector. But here x is a loop invariant. Nothing would change here if we were keeping the evolutions fully instantiated. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 11:29 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) (*) I hope; scev is a mess of mutualy recursive functions -- analyze_scalar_evolution calling number_of_iterations calling instantiate_parameters calling analyze_scalar_evolution again, with instantiate_parameters hacked so that the infinite cycle cannot occur is my favourite one. Nobody can say anything sure about behavior of scev -- it is not even well defined what analyze_scalar_evolutions will return to you, It returns to you an evolution that might contain SSA_NAMEs. Hmm... if this is all you can tell, I start fear even more; this does not define any semantics at all :-) More seriously -- which of the possibilities? If I have loops like while (...) { while (...) { x_1 = something (); } x_2 = phi (x_1); x_3 = x_2 + 1; } What will analyze_scalar_evolutions return for x_3? There are (at least) three possible valid values: x_3 x_2 + 1 x_1 + 1 In this example probably the last one will be chosen, but in more complicated cases you cannot tell (without simulating what scev analysis does). It might be better to not export analyze_scalar_evolution at all and to force it to be used through instantiate_parameters/resolve_mixers, that have at least a well defined semantics of return value -- I hope. Till recently I believed that I made the functions to behave reasonably in cases when values defined inside loop are used outside of it (through chain of lcssa phi nodes), but my attempt to remove the hacks that ensure sane behavior of the results for ivopts causes misscompilations; I am investigating the cause now. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 12:03 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 11:14 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: Adding the instantiation cache was compile time neutral on normal code, so I don't think the effect on scev cost is significant. How that? You end up querying and caching an evolution for every instantiation of an SSA_NAME! which is pretty cheap (constant time operation); note that we do an equivalent lookup in the analyze_scalar_evolution call in any case, also without causing any compile time problems. I haven't measured any slowdown on normal testcases. I will quantify the number of queries wrt the number of times this information is useful. I think that with my patch, the instantiation cache information is used zero times on a bootstrap. I don't think so. Please come up with some numbers, otherwise this discussion is pointless. during a bootstrap: instantiation cache queries: 1146452 instantiation cache uses: 247452 of which 143977 were scev_not_known, the other were SSA_NAMEs. this was counted with a grep|wc with the following patch: Index: tree-scalar-evolution.c === RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v retrieving revision 2.16 diff -d -u -p -r2.16 tree-scalar-evolution.c --- tree-scalar-evolution.c 18 Jan 2005 11:36:26 - 2.16 +++ tree-scalar-evolution.c 25 Jan 2005 12:00:57 - @@ -1898,8 +1898,15 @@ get_instantiated_value (htab_t cache, tr pattern.var = version; info = htab_find (cache, pattern); + fprintf (stdout, IC_query \n); + if (info) -return info-chrec; +{ + fprintf (stdout, IC_used_once \n); + print_generic_expr (stdout, info-chrec, 0); + fprintf (stdout, \n); + return info-chrec; +} else return NULL_TREE; } @@ -1915,6 +1922,8 @@ set_instantiated_value (htab_t cache, tr pattern.var = version; slot = htab_find_slot (cache, pattern, INSERT); + fprintf (stdout, IC_query \n); + if (*slot) info = *slot; else @@ -1990,7 +1999,8 @@ instantiate_parameters_1 (struct loop *l result again. */ bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec)); res = analyze_scalar_evolution (def_loop, chrec); - res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, cache); + if (res != chrec_dont_know) + res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, cache); bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec)); /* Store the correct value to the cache. */ @@ -2000,8 +2010,14 @@ instantiate_parameters_1 (struct loop *l case POLYNOMIAL_CHREC: op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec), allow_superloop_chrecs, cache); + if (op0 == chrec_dont_know) + return chrec_dont_know; + op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec), allow_superloop_chrecs, cache); + if (op1 == chrec_dont_know) + return chrec_dont_know; + if (CHREC_LEFT (chrec) != op0 || CHREC_RIGHT (chrec) != op1) chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1); @@ -2010,8 +2026,14 @@ instantiate_parameters_1 (struct loop *l case PLUS_EXPR: op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0), allow_superloop_chrecs, cache); + if (op0 == chrec_dont_know) + return chrec_dont_know; + op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1), allow_superloop_chrecs, cache); + if (op1 == chrec_dont_know) + return chrec_dont_know; + if (TREE_OPERAND (chrec, 0) != op0 || TREE_OPERAND (chrec, 1) != op1) chrec = chrec_fold_plus (TREE_TYPE (chrec), op0, op1); @@ -2020,8 +2042,14 @@ instantiate_parameters_1 (struct loop *l case MINUS_EXPR: op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0), allow_superloop_chrecs, cache); + if (op0 == chrec_dont_know) + return chrec_dont_know; + op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1), allow_superloop_chrecs, cache); + if (op1 == chrec_dont_know) +return chrec_dont_know; + if (TREE_OPERAND (chrec, 0) != op0 || TREE_OPERAND (chrec, 1) != op1) chrec = chrec_fold_minus (TREE_TYPE (chrec), op0, op1); @@ -2030,8 +2058,14 @@ instantiate_parameters_1 (struct loop *l
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 12:44 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: More seriously -- which of the possibilities? If I have loops like while (...) { while (...) { x_1 = something (); } x_2 = phi (x_1); x_3 = x_2 + 1; } What will analyze_scalar_evolutions return for x_3? There are (at least) three possible valid values: x_3 This would be the answer if analyze_scalar_evolutions would be the identity function. If you want, you could change analyze_scalar_evolutions such that it behaves like that, and decide that the instantiation do the rest of the work (I mean moving the code that is currently in analyze_scalar_evolutions to the instantiation phase). x_2 + 1 If you decide to reconstruct the tree expression, there is no reason to stop on a phi node that has a single argument. Why would you like to get this answer as the reconstructed tree? x_1 + 1 IMO this would be the answer, although I didn't checked. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 15:54 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: More seriously -- which of the possibilities? If I have loops like while (...) { while (...) { x_1 = something (); } x_2 = phi (x_1); x_3 = x_2 + 1; } What will analyze_scalar_evolutions return for x_3? There are (at least) three possible valid values: x_3 This would be the answer if analyze_scalar_evolutions would be the identity function. If you want, you could change analyze_scalar_evolutions such that it behaves like that, and decide that the instantiation do the rest of the work (I mean moving the code that is currently in analyze_scalar_evolutions to the instantiation phase). x_2 + 1 If you decide to reconstruct the tree expression, there is no reason to stop on a phi node that has a single argument. Why would you like to get this answer as the reconstructed tree? because this answer preserves loop closed ssa form -- the answer x_1 + 1 copy propagates the value outsied of the loop. In some applications this choice could make sense. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-25 16:26 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) If you decide to reconstruct the tree expression, there is no reason to stop on a phi node that has a single argument. Why would you like to get this answer as the reconstructed tree? because this answer preserves loop closed ssa form -- the answer x_1 + 1 copy propagates the value outsied of the loop. In some applications this choice could make sense. Okay, if it makes sense, you could modify the analyzer such that it stops reconstructing the tree once it is on the phi following a loop. This won't change the information provided after the instantiation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
-- What|Removed |Added Severity|normal |minor Priority|P2 |P3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-01-24 21:36 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Still there remain some inefficiences within the scev analysis itself. Zdenek, have you tried to revert the patch that caches the instantiation information? This could speed up things a little. On a side note, I wonder how many times we're using this instantiation cache, in other words whether we pay a high price for the scev analysis for some (probably) rare cases... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-24 22:16 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Still there remain some inefficiences within the scev analysis itself. Zdenek, have you tried to revert the patch that caches the instantiation information? This could speed up things a little. On a side note, I wonder how many times we're using this instantiation cache, in other words whether we pay a high price for the scev analysis for some (probably) rare cases... Adding the instantiation cache was compile time neutral on normal code, so I don't think the effect on scev cost is significant. The problem is that we end up calling the instantiate_parameters_1 function 83214+2453273 (outside + recursive) times (for N=100). We also call analyze_scalar_evolution_1 1146317 times. Many of these calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat). Large part of 5142869 of build_int_cst_wide calls is very likely also due to scev analysis. This is not going to be cheap. Removing the instantiation cache definitely would not decrease the number of instantiate_parameters_1 calls (might increase it, in fact, although I don't know how significantly). One part of the problem is that loop optimizers need to throw away the scev caches after each change to loops, which leads to recounting the information too much. Allowing to invalidate only changed parts helps, but seems to be relatively costly on normal code -- you need to scan the hash table for things that refer to removed or from some other reason invalidated ssa names, which is slow. And to make things worse, this change of course means that most of the information is left in the hash table, i.e. the size of the table grows and these scans get slower and slower. The alternative -- checking the validity of entries only when querying the cache -- is not possible, since we reuse the released ssa names, so we would be unable to recognize the validity of the information. Other part is that scev tries to be too clever. Without need to represent nonaffine induction variables (that we do not use anywhere) we could use more memory efficient representation of ivs. Without need to handle symbolic references (that we also do not use anywhere, we could store evolutions in a fully instantiated form, and we would not need instantiate_parameters/resolve_mixers functions at all. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-24 22:25 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Other part is that scev tries to be too clever. Without need to represent nonaffine induction variables (that we do not use anywhere) we could use more memory efficient representation of ivs. Without need to handle symbolic references (that we also do not use anywhere, we could store evolutions in a fully instantiated form, and we would not need instantiate_parameters/resolve_mixers functions atall. Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-24 23:09 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Other part is that scev tries to be too clever. Without need to represent nonaffine induction variables (that we do not use anywhere) we could use more memory efficient representation of ivs. Without need to handle symbolic references (that we also do not use anywhere, we could store evolutions in a fully instantiated form, and we would not need instantiate_parameters/resolve_mixers functions atall. Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, so the only thing you can do with it is to check its equality/inequality, in code like while (...) { i = something_weird (); a[i] = ...; (a) a[i+1] = ...; (b) a[i] = ...; (c) } to find that (b) is independent on (a) and (c) and to find the dependence between (a) and (c), but you do not need scev for this -- value numbering info is enough. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-24 23:12 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) On Mon, 24 Jan 2005, rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-24 23:09 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Other part is that scev tries to be too clever. Without need to represent nonaffine induction variables (that we do not use anywhere) we could use more memory efficient representation of ivs. Without need to handle symbolic references (that we also do not use anywhere, we could store evolutions in a fully instantiated form, and we would not need instantiate_parameters/resolve_mixers functions atall. Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, Wrong. We could know things about it's value, such as ranges. We just don't know an exact number. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 00:27 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Other part is that scev tries to be too clever. Without need to represent nonaffine induction variables (that we do not use anywhere) we could use more memory efficient representation of ivs. Without need to handle symbolic references (that we also do not use anywhere, we could store evolutions in a fully instantiated form, and we would not need instantiate_parameters/resolve_mixers functions atall. Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, Wrong. We could know things about it's value, such as ranges. We just don't know an exact number. OK. This is work for VRP, not scev. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-25 00:39 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, Wrong. We could know things about it's value, such as ranges. We just don't know an exact number. OK. This is work for VRP, not scev. And once we have it, we can talk about removing the symbolic stuff from SCEV :) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 00:49 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, Wrong. We could know things about it's value, such as ranges. We just don't know an exact number. OK. This is work for VRP, not scev. And once we have it, we can talk about removing the symbolic stuff from SCEV :) Once we use any value range info from SCEV, we can speak about leaving symbolic references in SCEV until we have a proper VRP :-) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-25 00:50 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) On Mon, 25 Jan 2005, rakdver at atrey dot karlin dot mff dot cuni dot cz wrote: --- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 00:49 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Uh, symbolic references are or will be used to do data dependence when MEM_REF and ARRAY_REF couldn't be generated from the pointers. How? If the reference is left in symbolic form, it means that you know nothing about its value, Wrong. We could know things about it's value, such as ranges. We just don't know an exact number. OK. This is work for VRP, not scev. And once we have it, we can talk about removing the symbolic stuff from SCEV :) Once we use any value range info from SCEV, we can speak about leaving symbolic references in SCEV until we have a proper VRP :-) See autovec branch. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-25 00:52 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) See autovec branch. You could also look at recent patches posted by sebastian and i for the autovect branch that have been adding this support. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-25 01:11 --- Which one? I cannot find anything relevant in changelog. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-25 01:15 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Which one? I cannot find anything relevant in changelog. * tree-data-ref.c (analyze_subscript_affine_affine): Implement a solution for the FIXME concerning the symbolic affine dependence testing; remove the FIXME. (can_use_analyze_subscript_affine_affine): New function. (analyze_siv_subscript): Use it. and 2004-12-15 Daniel Berlin [EMAIL PROTECTED] * Makefile.in (tree-chrec.o): Add cfgloop.h * tree-chrec.c: Add cfgloop.h, tree-flow.h. (evolution_function_is_invariant_p): New function. (evolution_function_is_affine_multivariate_p): Use evolution_function_is_invariant_p instead of evolution_function_is_constant_p. * tree-chrec.h: Add prototype for evolution_function_is_invariant_p. (evolution_function_is_affine_p): Use evolution_function_is_invariant_p. * tree-data-ref.c (analyze_overlapping_iterations): chrecs that are equal overlap on every iteration. This stuff is just simple symbolic differencing, and checking of invariantness of symbols, But it is indeed starting to se the symbolic scev info -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 01:24 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Which one? I cannot find anything relevant in changelog. * tree-data-ref.c (analyze_subscript_affine_affine): Implement a solution for the FIXME concerning the symbolic affine dependence testing; remove the FIXME. (can_use_analyze_subscript_affine_affine): New function. (analyze_siv_subscript): Use it. and 2004-12-15 Daniel Berlin [EMAIL PROTECTED] * Makefile.in (tree-chrec.o): Add cfgloop.h * tree-chrec.c: Add cfgloop.h, tree-flow.h. (evolution_function_is_invariant_p): New function. (evolution_function_is_affine_multivariate_p): Use evolution_function_is_invariant_p instead of evolution_function_is_constant_p. * tree-chrec.h: Add prototype for evolution_function_is_invariant_p. (evolution_function_is_affine_p): Use evolution_function_is_invariant_p. * tree-data-ref.c (analyze_overlapping_iterations): chrecs that are equal overlap on every iteration. This stuff is just simple symbolic differencing, and checking of invariantness of symbols, But it is indeed starting to se the symbolic scev info Ugh... OK, if you think it is a right way... Anyway, I am seriously considering resurrecting the simple iv analysis for purposes of passes that are not interested in this fancy stuff. Overhead of scev is probably acceptable if it is only used for this type of analysis, but for other purposes it is clearly overkill. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-25 01:28 --- Do remember that this bug is about bad behavior with deeply nested loops and we already have other algorithms that are quadratic in the loop nest depth. So I really wouldn't consider this to be a very serious problem, but rather something that could be improved. It is a shame that Seb has so far not commented on the behavior of his scev algorithm with respect to the loop nest depth. Is it expected to be cubic in the loop nest depth? Perhaps he can improve the algorithm? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-25 01:42 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) * tree-data-ref.c (analyze_overlapping_iterations): chrecs that are equal overlap on every iteration. This stuff is just simple symbolic differencing, and checking of invariantness of symbols, But it is indeed starting to se the symbolic scev info Ugh... OK, if you think it is a right way... Anyway, I am seriously considering resurrecting the simple iv analysis for purposes of passes that are not interested in this fancy stuff. Overhead of scev is probably acceptable if it is only used for this type of analysis, but for other purposes it is clearly overkill. Uh, almost all high level loop optimizations are going to do very badly in the depth of the loop nest. The fact that scev doesn't do so well on a 100 deep loop nest doesn't really concern me at all. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 01:52 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) Do remember that this bug is about bad behavior with deeply nested loops and we already have other algorithms that are quadratic in the loop nest depth. So I really wouldn't consider this to be a very serious problem, but rather something that could be improved. It is a shame that Seb has so far not commented on the behavior of his scev algorithm with respect to the loop nest depth. Is it expected to be cubic in the loop nest depth? Perhaps he can improve the algorithm? the algorithm itself is not cubic with loop depth, but worst case linear (per query) (*). This time complexity is acceptable, IMHO. (*) I hope; scev is a mess of mutualy recursive functions -- analyze_scalar_evolution calling number_of_iterations calling instantiate_parameters calling analyze_scalar_evolution again, with instantiate_parameters hacked so that the infinite cycle cannot occur is my favourite one. Nobody can say anything sure about behavior of scev -- it is not even well defined what analyze_scalar_evolutions will return to you, unless you call instantiate_parameters or resolve_mixers on the result. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-25 01:57 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) * tree-data-ref.c (analyze_overlapping_iterations): chrecs that are equal overlap on every iteration. This stuff is just simple symbolic differencing, and checking of invariantness of symbols, But it is indeed starting to se the symbolic scev info Ugh... OK, if you think it is a right way... Anyway, I am seriously considering resurrecting the simple iv analysis for purposes of passes that are not interested in this fancy stuff. Overhead of scev is probably acceptable if it is only used for this type of analysis, but for other purposes it is clearly overkill. Uh, almost all high level loop optimizations are going to do very badly in the depth of the loop nest. Remove almost and high level from this sentence and I will agree with you. This really does not worry me that much. The fact that scev doesn't do so well on a 100 deep loop nest doesn't really concern me at all. Me neither. Its time and memory consumption on normal testcases however is not entirely satisfactory (not critical, but there definitely is a place for improvement), and implementation could (should?) also be significantly cleaned up. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 14:18 --- It's definitely not the bitmaps: time seconds secondscalls s/call s/call name 10.28 25.1425.14 24356740 0.00 0.00 ggc_alloc_stat 5.39 38.3213.18 3133403 0.00 0.00 instantiate_parameters_1 5.25 51.1512.83 34260186 0.00 0.00 htab_find_slot_with_hash 4.41 61.9410.79 40175413 0.00 0.00 build_int_cst_wide 4.35 72.5810.64 1139473 0.00 0.00 chrec_convert 3.79 81.86 9.28 1810 0.01 0.01 htab_empty 3.63 90.73 8.87 22863854 0.00 0.00 build3_stat 3.41 99.07 8.34 800 0.01 0.01 for_each_insn_in_loop 2.74105.76 6.69 415264 0.00 0.00 follow_ssa_edge 2.64112.21 6.45 3085512 0.00 0.00 find_interesting_uses_stmt 2.14117.45 5.24 3197505 0.00 0.00 record_invariant 2.13122.66 5.21 9695984 0.00 0.00 analyze_scalar_evolution_1 1.65126.70 4.04 11769469 0.00 0.00 comptypes 1.64130.70 4.00 11710741 0.00 0.00 fold_convert_const 1.63134.68 3.98 23551936 0.00 0.00 make_node_stat 1.44138.19 3.51 11464308 0.00 0.00 force_fit_type 1.28141.31 3.12 12009984 0.00 0.00 fold_convert 1.25144.37 3.06 21935414 0.00 0.00 eq_scev_info 1.18147.26 2.89 9695984 0.00 0.00 analyze_scalar_evolution 1.09149.93 2.67 326865 0.00 0.00 chrec_fold_plus_1 1.01152.41 2.48 200 0.01 0.94 tree_ssa_iv_optimize_loop 1.01154.88 2.47 29473273 0.00 0.00 is_gimple_min_invariant 0.99157.30 2.43 26291603 0.00 0.00 flow_bb_inside_loop_p 0.98159.70 2.40 22605936 0.00 0.00 flow_loop_nested_p 0.95162.03 2.33 3135046 0.00 0.00 htab_delete 0.95164.35 2.32 164459 0.00 0.00 htab_expand -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 14:23 --- Looks to me like Seb may want to look at this bug too... -- What|Removed |Added CC||spop at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 14:24 --- One more, because I also run out of memory pretty quickly: --- 0.000.00 1/22863854 c_build_bind_expr [1608] 0.000.00 1/22863854 thread_across_edge [392] 0.000.00 199/22863854 estimate_numbers_of_iterations [184] 0.000.00 200/22863854 c_finish_loop [754] 0.000.019801/22863854 follow_ssa_edge_in_rhs cycle 2 [38] 0.020.06 50602/22863854 add_to_evolution [324] 0.100.32 247913/22863854 simple_iv cycle 2 [35] 0.170.55 428920/22863854 analyze_scalar_evolution_1 cycle 2 [12] 2.197.24 5648016/22863854 chrec_fold_plus_1 [24] 2.257.43 5793593/22863854 instantiate_parameters_1 cycle 2 [11] 4.15 13.69 10684608/22863854 chrec_convert [10] [13]15.68.87 29.30 22863854 build3_stat [13] 3.86 25.44 22863854/23551936 make_node_stat [15] --- -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2005-01-23 14:22 --- More profile data: --- 0.00 187.04 1/1 execute_pass_list [6] [7] 76.50.00 187.04 1 tree_ssa_iv_optimize [7] 2.48 184.56 200/200 tree_ssa_iv_optimize_loop [8] 0.000.00 1/201 free_loop_data [349] 0.000.00 2/30137 bitmap_obstack_free [178] 0.000.00 201/7042572 xcalloc [127] 0.000.00 3/2369varray_init [1063] 0.000.00 2/36950 bitmap_obstack_alloc [794] --- 2.48 184.56 200/200 tree_ssa_iv_optimize [7] [8] 76.52.48 184.56 200 tree_ssa_iv_optimize_loop [8] 30.61 134.21 309011/311017 simple_iv cycle 2 [35] 6.459.73 3085512/3085512 find_interesting_uses_stmt [25] 0.041.03 200/202 scev_reset [93] 0.000.76 200/200 determine_use_iv_costs [107] 0.000.50 200/200 rewrite_uses [142] 0.000.35 200/202 loop_commit_inserts [181] 0.110.11 20100/20100 find_interesting_uses_cond [241] 0.020.09 200/311017 number_of_iterations_exit cycle 2 [41] 0.060.04 200/201 free_loop_data [349] 0.000.09 200/200 create_new_ivs [360] 0.000.08 200/401 get_loop_body_in_dom_order [274] 0.000.06 200/2601get_loop_body [111] 0.050.00 200/200 remove_unused_ivs [440] 0.020.03 19703/99904 find_interesting_uses_outer_or_nonlin [223] 0.000.042004/2804add_candidate [413] 0.000.02 200/200 find_optimal_iv_set [609] 0.000.02 20507/40407 set_iv [503] 0.000.02 800/800 add_iv_value_candidates [689] 0.010.01 80200/26291603 flow_bb_inside_loop_p [44] 0.010.00 800/30137 bitmap_obstack_free [178] 0.000.001006/5012force_var_cost [590] 0.000.00 200/200 iv_ca_free [1067] 0.000.00 201/3005add_candidate_1 [530] 0.000.00 20708/370258 zero_p [517] 0.000.001399/3525413 get_iv [69] 0.000.00 402/12009984 fold_convert [17] 0.000.001198/3329551 is_gimple_reg [71] 0.000.001202/40175413 build_int_cst_wide [26] 0.000.00 200/200 single_dom_exit [1344] 0.000.00 402/132285 find_edge [436] 0.000.001003/8376514 bitmap_set_bit [92] 0.000.00 20507/20507 contains_abnormal_ssa_name_p [1427] 0.000.00 201/57945 int_cst_value [804] 0.000.001202/22889750 build_int_cst [141] 0.000.001006/5034add_cost [1463] 0.000.001007/22165 cst_and_fits_in_hwi [1748] 0.000.00 402/1401loop_latch_edge [2001] 0.000.00 201/1597loop_preheader_edge [1984] --- [9] 67.8 30.81 135.08 311017+24191806 cycle 2 as a whole [9] 13.18 43.06 3133403+18788021 instantiate_parameters_1 cycle 2 [11] 5.21 44.15 9695984 analyze_scalar_evolution_1 cycle 2 [12] 0.22 20.27 703698 interpret_rhs_modify_expr cycle 2 [22] 2.897.22 9695984 analyze_scalar_evolution cycle 2 [32] 0.576.74 313469+81390 follow_ssa_edge_in_rhs cycle 2 [38] 6.690.56 415264+1352605 follow_ssa_edge cycle 2 [39] 0.224.74 21906 number_of_iterations_exit cycle 2 [41] 0.051.16 30495 number_of_iterations_in_loop cycle 2 [86] 0.040.07 43784 instantiate_parameters cycle 2 [347] 0.050.01 65718 simplify_using_outer_evolutions cycle 2 [441] 0.050.00 30295 compute_overall_effect_of_inner_loop cycle 2 [459] --- 22088230 chrec_convert [10]
Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
I believe seb/zdenek already submitted patches for speeding up scev quite recently, with the goal of alleviating this problem. I'm pretty sure they have not been applied yet.
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-23 15:01 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) I believe seb/zdenek already submitted patches for speeding up scev quite recently, with the goal of alleviating this problem. I'm pretty sure they have not been applied yet. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2005-01-23 15:06 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) I believe seb/zdenek already submitted patches for speeding up scev quite recently, with the goal of alleviating this problem. I'm pretty sure they have not been applied yet. the Sebastian's patch is not in yet; it might help a bit in this case, a believe, although I suspect a real source of the problem is elsewhere. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-23 16:38 --- Sebastian's patch does not help significantly :-( -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-23 17:14 --- Also did not help much. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-23 17:04 --- One extra N factor seems to be coming from simple_iv (analyze_scalar_evolution_in_loop). The function was created before loop closed ssa form, under assumptions of lcssa it should be possible to get rid of this. I am working on patch -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-24 01:40 --- The patch that causes ivopts to reset just the relevant parts of the scev cache (just regtesting) instead of clearing it completely helps a bit -- the compile time for N=100 gets about 2x better and memory consumption about 10x. Still there remain some inefficiences within the scev analysis itself. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-24 01:46 --- On a side note, PRE also seems to have problems with the testcase. With the patch mentioned above, the largest consumers of compile time are ivopts (45%) and pre (20%). -- What|Removed |Added CC||dberlin at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
On Sun, 24 Jan 2005, rakdver at gcc dot gnu dot org wrote: --- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-24 01:46 --- On a side note, PRE also seems to have problems with the testcase. With the patch mentioned above, the largest consumers of compile time are ivopts (45%) and pre (20%). Uh, there was a bug filed about this, and i fixed it, last i looked.
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-24 01:49 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) On Sun, 24 Jan 2005, rakdver at gcc dot gnu dot org wrote: --- Additional Comments From rakdver at gcc dot gnu dot org 2005-01-24 01:46 --- On a side note, PRE also seems to have problems with the testcase. With the patch mentioned above, the largest consumers of compile time are ivopts (45%) and pre (20%). Uh, there was a bug filed about this, and i fixed it, last i looked. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From dberlin at gcc dot gnu dot org 2005-01-24 01:57 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) On a side note, PRE also seems to have problems with the testcase. With the patch mentioned above, the largest consumers of compile time are ivopts (45%) and pre (20%). Uh, there was a bug filed about this, and i fixed it, last i looked. Here it is. Bug 18673. N=100 takes .25 seconds now. I can't make PRE much faster than that, that's almost all hash function time. --Dan -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From belyshev at depni dot sinp dot msu dot ru 2004-12-22 23:40 --- It seems that extra O(N) factor comes from high memory usage. For smaller N (0..30) compiler behaves like O(2) and for larger ( 50) like O(2.5 .. 3). gnuplot fit [0:30] A*x**k+B data via A,k,B [snip] Final set of parametersAsymptotic Standard Error ===== A = 0.000465461 +/- 2.625e-05(5.64%) k = 1.90612 +/- 0.01652 (0.8667%) B = 0.0328722+/- 0.0007677(2.335%) gnuplot fit [50:175] A*x**k+B data via A,k,B [snip] Final set of parametersAsymptotic Standard Error ===== A = 4.76329e-06 +/- 3.493e-07(7.332%) k = 3.0033 +/- 0.01405 (0.4678%) B = 0.392746 +/- 0.02972 (7.567%) For N=175 memory usage on my machine exceedes 600 MB -- What|Removed |Added Last reconfirmed|2004-11-25 11:16:30 |2004-12-22 23:40:41 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2004-12-23 01:26 --- hmmm maybe the extra O(N) comes from O(N) bitmap operations? (Just guessing) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz 2004-12-23 01:29 --- Subject: Re: [4.0 Regression] IV-OPTS is O(N^3) hmmm maybe the extra O(N) comes from O(N) bitmap operations? (Just guessing) that might be the case, but I don't think it is likely (also just guessing :-) As far as I can tell all bitmaps in ivopts should be small for this testcase. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From rakdver at gcc dot gnu dot org 2004-11-28 18:42 --- IVOPTs should behave quadratically on this type of tests; I do not know where does the extra O(N) factor come from, and I would not expect the times to grow this fast. -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |rakdver at gcc dot gnu dot |dot org |org Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
-- What|Removed |Added OtherBugsDependingO||18693 nThis|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From steven at gcc dot gnu dot org 2004-11-27 01:10 --- Of course it's not very useful to claim that some algorithm is O(N^x) when you don't say what N is... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From belyshev at lubercy dot com 2004-11-27 01:45 --- (In reply to comment #3) Of course it's not very useful to claim that some algorithm is O(N^x) when you don't say what N is... N is number of nested loops. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595
[Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
--- Additional Comments From belyshev at lubercy dot com 2004-11-25 11:16 --- use this awk script to generate testcase (first arg is number of loops): BEGIN { ORS= print int f ()\n{\tint for (j = 0; j ARGV [1]; j++) print j j , print a;\n\ta = 0;\n print \tfor (j0 = 0; j0 2; j0 ++)\n for (j = 1; j ARGV [1]; j++) print \tfor (j j = j j-1 ; j j 2; j j ++)\n print \ta += for (j = 0; j ARGV [1]-1; j++) print j j + print j j ;\n\treturn a;\n}\n } N loops Time, s 50.05 100.17 150.38 201.14 252.81 304.68 357.52 4013.6 4521.8 5025.6 5533.9 -- What|Removed |Added Severity|minor |normal Status|UNCONFIRMED |NEW Ever Confirmed||1 Keywords||memory-hog Known to fail||4.0.0 Last reconfirmed|-00-00 00:00:00 |2004-11-25 11:16:30 date|| Summary|IV-OPTS is slow (and does |[4.0 Regression] IV-OPTS is |not scale) |O(N^3) Target Milestone|--- |4.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595