[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 Aldy Hernandez changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Aldy Hernandez --- fixed
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 --- Comment #5 from CVS Commits --- The master branch has been updated by Aldy Hernandez : https://gcc.gnu.org/g:82cd78f2c31db1664ca154d7fcd24e9eaee1427f commit r12-4532-g82cd78f2c31db1664ca154d7fcd24e9eaee1427f Author: Aldy Hernandez Date: Wed Oct 20 09:05:23 2021 +0200 Restore --param=max-fsm-thread-length The removal of --param=max-fsm-thread-length is causing code explosion. I thought that --param=max-fsm-thread-path-insns was a better gague for path profitability than raw BB length, but it turns out that we don't take into account PHIs when estimating the number of statements. In this PR, we have a sequence of very large PHIs that have us traversing extremely large paths that blow up the compilation. We could fix this a couple of different ways. We could avoid traversing more than a certain number of PHI arguments, or ignore large PHIs altogether. The old implementation certainly had this knob, and we could cut things off before we even got to the ranger. We could also adjust the instruction estimation to take into account PHIs, but I'm sure we'll mess something else in the process ;-). The easiest thing to do is just restore the knob. At a later time we could tweak this further, for instance, disregarding empty blocks in the count. BTW, this is the reason I didn't chop things off in the lowlevel registry for all threaders: the forward threader can't really explore too deep paths, but it could theoretically get there while threading over empty blocks. This fixes 102814, 102852, and I bet it solves the Linux kernel cross compile issue. Tested on x86-64 Linux. gcc/ChangeLog: PR tree-optimization/102814 * doc/invoke.texi: Document --param=max-fsm-thread-length. * params.opt: Add --param=max-fsm-thread-length. * tree-ssa-threadbackward.c (back_threader_profitability::profitable_path_p): Fail on paths longer than max-fsm-thread-length.
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 Bug 102814 depends on bug 102852, which changed state. Bug 102852 Summary: [12 Regression] Compile time hog since r12-4421-g0bd68793921ecf3b https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102852 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 Aldy Hernandez changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #4 from Aldy Hernandez --- *** Bug 102852 has been marked as a duplicate of this bug. ***
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 Aldy Hernandez changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |aldyh at gcc dot gnu.org CC||amacleod at redhat dot com Ever confirmed|0 |1 Last reconfirmed||2021-10-19 Status|UNCONFIRMED |NEW --- Comment #3 from Aldy Hernandez --- (In reply to Richard Biener from comment #2) > (In reply to Aldy Hernandez from comment #1) > > Question to the larger audience... do we support bug reports against > > internal --param constructs? > > Yes. Generally 'max-jump-thread-duplication-stmts' would suggest this is > a parameter limiting code size growth and one that might affect compile-time > in a linear fashion - exponential growth here is unexpected. The reporter > states a 180 -> 181 parameter change trips this over unexpectedly which is > a case worth investigating (it suggests a limit elsewhere is missing). > > For example the alias walking code counts the amount of "work" it does and > has a limit on that, allowing linear growth parametrization. Not sure if > there's sth in the threader and/or ranger that would support accumulating > a work budget and stop after it is exhausted, but something like that would > be very useful (not sure if that's the problem at hand in this case). ISTR Andrew had some patches to stop solving for some combination of extremely large PHIs. Not sure whether this is the issue at hand, but perhaps it's worth looking at. I'll put this in the back burner for now, since the loop threading restrictions make this a non-issue, but I'll come back to it later. Thanks.
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 --- Comment #2 from Richard Biener --- (In reply to Aldy Hernandez from comment #1) > Question to the larger audience... do we support bug reports against > internal --param constructs? Yes. Generally 'max-jump-thread-duplication-stmts' would suggest this is a parameter limiting code size growth and one that might affect compile-time in a linear fashion - exponential growth here is unexpected. The reporter states a 180 -> 181 parameter change trips this over unexpectedly which is a case worth investigating (it suggests a limit elsewhere is missing). For example the alias walking code counts the amount of "work" it does and has a limit on that, allowing linear growth parametrization. Not sure if there's sth in the threader and/or ranger that would support accumulating a work budget and stop after it is exhausted, but something like that would be very useful (not sure if that's the problem at hand in this case).
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 --- Comment #1 from Aldy Hernandez --- (In reply to Dmitry G. Dyachenko from comment #0) > r12-4256 FAST > r12- SLOW > > g++ -fpreprocessed -std=c++98 -O2 --param > max-jump-thread-duplication-stmts=NNN -c x.ii Well, you are basically eliding the fail safe we put in specifically for this code explosion: /* Threading through an empty latch would cause code to be added to the latch. This could alter the loop form sufficiently to cause loop optimizations to fail. Disable these threads until after loop optimizations have run. */ if ((threaded_through_latch || (taken_edge && taken_edge->dest == loop->latch)) && !(cfun->curr_properties & PROP_loop_opts_done) && empty_block_p (loop->latch)) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Thread through latch before loop opts would create non-empty latch\n"); return false; The default is 15, and you're pushing it to > 180. Nothing good can come of that. That being said, in this very specific case, thread3 is creating some monster PHIs which then thread4 further explodes. Luckily this combination is disallowed by the pending threading restrictions in the presence of loops here: https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581894.html But really, bad things can happen when you disable the fail-safe mechanisms we've put in place. Question to the larger audience... do we support bug reports against internal --param constructs?
[Bug tree-optimization/102814] [12 regression] quadratique/exponential time complexity for max-jump-thread-duplication-stmts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102814 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |12.0 Keywords||compile-time-hog CC||aldyh at gcc dot gnu.org, ||pinskia at gcc dot gnu.org