[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #11 from ktkachov at gcc dot gnu.org --- (In reply to Jeffrey A. Law from comment #10) > Should be fixed on the trunk now. I went from some absurd time down to less > than a second for the critical code in 453.povray. Yep, I can confirm that as well. Thanks!
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 Jeffrey A. Law changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #10 from Jeffrey A. Law --- Should be fixed on the trunk now. I went from some absurd time down to less than a second for the critical code in 453.povray.
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #9 from Jeffrey A. Law --- Author: law Date: Thu Nov 19 00:33:27 2015 New Revision: 230586 URL: https://gcc.gnu.org/viewcvs?rev=230586=gcc=rev Log: [PATCH][PR tree-optimization/68198] Avoid CFG explosion due to threading PR tree-optimization/68198 * tree-ssa-threadupdate.c (valid_jump_thread_path): Distinguish between threading a multi-way branch and a thread path that contains a multi-way branch. Disallow the case where a path contains a multi-way branch and does not thread a multi-way branch. (thread_through_all_blocks): Update comment. PR tree-optimization/68198 * gcc.dg/tree-ssa/pr66752-3.c: Update expected output for VRP1. * gcc.dg/tree-ssa/pr68198.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/pr68198.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c trunk/gcc/tree-ssa-threadupdate.c
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #8 from Jeffrey A. Law --- Creating a forwarder outside the path doesn't help since you still have to have an edge to the forwarder from each copy of the block with the SWITCH_EXPR. The solution is to realize that a path containing a SWITCH_EXPR that does not have a compile-time determinable destination probably isn't worth optimizing to start with! Essentially we're duplicating a block with 1k outgoing edges to eliminate a single conditional later in the path. From a cost/benefit analysis that's just silly. As I mentioned, this kind of situation was possible with the old threader too, it was just too dumb to discover the path.
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #7 from Richard Biener --- (In reply to Jeffrey A. Law from comment #6) > Fixing ought to be fairly easy... Create a forwarder block outside of the path?
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #6 from Jeffrey A. Law --- I see what's happening here (and boy it's much better to be debugging with real screens and a good night's sleep). So imagine what happens when you build a thread path and on that path you've got a block with a large switch statement. Say on the order of 1k unique destinations. Remember that we have to duplicate blocks on the thread path. For each block on the path we'll redirect *one* edge to the next block on the path, the remaining edges are copies from the original block and thus reach targets outside the thread path. Yup, given a thread path containing block with ~1k successors, we'll end up creating ~1k new edges for the duplicated block. Now imagine we do that a few hundred times (because the join block at the start of the thread path has hundreds of predecessors). Suddenly we're talking about tens of thousands of edges. That's what's happening here. In theory the old threader has the same problem, but it's just not strong enough to actually find enough paths to show the edge explosion. Fixing ought to be fairly easy...
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #5 from Jeffrey A. Law --- I've reproduced this issue. We've got a reasonably sized switch statement that the FSM threader is able to optimize. The problem is we go from something 2k edges and 1k blocks to 70k edges and 35k blocks. I'll have to put in some instrumentation, but early indications are most of the paths are redundant. ie, many of the incoming blocks thread to the same outgoing block. The FSM threader creates a distinct copy for each of those cases where the old threader knew how to factor them into a single path. I probably won't get much further than that while I'm on the road.
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-11-03 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Ramana Radhakrishnan --- Confirmed - I'm also seeing this today with a patch that I was testing.
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 Markus Trippelsdorf changed: What|Removed |Added CC||trippels at gcc dot gnu.org --- Comment #3 from Markus Trippelsdorf --- aarch64 isn't the only architecture affected by this big code size increase. ppc64 and x86 are also affected. See "Individual Sizes" graphs on: https://vmakarov.fedorapeople.org/spec/
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 --- Comment #1 from Jeffrey A. Law --- I'm on the road the rest of this will, but will definitely take a look at this ASAP.
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 Ramana Radhakrishnan changed: What|Removed |Added Target|arm, aarch64|arm, aarch64,x86_64, ppc64 --- Comment #4 from Ramana Radhakrishnan --- (In reply to Markus Trippelsdorf from comment #3) > aarch64 isn't the only architecture affected by this big code size increase. > ppc64 and x86 are also affected. See "Individual Sizes" graphs on: > https://vmakarov.fedorapeople.org/spec/ Update the target field then
[Bug tree-optimization/68198] [6 Regression]Excessive code size, compile time and memory usage bloat due to FSM threading in 453.povray
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68198 ktkachov at gcc dot gnu.org changed: What|Removed |Added Target Milestone|--- |6.0 Summary|Excessive code size,|[6 Regression]Excessive |compile time and memory |code size, compile time and |usage bloat due to FSM |memory usage bloat due to |threading in 453.povray |FSM threading in 453.povray Known to fail||6.0