[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-02-01 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P1  |P4

--- Comment #7 from Jeffrey A. Law  ---
Major regression resolved.  There's still some work to do, as outlined in the
BZ and in the post for the patch, so keeping this open as  P4.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-02-01 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

--- Comment #6 from Jeffrey A. Law  ---
If you could pass it along to me privately, I can verify if it's the same issue
or not easily (that's the nice things about a PARAM, I can just crank up the
limiter and see what happens).  I also happen to have gcc-4.9 and gcc-5
compilers handy to compare against.

l...@redhat.com

Thanks for taking the time to dig this stuff out and report the issues.  We're
using the FSM bits a lot more now and plan to rely on them more in the future. 
So shaking out these cases is important.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-02-01 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

--- Comment #5 from David Binderman  ---
(In reply to Jeffrey A. Law from comment #4)
> With the limiter, the time should come back down into the reasonable range
> and I'm going drop this to a P4 once that change goes in.  However, I'm
> going to keep this open because I think fsm_find_thread_path is horrible in
> the amount of work it duplicates and needs to be rewritten.  

I have another test case that currently seems to take about 17 minutes
with -O2.

Hopefully, this second test case is just a duplicate, but I'll wait until
your change has gone in to test it.

Good work and I look forward to seeing your change.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-02-01 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

--- Comment #4 from Jeffrey A. Law  ---
So for the testcase we've got merge points with huge numbers of predecessors,
which as I mentioned before we dutifully try finding paths through each one.

I instrumented the compiler a bit to see what kind of distribution we see in
this code.  I was interested primarily interested in the number of PHI
arguments, but also the number of threaded paths was recorded.

Not surprisingly, the distribution is heavily skewed towards small numbers of
PHI arguments (ie, 2-5).  The largest PHI in GCC an its shared libraries was on
the order of 256 entries, but it was pretty clear handling cases up to 100
would catch the overwhelming majority of cases.  That also sets the bar at one
order of magnitude lower than what the attached testcase needs to not explode.

In terms of # paths threaded in those big nodes, it tends to be either a bunch
(roughly half) or none.  I didn't investigate the bi-modal nature, though I do
know in the provided testcase the thread paths require memory simplifications
to be able to see the jump thread path.  That means they'll need to be picked
up by the old threader when called from DOM,   DOM threads ~2600 paths, so it
seems to be doing its job.

In general, the number of threaded paths wasn't all that interesting as a
datapoint.

With the limiter, the time should come back down into the reasonable range and
I'm going drop this to a P4 once that change goes in.  However, I'm going to
keep this open because I think fsm_find_thread_path is horrible in the amount
of work it duplicates and needs to be rewritten.  It likely accounts for the
remainder of the compile-time performance issues with this test when compared
to gcc-4.9.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-02-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-01-31 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

Jeffrey A. Law  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2016-01-31
 Ever confirmed|0   |1

--- Comment #3 from Jeffrey A. Law  ---
We've got join points with > 1k incoming edges and the FSM bits try to walk up
the CFG  for each edge.  A limiter seems to help considerably dropping the
compilation time from ~4 minutes to ~19 seconds.

I want to look at it a bit deeper before deciding that's the best way to go. 
At the least I'll want to do some instrumentation to help select a reasonable
limiter.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-01-31 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

--- Comment #2 from Andrew Pinski  ---
With checking turned on aarch64-linux-gnu:
 tree VRP: 804.96 (61%) usr   0.28 ( 9%) sys 805.19 (61%) wall 
 35850 kB (14%) ggc
 dominator optimization  : 286.56 (22%) usr   0.19 ( 6%) sys 286.66 (22%) wall 
 24718 kB (10%) ggc

I don't have a compiler right now without checking but this seems huge even for
checking case.

[Bug tree-optimization/69580] [6 Regression] From 26 seconds to 10 minutes moving from gcc 5.3.1 to gcc 6.0.0

2016-01-31 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69580

Andrew Pinski  changed:

   What|Removed |Added

 CC||law at gcc dot gnu.org,
   ||pinskia at gcc dot gnu.org
  Component|c   |tree-optimization

--- Comment #1 from Andrew Pinski  ---
Looks like fsm_find_thread_path and fsm_find_control_statement_thread_paths is
causing the compile time hog.