This removes another quadraticness from SCOP detection, gather_bbs
domwalk. This is done by enhancing domwalk to handle SEME regions
via a special return value from before_dom_children.
With this I'm now confident to remove the
PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION parameter and its associated limit.
Being there I've adjusted PARAM_GRAPHITE_MAX_NB_SCOP_PARAMS to its
documented default value which enables 90 more loos to be processed
in SPEC CPU 2006. I've also made a value of zero magic in disabling
the limit (a trick commonly used in GCC).
Statistics I have gathered a few patches before for SPEC CPU 2006:
1255 multi-loop SESEs in SCOP processing
max. params 34, 3 scops >= 20, 15 scops >= 10, 33 scops >= 8
max. drs per scop 869, 10 scops >= 100
max. pbbs per scop 36, 12 scops >= 10
919 SCOPs fail in build_alias_sets
which shows the default for PARAM_GRAPHITE_MAX_ARRAYS_PER_SCOP
is reasonable (if tuned to SPEC CPU 2006).
I've also included the hunk that allows -fgraphite-identity
to work ontop of -floop-nest-optimize and for -floop-nest-optimize
-ftree-parallelize-all also make sure to code-gen loops that
end up not transformed.
Bootstrapped and tested on x86_64-unknown-linux-gnu, SPEC CPU 2006
tested, applied to trunk.
Richard.
2017-09-27 Richard Biener
* doc/invoke.texi (graphite-max-bbs-per-function): Remove.
(graphite-max-nb-scop-params): Document special value zero.
* domwalk.h (dom_walker::STOP): New symbolical constant.
(dom_walker::dom_walker): Add optional parameter for bb to
RPO mapping.
(dom_walker::~dom_walker): Declare.
(dom_walker::before_dom_children): Document STOP return value.
(dom_walker::m_user_bb_to_rpo): New member.
(dom_walker::m_bb_to_rpo): Likewise.
* domwalk.c (dom_walker::dom_walker): Compute bb to RPO
mapping here if not provided by the user.
(dom_walker::~dom_walker): Free bb to RPO mapping if not
provided by the user.
(dom_walker::STOP): Define.
(dom_walker::walk): Do not compute bb to RPO mapping here.
Support STOP return value from before_dom_children to stop
walking.
* graphite-optimize-isl.c (optimize_isl): If the schedule
is the same still generate code if -fgraphite-identity
or -floop-parallelize-all are given.
* graphite-scop-detection.c: Include cfganal.h.
(gather_bbs::gather_bbs): Get and pass through bb to RPO
mapping.
(gather_bbs::before_dom_children): Return STOP for BBs
not in the region.
(build_scops): Compute bb to RPO mapping and pass it to
the domwalk. Treat --param graphite-max-nb-scop-params=0
as not limiting the number of params.
* graphite.c (graphite_initialize): Remove limit on the
number of basic-blocks in a function.
* params.def (PARAM_GRAPHITE_MAX_BBS_PER_FUNCTION): Remove.
(PARAM_GRAPHITE_MAX_NB_SCOP_PARAMS): Adjust to documented
default value of 10.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 253224)
+++ gcc/doc/invoke.texi (working copy)
@@ -10512,13 +10512,9 @@ sequence pairs. This option only applie
@item graphite-max-nb-scop-params
To avoid exponential effects in the Graphite loop transforms, the
number of parameters in a Static Control Part (SCoP) is bounded. The
-default value is 10 parameters. A variable whose value is unknown at
-compilation time and defined outside a SCoP is a parameter of the SCoP.
-
-@item graphite-max-bbs-per-function
-To avoid exponential effects in the detection of SCoPs, the size of
-the functions analyzed by Graphite is bounded. The default value is
-100 basic blocks.
+default value is 10 parameters, a value of zero can be used to lift
+the bound. A variable whose value is unknown at compilation time and
+defined outside a SCoP is a parameter of the SCoP.
@item loop-block-tile-size
Loop blocking or strip mining transforms, enabled with
Index: gcc/domwalk.c
===
--- gcc/domwalk.c (revision 253224)
+++ gcc/domwalk.c (working copy)
@@ -174,13 +174,29 @@ sort_bbs_postorder (basic_block *bbs, in
If SKIP_UNREACHBLE_BLOCKS is true, then we need to set
EDGE_EXECUTABLE on every edge in the CFG. */
dom_walker::dom_walker (cdi_direction direction,
- bool skip_unreachable_blocks)
+ bool skip_unreachable_blocks,
+ int *bb_index_to_rpo)
: m_dom_direction (direction),
m_skip_unreachable_blocks (skip_unreachable_blocks),
-m_unreachable_dom (NULL)
+m_user_bb_to_rpo (bb_index_to_rpo != NULL),
+m_unreachable_dom (NULL),
+m_bb_to_rpo (bb_index_to_rpo)
{
+ /* Compute the basic-block index to RPO mapping if not provided by
+ the user. */
+ if (! m_bb_to_rpo && direction