On Wed, Jan 18, 2017 at 11:10:32AM +0100, Martin Liška wrote: > Hello. > > After basic understanding of loop predictive commoning, the problematic > combined chain is: > > Loads-only chain 0x38b6730 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.29_81] (id 1) > offset 20 > distance 0 > MEM[(real(kind=8) *)vectp_a.38_141] (id 3) > offset 20 > distance 0 > > Loads-only chain 0x38b68b0 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.23_102] (id 0) > offset 0 > distance 0 > MEM[(real(kind=8) *)vectp_a.33_33] (id 2) > offset 0 > distance 0 > > Combination chain 0x38b65b0 > max distance 0, may reuse first > equal to 0x38b6730 + 0x38b68b0 in type vector(2) real(kind=8) > references: > combination ref > in statement predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > > distance 0 > combination ref > in statement predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > > distance 0 > > It's important to note that distance is equal to zero (happening within a > same loop iteration). > Aforementioned chains correspond to: > > ... > r2: vect__28.25_100 = MEM[(real(kind=8) *)vectp_a.23_102]; > vectp_a.23_99 = vectp_a.23_102 + 16; > vect__28.26_98 = MEM[(real(kind=8) *)vectp_a.23_99]; > vect__82.27_97 = vect__22.22_108; > vect__82.27_96 = vect__22.22_107; > vect__79.28_95 = vect__82.27_97 + vect__84.17_120; > vect__79.28_94 = vect__82.27_96 + vect__84.17_119; > r1: vect__32.31_78 = MEM[(real(kind=8) *)vectp_a.29_81]; > vectp_a.29_77 = vectp_a.29_81 + 16; > vect__32.32_76 = MEM[(real(kind=8) *)vectp_a.29_77]; > vect__38.35_39 = MEM[(real(kind=8) *)vectp_a.33_57]; > r2': vectp_a.33_33 = vectp_a.33_57 + 16; > vect__38.36_29 = MEM[(real(kind=8) *)vectp_a.33_33]; > vect__56.37_23 = vect__38.35_39; > vect__56.37_15 = vect__32.32_76; > vect__42.40_161 = MEM[(real(kind=8) *)vectp_a.38_163]; > vectp_a.38_141 = vectp_a.38_163 + 16; > r1': vect__42.41_138 = MEM[(real(kind=8) *)vectp_a.38_141]; > vect__54.42_135 = vect__42.40_161 + vect__56.37_23; > r1'+r2': predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > predreastmp.51_18 = vect__56.37_15; > vect__54.42_134 = predreastmp.50_17; > r1+r2: predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > ... > > Problematic construct is that while having load-only chains r1->r1' and > r2->r2', the combination > is actually r1'+r2'->r1+r2, which cause the troubles. I believe the proper > fix is to reject such > combinations where combined root stmt does not dominate usages. It's probably > corner case as it does > not reuse any values among loop iterations (which is main motivation of the > pass), it's doing PRE > if I'm right. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
I could bootstrap on aarch64-none-linux-gnu without any issues, regression tests are fine and the testcase compiles without ICE. Thanks for fixing this. VP. > > Ready to be installed? > Martin > > From 41b153cf975374fff48419ec8ac5991ac134735f Mon Sep 17 00:00:00 2001 > From: marxin <mli...@suse.cz> > Date: Tue, 17 Jan 2017 14:22:40 +0100 > Subject: [PATCH] Be careful about combined chain with length == 0 (PR > tree-optimization/70754). > > gcc/testsuite/ChangeLog: > > 2017-01-17 Martin Liska <mli...@suse.cz> > > PR tree-optimization/70754 > * gfortran.dg/pr70754.f90: New test. > > gcc/ChangeLog: > > 2017-01-17 Martin Liska <mli...@suse.cz> > > PR tree-optimization/70754 > * tree-predcom.c (combine_chains): Do not create a combined chain > with length equal to zero when root_stmt does not dominate > stmts of references. > --- > gcc/testsuite/gfortran.dg/pr70754.f90 | 35 > +++++++++++++++++++++++++++++++++++ > gcc/tree-predcom.c | 10 ++++++++++ > 2 files changed, 45 insertions(+) > create mode 100644 gcc/testsuite/gfortran.dg/pr70754.f90 > > diff --git a/gcc/testsuite/gfortran.dg/pr70754.f90 > b/gcc/testsuite/gfortran.dg/pr70754.f90 > new file mode 100644 > index 00000000000..758901ce2b2 > --- /dev/null > +++ b/gcc/testsuite/gfortran.dg/pr70754.f90 > @@ -0,0 +1,35 @@ > +! { dg-options "-Ofast" } > + > +module m > + implicit none > + private > + save > + > + integer, parameter, public :: & > + ii4 = selected_int_kind(6), & > + rr8 = selected_real_kind(13) > + > + integer (ii4), dimension(40,40,199), public :: xyz > + public :: foo > +contains > + subroutine foo(a) > + real (rr8), dimension(40,40), intent(out) :: a > + real (rr8), dimension(40,40) :: b > + integer (ii4), dimension(40,40) :: c > + integer i, j > + > + do i=1,8 > + b(i,j) = 123 * a(i,j) + a(i,j+1) & > + + a(i,j) + a(i+1,j+1) & > + + a(i+1,j) + a(i-1,j+1) & > + + a(i-1,j) > + c(i,j) = 123 > + end do > + > + where ((xyz(:,:,2) /= 0) .and. (c /= 0)) > + a = b/real(c) > + elsewhere > + a = 456 > + endwhere > + end subroutine foo > +end module m > diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c > index f0b70a61fb8..768f976cb87 100644 > --- a/gcc/tree-predcom.c > +++ b/gcc/tree-predcom.c > @@ -2323,6 +2323,16 @@ combine_chains (chain_p ch1, chain_p ch2) > root_stmt = get_chain_root (new_chain)->stmt; > for (i = 1; new_chain->refs.iterate (i, &nw); i++) > { > + /* PR tree-optimization/70754 > + For a combined chain with length equal to zero, we have to guarantee > + that ROOT_STMT dominates all references. */ > + if (new_chain->length == 0 > + && !stmt_dominates_stmt_p (root_stmt, nw->stmt)) > + { > + release_chain (new_chain); > + return NULL; > + } > + > if (nw->distance == new_chain->length > && !stmt_dominates_stmt_p (nw->stmt, root_stmt)) > { > -- > 2.11.0 >