On Wed, Jan 18, 2017 at 11:10 AM, Martin Liška <mli...@suse.cz> wrote: > Hello. > > After basic understanding of loop predictive commoning, the problematic > combined chain is: > > Loads-only chain 0x38b6730 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.29_81] (id 1) > offset 20 > distance 0 > MEM[(real(kind=8) *)vectp_a.38_141] (id 3) > offset 20 > distance 0 > > Loads-only chain 0x38b68b0 (combined) > max distance 0 > references: > MEM[(real(kind=8) *)vectp_a.23_102] (id 0) > offset 0 > distance 0 > MEM[(real(kind=8) *)vectp_a.33_33] (id 2) > offset 0 > distance 0 > > Combination chain 0x38b65b0 > max distance 0, may reuse first > equal to 0x38b6730 + 0x38b68b0 in type vector(2) real(kind=8) > references: > combination ref > in statement predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > > distance 0 > combination ref > in statement predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > > distance 0 > > It's important to note that distance is equal to zero (happening within a > same loop iteration). > Aforementioned chains correspond to: > > ... > r2: vect__28.25_100 = MEM[(real(kind=8) *)vectp_a.23_102]; > vectp_a.23_99 = vectp_a.23_102 + 16; > vect__28.26_98 = MEM[(real(kind=8) *)vectp_a.23_99]; > vect__82.27_97 = vect__22.22_108; > vect__82.27_96 = vect__22.22_107; > vect__79.28_95 = vect__82.27_97 + vect__84.17_120; > vect__79.28_94 = vect__82.27_96 + vect__84.17_119; > r1: vect__32.31_78 = MEM[(real(kind=8) *)vectp_a.29_81]; > vectp_a.29_77 = vectp_a.29_81 + 16; > vect__32.32_76 = MEM[(real(kind=8) *)vectp_a.29_77]; > vect__38.35_39 = MEM[(real(kind=8) *)vectp_a.33_57]; > r2': vectp_a.33_33 = vectp_a.33_57 + 16; > vect__38.36_29 = MEM[(real(kind=8) *)vectp_a.33_33]; > vect__56.37_23 = vect__38.35_39; > vect__56.37_15 = vect__32.32_76; > vect__42.40_161 = MEM[(real(kind=8) *)vectp_a.38_163]; > vectp_a.38_141 = vectp_a.38_163 + 16; > r1': vect__42.41_138 = MEM[(real(kind=8) *)vectp_a.38_141]; > vect__54.42_135 = vect__42.40_161 + vect__56.37_23; > r1'+r2': predreastmp.50_17 = vect__42.41_138 + vect__38.36_29; > predreastmp.51_18 = vect__56.37_15; > vect__54.42_134 = predreastmp.50_17; > r1+r2: predreastmp.48_10 = vect__32.31_78 + vect__28.25_100; > ... > > Problematic construct is that while having load-only chains r1->r1' and > r2->r2', the combination > is actually r1'+r2'->r1+r2, which cause the troubles. I believe the proper > fix is to reject such > combinations where combined root stmt does not dominate usages. It's probably > corner case as it does > not reuse any values among loop iterations (which is main motivation of the > pass), it's doing PRE > if I'm right. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Ready to be installed?
I'm not sure. If we have such zero distance refs in the IL at the time pcom runs then not handling them will pessimize code-gen for cases where they are part of a larger chain. Esp. I don't like another stmt_dominates_stmt_p call and thus rather not handle length == 0 at all... We already seem to go great length in associating stuff when combining stuff thus isn't this maybe an artifact of this association? Maybe we simply need to sort the new chain after combining it so the root stmt comes last? Note that there seems to be only a single length per chain but not all refs in a chain need to have the same distance. This means your fix is likely incomplete? What prevents the situation to arise for distance != 0? Richard. > Martin >