Richard, Jakub has already fixed it. Sorry for troubles. Yuri.
2016-07-19 18:29 GMT+03:00 Renlin Li <renlin...@foss.arm.com>: > Hi Yuri, > > I saw this test case runs on arm platforms, and maybe other platforms as > well. > > testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such > file or directory > > Before the change here, it's gated by vect_simd_clones target selector, > which limit it to i?86/x86_64 platform only. > > Regards, > Renlin Li > > > > > On 08/07/16 15:07, Yuri Rumyantsev wrote: >> >> Hi Richard, >> >> Thanks for your help - your patch looks much better. >> Here is new patch in which additional argument was added to determine >> source loop of reference. >> >> Bootstrap and regression testing did not show any new failures. >> >> Is it OK for trunk? >> ChangeLog: >> 2016-07-08 Yuri Rumyantsev <ysrum...@gmail.com> >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which >> contains REF, use it to check safelen, assume that safelen value >> must be greater 1, fix style. >> (ref_indep_loop_p_2): Add REF_LOOP argument. >> (ref_indep_loop_p): Pass LOOP as additional argument to >> ref_indep_loop_p_2. >> gcc/testsuite/ChangeLog: >> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. >> >> 2016-07-08 11:18 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>> >>> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>> wrote: >>>> >>>> I checked simd3.f90 and found out that my additional check reject >>>> independence of references >>>> >>>> REF is independent in loop#3 >>>> .istart0.19, .iend0.20 >>>> which are defined in loop#1 which is outer for loop#3. >>>> Note that these references are defined by >>>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >>>> which is in loop#1. >>>> It is clear that both these references can not be independent for >>>> loop#3. >>> >>> >>> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner >>> loops >>> of LOOP to catch memory references in those as well. So the issue is >>> really >>> that we look at the wrong loop for safelen and we _do_ want to apply >>> safelen >>> to inner loops as well. >>> >>> So better track the loop we are ultimately asking the question for, like >>> in the >>> attached patch (fixes the testcase for me). >>> >>> Richard. >>> >>> >>> >>>> 2016-07-07 17:11 GMT+03:00 Richard Biener <richard.guent...@gmail.com>: >>>>> >>>>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>> wrote: >>>>>> >>>>>> I Added this check because of new failures in libgomp.fortran suite. >>>>>> Here is copy of Jakub message: >>>>>> --- Comment #29 from Jakub Jelinek <jakub at gcc dot gnu.org> --- >>>>>> The #c27 r237844 change looks bogus to me. >>>>>> First of all, IMNSHO you can argue this way only if ref is a reference >>>>>> seen in >>>>>> loop LOOP, >>>>> >>>>> >>>>> or inner loops of LOOP I guess. I _think_ we never call >>>>> ref_indep_loop_p_1 with >>>>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would >>>>> not make >>>>> sense to do that, it would be a waste of time). >>>>> >>>>> So only if "or inner loops of LOOP" is not correct the check would be >>>>> needed >>>>> but then my issue with unrolling an inner loop and turning a ref that >>>>> safelen >>>>> does not apply to into a ref that it now applies to arises. >>>>> >>>>> I don't fully get what Jakub is hinting at. >>>>> >>>>> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can >>>>> you >>>>> explain that bitmap check with a simple testcase? >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 >>>>>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p >>>>>> - the >>>>>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the >>>>>> outer loop >>>>>> obviously can be dependent on many of the loads and/or stores in the >>>>>> loop, be >>>>>> it "omp simd array" or not. >>>>>> Say for >>>>>> void >>>>>> foo (int *p, int *q) >>>>>> { >>>>>> #pragma omp simd >>>>>> for (int i = 0; i < 1024; i++) >>>>>> p[i] += q[0]; >>>>>> } >>>>>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could >>>>>> write >>>>>> something that changes its value, and then it would behave differently >>>>>> from >>>>>> using VF = 1024, where everything is performed in parallel. >>>>>> Though, actually, it can alias, just it would have to write the same >>>>>> value as >>>>>> was there. So, if this is used to determine if it is safe to hoist >>>>>> the load >>>>>> before the loop, it is fine, if it is used to determine if &q[0] >= >>>>>> &p[0] && >>>>>> &q[0] <= &p[1023], then it is not fine. >>>>>> >>>>>> For aliasing of q[0] and p[1023], I don't see why they couldn't alias >>>>>> in a >>>>>> valid program. #pragma omp simd I think guarantees that the last >>>>>> iteration is >>>>>> executed last, it isn't necessarily executed last alone, it could be, >>>>>> or >>>>>> together with one before last iteration, or (for simdlen INT_MAX) even >>>>>> all >>>>>> iterations can be done concurrently, in hw or sw, so it is fine if it >>>>>> is >>>>>> transformed into: >>>>>> int temp[1024], temp2[1024], temp3[1024]; >>>>>> for (int i = 0; i < 1024; i++) >>>>>> temp[i] = p[i]; >>>>>> for (int i = 0; i < 1024; i++) >>>>>> temp2[i] = q[0]; >>>>>> /* The above two loops can be also swapped, or intermixed. */ >>>>>> for (int i = 0; i < 1024; i++) >>>>>> temp3[i] = temp[i] + temp2[i]; >>>>>> for (int i = 0; i < 1024; i++) >>>>>> p[i] = temp3[i]; >>>>>> /* Or the above loop reversed etc. */ >>>>>> >>>>>> If you have: >>>>>> int >>>>>> bar (int *p, int *q) >>>>>> { >>>>>> q[0] = 0; >>>>>> #pragma omp simd >>>>>> for (int i = 0; i < 1024; i++) >>>>>> p[i]++; >>>>>> return q[0]; >>>>>> } >>>>>> i.e. something similar to what misbehaves in simd3.f90 with the >>>>>> change, then >>>>>> the answer is that q[0] isn't guaranteed to be independent of any >>>>>> references in >>>>>> the simd loop. >>>>>> >>>>>> 2016-07-07 16:57 GMT+03:00 Richard Biener >>>>>> <richard.guent...@gmail.com>: >>>>>>> >>>>>>> On Thu, Jul 7, 2016 at 3:24 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> Richard, >>>>>>>> >>>>>>>> Did you meas the following check enhancement: >>>>>>>> >>>>>>>> outer = loop_outer (loop); >>>>>>>> /* We consider REF defined in LOOP as independent if at least 2 >>>>>>>> loop >>>>>>>> iterations are not dependent. */ >>>>>>>> if ((loop->safelen > 1 >>>>>>>> || (outer && outer->safelen > 1)) >>>>>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>>>>> ref->id)) >>>>>>>> return true; >>>>>>>> which allows us to consider reference x[0] as invariant for the test >>>>>>>> #pragma omp simd >>>>>>>> for (i = 0; i< 100; i++) >>>>>>>> { >>>>>>>> y[i] = i * 2; >>>>>>>> for (j = 0; j < 100; j++) >>>>>>>> z[j] += x[0]; >>>>>>>> } >>>>>>>> >>>>>>>> Moving statement >>>>>>>> _9 = *x_19(D); >>>>>>>> (cost 20) out of loop 1. >>>>>>> >>>>>>> >>>>>>> outer loops will be automatically considered by outermost_indep_loop >>>>>>> which eventually >>>>>>> calls the function you are patching. >>>>>>> >>>>>>> I was questioning the added test >>>>>>> >>>>>>> && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], >>>>>>> ref->id)) >>>>>>> >>>>>>> and still do not understand why you need that. It makes us only >>>>>>> consider the >>>>>>> loop of ref itself when looking at safelen (but not refs that belong >>>>>>> to inner loops >>>>>>> of loop). >>>>>>> >>>>>>> Richard. >>>>>>> >>>>>>> >>>>>>>> 2016-07-07 11:46 GMT+03:00 Richard Biener >>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>> >>>>>>>>> On Wed, Jul 6, 2016 at 4:42 PM, Yuri Rumyantsev >>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>> Richard, >>>>>>>>>> >>>>>>>>>> I don't understand your question and how it is related to proposed >>>>>>>>>> patch. >>>>>>>>>> >>>>>>>>>> 2016-07-06 16:55 GMT+03:00 Richard Biener >>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 6, 2016 at 3:50 PM, Yuri Rumyantsev >>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> See for example Jakub comments for 70729. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> But how can the check be valid given a >>>>>>>>>>> >>>>>>>>>>> #pragma omp simd >>>>>>>>>>> for (;;) >>>>>>>>>>> { >>>>>>>>>>> for (;;) >>>>>>>>>>> ...ref in question... >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> can be transformed to >>>>>>>>>>> >>>>>>>>>>> #pragma omp simd >>>>>>>>>>> for (;;) >>>>>>>>>>> { >>>>>>>>>>> ...ref in question... >>>>>>>>>>> ...ref in question.. >>>>>>>>>>> ... >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> by means of unrolling? >>>>>>>>> >>>>>>>>> >>>>>>>>> The bitmap check would return false for the not unrolled inner loop >>>>>>>>> and true for the inner unrolled loop. >>>>>>>>> So it cannot be correct. >>>>>>>>> >>>>>>>>> (not sure why this is all off-list btw) >>>>>>>>> >>>>>>>>> Richard. >>>>>>>>> >>>>>>>>>>> Richard. >>>>>>>>>>> >>>>>>>>>>>> 2016-07-06 16:25 GMT+03:00 Richard Biener >>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:43 PM, Yuri Rumyantsev >>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I did not learn LIM detailed but I see on the following >>>>>>>>>>>>>> example >>>>>>>>>>>>>> #pragma omp simd >>>>>>>>>>>>>> for (i = 0; i< 100; i++) >>>>>>>>>>>>>> { >>>>>>>>>>>>>> y[i] = i * 2; >>>>>>>>>>>>>> for (j = 0; j < 100; j++) >>>>>>>>>>>>>> z[j] += x[0]; >>>>>>>>>>>>>> } >>>>>>>>>>>>>> that only inner,ost loop is handled: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Memory reference in loop#2 1: *_7 >>>>>>>>>>>>>> Memory reference in loop#2 2: *x_19(D) >>>>>>>>>>>>>> Memory reference in loop#1 3: *_3 >>>>>>>>>>>>>> Basic block 3 (loop 1 -- depth 1): >>>>>>>>>>>>>> >>>>>>>>>>>>>> Basic block 4 (loop 2 -- depth 2): >>>>>>>>>>>>>> >>>>>>>>>>>>>> Loop#2 is innermost >>>>>>>>>>>>>> Querying dependency of refs 2 and 1: dependent. >>>>>>>>>>>>>> Querying dependencies of ref 2 in loop 2: dependent >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> well, LIM doesn't bother to check against refs in an outer loop >>>>>>>>>>>>> if >>>>>>>>>>>>> refs in the inner loop are already dependent. >>>>>>>>>>>>> >>>>>>>>>>>>>> Loop unroll of inner loop does not invalidate 'safelen' of >>>>>>>>>>>>>> outer one >>>>>>>>>>>>>> accordingly with safelen definition. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> So why do you need the bitmap_bit_p check then? >>>>>>>>>>>>> >>>>>>>>>>>>> Richard. >>>>>>>>>>>>> >>>>>>>>>>>>>> 2016-07-06 14:23 GMT+03:00 Richard Biener >>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:17 PM, Yuri Rumyantsev >>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> LIM phase considers only innermost loops and does not handle >>>>>>>>>>>>>>>> loop >>>>>>>>>>>>>>>> nests, so in your case the innermost loop (j) does not have >>>>>>>>>>>>>>>> non-zero >>>>>>>>>>>>>>>> safelen. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Huh? LIM considers loop nests just fine. And yes, the >>>>>>>>>>>>>>> innermost loop >>>>>>>>>>>>>>> does not have safelen but shouldn't it? Consider the inner >>>>>>>>>>>>>>> loop being unrolled >>>>>>>>>>>>>>> by GCC so it is removed - should that then invalidate the >>>>>>>>>>>>>>> outer loop safelen >>>>>>>>>>>>>>> because innermost loop refs now appear in the outer loop? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2016-07-06 13:34 GMT+03:00 Richard Biener >>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 11:50 AM, Yuri Rumyantsev >>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Richard, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I pointed out in the commentary that REF is defined inside >>>>>>>>>>>>>>>>>> loop and >>>>>>>>>>>>>>>>>> this check is related to this statement. Should I clarify >>>>>>>>>>>>>>>>>> it? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> + /* We consider REF defined in LOOP as independent if at >>>>>>>>>>>>>>>>>> least 2 loop >>>>>>>>>>>>>>>>>> + iterations are not dependent. */ >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Yes, I fail to see why x[0] should not be disambiguated >>>>>>>>>>>>>>>>> against y[i] in >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> #pragma GCC loop ivdep >>>>>>>>>>>>>>>>> for (i...) >>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>> y[i] = ...; >>>>>>>>>>>>>>>>> for (j...) >>>>>>>>>>>>>>>>> ... = x[0]; >>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> REF is always inside the loop nest with LOOP being the >>>>>>>>>>>>>>>>> outermost loop. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2016-07-06 12:38 GMT+03:00 Richard Biener >>>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev >>>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Here is a simple fix to cure regressions introduced by >>>>>>>>>>>>>>>>>>>> my fix for >>>>>>>>>>>>>>>>>>>> 70729. Patch also contains minor changes in test found >>>>>>>>>>>>>>>>>>>> by Jakub. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Bootstrapping and regression testing did not show any >>>>>>>>>>>>>>>>>>>> new failures. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Is it OK for trunk? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> + && bitmap_bit_p >>>>>>>>>>>>>>>>>>> (&memory_accesses.refs_in_loop[loop->num], ref->id)) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So safelen does not apply to refs in nested loops? The >>>>>>>>>>>>>>>>>>> added comment only >>>>>>>>>>>>>>>>>>> explains the safelen check change but this also requires >>>>>>>>>>>>>>>>>>> explanation. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> ChangeLog: >>>>>>>>>>>>>>>>>>>> 2016-07-05 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> PR tree-optimization/71734 >>>>>>>>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF >>>>>>>>>>>>>>>>>>>> defined in >>>>>>>>>>>>>>>>>>>> LOOP as independent if at least two loop iterations are >>>>>>>>>>>>>>>>>>>> not dependent. >>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog: >>>>>>>>>>>>>>>>>>>> * g++.dg/vect/pr70729.cc: Delete redundant dg >>>>>>>>>>>>>>>>>>>> options, fix style.