Richard,

Jakub has already fixed it.
Sorry for troubles.
Yuri.

2016-07-19 18:29 GMT+03:00 Renlin Li <renlin...@foss.arm.com>:
> Hi Yuri,
>
> I saw this test case runs on arm platforms, and maybe other platforms as
> well.
>
> testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such
> file or directory
>
> Before the change here, it's gated by vect_simd_clones target selector,
> which limit it to i?86/x86_64 platform only.
>
> Regards,
> Renlin Li
>
>
>
>
> On 08/07/16 15:07, Yuri Rumyantsev wrote:
>>
>> Hi Richard,
>>
>> Thanks for your help - your patch looks much better.
>> Here is new patch in which additional argument was added to determine
>> source loop of reference.
>>
>> Bootstrap and regression testing did not show any new failures.
>>
>> Is it OK for trunk?
>> ChangeLog:
>> 2016-07-08  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> PR tree-optimization/71734
>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which
>> contains REF, use it to check safelen, assume that safelen value
>> must be greater 1, fix style.
>> (ref_indep_loop_p_2): Add REF_LOOP argument.
>> (ref_indep_loop_p): Pass LOOP as additional argument to
>> ref_indep_loop_p_2.
>> gcc/testsuite/ChangeLog:
>>          * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.
>>
>> 2016-07-08 11:18 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>
>>> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev <ysrum...@gmail.com>
>>> wrote:
>>>>
>>>> I checked simd3.f90 and found out that my additional check reject
>>>> independence of references
>>>>
>>>> REF is independent in loop#3
>>>> .istart0.19, .iend0.20
>>>> which are defined in loop#1 which is outer for loop#3.
>>>> Note that these references are defined by
>>>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20);
>>>> which is in loop#1.
>>>> It is clear that both these references can not be independent for
>>>> loop#3.
>>>
>>>
>>> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner
>>> loops
>>> of LOOP to catch memory references in those as well.  So the issue is
>>> really
>>> that we look at the wrong loop for safelen and we _do_ want to apply
>>> safelen
>>> to inner loops as well.
>>>
>>> So better track the loop we are ultimately asking the question for, like
>>> in the
>>> attached patch (fixes the testcase for me).
>>>
>>> Richard.
>>>
>>>
>>>
>>>> 2016-07-07 17:11 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>
>>>>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev <ysrum...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> I Added this check because of new failures in libgomp.fortran suite.
>>>>>> Here is copy of Jakub message:
>>>>>> --- Comment #29 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
>>>>>> The #c27 r237844 change looks bogus to me.
>>>>>> First of all, IMNSHO you can argue this way only if ref is a reference
>>>>>> seen in
>>>>>> loop LOOP,
>>>>>
>>>>>
>>>>> or inner loops of LOOP I guess.  I _think_ we never call
>>>>> ref_indep_loop_p_1 with
>>>>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would
>>>>> not make
>>>>> sense to do that, it would be a waste of time).
>>>>>
>>>>> So only if "or inner loops of LOOP" is not correct the check would be
>>>>> needed
>>>>> but then my issue with unrolling an inner loop and turning a ref that
>>>>> safelen
>>>>> does not apply to into a ref that it now applies to arises.
>>>>>
>>>>> I don't fully get what Jakub is hinting at.
>>>>>
>>>>> Can you install the safelen > 0 -> safelen > 1 fix please?  Jakub, can
>>>>> you
>>>>> explain that bitmap check with a simple testcase?
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2
>>>>>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p
>>>>>> - the
>>>>>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the
>>>>>> outer loop
>>>>>> obviously can be dependent on many of the loads and/or stores in the
>>>>>> loop, be
>>>>>> it "omp simd array" or not.
>>>>>> Say for
>>>>>> void
>>>>>> foo (int *p, int *q)
>>>>>> {
>>>>>>    #pragma omp simd
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      p[i] += q[0];
>>>>>> }
>>>>>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could
>>>>>> write
>>>>>> something that changes its value, and then it would behave differently
>>>>>> from
>>>>>> using VF = 1024, where everything is performed in parallel.
>>>>>> Though, actually, it can alias, just it would have to write the same
>>>>>> value as
>>>>>> was there.  So, if this is used to determine if it is safe to hoist
>>>>>> the load
>>>>>> before the loop, it is fine, if it is used to determine if &q[0] >=
>>>>>> &p[0] &&
>>>>>> &q[0] <= &p[1023], then it is not fine.
>>>>>>
>>>>>> For aliasing of q[0] and p[1023], I don't see why they couldn't alias
>>>>>> in a
>>>>>> valid program.  #pragma omp simd I think guarantees that the last
>>>>>> iteration is
>>>>>> executed last, it isn't necessarily executed last alone, it could be,
>>>>>> or
>>>>>> together with one before last iteration, or (for simdlen INT_MAX) even
>>>>>> all
>>>>>> iterations can be done concurrently, in hw or sw, so it is fine if it
>>>>>> is
>>>>>> transformed into:
>>>>>>    int temp[1024], temp2[1024], temp3[1024];
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      temp[i] = p[i];
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      temp2[i] = q[0];
>>>>>>    /* The above two loops can be also swapped, or intermixed.  */
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      temp3[i] = temp[i] + temp2[i];
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      p[i] = temp3[i];
>>>>>>    /* Or the above loop reversed etc. */
>>>>>>
>>>>>> If you have:
>>>>>> int
>>>>>> bar (int *p, int *q)
>>>>>> {
>>>>>>    q[0] = 0;
>>>>>>    #pragma omp simd
>>>>>>    for (int i = 0; i < 1024; i++)
>>>>>>      p[i]++;
>>>>>>    return q[0];
>>>>>> }
>>>>>> i.e. something similar to what misbehaves in simd3.f90 with the
>>>>>> change, then
>>>>>> the answer is that q[0] isn't guaranteed to be independent of any
>>>>>> references in
>>>>>> the simd loop.
>>>>>>
>>>>>> 2016-07-07 16:57 GMT+03:00 Richard Biener
>>>>>> <richard.guent...@gmail.com>:
>>>>>>>
>>>>>>> On Thu, Jul 7, 2016 at 3:24 PM, Yuri Rumyantsev <ysrum...@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Richard,
>>>>>>>>
>>>>>>>> Did you meas the following check enhancement:
>>>>>>>>
>>>>>>>>    outer = loop_outer (loop);
>>>>>>>>    /* We consider REF defined in LOOP as independent if at least 2
>>>>>>>> loop
>>>>>>>>       iterations are not dependent.  */
>>>>>>>>    if ((loop->safelen > 1
>>>>>>>>         || (outer && outer->safelen > 1))
>>>>>>>>        && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num],
>>>>>>>> ref->id))
>>>>>>>>      return true;
>>>>>>>> which allows us to consider reference x[0] as invariant for the test
>>>>>>>> #pragma omp simd
>>>>>>>>    for (i = 0; i< 100; i++)
>>>>>>>>      {
>>>>>>>>        y[i] = i * 2;
>>>>>>>>        for (j = 0; j < 100; j++)
>>>>>>>> z[j] += x[0];
>>>>>>>>      }
>>>>>>>>
>>>>>>>> Moving statement
>>>>>>>> _9 = *x_19(D);
>>>>>>>> (cost 20) out of loop 1.
>>>>>>>
>>>>>>>
>>>>>>> outer loops will be automatically considered by outermost_indep_loop
>>>>>>> which eventually
>>>>>>> calls the function you are patching.
>>>>>>>
>>>>>>> I was questioning the added test
>>>>>>>
>>>>>>>         && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num],
>>>>>>> ref->id))
>>>>>>>
>>>>>>> and still do not understand why you need that.  It makes us only
>>>>>>> consider the
>>>>>>> loop of ref itself when looking at safelen (but not refs that belong
>>>>>>> to inner loops
>>>>>>> of loop).
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>
>>>>>>>> 2016-07-07 11:46 GMT+03:00 Richard Biener
>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>
>>>>>>>>> On Wed, Jul 6, 2016 at 4:42 PM, Yuri Rumyantsev
>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Richard,
>>>>>>>>>>
>>>>>>>>>> I don't understand your question and how it is related to proposed
>>>>>>>>>> patch.
>>>>>>>>>>
>>>>>>>>>> 2016-07-06 16:55 GMT+03:00 Richard Biener
>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jul 6, 2016 at 3:50 PM, Yuri Rumyantsev
>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> See for example Jakub comments for 70729.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> But how can the check be valid given a
>>>>>>>>>>>
>>>>>>>>>>> #pragma omp simd
>>>>>>>>>>>    for (;;)
>>>>>>>>>>>      {
>>>>>>>>>>>         for (;;)
>>>>>>>>>>>           ...ref in question...
>>>>>>>>>>>      }
>>>>>>>>>>>
>>>>>>>>>>> can be transformed to
>>>>>>>>>>>
>>>>>>>>>>> #pragma omp simd
>>>>>>>>>>>     for (;;)
>>>>>>>>>>>       {
>>>>>>>>>>>          ...ref in question...
>>>>>>>>>>>          ...ref in question..
>>>>>>>>>>>          ...
>>>>>>>>>>>       }
>>>>>>>>>>>
>>>>>>>>>>> by means of unrolling?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The bitmap check would return false for the not unrolled inner loop
>>>>>>>>> and true for the inner unrolled loop.
>>>>>>>>> So it cannot be correct.
>>>>>>>>>
>>>>>>>>> (not sure why this is all off-list btw)
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>>> Richard.
>>>>>>>>>>>
>>>>>>>>>>>> 2016-07-06 16:25 GMT+03:00 Richard Biener
>>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:43 PM, Yuri Rumyantsev
>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I did not learn LIM detailed but I see on the following
>>>>>>>>>>>>>> example
>>>>>>>>>>>>>> #pragma omp simd
>>>>>>>>>>>>>>    for (i = 0; i< 100; i++)
>>>>>>>>>>>>>>      {
>>>>>>>>>>>>>>        y[i] = i * 2;
>>>>>>>>>>>>>>        for (j = 0; j < 100; j++)
>>>>>>>>>>>>>> z[j] += x[0];
>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>> that only inner,ost loop is handled:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Memory reference in loop#2 1: *_7
>>>>>>>>>>>>>> Memory reference in loop#2 2: *x_19(D)
>>>>>>>>>>>>>> Memory reference in loop#1 3: *_3
>>>>>>>>>>>>>> Basic block 3 (loop 1 -- depth 1):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Basic block 4 (loop 2 -- depth 2):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Loop#2 is innermost
>>>>>>>>>>>>>> Querying dependency of refs 2 and 1: dependent.
>>>>>>>>>>>>>> Querying dependencies of ref 2 in loop 2: dependent
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> well, LIM doesn't bother to check against refs in an outer loop
>>>>>>>>>>>>> if
>>>>>>>>>>>>> refs in the inner loop are already dependent.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Loop unroll of inner loop does not invalidate 'safelen' of
>>>>>>>>>>>>>> outer one
>>>>>>>>>>>>>> accordingly with safelen definition.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> So why do you need the bitmap_bit_p check then?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2016-07-06 14:23 GMT+03:00 Richard Biener
>>>>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 1:17 PM, Yuri Rumyantsev
>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Richard,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> LIM phase considers only innermost loops and does not handle
>>>>>>>>>>>>>>>> loop
>>>>>>>>>>>>>>>> nests, so in your case the innermost loop (j) does not have
>>>>>>>>>>>>>>>> non-zero
>>>>>>>>>>>>>>>> safelen.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Huh?  LIM considers loop nests just fine.  And yes, the
>>>>>>>>>>>>>>> innermost loop
>>>>>>>>>>>>>>> does not have safelen but shouldn't it?  Consider the inner
>>>>>>>>>>>>>>> loop being unrolled
>>>>>>>>>>>>>>> by GCC so it is removed - should that then invalidate the
>>>>>>>>>>>>>>> outer loop safelen
>>>>>>>>>>>>>>> because innermost loop refs now appear in the outer loop?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2016-07-06 13:34 GMT+03:00 Richard Biener
>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Jul 6, 2016 at 11:50 AM, Yuri Rumyantsev
>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Richard,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I pointed out in the commentary that REF is defined inside
>>>>>>>>>>>>>>>>>> loop and
>>>>>>>>>>>>>>>>>> this check is related to this statement. Should I clarify
>>>>>>>>>>>>>>>>>> it?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> +  /* We consider REF defined in LOOP as independent if at
>>>>>>>>>>>>>>>>>> least 2 loop
>>>>>>>>>>>>>>>>>> +     iterations are not dependent.  */
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Yes, I fail to see why x[0] should not be disambiguated
>>>>>>>>>>>>>>>>> against y[i] in
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> #pragma GCC loop ivdep
>>>>>>>>>>>>>>>>>    for (i...)
>>>>>>>>>>>>>>>>>      {
>>>>>>>>>>>>>>>>>        y[i] = ...;
>>>>>>>>>>>>>>>>>        for (j...)
>>>>>>>>>>>>>>>>>          ... = x[0];
>>>>>>>>>>>>>>>>>      }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> REF is always inside the loop nest with LOOP being the
>>>>>>>>>>>>>>>>> outermost loop.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2016-07-06 12:38 GMT+03:00 Richard Biener
>>>>>>>>>>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev
>>>>>>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Here is a simple fix to cure regressions introduced by
>>>>>>>>>>>>>>>>>>>> my fix for
>>>>>>>>>>>>>>>>>>>> 70729. Patch also contains minor changes in test found
>>>>>>>>>>>>>>>>>>>> by Jakub.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Bootstrapping and regression testing did not show any
>>>>>>>>>>>>>>>>>>>> new failures.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> +      && bitmap_bit_p
>>>>>>>>>>>>>>>>>>> (&memory_accesses.refs_in_loop[loop->num], ref->id))
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> So safelen does not apply to refs in nested loops?  The
>>>>>>>>>>>>>>>>>>> added comment only
>>>>>>>>>>>>>>>>>>> explains the safelen check change but this also requires
>>>>>>>>>>>>>>>>>>> explanation.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Richard.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>>>>>>>>>> 2016-07-05  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF
>>>>>>>>>>>>>>>>>>>> defined in
>>>>>>>>>>>>>>>>>>>> LOOP as independent if at least two loop iterations are
>>>>>>>>>>>>>>>>>>>> not dependent.
>>>>>>>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>>>>>>>          * g++.dg/vect/pr70729.cc: Delete redundant dg
>>>>>>>>>>>>>>>>>>>> options, fix style.

Reply via email to