On Tue, Aug 9, 2016 at 2:05 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
> Richard,
>
> I checked that this move helps.
> Does it mean that I've got approval to integrate it to trunk?

Yes, if it survives bootstrap & regtest.

Richard.

> 2016-08-09 14:33 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>> On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>>> Richard,
>>>
>>> The patch proposed by you does not work properly for
>>> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
>>> been cached as dependent for outer loop and loop is not vectorized:
>>>
>>>  g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
>>> -fdump-tree-vect-details
>>> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
>>> <not found>
>>>
>>> You missed additional check I added before check on cached dependence.
>>
>> Ok, but it should get the correctness right?
>>
>> I suppose that if you move the cache checks inside the else clause it
>> would work?
>> I'd be ok with that change.
>>
>> Richard.
>>
>>> 2016-08-09 13:00 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>> wrote:
>>>>> Yes it is impossible since all basic blocks are handled from outer
>>>>> loops to innermost so we can not have the sequence with wrong
>>>>> dependence, i.e. we cached that reference is independent (due to
>>>>> safelen) but the same reference in outer loop must be evaluated as
>>>>> dependent. So we must re-evaluate only dependent references in loops
>>>>> having non-zero safelen attribute.
>>>>
>>>> Hmm.  I don't like depending on this implementation detail.  Does the
>>>> attached patch work
>>>> which simply avoids any positive/negative caching on safelen affected
>>>> refs?  It also makes
>>>> the query cheaper by avoiding the dive into child loops.
>>>>
>>>> Richard.
>>>>
>>>>> 2016-08-09 11:44 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> I added additional check before caching dependencies since (1) all
>>>>>>> statements in loop are handled in loop postorder order, i.e. form
>>>>>>> outer to inner; (2) we can change dependency for reference in subloops
>>>>>>> which have non-zero safelen attribute. So I propose to re-evaluate it
>>>>>>> in such cases. I don't see why we need to avoid dependence caching for
>>>>>>> all loop nests since pragma omp simd is used very rarely.
>>>>>>
>>>>>> You think it is impossible to construct a testcase which hits the
>>>>>> correctness issue?
>>>>>> "very rarely" is not a good argument to generate wrong code.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> 2016-08-05 16:50 GMT+03:00 Richard Biener <richard.guent...@gmail.com>:
>>>>>>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev <ysrum...@gmail.com> 
>>>>>>>> wrote:
>>>>>>>>> Richard,
>>>>>>>>>
>>>>>>>>> Here is updated patch which implements your proposal - I pass loop
>>>>>>>>> instead of stmt to determine either REF is defined inside LOOP nest or
>>>>>>>>> not. I checked that for pr70729-nest.cc the reference this->S_n  for
>>>>>>>>> statements which are out of innermost loop are  not considered as
>>>>>>>>> independent as you pointed out.
>>>>>>>>>
>>>>>>>>> Regression testing did not show any new failures and both failed tests
>>>>>>>>> from libgomp.fortran suite now passed.
>>>>>>>>>
>>>>>>>>> Is it OK for trunk?
>>>>>>>>
>>>>>>>> I don't quite understand
>>>>>>>>
>>>>>>>> +  /* Ignore dependence for loops having greater safelen.  */
>>>>>>>> +  if (new_safelen == safelen
>>>>>>>> +      && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
>>>>>>>> stored_p)))
>>>>>>>>      return false;
>>>>>>>>
>>>>>>>> this seems to suggest (correctly I think) that we cannot rely on the 
>>>>>>>> caching
>>>>>>>> for safelen, neither for optimal results (you seem to address that) 
>>>>>>>> but also
>>>>>>>> not for correctness (we cache the no-dep result from a safelen run and
>>>>>>>> then happily re-use that info for a ref that is not safe for safelen).
>>>>>>>>
>>>>>>>> It seems to me we need to avoid any caching if we made things 
>>>>>>>> independent
>>>>>>>> because of safelen and simply not do the dep test afterwards.  this 
>>>>>>>> means
>>>>>>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a 
>>>>>>>> great way
>>>>>>>> to do this w/o confusing the control flow).
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> ChangeLog:
>>>>>>>>> 2016-08-05  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>
>>>>>>>>> PR tree-optimization/71734
>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
>>>>>>>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
>>>>>>>>> ref_indep_loop_p.
>>>>>>>>> (ref_indep_loop_p_1): Fix commentary.
>>>>>>>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
>>>>>>>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
>>>>>>>>> dependencde for loop having greater safelen value, pass REF_LOOP in
>>>>>>>>> recursive call.
>>>>>>>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>>>>>>>>>
>>>>>>>>> 2016-08-03 16:44 GMT+03:00 Richard Biener 
>>>>>>>>> <richard.guent...@gmail.com>:
>>>>>>>>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev 
>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>> Hi Richard.
>>>>>>>>>>>
>>>>>>>>>>> It turned out that the fix proposed by you does not work for liggomp
>>>>>>>>>>> tests simd3 and simd4.
>>>>>>>>>>> The reason is that we can't change safelen value for references not
>>>>>>>>>>> defined inside loop. So I add missed check on it to patch.
>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>
>>>>>>>>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p 
>>>>>>>>>> as
>>>>>>>>>> that operation can end up being quadratic in the loop depth/width.
>>>>>>>>>>
>>>>>>>>>> But I also wonder about correctness given that LIM "commons"
>>>>>>>>>> references.  So we can have
>>>>>>>>>>
>>>>>>>>>>   for (;;)
>>>>>>>>>>     .. = ref;  (1)
>>>>>>>>>>     for (;;) // safelen == 2  (2)
>>>>>>>>>>       ... = ref;
>>>>>>>>>>
>>>>>>>>>> and when looking at the ref at (1) which according to you should not
>>>>>>>>>> have safelen applied your function will happily return that ref is 
>>>>>>>>>> defined
>>>>>>>>>> in the inner loop.
>>>>>>>>>>
>>>>>>>>>> So it looks like to be able to apply safelen the caller of 
>>>>>>>>>> ref_indep_loop_p
>>>>>>>>>> needs to pass down a ref plus a location (a stmt).  In which case 
>>>>>>>>>> your
>>>>>>>>>> function can simply use flow_loop_nested_p (loop, gimple_bb
>>>>>>>>>> (stmt)->loop_father);
>>>>>>>>>>
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> ChangeLog:
>>>>>>>>>>> 2016-07-29  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function.
>>>>>>>>>>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside 
>>>>>>>>>>> LOOP.
>>>>>>>>>>>
>>>>>>>>>>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev <ysrum...@gmail.com>:
>>>>>>>>>>>> Sorry H.J.
>>>>>>>>>>>>
>>>>>>>>>>>> I checked both these tests manually but forgot to pass "-fopenmp" 
>>>>>>>>>>>> option.
>>>>>>>>>>>> I will fix the issue asap.
>>>>>>>>>>>>
>>>>>>>>>>>> 2016-07-29 0:33 GMT+03:00 H.J. Lu <hjl.to...@gmail.com>:
>>>>>>>>>>>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev 
>>>>>>>>>>>>> <ysrum...@gmail.com> wrote:
>>>>>>>>>>>>>> Richard,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I prepare a patch which is based on yours. New test is also 
>>>>>>>>>>>>>> included.
>>>>>>>>>>>>>> Bootstrapping and regression testing did not show any new 
>>>>>>>>>>>>>> failures.
>>>>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>>>> 2016-07-28  Yuri Rumyantsev  <ysrum...@gmail.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen
>>>>>>>>>>>>>> attribute instead of REF_LOOP and use it.
>>>>>>>>>>>>>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> set it for Loops having non-zero safelen attribute.
>>>>>>>>>>>>>> (ref_indep_loop_p): Pass zero as initial value for safelen.
>>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>>>         * g++.dg/vect/pr70729-nest.cc: New test.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Does this cause
>>>>>>>>>>>>>
>>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>>>> test
>>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -g  execution test
>>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>>>> test
>>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -g  execution test
>>>>>>>>>>>>>
>>>>>>>>>>>>> on AVX machines and
>>>>>>>>>>>>>
>>>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>>>> test
>>>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90   -O3 -g  execution test
>>>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>>>> test
>>>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90   -O3 -g  execution test
>>>>>>>>>>>>>
>>>>>>>>>>>>> on non-AVX machines?
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> H.J.

Reply via email to