Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Tue, Aug 9, 2016 at 2:05 PM, Yuri Rumyantsev wrote: > Richard, > > I checked that this move helps. > Does it mean that I've got approval to integrate it to trunk? Yes, if it survives bootstrap & regtest. Richard. > 2016-08-09 14:33 GMT+03:00 Richard Biener : >> On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> The patch proposed by you does not work properly for >>> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has >>> been cached as dependent for outer loop and loop is not vectorized: >>> >>> g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c >>> -fdump-tree-vect-details >>> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect >>> >>> >>> You missed additional check I added before check on cached dependence. >> >> Ok, but it should get the correctness right? >> >> I suppose that if you move the cache checks inside the else clause it >> would work? >> I'd be ok with that change. >> >> Richard. >> >>> 2016-08-09 13:00 GMT+03:00 Richard Biener : On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: > Yes it is impossible since all basic blocks are handled from outer > loops to innermost so we can not have the sequence with wrong > dependence, i.e. we cached that reference is independent (due to > safelen) but the same reference in outer loop must be evaluated as > dependent. So we must re-evaluate only dependent references in loops > having non-zero safelen attribute. Hmm. I don't like depending on this implementation detail. Does the attached patch work which simply avoids any positive/negative caching on safelen affected refs? It also makes the query cheaper by avoiding the dive into child loops. Richard. > 2016-08-09 11:44 GMT+03:00 Richard Biener : >> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev >> wrote: >>> Richard, >>> >>> I added additional check before caching dependencies since (1) all >>> statements in loop are handled in loop postorder order, i.e. form >>> outer to inner; (2) we can change dependency for reference in subloops >>> which have non-zero safelen attribute. So I propose to re-evaluate it >>> in such cases. I don't see why we need to avoid dependence caching for >>> all loop nests since pragma omp simd is used very rarely. >> >> You think it is impossible to construct a testcase which hits the >> correctness issue? >> "very rarely" is not a good argument to generate wrong code. >> >> Richard. >> >>> 2016-08-05 16:50 GMT+03:00 Richard Biener : On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev wrote: > Richard, > > Here is updated patch which implements your proposal - I pass loop > instead of stmt to determine either REF is defined inside LOOP nest or > not. I checked that for pr70729-nest.cc the reference this->S_n for > statements which are out of innermost loop are not considered as > independent as you pointed out. > > Regression testing did not show any new failures and both failed tests > from libgomp.fortran suite now passed. > > Is it OK for trunk? I don't quite understand + /* Ignore dependence for loops having greater safelen. */ + if (new_safelen == safelen + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, stored_p))) return false; this seems to suggest (correctly I think) that we cannot rely on the caching for safelen, neither for optimal results (you seem to address that) but also not for correctness (we cache the no-dep result from a safelen run and then happily re-use that info for a ref that is not safe for safelen). It seems to me we need to avoid any caching if we made things independent because of safelen and simply not do the dep test afterwards. this means inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way to do this w/o confusing the control flow). Richard. > ChangeLog: > 2016-08-05 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. > (outermost_indep_loop): Pass LOOP argumnet where REF was defined to > ref_indep_loop_p. > (ref_indep_loop_p_1): Fix commentary. > (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new > variable NEW_SAFELEN which may have new value for SAFELEN, ignore > dependencde for loop having greater safelen value, pass REF_LOOP in > recursive call. > (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. > > 2016-08-03 16:
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, I checked that this move helps. Does it mean that I've got approval to integrate it to trunk? 2016-08-09 14:33 GMT+03:00 Richard Biener : > On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev wrote: >> Richard, >> >> The patch proposed by you does not work properly for >> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has >> been cached as dependent for outer loop and loop is not vectorized: >> >> g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c >> -fdump-tree-vect-details >> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect >> >> >> You missed additional check I added before check on cached dependence. > > Ok, but it should get the correctness right? > > I suppose that if you move the cache checks inside the else clause it > would work? > I'd be ok with that change. > > Richard. > >> 2016-08-09 13:00 GMT+03:00 Richard Biener : >>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: Yes it is impossible since all basic blocks are handled from outer loops to innermost so we can not have the sequence with wrong dependence, i.e. we cached that reference is independent (due to safelen) but the same reference in outer loop must be evaluated as dependent. So we must re-evaluate only dependent references in loops having non-zero safelen attribute. >>> >>> Hmm. I don't like depending on this implementation detail. Does the >>> attached patch work >>> which simply avoids any positive/negative caching on safelen affected >>> refs? It also makes >>> the query cheaper by avoiding the dive into child loops. >>> >>> Richard. >>> 2016-08-09 11:44 GMT+03:00 Richard Biener : > On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev > wrote: >> Richard, >> >> I added additional check before caching dependencies since (1) all >> statements in loop are handled in loop postorder order, i.e. form >> outer to inner; (2) we can change dependency for reference in subloops >> which have non-zero safelen attribute. So I propose to re-evaluate it >> in such cases. I don't see why we need to avoid dependence caching for >> all loop nests since pragma omp simd is used very rarely. > > You think it is impossible to construct a testcase which hits the > correctness issue? > "very rarely" is not a good argument to generate wrong code. > > Richard. > >> 2016-08-05 16:50 GMT+03:00 Richard Biener : >>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev >>> wrote: Richard, Here is updated patch which implements your proposal - I pass loop instead of stmt to determine either REF is defined inside LOOP nest or not. I checked that for pr70729-nest.cc the reference this->S_n for statements which are out of innermost loop are not considered as independent as you pointed out. Regression testing did not show any new failures and both failed tests from libgomp.fortran suite now passed. Is it OK for trunk? >>> >>> I don't quite understand >>> >>> + /* Ignore dependence for loops having greater safelen. */ >>> + if (new_safelen == safelen >>> + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, >>> stored_p))) >>> return false; >>> >>> this seems to suggest (correctly I think) that we cannot rely on the >>> caching >>> for safelen, neither for optimal results (you seem to address that) but >>> also >>> not for correctness (we cache the no-dep result from a safelen run and >>> then happily re-use that info for a ref that is not safe for safelen). >>> >>> It seems to me we need to avoid any caching if we made things >>> independent >>> because of safelen and simply not do the dep test afterwards. this >>> means >>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great >>> way >>> to do this w/o confusing the control flow). >>> >>> Richard. >>> ChangeLog: 2016-08-05 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. (outermost_indep_loop): Pass LOOP argumnet where REF was defined to ref_indep_loop_p. (ref_indep_loop_p_1): Fix commentary. (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new variable NEW_SAFELEN which may have new value for SAFELEN, ignore dependencde for loop having greater safelen value, pass REF_LOOP in recursive call. (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. 2016-08-03 16:44 GMT+03:00 Richard Biener : > On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev > wrote: >> Hi Richard. >> >> It turned out that the fix proposed by you does not work for liggomp >> tes
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev wrote: > Richard, > > The patch proposed by you does not work properly for > g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has > been cached as dependent for outer loop and loop is not vectorized: > > g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c > -fdump-tree-vect-details > grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect > > > You missed additional check I added before check on cached dependence. Ok, but it should get the correctness right? I suppose that if you move the cache checks inside the else clause it would work? I'd be ok with that change. Richard. > 2016-08-09 13:00 GMT+03:00 Richard Biener : >> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: >>> Yes it is impossible since all basic blocks are handled from outer >>> loops to innermost so we can not have the sequence with wrong >>> dependence, i.e. we cached that reference is independent (due to >>> safelen) but the same reference in outer loop must be evaluated as >>> dependent. So we must re-evaluate only dependent references in loops >>> having non-zero safelen attribute. >> >> Hmm. I don't like depending on this implementation detail. Does the >> attached patch work >> which simply avoids any positive/negative caching on safelen affected >> refs? It also makes >> the query cheaper by avoiding the dive into child loops. >> >> Richard. >> >>> 2016-08-09 11:44 GMT+03:00 Richard Biener : On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev wrote: > Richard, > > I added additional check before caching dependencies since (1) all > statements in loop are handled in loop postorder order, i.e. form > outer to inner; (2) we can change dependency for reference in subloops > which have non-zero safelen attribute. So I propose to re-evaluate it > in such cases. I don't see why we need to avoid dependence caching for > all loop nests since pragma omp simd is used very rarely. You think it is impossible to construct a testcase which hits the correctness issue? "very rarely" is not a good argument to generate wrong code. Richard. > 2016-08-05 16:50 GMT+03:00 Richard Biener : >> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev >> wrote: >>> Richard, >>> >>> Here is updated patch which implements your proposal - I pass loop >>> instead of stmt to determine either REF is defined inside LOOP nest or >>> not. I checked that for pr70729-nest.cc the reference this->S_n for >>> statements which are out of innermost loop are not considered as >>> independent as you pointed out. >>> >>> Regression testing did not show any new failures and both failed tests >>> from libgomp.fortran suite now passed. >>> >>> Is it OK for trunk? >> >> I don't quite understand >> >> + /* Ignore dependence for loops having greater safelen. */ >> + if (new_safelen == safelen >> + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, >> stored_p))) >> return false; >> >> this seems to suggest (correctly I think) that we cannot rely on the >> caching >> for safelen, neither for optimal results (you seem to address that) but >> also >> not for correctness (we cache the no-dep result from a safelen run and >> then happily re-use that info for a ref that is not safe for safelen). >> >> It seems to me we need to avoid any caching if we made things independent >> because of safelen and simply not do the dep test afterwards. this means >> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great >> way >> to do this w/o confusing the control flow). >> >> Richard. >> >>> ChangeLog: >>> 2016-08-05 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. >>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to >>> ref_indep_loop_p. >>> (ref_indep_loop_p_1): Fix commentary. >>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new >>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore >>> dependencde for loop having greater safelen value, pass REF_LOOP in >>> recursive call. >>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. >>> >>> 2016-08-03 16:44 GMT+03:00 Richard Biener : On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev wrote: > Hi Richard. > > It turned out that the fix proposed by you does not work for liggomp > tests simd3 and simd4. > The reason is that we can't change safelen value for references not > defined inside loop. So I add missed check on it to patch. > Is it OK for trunk? Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as t
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, The patch proposed by you does not work properly for g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has been cached as dependent for outer loop and loop is not vectorized: g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c -fdump-tree-vect-details grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect You missed additional check I added before check on cached dependence. 2016-08-09 13:00 GMT+03:00 Richard Biener : > On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: >> Yes it is impossible since all basic blocks are handled from outer >> loops to innermost so we can not have the sequence with wrong >> dependence, i.e. we cached that reference is independent (due to >> safelen) but the same reference in outer loop must be evaluated as >> dependent. So we must re-evaluate only dependent references in loops >> having non-zero safelen attribute. > > Hmm. I don't like depending on this implementation detail. Does the > attached patch work > which simply avoids any positive/negative caching on safelen affected > refs? It also makes > the query cheaper by avoiding the dive into child loops. > > Richard. > >> 2016-08-09 11:44 GMT+03:00 Richard Biener : >>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev wrote: Richard, I added additional check before caching dependencies since (1) all statements in loop are handled in loop postorder order, i.e. form outer to inner; (2) we can change dependency for reference in subloops which have non-zero safelen attribute. So I propose to re-evaluate it in such cases. I don't see why we need to avoid dependence caching for all loop nests since pragma omp simd is used very rarely. >>> >>> You think it is impossible to construct a testcase which hits the >>> correctness issue? >>> "very rarely" is not a good argument to generate wrong code. >>> >>> Richard. >>> 2016-08-05 16:50 GMT+03:00 Richard Biener : > On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev > wrote: >> Richard, >> >> Here is updated patch which implements your proposal - I pass loop >> instead of stmt to determine either REF is defined inside LOOP nest or >> not. I checked that for pr70729-nest.cc the reference this->S_n for >> statements which are out of innermost loop are not considered as >> independent as you pointed out. >> >> Regression testing did not show any new failures and both failed tests >> from libgomp.fortran suite now passed. >> >> Is it OK for trunk? > > I don't quite understand > > + /* Ignore dependence for loops having greater safelen. */ > + if (new_safelen == safelen > + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, > stored_p))) > return false; > > this seems to suggest (correctly I think) that we cannot rely on the > caching > for safelen, neither for optimal results (you seem to address that) but > also > not for correctness (we cache the no-dep result from a safelen run and > then happily re-use that info for a ref that is not safe for safelen). > > It seems to me we need to avoid any caching if we made things independent > because of safelen and simply not do the dep test afterwards. this means > inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great > way > to do this w/o confusing the control flow). > > Richard. > >> ChangeLog: >> 2016-08-05 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. >> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to >> ref_indep_loop_p. >> (ref_indep_loop_p_1): Fix commentary. >> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new >> variable NEW_SAFELEN which may have new value for SAFELEN, ignore >> dependencde for loop having greater safelen value, pass REF_LOOP in >> recursive call. >> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. >> >> 2016-08-03 16:44 GMT+03:00 Richard Biener : >>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev >>> wrote: Hi Richard. It turned out that the fix proposed by you does not work for liggomp tests simd3 and simd4. The reason is that we can't change safelen value for references not defined inside loop. So I add missed check on it to patch. Is it OK for trunk? >>> >>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as >>> that operation can end up being quadratic in the loop depth/width. >>> >>> But I also wonder about correctness given that LIM "commons" >>> references. So we can have >>> >>> for (;;) >>> .. = ref; (1) >>> for (;;) // safelen == 2 (2) >>> ... = ref; >>> >>> and when looking at the ref a
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev wrote: > Yes it is impossible since all basic blocks are handled from outer > loops to innermost so we can not have the sequence with wrong > dependence, i.e. we cached that reference is independent (due to > safelen) but the same reference in outer loop must be evaluated as > dependent. So we must re-evaluate only dependent references in loops > having non-zero safelen attribute. Hmm. I don't like depending on this implementation detail. Does the attached patch work which simply avoids any positive/negative caching on safelen affected refs? It also makes the query cheaper by avoiding the dive into child loops. Richard. > 2016-08-09 11:44 GMT+03:00 Richard Biener : >> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> I added additional check before caching dependencies since (1) all >>> statements in loop are handled in loop postorder order, i.e. form >>> outer to inner; (2) we can change dependency for reference in subloops >>> which have non-zero safelen attribute. So I propose to re-evaluate it >>> in such cases. I don't see why we need to avoid dependence caching for >>> all loop nests since pragma omp simd is used very rarely. >> >> You think it is impossible to construct a testcase which hits the >> correctness issue? >> "very rarely" is not a good argument to generate wrong code. >> >> Richard. >> >>> 2016-08-05 16:50 GMT+03:00 Richard Biener : On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev wrote: > Richard, > > Here is updated patch which implements your proposal - I pass loop > instead of stmt to determine either REF is defined inside LOOP nest or > not. I checked that for pr70729-nest.cc the reference this->S_n for > statements which are out of innermost loop are not considered as > independent as you pointed out. > > Regression testing did not show any new failures and both failed tests > from libgomp.fortran suite now passed. > > Is it OK for trunk? I don't quite understand + /* Ignore dependence for loops having greater safelen. */ + if (new_safelen == safelen + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, stored_p))) return false; this seems to suggest (correctly I think) that we cannot rely on the caching for safelen, neither for optimal results (you seem to address that) but also not for correctness (we cache the no-dep result from a safelen run and then happily re-use that info for a ref that is not safe for safelen). It seems to me we need to avoid any caching if we made things independent because of safelen and simply not do the dep test afterwards. this means inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way to do this w/o confusing the control flow). Richard. > ChangeLog: > 2016-08-05 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. > (outermost_indep_loop): Pass LOOP argumnet where REF was defined to > ref_indep_loop_p. > (ref_indep_loop_p_1): Fix commentary. > (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new > variable NEW_SAFELEN which may have new value for SAFELEN, ignore > dependencde for loop having greater safelen value, pass REF_LOOP in > recursive call. > (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. > > 2016-08-03 16:44 GMT+03:00 Richard Biener : >> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev >> wrote: >>> Hi Richard. >>> >>> It turned out that the fix proposed by you does not work for liggomp >>> tests simd3 and simd4. >>> The reason is that we can't change safelen value for references not >>> defined inside loop. So I add missed check on it to patch. >>> Is it OK for trunk? >> >> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as >> that operation can end up being quadratic in the loop depth/width. >> >> But I also wonder about correctness given that LIM "commons" >> references. So we can have >> >> for (;;) >> .. = ref; (1) >> for (;;) // safelen == 2 (2) >> ... = ref; >> >> and when looking at the ref at (1) which according to you should not >> have safelen applied your function will happily return that ref is >> defined >> in the inner loop. >> >> So it looks like to be able to apply safelen the caller of >> ref_indep_loop_p >> needs to pass down a ref plus a location (a stmt). In which case your >> function can simply use flow_loop_nested_p (loop, gimple_bb >> (stmt)->loop_father); >> >> Richard. >> >>> ChangeLog: >>> 2016-07-29 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tr
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev wrote: > Richard, > > Here is updated patch which implements your proposal - I pass loop > instead of stmt to determine either REF is defined inside LOOP nest or > not. I checked that for pr70729-nest.cc the reference this->S_n for > statements which are out of innermost loop are not considered as > independent as you pointed out. > > Regression testing did not show any new failures and both failed tests > from libgomp.fortran suite now passed. > > Is it OK for trunk? I don't quite understand + /* Ignore dependence for loops having greater safelen. */ + if (new_safelen == safelen + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, stored_p))) return false; this seems to suggest (correctly I think) that we cannot rely on the caching for safelen, neither for optimal results (you seem to address that) but also not for correctness (we cache the no-dep result from a safelen run and then happily re-use that info for a ref that is not safe for safelen). It seems to me we need to avoid any caching if we made things independent because of safelen and simply not do the dep test afterwards. this means inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way to do this w/o confusing the control flow). Richard. > ChangeLog: > 2016-08-05 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. > (outermost_indep_loop): Pass LOOP argumnet where REF was defined to > ref_indep_loop_p. > (ref_indep_loop_p_1): Fix commentary. > (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new > variable NEW_SAFELEN which may have new value for SAFELEN, ignore > dependencde for loop having greater safelen value, pass REF_LOOP in > recursive call. > (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. > > 2016-08-03 16:44 GMT+03:00 Richard Biener : >> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev wrote: >>> Hi Richard. >>> >>> It turned out that the fix proposed by you does not work for liggomp >>> tests simd3 and simd4. >>> The reason is that we can't change safelen value for references not >>> defined inside loop. So I add missed check on it to patch. >>> Is it OK for trunk? >> >> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as >> that operation can end up being quadratic in the loop depth/width. >> >> But I also wonder about correctness given that LIM "commons" >> references. So we can have >> >> for (;;) >> .. = ref; (1) >> for (;;) // safelen == 2 (2) >> ... = ref; >> >> and when looking at the ref at (1) which according to you should not >> have safelen applied your function will happily return that ref is defined >> in the inner loop. >> >> So it looks like to be able to apply safelen the caller of ref_indep_loop_p >> needs to pass down a ref plus a location (a stmt). In which case your >> function can simply use flow_loop_nested_p (loop, gimple_bb >> (stmt)->loop_father); >> >> Richard. >> >>> ChangeLog: >>> 2016-07-29 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. >>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP. >>> >>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev : Sorry H.J. I checked both these tests manually but forgot to pass "-fopenmp" option. I will fix the issue asap. 2016-07-29 0:33 GMT+03:00 H.J. Lu : > On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev > wrote: >> Richard, >> >> I prepare a patch which is based on yours. New test is also included. >> Bootstrapping and regression testing did not show any new failures. >> Is it OK for trunk? >> >> Thanks. >> ChangeLog: >> 2016-07-28 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen >> attribute instead of REF_LOOP and use it. >> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and >> set it for Loops having non-zero safelen attribute. >> (ref_indep_loop_p): Pass zero as initial value for safelen. >> gcc/testsuite/ChangeLog: >> * g++.dg/vect/pr70729-nest.cc: New test. >> > > Does this cause > > FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution > test > FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test > FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution > test > FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test > > on AVX machines and > > FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions execution > test > FAIL: libgomp.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, Here is updated patch which implements your proposal - I pass loop instead of stmt to determine either REF is defined inside LOOP nest or not. I checked that for pr70729-nest.cc the reference this->S_n for statements which are out of innermost loop are not considered as independent as you pointed out. Regression testing did not show any new failures and both failed tests from libgomp.fortran suite now passed. Is it OK for trunk? ChangeLog: 2016-08-05 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP. (outermost_indep_loop): Pass LOOP argumnet where REF was defined to ref_indep_loop_p. (ref_indep_loop_p_1): Fix commentary. (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new variable NEW_SAFELEN which may have new value for SAFELEN, ignore dependencde for loop having greater safelen value, pass REF_LOOP in recursive call. (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p. 2016-08-03 16:44 GMT+03:00 Richard Biener : > On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev wrote: >> Hi Richard. >> >> It turned out that the fix proposed by you does not work for liggomp >> tests simd3 and simd4. >> The reason is that we can't change safelen value for references not >> defined inside loop. So I add missed check on it to patch. >> Is it OK for trunk? > > Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as > that operation can end up being quadratic in the loop depth/width. > > But I also wonder about correctness given that LIM "commons" > references. So we can have > > for (;;) > .. = ref; (1) > for (;;) // safelen == 2 (2) > ... = ref; > > and when looking at the ref at (1) which according to you should not > have safelen applied your function will happily return that ref is defined > in the inner loop. > > So it looks like to be able to apply safelen the caller of ref_indep_loop_p > needs to pass down a ref plus a location (a stmt). In which case your > function can simply use flow_loop_nested_p (loop, gimple_bb > (stmt)->loop_father); > > Richard. > >> ChangeLog: >> 2016-07-29 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. >> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP. >> >> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev : >>> Sorry H.J. >>> >>> I checked both these tests manually but forgot to pass "-fopenmp" option. >>> I will fix the issue asap. >>> >>> 2016-07-29 0:33 GMT+03:00 H.J. Lu : On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev wrote: > Richard, > > I prepare a patch which is based on yours. New test is also included. > Bootstrapping and regression testing did not show any new failures. > Is it OK for trunk? > > Thanks. > ChangeLog: > 2016-07-28 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen > attribute instead of REF_LOOP and use it. > (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and > set it for Loops having non-zero safelen attribute. > (ref_indep_loop_p): Pass zero as initial value for safelen. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729-nest.cc: New test. > Does this cause FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test on AVX machines and FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test on non-AVX machines? -- H.J. 71734.patch.4 Description: Binary data
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev wrote: > Hi Richard. > > It turned out that the fix proposed by you does not work for liggomp > tests simd3 and simd4. > The reason is that we can't change safelen value for references not > defined inside loop. So I add missed check on it to patch. > Is it OK for trunk? Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as that operation can end up being quadratic in the loop depth/width. But I also wonder about correctness given that LIM "commons" references. So we can have for (;;) .. = ref; (1) for (;;) // safelen == 2 (2) ... = ref; and when looking at the ref at (1) which according to you should not have safelen applied your function will happily return that ref is defined in the inner loop. So it looks like to be able to apply safelen the caller of ref_indep_loop_p needs to pass down a ref plus a location (a stmt). In which case your function can simply use flow_loop_nested_p (loop, gimple_bb (stmt)->loop_father); Richard. > ChangeLog: > 2016-07-29 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. > (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP. > > 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev : >> Sorry H.J. >> >> I checked both these tests manually but forgot to pass "-fopenmp" option. >> I will fix the issue asap. >> >> 2016-07-29 0:33 GMT+03:00 H.J. Lu : >>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev wrote: Richard, I prepare a patch which is based on yours. New test is also included. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? Thanks. ChangeLog: 2016-07-28 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen attribute instead of REF_LOOP and use it. (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and set it for Loops having non-zero safelen attribute. (ref_indep_loop_p): Pass zero as initial value for safelen. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729-nest.cc: New test. >>> >>> Does this cause >>> >>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test >>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test >>> >>> on AVX machines and >>> >>> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test >>> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test >>> >>> on non-AVX machines? >>> >>> -- >>> H.J.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Hi Richard, Did you have a chance to look at this patch? Thanks. 2016-07-29 17:00 GMT+03:00 Yuri Rumyantsev : > Hi Richard. > > It turned out that the fix proposed by you does not work for liggomp > tests simd3 and simd4. > The reason is that we can't change safelen value for references not > defined inside loop. So I add missed check on it to patch. > Is it OK for trunk? > ChangeLog: > 2016-07-29 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. > (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP. > > 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev : >> Sorry H.J. >> >> I checked both these tests manually but forgot to pass "-fopenmp" option. >> I will fix the issue asap. >> >> 2016-07-29 0:33 GMT+03:00 H.J. Lu : >>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev wrote: Richard, I prepare a patch which is based on yours. New test is also included. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? Thanks. ChangeLog: 2016-07-28 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen attribute instead of REF_LOOP and use it. (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and set it for Loops having non-zero safelen attribute. (ref_indep_loop_p): Pass zero as initial value for safelen. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729-nest.cc: New test. >>> >>> Does this cause >>> >>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test >>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test >>> >>> on AVX machines and >>> >>> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test >>> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer >>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >>> test >>> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test >>> >>> on non-AVX machines? >>> >>> -- >>> H.J.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Hi Richard. It turned out that the fix proposed by you does not work for liggomp tests simd3 and simd4. The reason is that we can't change safelen value for references not defined inside loop. So I add missed check on it to patch. Is it OK for trunk? ChangeLog: 2016-07-29 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function. (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP. 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev : > Sorry H.J. > > I checked both these tests manually but forgot to pass "-fopenmp" option. > I will fix the issue asap. > > 2016-07-29 0:33 GMT+03:00 H.J. Lu : >> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> I prepare a patch which is based on yours. New test is also included. >>> Bootstrapping and regression testing did not show any new failures. >>> Is it OK for trunk? >>> >>> Thanks. >>> ChangeLog: >>> 2016-07-28 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen >>> attribute instead of REF_LOOP and use it. >>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and >>> set it for Loops having non-zero safelen attribute. >>> (ref_indep_loop_p): Pass zero as initial value for safelen. >>> gcc/testsuite/ChangeLog: >>> * g++.dg/vect/pr70729-nest.cc: New test. >>> >> >> Does this cause >> >> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >> test >> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test >> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >> test >> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test >> >> on AVX machines and >> >> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >> test >> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test >> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer >> -funroll-loops -fpeel-loops -ftracer -finline-functions execution >> test >> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test >> >> on non-AVX machines? >> >> -- >> H.J. 71734.patch.3 Description: Binary data
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev wrote: > Richard, > > I prepare a patch which is based on yours. New test is also included. > Bootstrapping and regression testing did not show any new failures. > Is it OK for trunk? > > Thanks. > ChangeLog: > 2016-07-28 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen > attribute instead of REF_LOOP and use it. > (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and > set it for Loops having non-zero safelen attribute. > (ref_indep_loop_p): Pass zero as initial value for safelen. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729-nest.cc: New test. > Does this cause FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test on AVX machines and FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test on non-AVX machines? -- H.J.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Thu, Jul 28, 2016 at 3:49 PM, Yuri Rumyantsev wrote: > Richard, > > I prepare a patch which is based on yours. New test is also included. > Bootstrapping and regression testing did not show any new failures. > Is it OK for trunk? Ok. Thanks, Richard. > Thanks. > ChangeLog: > 2016-07-28 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen > attribute instead of REF_LOOP and use it. > (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and > set it for Loops having non-zero safelen attribute. > (ref_indep_loop_p): Pass zero as initial value for safelen. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729-nest.cc: New test. > > 2016-07-27 16:25 GMT+03:00 Richard Biener : >> On Tue, Jul 26, 2016 at 5:49 PM, Yuri Rumyantsev wrote: >>> Hi Richard, >>> >>> It turned out that the patch proposed by you does not work properly >>> for nested loop. If we slightly change pr70729.cc to >>> (non-essential part is omitted >>> void inline Ss::foo (float w) >>> { >>> #pragma omp simd >>> for (int i=0; i>> { >>> float w1 = C2[S_n + i] * w; >>> v1.v_i[i] += (int)w1; >>> C1[S_n + i] += w1; >>> } >>> } >>> void Ss::boo (int n) >>> { >>> for (int i = 0; i>> foo(ww[i]); >>> } >>> loop in foo won't be vectorized since REF_LOOP is outer loop which is >>> not marked with omp simd pragma: >>> t1.cc:73:25: note: not vectorized: not suitable for scatter store *_31 = >>> _33; >>> t1.cc:73:25: note: bad data references. >>> >>> Note that the check I proposed before works fine. >> >> The attached works for me (current trunk doesn't work because of caching >> we do - I suppose we should try again to avoid the quadraticness in other >> ways when ref_indep_loop_p is called from outermost_indep_loop). >> >> Richard. >> >>> 2016-07-20 12:35 GMT+03:00 Yuri Rumyantsev : Richard, Jakub has already fixed it. Sorry for troubles. Yuri. 2016-07-19 18:29 GMT+03:00 Renlin Li : > Hi Yuri, > > I saw this test case runs on arm platforms, and maybe other platforms as > well. > > testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such > file or directory > > Before the change here, it's gated by vect_simd_clones target selector, > which limit it to i?86/x86_64 platform only. > > Regards, > Renlin Li > > > > > On 08/07/16 15:07, Yuri Rumyantsev wrote: >> >> Hi Richard, >> >> Thanks for your help - your patch looks much better. >> Here is new patch in which additional argument was added to determine >> source loop of reference. >> >> Bootstrap and regression testing did not show any new failures. >> >> Is it OK for trunk? >> ChangeLog: >> 2016-07-08 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which >> contains REF, use it to check safelen, assume that safelen value >> must be greater 1, fix style. >> (ref_indep_loop_p_2): Add REF_LOOP argument. >> (ref_indep_loop_p): Pass LOOP as additional argument to >> ref_indep_loop_p_2. >> gcc/testsuite/ChangeLog: >> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix >> style. >> >> 2016-07-08 11:18 GMT+03:00 Richard Biener : >>> >>> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev >>> wrote: I checked simd3.f90 and found out that my additional check reject independence of references REF is independent in loop#3 .istart0.19, .iend0.20 which are defined in loop#1 which is outer for loop#3. Note that these references are defined by _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); which is in loop#1. It is clear that both these references can not be independent for loop#3. >>> >>> >>> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner >>> loops >>> of LOOP to catch memory references in those as well. So the issue is >>> really >>> that we look at the wrong loop for safelen and we _do_ want to apply >>> safelen >>> to inner loops as well. >>> >>> So better track the loop we are ultimately asking the question for, like >>> in the >>> attached patch (fixes the testcase for me). >>> >>> Richard. >>> >>> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener : > > On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev > wrote: >> >> I Added this check because of new failures in libgomp.fortran suite. >> Here is copy of Jakub message: >> --- Comment #29 from Jakub Jelinek --- >> The #c27 r237844 change looks bogus to me. >> First of all, IMNSHO you can argue this way only if ref is a >> refer
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, I prepare a patch which is based on yours. New test is also included. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? Thanks. ChangeLog: 2016-07-28 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen attribute instead of REF_LOOP and use it. (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and set it for Loops having non-zero safelen attribute. (ref_indep_loop_p): Pass zero as initial value for safelen. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729-nest.cc: New test. 2016-07-27 16:25 GMT+03:00 Richard Biener : > On Tue, Jul 26, 2016 at 5:49 PM, Yuri Rumyantsev wrote: >> Hi Richard, >> >> It turned out that the patch proposed by you does not work properly >> for nested loop. If we slightly change pr70729.cc to >> (non-essential part is omitted >> void inline Ss::foo (float w) >> { >> #pragma omp simd >> for (int i=0; i> { >> float w1 = C2[S_n + i] * w; >> v1.v_i[i] += (int)w1; >> C1[S_n + i] += w1; >> } >> } >> void Ss::boo (int n) >> { >> for (int i = 0; i> foo(ww[i]); >> } >> loop in foo won't be vectorized since REF_LOOP is outer loop which is >> not marked with omp simd pragma: >> t1.cc:73:25: note: not vectorized: not suitable for scatter store *_31 = >> _33; >> t1.cc:73:25: note: bad data references. >> >> Note that the check I proposed before works fine. > > The attached works for me (current trunk doesn't work because of caching > we do - I suppose we should try again to avoid the quadraticness in other > ways when ref_indep_loop_p is called from outermost_indep_loop). > > Richard. > >> 2016-07-20 12:35 GMT+03:00 Yuri Rumyantsev : >>> Richard, >>> >>> Jakub has already fixed it. >>> Sorry for troubles. >>> Yuri. >>> >>> 2016-07-19 18:29 GMT+03:00 Renlin Li : Hi Yuri, I saw this test case runs on arm platforms, and maybe other platforms as well. testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such file or directory Before the change here, it's gated by vect_simd_clones target selector, which limit it to i?86/x86_64 platform only. Regards, Renlin Li On 08/07/16 15:07, Yuri Rumyantsev wrote: > > Hi Richard, > > Thanks for your help - your patch looks much better. > Here is new patch in which additional argument was added to determine > source loop of reference. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? > ChangeLog: > 2016-07-08 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which > contains REF, use it to check safelen, assume that safelen value > must be greater 1, fix style. > (ref_indep_loop_p_2): Add REF_LOOP argument. > (ref_indep_loop_p): Pass LOOP as additional argument to > ref_indep_loop_p_2. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. > > 2016-07-08 11:18 GMT+03:00 Richard Biener : >> >> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev >> wrote: >>> >>> I checked simd3.f90 and found out that my additional check reject >>> independence of references >>> >>> REF is independent in loop#3 >>> .istart0.19, .iend0.20 >>> which are defined in loop#1 which is outer for loop#3. >>> Note that these references are defined by >>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >>> which is in loop#1. >>> It is clear that both these references can not be independent for >>> loop#3. >> >> >> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner >> loops >> of LOOP to catch memory references in those as well. So the issue is >> really >> that we look at the wrong loop for safelen and we _do_ want to apply >> safelen >> to inner loops as well. >> >> So better track the loop we are ultimately asking the question for, like >> in the >> attached patch (fixes the testcase for me). >> >> Richard. >> >> >> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev wrote: > > I Added this check because of new failures in libgomp.fortran suite. > Here is copy of Jakub message: > --- Comment #29 from Jakub Jelinek --- > The #c27 r237844 change looks bogus to me. > First of all, IMNSHO you can argue this way only if ref is a reference > seen in > loop LOOP, or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 with a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not make
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Tue, Jul 26, 2016 at 5:49 PM, Yuri Rumyantsev wrote: > Hi Richard, > > It turned out that the patch proposed by you does not work properly > for nested loop. If we slightly change pr70729.cc to > (non-essential part is omitted > void inline Ss::foo (float w) > { > #pragma omp simd > for (int i=0; i { > float w1 = C2[S_n + i] * w; > v1.v_i[i] += (int)w1; > C1[S_n + i] += w1; > } > } > void Ss::boo (int n) > { > for (int i = 0; i foo(ww[i]); > } > loop in foo won't be vectorized since REF_LOOP is outer loop which is > not marked with omp simd pragma: > t1.cc:73:25: note: not vectorized: not suitable for scatter store *_31 = _33; > t1.cc:73:25: note: bad data references. > > Note that the check I proposed before works fine. The attached works for me (current trunk doesn't work because of caching we do - I suppose we should try again to avoid the quadraticness in other ways when ref_indep_loop_p is called from outermost_indep_loop). Richard. > 2016-07-20 12:35 GMT+03:00 Yuri Rumyantsev : >> Richard, >> >> Jakub has already fixed it. >> Sorry for troubles. >> Yuri. >> >> 2016-07-19 18:29 GMT+03:00 Renlin Li : >>> Hi Yuri, >>> >>> I saw this test case runs on arm platforms, and maybe other platforms as >>> well. >>> >>> testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such >>> file or directory >>> >>> Before the change here, it's gated by vect_simd_clones target selector, >>> which limit it to i?86/x86_64 platform only. >>> >>> Regards, >>> Renlin Li >>> >>> >>> >>> >>> On 08/07/16 15:07, Yuri Rumyantsev wrote: Hi Richard, Thanks for your help - your patch looks much better. Here is new patch in which additional argument was added to determine source loop of reference. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-08 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which contains REF, use it to check safelen, assume that safelen value must be greater 1, fix style. (ref_indep_loop_p_2): Add REF_LOOP argument. (ref_indep_loop_p): Pass LOOP as additional argument to ref_indep_loop_p_2. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. 2016-07-08 11:18 GMT+03:00 Richard Biener : > > On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev > wrote: >> >> I checked simd3.f90 and found out that my additional check reject >> independence of references >> >> REF is independent in loop#3 >> .istart0.19, .iend0.20 >> which are defined in loop#1 which is outer for loop#3. >> Note that these references are defined by >> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >> which is in loop#1. >> It is clear that both these references can not be independent for >> loop#3. > > > Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner > loops > of LOOP to catch memory references in those as well. So the issue is > really > that we look at the wrong loop for safelen and we _do_ want to apply > safelen > to inner loops as well. > > So better track the loop we are ultimately asking the question for, like > in the > attached patch (fixes the testcase for me). > > Richard. > > > >> 2016-07-07 17:11 GMT+03:00 Richard Biener : >>> >>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev >>> wrote: I Added this check because of new failures in libgomp.fortran suite. Here is copy of Jakub message: --- Comment #29 from Jakub Jelinek --- The #c27 r237844 change looks bogus to me. First of all, IMNSHO you can argue this way only if ref is a reference seen in loop LOOP, >>> >>> >>> or inner loops of LOOP I guess. I _think_ we never call >>> ref_indep_loop_p_1 with >>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would >>> not make >>> sense to do that, it would be a waste of time). >>> >>> So only if "or inner loops of LOOP" is not correct the check would be >>> needed >>> but then my issue with unrolling an inner loop and turning a ref that >>> safelen >>> does not apply to into a ref that it now applies to arises. >>> >>> I don't fully get what Jakub is hinting at. >>> >>> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can >>> you >>> explain that bitmap check with a simple testcase? >>> >>> Thanks, >>> Richard. >>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - the D.3815[0] = 0; as well as somethi
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Hi Richard, It turned out that the patch proposed by you does not work properly for nested loop. If we slightly change pr70729.cc to (non-essential part is omitted void inline Ss::foo (float w) { #pragma omp simd for (int i=0; i: > Richard, > > Jakub has already fixed it. > Sorry for troubles. > Yuri. > > 2016-07-19 18:29 GMT+03:00 Renlin Li : >> Hi Yuri, >> >> I saw this test case runs on arm platforms, and maybe other platforms as >> well. >> >> testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such >> file or directory >> >> Before the change here, it's gated by vect_simd_clones target selector, >> which limit it to i?86/x86_64 platform only. >> >> Regards, >> Renlin Li >> >> >> >> >> On 08/07/16 15:07, Yuri Rumyantsev wrote: >>> >>> Hi Richard, >>> >>> Thanks for your help - your patch looks much better. >>> Here is new patch in which additional argument was added to determine >>> source loop of reference. >>> >>> Bootstrap and regression testing did not show any new failures. >>> >>> Is it OK for trunk? >>> ChangeLog: >>> 2016-07-08 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which >>> contains REF, use it to check safelen, assume that safelen value >>> must be greater 1, fix style. >>> (ref_indep_loop_p_2): Add REF_LOOP argument. >>> (ref_indep_loop_p): Pass LOOP as additional argument to >>> ref_indep_loop_p_2. >>> gcc/testsuite/ChangeLog: >>> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. >>> >>> 2016-07-08 11:18 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev wrote: > > I checked simd3.f90 and found out that my additional check reject > independence of references > > REF is independent in loop#3 > .istart0.19, .iend0.20 > which are defined in loop#1 which is outer for loop#3. > Note that these references are defined by > _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); > which is in loop#1. > It is clear that both these references can not be independent for > loop#3. Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops of LOOP to catch memory references in those as well. So the issue is really that we look at the wrong loop for safelen and we _do_ want to apply safelen to inner loops as well. So better track the loop we are ultimately asking the question for, like in the attached patch (fixes the testcase for me). Richard. > 2016-07-07 17:11 GMT+03:00 Richard Biener : >> >> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev >> wrote: >>> >>> I Added this check because of new failures in libgomp.fortran suite. >>> Here is copy of Jakub message: >>> --- Comment #29 from Jakub Jelinek --- >>> The #c27 r237844 change looks bogus to me. >>> First of all, IMNSHO you can argue this way only if ref is a reference >>> seen in >>> loop LOOP, >> >> >> or inner loops of LOOP I guess. I _think_ we never call >> ref_indep_loop_p_1 with >> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would >> not make >> sense to do that, it would be a waste of time). >> >> So only if "or inner loops of LOOP" is not correct the check would be >> needed >> but then my issue with unrolling an inner loop and turning a ref that >> safelen >> does not apply to into a ref that it now applies to arises. >> >> I don't fully get what Jakub is hinting at. >> >> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can >> you >> explain that bitmap check with a simple testcase? >> >> Thanks, >> Richard. >> >>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 >>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p >>> - the >>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the >>> outer loop >>> obviously can be dependent on many of the loads and/or stores in the >>> loop, be >>> it "omp simd array" or not. >>> Say for >>> void >>> foo (int *p, int *q) >>> { >>>#pragma omp simd >>>for (int i = 0; i < 1024; i++) >>> p[i] += q[0]; >>> } >>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could >>> write >>> something that changes its value, and then it would behave differently >>> from >>> using VF = 1024, where everything is performed in parallel. >>> Though, actually, it can alias, just it would have to write the same >>> value as >>> was there. So, if this is used to determine if it is safe to hoist >>> the load >>> before the loop, it is fine, if it is used to determine if &q[0] >= >>> &p[0] && >>> &q[0] <= &p[1023], then it is n
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, Jakub has already fixed it. Sorry for troubles. Yuri. 2016-07-19 18:29 GMT+03:00 Renlin Li : > Hi Yuri, > > I saw this test case runs on arm platforms, and maybe other platforms as > well. > > testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such > file or directory > > Before the change here, it's gated by vect_simd_clones target selector, > which limit it to i?86/x86_64 platform only. > > Regards, > Renlin Li > > > > > On 08/07/16 15:07, Yuri Rumyantsev wrote: >> >> Hi Richard, >> >> Thanks for your help - your patch looks much better. >> Here is new patch in which additional argument was added to determine >> source loop of reference. >> >> Bootstrap and regression testing did not show any new failures. >> >> Is it OK for trunk? >> ChangeLog: >> 2016-07-08 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which >> contains REF, use it to check safelen, assume that safelen value >> must be greater 1, fix style. >> (ref_indep_loop_p_2): Add REF_LOOP argument. >> (ref_indep_loop_p): Pass LOOP as additional argument to >> ref_indep_loop_p_2. >> gcc/testsuite/ChangeLog: >> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. >> >> 2016-07-08 11:18 GMT+03:00 Richard Biener : >>> >>> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev >>> wrote: I checked simd3.f90 and found out that my additional check reject independence of references REF is independent in loop#3 .istart0.19, .iend0.20 which are defined in loop#1 which is outer for loop#3. Note that these references are defined by _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); which is in loop#1. It is clear that both these references can not be independent for loop#3. >>> >>> >>> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner >>> loops >>> of LOOP to catch memory references in those as well. So the issue is >>> really >>> that we look at the wrong loop for safelen and we _do_ want to apply >>> safelen >>> to inner loops as well. >>> >>> So better track the loop we are ultimately asking the question for, like >>> in the >>> attached patch (fixes the testcase for me). >>> >>> Richard. >>> >>> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener : > > On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev > wrote: >> >> I Added this check because of new failures in libgomp.fortran suite. >> Here is copy of Jakub message: >> --- Comment #29 from Jakub Jelinek --- >> The #c27 r237844 change looks bogus to me. >> First of all, IMNSHO you can argue this way only if ref is a reference >> seen in >> loop LOOP, > > > or inner loops of LOOP I guess. I _think_ we never call > ref_indep_loop_p_1 with > a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would > not make > sense to do that, it would be a waste of time). > > So only if "or inner loops of LOOP" is not correct the check would be > needed > but then my issue with unrolling an inner loop and turning a ref that > safelen > does not apply to into a ref that it now applies to arises. > > I don't fully get what Jakub is hinting at. > > Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can > you > explain that bitmap check with a simple testcase? > > Thanks, > Richard. > >> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 >> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p >> - the >> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the >> outer loop >> obviously can be dependent on many of the loads and/or stores in the >> loop, be >> it "omp simd array" or not. >> Say for >> void >> foo (int *p, int *q) >> { >>#pragma omp simd >>for (int i = 0; i < 1024; i++) >> p[i] += q[0]; >> } >> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could >> write >> something that changes its value, and then it would behave differently >> from >> using VF = 1024, where everything is performed in parallel. >> Though, actually, it can alias, just it would have to write the same >> value as >> was there. So, if this is used to determine if it is safe to hoist >> the load >> before the loop, it is fine, if it is used to determine if &q[0] >= >> &p[0] && >> &q[0] <= &p[1023], then it is not fine. >> >> For aliasing of q[0] and p[1023], I don't see why they couldn't alias >> in a >> valid program. #pragma omp simd I think guarantees that the last >> iteration is >> executed last, it isn't necessarily executed last alone, it could be, >> or >> together with one before last iteration, or (for simdlen INT_MAX) even >> all >> it
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Hi Yuri, I saw this test case runs on arm platforms, and maybe other platforms as well. testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such file or directory Before the change here, it's gated by vect_simd_clones target selector, which limit it to i?86/x86_64 platform only. Regards, Renlin Li On 08/07/16 15:07, Yuri Rumyantsev wrote: Hi Richard, Thanks for your help - your patch looks much better. Here is new patch in which additional argument was added to determine source loop of reference. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-08 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which contains REF, use it to check safelen, assume that safelen value must be greater 1, fix style. (ref_indep_loop_p_2): Add REF_LOOP argument. (ref_indep_loop_p): Pass LOOP as additional argument to ref_indep_loop_p_2. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. 2016-07-08 11:18 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev wrote: I checked simd3.f90 and found out that my additional check reject independence of references REF is independent in loop#3 .istart0.19, .iend0.20 which are defined in loop#1 which is outer for loop#3. Note that these references are defined by _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); which is in loop#1. It is clear that both these references can not be independent for loop#3. Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops of LOOP to catch memory references in those as well. So the issue is really that we look at the wrong loop for safelen and we _do_ want to apply safelen to inner loops as well. So better track the loop we are ultimately asking the question for, like in the attached patch (fixes the testcase for me). Richard. 2016-07-07 17:11 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev wrote: I Added this check because of new failures in libgomp.fortran suite. Here is copy of Jakub message: --- Comment #29 from Jakub Jelinek --- The #c27 r237844 change looks bogus to me. First of all, IMNSHO you can argue this way only if ref is a reference seen in loop LOOP, or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 with a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not make sense to do that, it would be a waste of time). So only if "or inner loops of LOOP" is not correct the check would be needed but then my issue with unrolling an inner loop and turning a ref that safelen does not apply to into a ref that it now applies to arises. I don't fully get what Jakub is hinting at. Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you explain that bitmap check with a simple testcase? Thanks, Richard. which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - the D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer loop obviously can be dependent on many of the loads and/or stores in the loop, be it "omp simd array" or not. Say for void foo (int *p, int *q) { #pragma omp simd for (int i = 0; i < 1024; i++) p[i] += q[0]; } sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could write something that changes its value, and then it would behave differently from using VF = 1024, where everything is performed in parallel. Though, actually, it can alias, just it would have to write the same value as was there. So, if this is used to determine if it is safe to hoist the load before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] && &q[0] <= &p[1023], then it is not fine. For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a valid program. #pragma omp simd I think guarantees that the last iteration is executed last, it isn't necessarily executed last alone, it could be, or together with one before last iteration, or (for simdlen INT_MAX) even all iterations can be done concurrently, in hw or sw, so it is fine if it is transformed into: int temp[1024], temp2[1024], temp3[1024]; for (int i = 0; i < 1024; i++) temp[i] = p[i]; for (int i = 0; i < 1024; i++) temp2[i] = q[0]; /* The above two loops can be also swapped, or intermixed. */ for (int i = 0; i < 1024; i++) temp3[i] = temp[i] + temp2[i]; for (int i = 0; i < 1024; i++) p[i] = temp3[i]; /* Or the above loop reversed etc. */ If you have: int bar (int *p, int *q) { q[0] = 0; #pragma omp simd for (int i = 0; i < 1024; i++) p[i]++; return q[0]; } i.e. something similar to what misbehaves in simd3.f90 with the change, then the answer is that q[0] isn't guaranteed to be independent of any references in the simd loop. 2016-07
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Fri, Jul 8, 2016 at 4:07 PM, Yuri Rumyantsev wrote: > Hi Richard, > > Thanks for your help - your patch looks much better. > Here is new patch in which additional argument was added to determine > source loop of reference. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? Yes. Thanks, Richard. > ChangeLog: > 2016-07-08 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which > contains REF, use it to check safelen, assume that safelen value > must be greater 1, fix style. > (ref_indep_loop_p_2): Add REF_LOOP argument. > (ref_indep_loop_p): Pass LOOP as additional argument to > ref_indep_loop_p_2. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. > > 2016-07-08 11:18 GMT+03:00 Richard Biener : >> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev wrote: >>> I checked simd3.f90 and found out that my additional check reject >>> independence of references >>> >>> REF is independent in loop#3 >>> .istart0.19, .iend0.20 >>> which are defined in loop#1 which is outer for loop#3. >>> Note that these references are defined by >>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >>> which is in loop#1. >>> It is clear that both these references can not be independent for loop#3. >> >> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops >> of LOOP to catch memory references in those as well. So the issue is really >> that we look at the wrong loop for safelen and we _do_ want to apply safelen >> to inner loops as well. >> >> So better track the loop we are ultimately asking the question for, like in >> the >> attached patch (fixes the testcase for me). >> >> Richard. >> >> >> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev wrote: > I Added this check because of new failures in libgomp.fortran suite. > Here is copy of Jakub message: > --- Comment #29 from Jakub Jelinek --- > The #c27 r237844 change looks bogus to me. > First of all, IMNSHO you can argue this way only if ref is a reference > seen in > loop LOOP, or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 with a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not make sense to do that, it would be a waste of time). So only if "or inner loops of LOOP" is not correct the check would be needed but then my issue with unrolling an inner loop and turning a ref that safelen does not apply to into a ref that it now applies to arises. I don't fully get what Jakub is hinting at. Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you explain that bitmap check with a simple testcase? Thanks, Richard. > which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 > -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - > the > D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer > loop > obviously can be dependent on many of the loads and/or stores in the > loop, be > it "omp simd array" or not. > Say for > void > foo (int *p, int *q) > { > #pragma omp simd > for (int i = 0; i < 1024; i++) > p[i] += q[0]; > } > sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could > write > something that changes its value, and then it would behave differently > from > using VF = 1024, where everything is performed in parallel. > Though, actually, it can alias, just it would have to write the same > value as > was there. So, if this is used to determine if it is safe to hoist the > load > before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] > && > &q[0] <= &p[1023], then it is not fine. > > For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a > valid program. #pragma omp simd I think guarantees that the last > iteration is > executed last, it isn't necessarily executed last alone, it could be, or > together with one before last iteration, or (for simdlen INT_MAX) even all > iterations can be done concurrently, in hw or sw, so it is fine if it is > transformed into: > int temp[1024], temp2[1024], temp3[1024]; > for (int i = 0; i < 1024; i++) > temp[i] = p[i]; > for (int i = 0; i < 1024; i++) > temp2[i] = q[0]; > /* The above two loops can be also swapped, or intermixed. */ > for (int i = 0; i < 1024; i++) > temp3[i] = temp[i] + temp2[i]; > for (int i = 0; i < 1024; i++) > p[i] = temp3[i]; > /* Or the above loop reversed etc. */ > > If you have: > int > bar (int *p, int *q) >
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard! Did you have a chance to look at this patch? Thanks. Yuri. 2016-07-08 17:07 GMT+03:00 Yuri Rumyantsev : > Hi Richard, > > Thanks for your help - your patch looks much better. > Here is new patch in which additional argument was added to determine > source loop of reference. > > Bootstrap and regression testing did not show any new failures. > > Is it OK for trunk? > ChangeLog: > 2016-07-08 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which > contains REF, use it to check safelen, assume that safelen value > must be greater 1, fix style. > (ref_indep_loop_p_2): Add REF_LOOP argument. > (ref_indep_loop_p): Pass LOOP as additional argument to > ref_indep_loop_p_2. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. > > 2016-07-08 11:18 GMT+03:00 Richard Biener : >> On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev wrote: >>> I checked simd3.f90 and found out that my additional check reject >>> independence of references >>> >>> REF is independent in loop#3 >>> .istart0.19, .iend0.20 >>> which are defined in loop#1 which is outer for loop#3. >>> Note that these references are defined by >>> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >>> which is in loop#1. >>> It is clear that both these references can not be independent for loop#3. >> >> Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops >> of LOOP to catch memory references in those as well. So the issue is really >> that we look at the wrong loop for safelen and we _do_ want to apply safelen >> to inner loops as well. >> >> So better track the loop we are ultimately asking the question for, like in >> the >> attached patch (fixes the testcase for me). >> >> Richard. >> >> >> >>> 2016-07-07 17:11 GMT+03:00 Richard Biener : On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev wrote: > I Added this check because of new failures in libgomp.fortran suite. > Here is copy of Jakub message: > --- Comment #29 from Jakub Jelinek --- > The #c27 r237844 change looks bogus to me. > First of all, IMNSHO you can argue this way only if ref is a reference > seen in > loop LOOP, or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 with a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not make sense to do that, it would be a waste of time). So only if "or inner loops of LOOP" is not correct the check would be needed but then my issue with unrolling an inner loop and turning a ref that safelen does not apply to into a ref that it now applies to arises. I don't fully get what Jakub is hinting at. Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you explain that bitmap check with a simple testcase? Thanks, Richard. > which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 > -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - > the > D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer > loop > obviously can be dependent on many of the loads and/or stores in the > loop, be > it "omp simd array" or not. > Say for > void > foo (int *p, int *q) > { > #pragma omp simd > for (int i = 0; i < 1024; i++) > p[i] += q[0]; > } > sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could > write > something that changes its value, and then it would behave differently > from > using VF = 1024, where everything is performed in parallel. > Though, actually, it can alias, just it would have to write the same > value as > was there. So, if this is used to determine if it is safe to hoist the > load > before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] > && > &q[0] <= &p[1023], then it is not fine. > > For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a > valid program. #pragma omp simd I think guarantees that the last > iteration is > executed last, it isn't necessarily executed last alone, it could be, or > together with one before last iteration, or (for simdlen INT_MAX) even all > iterations can be done concurrently, in hw or sw, so it is fine if it is > transformed into: > int temp[1024], temp2[1024], temp3[1024]; > for (int i = 0; i < 1024; i++) > temp[i] = p[i]; > for (int i = 0; i < 1024; i++) > temp2[i] = q[0]; > /* The above two loops can be also swapped, or intermixed. */ > for (int i = 0; i < 1024; i++) > temp3[i] = temp[i] + temp2[i]; > for (int i = 0; i < 1024; i++) > p[i] = temp3[i]; > /* Or the above loop reversed etc. */ > > If you have: > in
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Hi Richard, Thanks for your help - your patch looks much better. Here is new patch in which additional argument was added to determine source loop of reference. Bootstrap and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-07-08 Yuri Rumyantsev PR tree-optimization/71734 * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which contains REF, use it to check safelen, assume that safelen value must be greater 1, fix style. (ref_indep_loop_p_2): Add REF_LOOP argument. (ref_indep_loop_p): Pass LOOP as additional argument to ref_indep_loop_p_2. gcc/testsuite/ChangeLog: * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style. 2016-07-08 11:18 GMT+03:00 Richard Biener : > On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev wrote: >> I checked simd3.f90 and found out that my additional check reject >> independence of references >> >> REF is independent in loop#3 >> .istart0.19, .iend0.20 >> which are defined in loop#1 which is outer for loop#3. >> Note that these references are defined by >> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20); >> which is in loop#1. >> It is clear that both these references can not be independent for loop#3. > > Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner loops > of LOOP to catch memory references in those as well. So the issue is really > that we look at the wrong loop for safelen and we _do_ want to apply safelen > to inner loops as well. > > So better track the loop we are ultimately asking the question for, like in > the > attached patch (fixes the testcase for me). > > Richard. > > > >> 2016-07-07 17:11 GMT+03:00 Richard Biener : >>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev wrote: I Added this check because of new failures in libgomp.fortran suite. Here is copy of Jakub message: --- Comment #29 from Jakub Jelinek --- The #c27 r237844 change looks bogus to me. First of all, IMNSHO you can argue this way only if ref is a reference seen in loop LOOP, >>> >>> or inner loops of LOOP I guess. I _think_ we never call ref_indep_loop_p_1 >>> with >>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would not >>> make >>> sense to do that, it would be a waste of time). >>> >>> So only if "or inner loops of LOOP" is not correct the check would be needed >>> but then my issue with unrolling an inner loop and turning a ref that >>> safelen >>> does not apply to into a ref that it now applies to arises. >>> >>> I don't fully get what Jakub is hinting at. >>> >>> Can you install the safelen > 0 -> safelen > 1 fix please? Jakub, can you >>> explain that bitmap check with a simple testcase? >>> >>> Thanks, >>> Richard. >>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2 -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p - the D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the outer loop obviously can be dependent on many of the loads and/or stores in the loop, be it "omp simd array" or not. Say for void foo (int *p, int *q) { #pragma omp simd for (int i = 0; i < 1024; i++) p[i] += q[0]; } sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could write something that changes its value, and then it would behave differently from using VF = 1024, where everything is performed in parallel. Though, actually, it can alias, just it would have to write the same value as was there. So, if this is used to determine if it is safe to hoist the load before the loop, it is fine, if it is used to determine if &q[0] >= &p[0] && &q[0] <= &p[1023], then it is not fine. For aliasing of q[0] and p[1023], I don't see why they couldn't alias in a valid program. #pragma omp simd I think guarantees that the last iteration is executed last, it isn't necessarily executed last alone, it could be, or together with one before last iteration, or (for simdlen INT_MAX) even all iterations can be done concurrently, in hw or sw, so it is fine if it is transformed into: int temp[1024], temp2[1024], temp3[1024]; for (int i = 0; i < 1024; i++) temp[i] = p[i]; for (int i = 0; i < 1024; i++) temp2[i] = q[0]; /* The above two loops can be also swapped, or intermixed. */ for (int i = 0; i < 1024; i++) temp3[i] = temp[i] + temp2[i]; for (int i = 0; i < 1024; i++) p[i] = temp3[i]; /* Or the above loop reversed etc. */ If you have: int bar (int *p, int *q) { q[0] = 0; #pragma omp simd for (int i = 0; i < 1024; i++) p[i]++; return q[0]; } i.e. something similar to what misbehaves in simd3.f90 with the change, then the answer is that q[0] isn'
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Wed, Jul 6, 2016 at 11:50 AM, Yuri Rumyantsev wrote: > Richard, > > I pointed out in the commentary that REF is defined inside loop and > this check is related to this statement. Should I clarify it? > > + /* We consider REF defined in LOOP as independent if at least 2 loop > + iterations are not dependent. */ Yes, I fail to see why x[0] should not be disambiguated against y[i] in #pragma GCC loop ivdep for (i...) { y[i] = ...; for (j...) ... = x[0]; } REF is always inside the loop nest with LOOP being the outermost loop. Richard. > > 2016-07-06 12:38 GMT+03:00 Richard Biener : >> On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev wrote: >>> Hi All, >>> >>> Here is a simple fix to cure regressions introduced by my fix for >>> 70729. Patch also contains minor changes in test found by Jakub. >>> >>> Bootstrapping and regression testing did not show any new failures. >>> >>> Is it OK for trunk? >> >> + && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], ref->id)) >> >> So safelen does not apply to refs in nested loops? The added comment only >> explains the safelen check change but this also requires explanation. >> >> Richard. >> >>> ChangeLog: >>> 2016-07-05 Yuri Rumyantsev >>> >>> PR tree-optimization/71734 >>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF defined in >>> LOOP as independent if at least two loop iterations are not dependent. >>> gcc/testsuite/ChangeLog: >>> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
Richard, I pointed out in the commentary that REF is defined inside loop and this check is related to this statement. Should I clarify it? + /* We consider REF defined in LOOP as independent if at least 2 loop + iterations are not dependent. */ 2016-07-06 12:38 GMT+03:00 Richard Biener : > On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is a simple fix to cure regressions introduced by my fix for >> 70729. Patch also contains minor changes in test found by Jakub. >> >> Bootstrapping and regression testing did not show any new failures. >> >> Is it OK for trunk? > > + && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], ref->id)) > > So safelen does not apply to refs in nested loops? The added comment only > explains the safelen check change but this also requires explanation. > > Richard. > >> ChangeLog: >> 2016-07-05 Yuri Rumyantsev >> >> PR tree-optimization/71734 >> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF defined in >> LOOP as independent if at least two loop iterations are not dependent. >> gcc/testsuite/ChangeLog: >> * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
On Tue, Jul 5, 2016 at 4:56 PM, Yuri Rumyantsev wrote: > Hi All, > > Here is a simple fix to cure regressions introduced by my fix for > 70729. Patch also contains minor changes in test found by Jakub. > > Bootstrapping and regression testing did not show any new failures. > > Is it OK for trunk? + && bitmap_bit_p (&memory_accesses.refs_in_loop[loop->num], ref->id)) So safelen does not apply to refs in nested loops? The added comment only explains the safelen check change but this also requires explanation. Richard. > ChangeLog: > 2016-07-05 Yuri Rumyantsev > > PR tree-optimization/71734 > * tree-ssa-loop-im.c (ref_indep_loop_p_1): Consider REF defined in > LOOP as independent if at least two loop iterations are not dependent. > gcc/testsuite/ChangeLog: > * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.