------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr 2005-07-26 15:15 ------- Subject: Re: [4.1 Regression] wrong code for casts and scev
Dorit Naishlos wrote: > > The modifications you suggest will make the tests uninteresting - they were > introduced with unknown loop-bound/offset on purpose. In fact, for most of > these tests there are already versions with explicit constant values for > the loop-bound/offset - Okay, I won't modify any of these testcases. > vect-46.c with explicit loop-bound becomes vect-40.c > vect-50.c with explicit loop-bound becomes vect-44.c > vect-52.c with explicit loop-bound becomes vect-48.c > vect-58.c with explicit loop-bound becomes vect-54.c > vect-60.c with explicit loop-bound becomes vect-56.c > vect-77.c and vect-78.c with explicit offset become vect-75.c > and vect-92.c also uses unknown loop-bound on purpose. > > Can we change something else in the tests to make the evolution-analyzer > return something saner? by changing types of variables? by using some flag? > I don't think it is possible to properly convert these ivs without knowing an approximation of the number of iterations. We can get this either from -fipcp, but this would make the testcases redundant as you said, or having the IP alias analysis that tells us that the only pointers passed to main1 () in vect-46 point to data whose size is 256, and from this extract an estimation of the number of iterations, or last solution explained below. > (by the way, where does it fail the vectorizer? (what are the last things > that dump details reports?)) compiling vect-46.c produces the following: (... (set_scalar_evolution (scalar = D.2054_15) (scalar_evolution = (afloat * restrict) {0, +, 4}_1 + pb_14)) ) /home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: Access function of ptr: (afloat * restrict) {0, +, 4}_1 + pb_14 /home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not vectorized: ptr is loop invariant. /home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: not vectorized: unhandled data ref: D.2055_16 = *D.2054_15 /home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: bad data references. /home/seb/mainline/gcc/gcc/testsuite/gcc.dg/vect/vect-46.c:30: note: vectorized 0 loops in function. Now that we keep the cast, (afloat * restrict) {(uint) 0, +, (uint) 4}_1 + pb_14 cannot be folded into {(afloat * restrict)pb_14, +, (afloat * restrict)4}_1, that's why the vectorizer cannot recognize the pattern. The main problem is that the code in the loop contains a cast of the signed iv i_18 to unsigned: bb_2 (preds = {bb_3 bb_1 }, succs = {bb_3 bb_5 }) { # TMT.5_17 = PHI <TMT.5_27(3), TMT.5_26(1)>; # i_18 = PHI <i_24(3), 0(1)>; <L0>:; D.2050_6 = (long unsigned int) i_18; D.2051_7 = D.2050_6 * 4; D.2052_8 = (afloat * restrict) D.2051_7; D.2053_10 = D.2052_8 + pa_9; D.2054_15 = D.2052_8 + pb_14; # VUSE <TMT.5_17>; D.2055_16 = *D.2054_15; D.2056_21 = D.2052_8 + pc_20; # VUSE <TMT.5_17>; D.2057_22 = *D.2056_21; D.2058_23 = D.2055_16 * D.2057_22; # TMT.5_27 = V_MAY_DEF <TMT.5_17>; *D.2053_10 = D.2058_23; i_24 = i_18 + 1; if (n_3 > i_24) goto <L9>; else goto <L11>; } The solution is to have an estimation of the loop bound based on the fact that i_24 and i_18 do not wrap. From this estimation, I think that the other vars D.2050_6, D.2051_7 and D.2052_8 can be proved to not wrap. I'm working on some code for estimating niter from signed non wrapping ivs. Sebastian -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22236