Hi Alexandre, on 2023/4/6 13:20, Alexandre Oliva wrote: > Hello, Kewen, > > On Mar 27, 2023, "Kewen.Lin" <li...@linux.ibm.com> wrote: > >> on 2023/3/25 16:35, Alexandre Oliva wrote: > >>> The first loop in main gets stores "vectorized" on powerpc into >>> full-word stores, even without any vector instruction support, so the >>> test's expectation of no loop vectorization is not met. > >> I think this test issue has been gone since r13-5771-gdc87e1391c55c6. > > That patch has been backported to gcc-12 as r12-9258-g21e7145aaf582c. > >> Could you have a double check? > > I confirm I observe the problem with gcc-12 targeting ppc64-vx7r2, > containing the backported patch, and that the loop is vectorized, > failing the test.
Thanks for confirming! Sorry that I didn't have a vxworks env to reproduce this locally, but I guessed that vxworks env doesn't have its specific configurations on vectorization?, so I tried to reproduce this on a env with powerpc64-linux-gnu, with the latest gcc-12 branch (r12-9388), I still saw it passed with vect dumping: gen-vect-11c.c:26:17: note: ==> examining statement: _3 = _1 + _2; gen-vect-11c.c:26:17: note: vect_is_simple_use: operand ib[i_24], type of def: internal gen-vect-11c.c:26:17: note: vect_is_simple_use: vectype vector(2) int gen-vect-11c.c:26:17: note: vect_is_simple_use: operand ic[i_24], type of def: internal gen-vect-11c.c:26:17: note: vect_is_simple_use: vectype vector(2) int not using word mode for +- and less than four vector elements gen-vect-11c.c:28:21: missed: not vectorized: relevant stmt not supported: _3 = _1 + _2; gen-vect-11c.c:26:17: missed: bad operation or unsupported loop bound. gen-vect-11c.c:26:17: note: ***** Analysis failed with vector mode DI gen-vect-11c.c:26:17: missed: couldn't vectorize loop gen-vect-11c.c:18:5: note: vectorized 0 loops in function. By reverting r12-9258-g21e7145aaf582c, I saw it failed with dumping: gen-vect-11c.c:26:17: note: ==> examining statement: _3 = _1 + _2; gen-vect-11c.c:26:17: note: vect_is_simple_use: operand ib[i_24], type of def: internal gen-vect-11c.c:26:17: note: vect_is_simple_use: vectype vector(2) int gen-vect-11c.c:26:17: note: vect_is_simple_use: operand ic[i_24], type of def: internal gen-vect-11c.c:26:17: note: vect_is_simple_use: vectype vector(2) int ... gen-vect-11c.c:26:17: note: ***** Analysis succeeded with vector mode DI gen-vect-11c.c:26:17: note: ***** Choosing vector mode DI gen-vect-11c.c:26:17: optimized: loop vectorized using 8 byte vectors > > > It's unfortunately not viable for me to test GCC trunk with vxworks, so > my testing with it is limited to earlier GCC versions, that we (AdaCore) > have already ported or are in the process of porting. I make up for > that by testing trunk with other target variants, to the best of my > abilities, to avoid regressions, but sometimes I just can't tell whether > my baseline for regression testing doesn't contain a failure because > there's another fix, or because it just doesn't fail on that target > variant. > > > In this case, the comments in the patch you mentioned don't seem to > match the situation at hand: the SImode stores vectorized into V2SImode > (DImode) seem profitable and are *not* split by vector lowering. > Yeah, but the case also have "+" (PLUS), it results in a unvectorized decision as the above dumping, I'm not quite sure what's the difference between our ENVs and something caused that you didn't see the above analysis failure on your side, do you mind to have a further check? BR, Kewen