On Tue, 31 May 2016, Yuri Rumyantsev wrote: > Richard, > > I built compiler with your patch and did not find out any issues with > vectorization of loops marked with pragma simd. I also noticed that > the size of the vectorized loop looks smaller (I can't tell you exact > numbers since the fresh compiler performs fool unroll even if > "-funroll-loops" option was not passed).
Thanks for checking - and yes, I expect us to generate better code as we can't always recover from the un-CSE ifcvt did. I have installed the patch now, watching for fallout on other archs (I have installed it w/o re-instantiating bool patterns on x86). Richard. > 2016-05-30 15:55 GMT+03:00 Ilya Enkovich <enkovich....@gmail.com>: > > 2016-05-30 14:04 GMT+03:00 Richard Biener <rguent...@suse.de>: > >> > >> The following patch removes the restriction on seeing a tree of stmts > >> in vectorizer bool pattern detection (aka single-use). With this > >> it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern > >> and that compile-time hog can go (it's now enabled unconditionally for GCC > >> 7). > >> > >> Instead the pattern detection code will now "unshare" the condition tree > >> for each bool pattern root my means of adding all pattern stmts of the > >> condition tree to its pattern def sequence (so we still get some > >> unnecessary copying, worst-case quadratic rather than exponential). > >> > >> Ilja - I had to disable the > >> > >> tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE > >> (rhs1)); > >> if (mask_type > >> && expand_vec_cmp_expr_p (comp_vectype, mask_type)) > >> return false; > >> > >> check you added to check_bool_pattern to get any coverage for bool > >> patterns on x86_64. Doing that regresses > >> > >> FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1 > >> loops" 10 > >> FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized > >> 1 loops" 10 > >> FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1 > >> loops" 2 > >> > >> so somehow bool patterns mess up things here (I didn't investigate). > >> The final patch will enable the above path again, avoiding the regression. > > > > Mask conversion patterns handle some cases bool patterns don't. So it's > > expected we can't vectorize some loops if bool patterns are enforced. > > > > Thanks, > > Ilya > > > >> > >> Yuri - I suppose you have a larger set of testcases using OMP simd > >> or other forced vectorization you added ifcvt_repair_bool_pattern for. > >> I'd appreciate testing (and testcases if anything fails unexpectedly). > >> > >> Testing on other targets is of course appreciated as well. > >> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0). > >> > >> Comments? > >> > >> I agree with Ilya elsewhere to remove bool patterns completely > >> (as a first step making them apply to individual stmts). We do > >> currently not support mixed cond/non-cond uses anyway, like > >> > >> _Bool a[64]; > >> unsigned char b[64]; > >> > >> void foo (void) > >> { > >> for (int i = 0; i < 64; ++i) > >> { > >> _Bool x = a[i] && b[i] < 10; > >> a[i] = x; > >> } > >> } > >> > >> and stmt-local "patterns" can be added when vectorizing the stmt > >> (like in the above case a tem = a[i] != 0 ? -1 : 0). Doing > >> bool "promotion" optimally requires a better vectorizer IL > >> (similar to placing of shuffles). > >> > >> Thanks, > >> Richard. > >> > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)