Re: [PATCH][RFC] Remove ifcvt_repair_bool_pattern, re-do bool patterns
On Tue, 31 May 2016, Yuri Rumyantsev wrote: > Richard, > > I built compiler with your patch and did not find out any issues with > vectorization of loops marked with pragma simd. I also noticed that > the size of the vectorized loop looks smaller (I can't tell you exact > numbers since the fresh compiler performs fool unroll even if > "-funroll-loops" option was not passed). Thanks for checking - and yes, I expect us to generate better code as we can't always recover from the un-CSE ifcvt did. I have installed the patch now, watching for fallout on other archs (I have installed it w/o re-instantiating bool patterns on x86). Richard. > 2016-05-30 15:55 GMT+03:00 Ilya Enkovich: > > 2016-05-30 14:04 GMT+03:00 Richard Biener : > >> > >> The following patch removes the restriction on seeing a tree of stmts > >> in vectorizer bool pattern detection (aka single-use). With this > >> it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern > >> and that compile-time hog can go (it's now enabled unconditionally for GCC > >> 7). > >> > >> Instead the pattern detection code will now "unshare" the condition tree > >> for each bool pattern root my means of adding all pattern stmts of the > >> condition tree to its pattern def sequence (so we still get some > >> unnecessary copying, worst-case quadratic rather than exponential). > >> > >> Ilja - I had to disable the > >> > >> tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE > >> (rhs1)); > >> if (mask_type > >> && expand_vec_cmp_expr_p (comp_vectype, mask_type)) > >> return false; > >> > >> check you added to check_bool_pattern to get any coverage for bool > >> patterns on x86_64. Doing that regresses > >> > >> FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1 > >> loops" 10 > >> FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized > >> 1 loops" 10 > >> FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1 > >> loops" 2 > >> > >> so somehow bool patterns mess up things here (I didn't investigate). > >> The final patch will enable the above path again, avoiding the regression. > > > > Mask conversion patterns handle some cases bool patterns don't. So it's > > expected we can't vectorize some loops if bool patterns are enforced. > > > > Thanks, > > Ilya > > > >> > >> Yuri - I suppose you have a larger set of testcases using OMP simd > >> or other forced vectorization you added ifcvt_repair_bool_pattern for. > >> I'd appreciate testing (and testcases if anything fails unexpectedly). > >> > >> Testing on other targets is of course appreciated as well. > >> > >> Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0). > >> > >> Comments? > >> > >> I agree with Ilya elsewhere to remove bool patterns completely > >> (as a first step making them apply to individual stmts). We do > >> currently not support mixed cond/non-cond uses anyway, like > >> > >> _Bool a[64]; > >> unsigned char b[64]; > >> > >> void foo (void) > >> { > >> for (int i = 0; i < 64; ++i) > >> { > >> _Bool x = a[i] && b[i] < 10; > >> a[i] = x; > >> } > >> } > >> > >> and stmt-local "patterns" can be added when vectorizing the stmt > >> (like in the above case a tem = a[i] != 0 ? -1 : 0). Doing > >> bool "promotion" optimally requires a better vectorizer IL > >> (similar to placing of shuffles). > >> > >> Thanks, > >> Richard. > >> > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
Re: [PATCH][RFC] Remove ifcvt_repair_bool_pattern, re-do bool patterns
Richard, I built compiler with your patch and did not find out any issues with vectorization of loops marked with pragma simd. I also noticed that the size of the vectorized loop looks smaller (I can't tell you exact numbers since the fresh compiler performs fool unroll even if "-funroll-loops" option was not passed). 2016-05-30 15:55 GMT+03:00 Ilya Enkovich: > 2016-05-30 14:04 GMT+03:00 Richard Biener : >> >> The following patch removes the restriction on seeing a tree of stmts >> in vectorizer bool pattern detection (aka single-use). With this >> it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern >> and that compile-time hog can go (it's now enabled unconditionally for GCC >> 7). >> >> Instead the pattern detection code will now "unshare" the condition tree >> for each bool pattern root my means of adding all pattern stmts of the >> condition tree to its pattern def sequence (so we still get some >> unnecessary copying, worst-case quadratic rather than exponential). >> >> Ilja - I had to disable the >> >> tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE >> (rhs1)); >> if (mask_type >> && expand_vec_cmp_expr_p (comp_vectype, mask_type)) >> return false; >> >> check you added to check_bool_pattern to get any coverage for bool >> patterns on x86_64. Doing that regresses >> >> FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1 >> loops" 10 >> FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized >> 1 loops" 10 >> FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1 >> loops" 2 >> >> so somehow bool patterns mess up things here (I didn't investigate). >> The final patch will enable the above path again, avoiding the regression. > > Mask conversion patterns handle some cases bool patterns don't. So it's > expected we can't vectorize some loops if bool patterns are enforced. > > Thanks, > Ilya > >> >> Yuri - I suppose you have a larger set of testcases using OMP simd >> or other forced vectorization you added ifcvt_repair_bool_pattern for. >> I'd appreciate testing (and testcases if anything fails unexpectedly). >> >> Testing on other targets is of course appreciated as well. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0). >> >> Comments? >> >> I agree with Ilya elsewhere to remove bool patterns completely >> (as a first step making them apply to individual stmts). We do >> currently not support mixed cond/non-cond uses anyway, like >> >> _Bool a[64]; >> unsigned char b[64]; >> >> void foo (void) >> { >> for (int i = 0; i < 64; ++i) >> { >> _Bool x = a[i] && b[i] < 10; >> a[i] = x; >> } >> } >> >> and stmt-local "patterns" can be added when vectorizing the stmt >> (like in the above case a tem = a[i] != 0 ? -1 : 0). Doing >> bool "promotion" optimally requires a better vectorizer IL >> (similar to placing of shuffles). >> >> Thanks, >> Richard. >>
Re: [PATCH][RFC] Remove ifcvt_repair_bool_pattern, re-do bool patterns
2016-05-30 14:04 GMT+03:00 Richard Biener: > > The following patch removes the restriction on seeing a tree of stmts > in vectorizer bool pattern detection (aka single-use). With this > it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern > and that compile-time hog can go (it's now enabled unconditionally for GCC 7). > > Instead the pattern detection code will now "unshare" the condition tree > for each bool pattern root my means of adding all pattern stmts of the > condition tree to its pattern def sequence (so we still get some > unnecessary copying, worst-case quadratic rather than exponential). > > Ilja - I had to disable the > > tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE > (rhs1)); > if (mask_type > && expand_vec_cmp_expr_p (comp_vectype, mask_type)) > return false; > > check you added to check_bool_pattern to get any coverage for bool > patterns on x86_64. Doing that regresses > > FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1 > loops" 10 > FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized > 1 loops" 10 > FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1 > loops" 2 > > so somehow bool patterns mess up things here (I didn't investigate). > The final patch will enable the above path again, avoiding the regression. Mask conversion patterns handle some cases bool patterns don't. So it's expected we can't vectorize some loops if bool patterns are enforced. Thanks, Ilya > > Yuri - I suppose you have a larger set of testcases using OMP simd > or other forced vectorization you added ifcvt_repair_bool_pattern for. > I'd appreciate testing (and testcases if anything fails unexpectedly). > > Testing on other targets is of course appreciated as well. > > Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0). > > Comments? > > I agree with Ilya elsewhere to remove bool patterns completely > (as a first step making them apply to individual stmts). We do > currently not support mixed cond/non-cond uses anyway, like > > _Bool a[64]; > unsigned char b[64]; > > void foo (void) > { > for (int i = 0; i < 64; ++i) > { > _Bool x = a[i] && b[i] < 10; > a[i] = x; > } > } > > and stmt-local "patterns" can be added when vectorizing the stmt > (like in the above case a tem = a[i] != 0 ? -1 : 0). Doing > bool "promotion" optimally requires a better vectorizer IL > (similar to placing of shuffles). > > Thanks, > Richard. >
[PATCH][RFC] Remove ifcvt_repair_bool_pattern, re-do bool patterns
The following patch removes the restriction on seeing a tree of stmts in vectorizer bool pattern detection (aka single-use). With this it is no longer necessary to unshare DEFs in ifcvt_repair_bool_pattern and that compile-time hog can go (it's now enabled unconditionally for GCC 7). Instead the pattern detection code will now "unshare" the condition tree for each bool pattern root my means of adding all pattern stmts of the condition tree to its pattern def sequence (so we still get some unnecessary copying, worst-case quadratic rather than exponential). Ilja - I had to disable the tree mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1)); if (mask_type && expand_vec_cmp_expr_p (comp_vectype, mask_type)) return false; check you added to check_bool_pattern to get any coverage for bool patterns on x86_64. Doing that regresses FAIL: gcc.target/i386/mask-pack.c scan-tree-dump-times vect "vectorized 1 loops" 10 FAIL: gcc.target/i386/mask-unpack.c scan-tree-dump-times vect "vectorized 1 loops" 10 FAIL: gcc.target/i386/pr70021.c scan-tree-dump-times vect "vectorized 1 loops" 2 so somehow bool patterns mess up things here (I didn't investigate). The final patch will enable the above path again, avoiding the regression. Yuri - I suppose you have a larger set of testcases using OMP simd or other forced vectorization you added ifcvt_repair_bool_pattern for. I'd appreciate testing (and testcases if anything fails unexpectedly). Testing on other targets is of course appreciated as well. Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 0). Comments? I agree with Ilya elsewhere to remove bool patterns completely (as a first step making them apply to individual stmts). We do currently not support mixed cond/non-cond uses anyway, like _Bool a[64]; unsigned char b[64]; void foo (void) { for (int i = 0; i < 64; ++i) { _Bool x = a[i] && b[i] < 10; a[i] = x; } } and stmt-local "patterns" can be added when vectorizing the stmt (like in the above case a tem = a[i] != 0 ? -1 : 0). Doing bool "promotion" optimally requires a better vectorizer IL (similar to placing of shuffles). Thanks, Richard. 2016-05-30 Richard BienerPR tree-optimization/71261 * tree-vect-patterns.c (check_bool_pattern): Gather a hash-set of stmts successfully put in the bool pattern. Remove single-use restriction. (adjust_bool_pattern_cast): Add cast at the use site via the pattern def sequence. (adjust_bool_pattern): Remove recursion, maintain a hash-map of patterned defs. Use the pattern def seqence instead of multiple independent patterns. (sort_after_uid): New qsort compare function. (adjust_bool_stmts): New function to process stmts in the bool pattern in IL order. (vect_recog_bool_pattern): Adjust. * tree-if-conv.c (ifcvt_split_def_stmt): Remove. (ifcvt_walk_pattern_tree): Likewise. (stmt_is_root_of_bool_pattern): Likewise. (ifcvt_repair_bool_pattern): Likewise. (tree_if_conversion): Do not call ifcvt_repair_bool_pattern. * gcc.dg/torture/vect-bool-1.c: New testcase. Index: gcc/tree-vect-patterns.c === *** gcc/tree-vect-patterns.c.orig 2016-05-27 14:32:49.778686018 +0200 --- gcc/tree-vect-patterns.c2016-05-30 11:06:23.563532173 +0200 *** vect_recog_mixed_size_cond_pattern (vec< *** 2888,2897 /* Helper function of vect_recog_bool_pattern. Called recursively, return true if bool VAR can and should be optimized that way. Assume it shouldn't in case it's a result of a comparison which can be directly vectorized into !a vector comparison. */ static bool ! check_bool_pattern (tree var, vec_info *vinfo) { gimple *def_stmt; enum vect_def_type dt; --- 2888,2898 /* Helper function of vect_recog_bool_pattern. Called recursively, return true if bool VAR can and should be optimized that way. Assume it shouldn't in case it's a result of a comparison which can be directly vectorized into !a vector comparison. Fills in STMTS with all stmts visited during the !walk. */ static bool ! check_bool_pattern (tree var, vec_info *vinfo, hash_set ) { gimple *def_stmt; enum vect_def_type dt; *** check_bool_pattern (tree var, vec_info * *** 2907,2943 if (!is_gimple_assign (def_stmt)) return false; ! if (!has_single_use (var)) ! return false; rhs1 = gimple_assign_rhs1 (def_stmt); rhs_code = gimple_assign_rhs_code (def_stmt); switch (rhs_code) { case SSA_NAME: ! return check_bool_pattern (rhs1, vinfo); CASE_CONVERT: if ((TYPE_PRECISION (TREE_TYPE (rhs1)) != 1 || !TYPE_UNSIGNED (TREE_TYPE (rhs1))) && TREE_CODE