[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #11 from Patrick Palka --- Author: ppalka Date: Wed Aug 31 19:06:22 2016 New Revision: 239907 URL: https://gcc.gnu.org/viewcvs?rev=239907=gcc=rev Log: Fix folding of VECTOR_CST comparisons gcc/ChangeLog: Backport from mainline 2016-08-27 Patrick PalkaPR tree-optimization/71077 PR tree-optimization/68542 * fold-const.c (fold_relational_const): Fix folding of VECTOR_CST comparisons that have a scalar boolean result type. gcc/testsuite/ChangeLog: Backport from mainline 2016-08-27 Patrick Palka PR tree-optimization/71077 * gcc.target/i386/pr71077.c: New test. Added: branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr71077.c Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/fold-const.c branches/gcc-6-branch/gcc/testsuite/ChangeLog
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #10 from Patrick Palka --- Author: ppalka Date: Sat Aug 27 22:00:17 2016 New Revision: 239798 URL: https://gcc.gnu.org/viewcvs?rev=239798=gcc=rev Log: Fix folding of VECTOR_CST comparisons gcc/ChangeLog: PR tree-optimization/71077 PR tree-optimization/68542 * fold-const.c (fold_relational_const): Fix folding of VECTOR_CST comparisons that have a scalar boolean result type. (selftest::test_vector_folding): New static function. (selftest::fold_const_c_tests): Call it. gcc/testsuite/ChangeLog: PR tree-optimization/71077 * gcc.target/i386/pr71077.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr71077.c Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from Richard Biener --- Fixed.
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #8 from Ilya Enkovich --- Author: ienkovich Date: Tue Feb 2 09:46:26 2016 New Revision: 233068 URL: https://gcc.gnu.org/viewcvs?rev=233068=gcc=rev Log: gcc/ 2016-02-02 Yuri RumyantsevPR middle-end/68542 * config/i386/i386.c (ix86_expand_branch): Add support for conditional branch with vector comparison. * config/i386/sse.md (VI48_AVX): New mode iterator. (define_expand "cbranch4): Add support for conditional branch with vector comparison. * tree-vect-loop.c (optimize_mask_stores): New function. * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize has_mask_store field of vect_info. * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for vectorized loops having masked stores after vec_info destroy. * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and correspondent macros. (optimize_mask_stores): Add prototype. gcc/testsuite 2016-02-02 Yuri Rumyantsev PR middle-end/68542 * gcc.dg/vect/vect-mask-store-move-1.c: New test. * gcc.target/i386/avx2-vect-mask-store-move1.c: New test. Added: trunk/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c trunk/gcc/testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c trunk/gcc/config/i386/sse.md trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-loop.c trunk/gcc/tree-vect-stmts.c trunk/gcc/tree-vectorizer.c trunk/gcc/tree-vectorizer.h
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #7 from rguenther at suse dot de --- On January 27, 2016 5:03:18 PM GMT+01:00, "mpolacek at gcc dot gnu.org"wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 > >Marek Polacek changed: > > What|Removed |Added > >CC||mpolacek at gcc dot gnu.org > >--- Comment #6 from Marek Polacek --- >Fixed? Not yet
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #6 from Marek Polacek --- Fixed?
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #5 from Ilya Enkovich --- Author: ienkovich Date: Mon Jan 18 14:14:35 2016 New Revision: 232518 URL: https://gcc.gnu.org/viewcvs?rev=232518=gcc=rev Log: gcc/ 2016-01-18 Yuri RumyantsevPR middle-end/68542 * fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case of mixind vector and scalar types. (fold_relational_const): Add handling of vector comparison with boolean result. * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow comparison of vector operands with boolean result for EQ/NE only. (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison. (verify_gimple_cond): Likewise. * tree-vrp.c (extract_code_and_val_from_cond_with_ops): Modify check on valid type of VAL. Modified: trunk/gcc/ChangeLog trunk/gcc/fold-const.c trunk/gcc/tree-cfg.c trunk/gcc/tree-vrp.c
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #4 from Andrew Pinski --- (In reply to Yuri Rumyantsev from comment #3) > I enhanced a patch for masked stores movement by guard on zero mask - move > all possible producers for stored value and performance degradation > disappeared. > the patch will be re-designed and send for review next week. What happened to this patch?
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #3 from Yuri Rumyantsev --- I enhanced a patch for masked stores movement by guard on zero mask - move all possible producers for stored value and performance degradation disappeared. the patch will be re-designed and send for review next week.
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 Ilya Enkovich changed: What|Removed |Added CC||ienkovich at gcc dot gnu.org --- Comment #1 from Ilya Enkovich --- I was looking into 481.wrf degradation caused by r230309 on Haswell (-Ofast -flto). I found this patch caused many loops in 'sint' function to be vectorized. All these loops have form: DO 1 II=N1STAR,N1END IF ( icmask(II,JJ) ) THEN ... ENDIF 1 CONTINUE Where icmask is two-dimensional array of logicals. The problem is that only 35% of icmask values are .TRUE. and these values are not sparse. This means in vectorized loop we usually have either all 0s or all 1s vector mask and therefore perform many iterations with zero mask. Zero check for vector mask patch by Yuri should (at least partly) resolve this issue.
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 Richard Biener changed: What|Removed |Added Target Milestone|--- |6.0
[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542 --- Comment #2 from Richard Biener --- Ok, so previously we if-converted but with versioning and thus the if-converted loop was not vectorized and thrown away? So yes, for such cases we'd ideally have vector control-flow if (!all-zero) { ... } but best by not if-converting this in the first place. Note that the above also applies to "regular" vectorization of if-converted code, not only to 'masks' as with Yuris patch. I wonder if we can extend that to re-introduce control flow. A vector == 0 check should be fairly cheap and the transform keyed on how much code we can execute conditionally.