[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-08-31 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #11 from Patrick Palka  ---
Author: ppalka
Date: Wed Aug 31 19:06:22 2016
New Revision: 239907

URL: https://gcc.gnu.org/viewcvs?rev=239907=gcc=rev
Log:
Fix folding of VECTOR_CST comparisons

gcc/ChangeLog:

Backport from mainline
2016-08-27  Patrick Palka  

PR tree-optimization/71077
PR tree-optimization/68542
* fold-const.c (fold_relational_const): Fix folding of
VECTOR_CST comparisons that have a scalar boolean result type.

gcc/testsuite/ChangeLog:

Backport from mainline
2016-08-27  Patrick Palka  

PR tree-optimization/71077
* gcc.target/i386/pr71077.c: New test.


Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr71077.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/fold-const.c
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-08-27 Thread ppalka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #10 from Patrick Palka  ---
Author: ppalka
Date: Sat Aug 27 22:00:17 2016
New Revision: 239798

URL: https://gcc.gnu.org/viewcvs?rev=239798=gcc=rev
Log:
Fix folding of VECTOR_CST comparisons

gcc/ChangeLog:

PR tree-optimization/71077
PR tree-optimization/68542
* fold-const.c (fold_relational_const): Fix folding of
VECTOR_CST comparisons that have a scalar boolean result type.
(selftest::test_vector_folding): New static function.
(selftest::fold_const_c_tests): Call it.

gcc/testsuite/ChangeLog:

PR tree-optimization/71077
* gcc.target/i386/pr71077.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/i386/pr71077.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/fold-const.c
trunk/gcc/testsuite/ChangeLog

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-03-01 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Biener  ---
Fixed.

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-02-02 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #8 from Ilya Enkovich  ---
Author: ienkovich
Date: Tue Feb  2 09:46:26 2016
New Revision: 233068

URL: https://gcc.gnu.org/viewcvs?rev=233068=gcc=rev
Log:
gcc/

2016-02-02  Yuri Rumyantsev  

PR middle-end/68542
* config/i386/i386.c (ix86_expand_branch): Add support for conditional
branch with vector comparison.
* config/i386/sse.md (VI48_AVX): New mode iterator.
(define_expand "cbranch4): Add support for conditional branch
with vector comparison.
* tree-vect-loop.c (optimize_mask_stores): New function.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores after vec_info destroy.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.

gcc/testsuite

2016-02-02  Yuri Rumyantsev  

PR middle-end/68542
* gcc.dg/vect/vect-mask-store-move-1.c: New test.
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/vect/vect-mask-store-move-1.c
trunk/gcc/testsuite/gcc.target/i386/avx2-vect-mask-store-move1.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-vect-loop.c
trunk/gcc/tree-vect-stmts.c
trunk/gcc/tree-vectorizer.c
trunk/gcc/tree-vectorizer.h

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-01-27 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #7 from rguenther at suse dot de  ---
On January 27, 2016 5:03:18 PM GMT+01:00, "mpolacek at gcc dot gnu.org"
 wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542
>
>Marek Polacek  changed:
>
>   What|Removed |Added
>
>CC||mpolacek at gcc dot gnu.org
>
>--- Comment #6 from Marek Polacek  ---
>Fixed?

Not yet

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-01-27 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org

--- Comment #6 from Marek Polacek  ---
Fixed?

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-01-18 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #5 from Ilya Enkovich  ---
Author: ienkovich
Date: Mon Jan 18 14:14:35 2016
New Revision: 232518

URL: https://gcc.gnu.org/viewcvs?rev=232518=gcc=rev
Log:
gcc/

2016-01-18  Yuri Rumyantsev  

PR middle-end/68542
* fold-const.c (fold_binary_op_with_conditional_arg): Bail out for case
of mixind vector and scalar types.
(fold_relational_const): Add handling of vector
comparison with boolean result.
* tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
comparison of vector operands with boolean result for EQ/NE only.
(verify_gimple_assign_binary): Adjust call for
verify_gimple_comparison.
(verify_gimple_cond): Likewise.
* tree-vrp.c (extract_code_and_val_from_cond_with_ops): Modify check on
valid type of VAL.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/fold-const.c
trunk/gcc/tree-cfg.c
trunk/gcc/tree-vrp.c

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2016-01-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #4 from Andrew Pinski  ---
(In reply to Yuri Rumyantsev from comment #3)
> I enhanced a patch for masked stores movement by guard on zero mask - move
> all possible producers for stored value and performance degradation
> disappeared.
> the patch will be re-designed and send for review next week.

What happened to this patch?

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2015-11-26 Thread ysrumyan at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #3 from Yuri Rumyantsev  ---
I enhanced a patch for masked stores movement by guard on zero mask - move all
possible producers for stored value and performance degradation disappeared.
the patch will be re-designed and send for review next week.

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2015-11-25 Thread ienkovich at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

Ilya Enkovich  changed:

   What|Removed |Added

 CC||ienkovich at gcc dot gnu.org

--- Comment #1 from Ilya Enkovich  ---
I was looking into 481.wrf degradation caused by r230309 on Haswell (-Ofast
-flto).  I found this patch caused many loops in 'sint' function to be
vectorized.  All these loops have form:

DO 1 II=N1STAR,N1END
  IF ( icmask(II,JJ) ) THEN
   ...
  ENDIF
1 CONTINUE

Where icmask is two-dimensional array of logicals.  The problem is that only
35% of icmask values are .TRUE. and these values are not sparse.
This means in vectorized loop we usually have either all 0s or all 1s vector
mask and therefore perform many iterations with zero mask.  Zero check for
vector mask patch by Yuri should (at least partly) resolve this issue.

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2015-11-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |6.0

[Bug middle-end/68542] [6 Regression] 10% 481.wrf performance regression

2015-11-25 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68542

--- Comment #2 from Richard Biener  ---
Ok, so previously we if-converted but with versioning and thus the if-converted
loop was not vectorized and thrown away?

So yes, for such cases we'd ideally have vector control-flow

 if (!all-zero)
  {
...
  }

but best by not if-converting this in the first place.  Note that the
above also applies to "regular" vectorization of if-converted code,
not only to 'masks' as with Yuris patch.  I wonder if we can extend
that to re-introduce control flow.  A vector == 0 check should be
fairly cheap and the transform keyed on how much code we can execute
conditionally.