[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-26 Thread kobalicek.petr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #16 from Petr --- Thanks a lot! I hope much more code would benefit from this change.

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 Richard Biener changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-26 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #14 from Richard Biener --- Author: rguenth Date: Fri Oct 26 07:38:59 2018 New Revision: 265522 URL: https://gcc.gnu.org/viewcvs?rev=265522=gcc=rev Log: 2018-10-26 Richard Biener PR tree-optimization/87105 *

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-24 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #13 from Richard Biener --- Author: rguenth Date: Wed Oct 24 11:46:58 2018 New Revision: 265457 URL: https://gcc.gnu.org/viewcvs?rev=265457=gcc=rev Log: 2018-10-24 Richard Biener PR tree-optimization/87105 *

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-24 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #12 from Richard Biener --- With the duplicate store issue fixed in the vectorizer we run into the SLP vectorization issue that limits the growth of the SLP tree (yes, it's a tree and thus tends to grow expontential easily...).

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #11 from Richard Biener --- The code is now better but not vectorized due to mentioned issues.

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #10 from Richard Biener --- Author: rguenth Date: Tue Oct 23 11:34:56 2018 New Revision: 265421 URL: https://gcc.gnu.org/viewcvs?rev=265421=gcc=rev Log: 2018-10-23 Richard Biener PR tree-optimization/87105 PR

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-10-23 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 Richard Biener changed: What|Removed |Added CC||jamborm at gcc dot gnu.org,

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #8 from Richard Biener --- I've also had patches adding an early phiopt pass which would have solved the CFG mess VRP creates.

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-27 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 Richard Biener changed: What|Removed |Added Keywords||alias Target|

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-26 Thread kobalicek.petr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #6 from Petr --- I think the test-case can even be simplified to something like this: #include #include struct Point { double x, y; void reset(double x, double y) { this->x = x; this->y = y; } }; void f1(Point* p,

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-26 Thread amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-26 Thread kobalicek.petr at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #4 from Petr --- I think this code is vectorizable without --fast-math. However, it seems that once a min/max (or something else) is kept scalar it poisons the rest of the code. The following code works perfectly (scalar): ```

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #3 from Marc Glisse --- With -ffast-math we (awkwardly) vectorize a couple min/max at the beginning, but clearly not the whole thing like llvm.

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #2 from Andrew Pinski --- One more point is in C++ (a < b ? b : a) is a lvalue which might also interfer with converting it into min/max.

[Bug tree-optimization/87105] Autovectorization [X86, SSE2, AVX2, DoublePrecision]

2018-08-25 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87105 --- Comment #1 from Andrew Pinski --- So I think it is an interesting interaction in that GCC cannot change a < b ? a : b into MIN_EXPR. There might be a reasoning behind this, dealing with NaNs, INF, etc.