[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #3 from Alexander Monakov --- There's a forward dependency over 'c' (read of c[i] vs. write of c[i+1] with 'i' iterating forward), and the vectorized variant takes the hit on each iteration. How is a slowdown even surprising. For th

[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #2 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 > > --- Comment #1 from Richard Biener --- > Because store data races are allowed with -Ofast masked stores are not used so > we instead get > > vect_

[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #1 from Richard Biener --- Because store data races are allowed with -Ofast masked stores are not used so we instead get vect__ifc__80.24_114 = VEC_COND_EXPR ; _ifc__80 = _58 ? _45 : _ifc__78; MEM [(double *)vectp_c.25_116] =