https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96888
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |53947 Status|UNCONFIRMED |NEW Last reconfirmed| |2020-09-02 Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- Confirmed. We currently do not support promoting/demoting the shift amount vector operand when vectorizing shifts. Note the fact that we turn word & (1ul<<j) into word >> j likely triggers this issue - this does not happen when you write word & (1u<<j). Still AVX2 is needed for the 1<<j induction. Note the generated code doesn't exactly look good ... It seems we should be able to use outer loop vectorization here, but after fixing some things we still run into dependence analysis issues there: t.ii:4:46: note: === vect_analyze_data_ref_dependences === t.ii:4:46: note: dependence distance = 0. t.ii:4:46: note: dependence distance == 0 between *_8 and *_8 t.ii:4:46: note: dependence distance = 1. t.ii:7:23: missed: not vectorized, possible dependence between data-refs *_8 and *_8 t.ii:4:46: missed: bad data dependence. that's the v[i*16+j] read/write. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations