https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011
Bug ID: 109011 Summary: missed optimization in presence of __builtin_ctz Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch Target Milestone: --- in the following code foo does not vectorize, bar does. clang vectorize foo using a pattern that invokes vplzcntd (code made a bit complex to make vectorization "relevant") see https://godbolt.org/z/5fa1zbPeG #include <cstdint> uint32_t x[256]; uint32_t y[256]; uint32_t w[256]; uint32_t z[256]; void foo() { for (int i=0; i<256;i++) { auto p = x[i] ? __builtin_ctz(x[i]) : y[i]; z[i] = w[i]*p; } } void bar() { for (int j=0; j<256;j+=8) for (int i=j; i<j+8;i++) { // auto p = x[i] ? x[i] : y[i]; auto p = x[i] ? __builtin_ctz(x[i]) : y[i]; z[i] = w[i]*p; } }