https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96888
Bug ID: 96888
Summary: Missing vectorization opportunity depending on integer
type
Product: gcc
Version: 10.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: pmenon at cs dot cmu.edu
Target Milestone: ---
The loop in the following test case isn't vectorized:
#include
#include
// Add x to each v[i] if bit 'i' is set in LSB-encoded bits.
void Test(int8_t *__restrict v, int8_t x, const uint64_t *bits, unsigned n) {
for (int i = 0, num_words=(n+64-1)/64; i , n; i++) {
const uint64_t word = bits[i];
for (int j = 0; j < 64; j++) {
v[i*64+j] += x * (bool)(word & (uint64_t(1)<:7:9: missed: couldn't vectorize loop
:7:9: missed: not vectorized: control flow in loop.
:8:27: missed: couldn't vectorize loop
:9:30: missed: not vectorized: relevant stmt not supported: _10 =
word_24 >> j_34;
However, changing one line (the one constructing the mask) from an explicit
uint64_t(1) to a plan 1U (which is not correct), we get auto-vectorization:
#include
#include
// Add x to each v[i] if bit 'i' is set in LSB-encoded bits.
void Test(int8_t *__restrict v, int8_t x, const uint64_t *bits, unsigned n) {
for (int i = 0, num_words=(n+64-1)/64; i , n; i++) {
const uint64_t word = bits[i];
for (int j = 0; j < 64; j++) {
v[i*64+j] += x * (bool)(word & (1<