[Bug tree-optimization/96888] Missing vectorization opportunity depending on integer type

2020-09-01 Thread pmenon at cs dot cmu.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96888

--- Comment #1 from pmenon at cs dot cmu.edu ---
Correction: outer loop condition should read 'i < n'.

[Bug tree-optimization/96888] New: Missing vectorization opportunity depending on integer type

2020-09-01 Thread pmenon at cs dot cmu.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96888

Bug ID: 96888
   Summary: Missing vectorization opportunity depending on integer
type
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pmenon at cs dot cmu.edu
  Target Milestone: ---

The loop in the following test case isn't vectorized:

#include 
#include 

// Add x to each v[i] if bit 'i' is set in LSB-encoded bits.
void Test(int8_t *__restrict v, int8_t x, const uint64_t *bits, unsigned n) {
for (int i = 0, num_words=(n+64-1)/64; i , n; i++) {
const uint64_t word = bits[i];
for (int j = 0; j < 64; j++) {
v[i*64+j] += x * (bool)(word & (uint64_t(1)<:7:9: missed: couldn't vectorize loop
:7:9: missed: not vectorized: control flow in loop.
:8:27: missed: couldn't vectorize loop
:9:30: missed: not vectorized: relevant stmt not supported: _10 =
word_24 >> j_34;

However, changing one line (the one constructing the mask) from an explicit
uint64_t(1) to a plan 1U (which is not correct), we get auto-vectorization:

#include 
#include 

// Add x to each v[i] if bit 'i' is set in LSB-encoded bits.
void Test(int8_t *__restrict v, int8_t x, const uint64_t *bits, unsigned n) {
for (int i = 0, num_words=(n+64-1)/64; i , n; i++) {
const uint64_t word = bits[i];
for (int j = 0; j < 64; j++) {
v[i*64+j] += x * (bool)(word & (1<