https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66285
Bug ID: 66285 Summary: failure to vectorize parallelized loop Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- Another pr46032-inspired example. Consider par-2.c: ... #define nEvents 1000 int __attribute__((noinline,noclone)) f (int argc, double *__restrict results, double *__restrict data) { double coeff = 12.2; for (INDEX_TYPE idx = 0; idx < nEvents; idx++) results[idx] = coeff * data[idx]; return !(results[argc] == 0.0); } #if defined (MAIN) int main (int argc) { double results[nEvents] = {0}; double data[nEvents] = {0}; return f (argc, results, data); } #endif ... And investigate.sh: ... #!/bin/bash src=par-2.c for parloops_factor in 0 2; do for index_type in "int" "unsigned int" "long" "unsigned long"; do rm -f *.c.*; ./lean-c/install/bin/gcc -O2 $src -S \ -ftree-parallelize-loops=$parloops_factor \ -ftree-vectorize \ -fdump-tree-all-all \ "-DINDEX_TYPE=$index_type" vectdump=$src.132t.vect pardump=$src.129t.parloops vectorized=$(grep -c "LOOP VECTORIZED" $vectdump) if [ ! -f $pardump ]; then parallelized=0 else parallelized=$(grep -c "parallelizing inner loop" $pardump) fi echo "parloops_factor: $parloops_factor, index_type: $index_type:" echo " vectorized: $vectorized, parallelized: $parallelized" done done ... If we're not parallelizing, vectorization succeeds: ... parloops_factor: 0, index_type: int: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: unsigned int: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: long: vectorized: 1, parallelized: 0 parloops_factor: 0, index_type: unsigned long: vectorized: 1, parallelized: 0 ... If we're parallelizing, vectorization succeeds for (unsigned) long: ... parloops_factor: 2, index_type: long: vectorized: 1, parallelized: 1 parloops_factor: 2, index_type: unsigned long: vectorized: 1, parallelized: 1 ... but not for (unsigned) int: ... parloops_factor: 2, index_type: int: vectorized: 0, parallelized: 1 parloops_factor: 2, index_type: unsigned int: vectorized: 0, parallelized: 1 ...