https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148
Bug ID: 63148 Summary: r187042 causes auto-vectorization failure for X86 for -m32. Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: doug.gilmore at imgtec dot com Created attachment 33440 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33440&action=edit test example I noticed that MultiSource/Benchmarks/TSVC/LoopRestructuring-{flt,dbl} from LLVM test-suite fail on X86 -m32 and I was able to bisect the failure to commit r187042. I attached a stripped down example: Before the revision if we compile with -fdump-tree-vect-details we see that a loop carried dependency is recorded: (compute_affine_dependence stmt_a: D.1748_9 = global_data.b[D.1747_8]; stmt_b: global_data.b[i.0_2] = D.1750_11; (subscript_dependence_tester (analyze_overlapping_iterations (chrec_a = {0, +, 1}_5) (chrec_b = {1, +, 1}_5) (analyze_siv_subscript (analyze_subscript_affine_affine (overlaps_a = [1 + 1 * x_1] ) (overlaps_b = [0 + 1 * x_1] ) ) ) (overlap_iterations_a = [1 + 1 * x_1] ) (overlap_iterations_b = [0 + 1 * x_1] ) ) (analyze_overlapping_iterations (chrec_a = 2816) (chrec_b = 2816) (overlap_iterations_a = [0] ) (overlap_iterations_b = [0] ) ) (build_classic_dist_vector dist_vector = ( 1 ) ) ) ) which results in the loop not being vectorized because of the memory recurrence. After the change the dependency is not recorded: (compute_affine_dependence stmt_a: D.1748_9 = global_data.b[D.1747_8]; stmt_b: global_data.b[i.0_2] = D.1750_11; (subscript_dependence_tester (analyze_overlapping_iterations (chrec_a = {536870912, +, 1}_5) (chrec_b = {1, +, 1}_5) (analyze_siv_subscript (analyze_subscript_affine_affine (overlaps_a = no dependence ) (overlaps_b = no dependence ) ) ) (overlap_iterations_a = no dependence ) (overlap_iterations_b = no dependence ) ) (dependence classified: scev_known) ) Causing the loop to be incorrectly vectorized. Note that when compiled with -m64 is actually vectorized, but it is determined that versioning is needed: 45: dependence distance == 0 between global_data.a[D.1767_2] and global_data.a[D.1767_2] 45: versioning for alias required: can't determine dependence between global_data.a[D.1767_2] and *D.1776_10 ... 58: LOOP VECTORIZED. s221_extract.c:40: note: vectorized 5 loops in function. Merging blocks 2 and 41 Removing basic block 5 ... and the incorrectly vectorized code is removed.