https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68786
Bug ID: 68786 Summary: Aligned masked store is generated for unaligned pointer Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ienkovich at gcc dot gnu.org Target Milestone: --- Here is a testcase: double *a; int b; void test (void) { for (; b; b++) if (b < 7) a[b] = 1.0; } Produced assembler for that loop when compiled for AVX-512: >gcc -O2 -ftree-vectorize -march=skylake-avx512 small.i -S .L4: vpcmpd $2, %zmm2, %zmm0, %k1 addl $1, %r8d vpaddd %zmm3, %zmm0, %zmm0 vmovupd %zmm1, (%rsi){%k1} kshiftrw $8, %k1, %k1 vmovapd %zmm1, 64(%rsi){%k1} subq $-128, %rsi cmpl %edx, %r8d jb .L4 We have two store using the same base. One of them is unaligned and another one is aligned. The difference comes from GIMPLE. Here is a vectorized loop: <bb 6>: # vect_vec_iv_.11_71 = PHI <vect_cst__69(5), vect_vec_iv_.11_72(6)> # ivtmp.26_87 = PHI <0(5), ivtmp.26_6(6)> # ivtmp.28_7 = PHI <ivtmp.28_24(5), ivtmp.28_8(6)> vectp.15_82 = (vector(8) double *) ivtmp.28_7; vect_vec_iv_.11_72 = vect_vec_iv_.11_71 + { 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16 }; mask__24.12_74 = vect_vec_iv_.11_71 <= { 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6 }; mask_patt_28.14_76 = [vec_unpack_lo_expr] mask__24.12_74; mask_patt_28.14_77 = [vec_unpack_hi_expr] mask__24.12_74; MASK_STORE (vectp.15_82, 0B, mask_patt_28.14_76, { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }); _25 = ivtmp.28_7 + 64; _88 = (vector(8) double *) _25; MASK_STORE (_88, 0B, mask_patt_28.14_77, { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 }); ivtmp.26_6 = ivtmp.26_87 + 1; ivtmp.28_8 = ivtmp.28_7 + 128; if (ivtmp.26_6 < bnd.8_31) goto <bb 6>; Pointers used for masked stores have different SSA_NAME_PTR_INFO For vectp.15_82 we have {pt = {anything = 0, nonlocal = 1, escaped = 1, ipa_escaped = 0, null = 0, vars_contains_nonlocal = 0, vars_contains_escaped = 0, vars_contains_escaped_heap = 0, vars = 0x7ffff7c13160}, align = 8, misalign = 0} For _88 we have {pt = {anything = 0, nonlocal = 1, escaped = 1, ipa_escaped = 0, null = 0, vars_contains_nonlocal = 0, vars_contains_escaped = 0, vars_contains_escaped_heap = 0, vars = 0x7ffff7c13160}, align = 0, misalign = 0} Zero alignment here for _88 causes TYPE_ALIGN to be used for the second MASK_STORE. TYPE_ALIGN for vector types is its size and therefore we get incorrect aligned memory access.