On Feb 12, 2021, Richard Biener <[email protected]> wrote:
>> + if (TREE_CODE (mem) == SSA_NAME)
>> + if (ptr_info_def *pi = get_ptr_info (mem))
>> + {
>> + unsigned al = get_pointer_alignment (builtin->dst_base);
>> + if (al > pi->align || pi->misalign)
> We still might prefer pi->align == 64 and pi->misalign == 32 over al == 16
> so maybe factor that in, too.
Ugh, apologies, I somehow posted an incorrect and outdated version of
the patch. The improved (propagates both alignments) and fixed (divides
by BITS_PER_UNIT, fixing a regression in gfortran.dg/sms-2.f90) had
this alternate hunk as the only difference:
@@ -1155,6 +1156,16 @@ generate_memset_builtin (class loop *loop, partition
*partition)
mem = force_gimple_operand_gsi (&gsi, mem, true, NULL_TREE,
false, GSI_CONTINUE_LINKING);
+ if (TREE_CODE (mem) == SSA_NAME)
+ if (ptr_info_def *pi = get_ptr_info (mem))
+ {
+ unsigned al;
+ unsigned HOST_WIDE_INT misal;
+ if (get_pointer_alignment_1 (builtin->dst_base, &al, &misal))
+ set_ptr_info_alignment (pi, al / BITS_PER_UNIT,
+ misal / BITS_PER_UNIT);
+ }
+
/* This exactly matches the pattern recognition in classify_partition. */
val = gimple_assign_rhs1 (DR_STMT (builtin->dst_dr));
/* Handle constants like 0x15151515 and similarly
> So I wonder whether we should instead re-run CCP after loop opts which
> computes nonzero bits as well instead of the above "hack". Would
> nonzero bits be ready to compute in the above way from loop distribution?
> That is, you can do set_nonzero_bits just like you did
> set_ptr_info_alignment ...
> Since CCP also performs copy propagation an obvious candidate would be
> to replace the last pass_copy_prop with pass_ccp (with a comment noting
> to compute up-to-date nonzero bits info).
I'll look into these possibilities.
--
Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/
Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar