https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116032
--- Comment #8 from Christophe Lyon <clyon at gcc dot gnu.org> ---
We currently have:
struct cpu_vec_costs arm_default_vec_cost = {
1, /* scalar_stmt_cost. */
1, /* scalar load_cost. */
1, /* scalar_store_cost. */
1, /* vec_stmt_cost. */
1, /* vec_to_scalar_cost. */
1, /* scalar_to_vec_cost. */
1, /* vec_align_load_cost. */
1, /* vec_unalign_load_cost. */
1, /* vec_unalign_store_cost. */
1, /* vec_store_cost. */
3, /* cond_taken_branch_cost. */
1, /* cond_not_taken_branch_cost. */
};
and obviously replacing vec_align_load_cost with "2" "fixed" the problem, we
again generate:
movs r2, #1
movs r3, #0
strd r2, r3, [r0]
but that seems a bit too strong a change (and probably introduces regressions
elsewhere)
Maybe we could instead pessimize such a vec_load if it implies the creation of
a literal pool entry?