https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366
--- Comment #12 from Ramana Radhakrishnan <ramana at gcc dot gnu.org> --- (In reply to rguent...@suse.de from comment #3) > On Thu, 27 Aug 2015, rearnsha at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67366 > > > > --- Comment #2 from Richard Earnshaw <rearnsha at gcc dot gnu.org> --- > > (In reply to Richard Biener from comment #1) > > > I think this boils down to the fact that memcpy expansion is done too late > > > and > > > that (with more recent GCC) the "inlining" done on the GIMPLE level is > > > restricted > > > to !SLOW_UNALIGNED_ACCESS but arm defines STRICT_ALIGNMENT to 1 > > > unconditionally. > > > > > > > Yep, we have to define STRICT_ALIGNMENT to 1 because not all load > > instructions > > work with misaligned addresses (ldm, for example). The only way to handle > > misaligned copies is through the movmisalign API. > > Are the movmisalign handled ones reasonably efficient? That is, more > efficient than memcpy/memmove? Then we should experiment with > > Index: gcc/gimple-fold.c > =================================================================== > --- gcc/gimple-fold.c (revision 227252) > +++ gcc/gimple-fold.c (working copy) > @@ -708,7 +708,9 @@ gimple_fold_builtin_memory_op (gimple_st > /* If the destination pointer is not aligned we must be > able > to emit an unaligned store. */ > && (dest_align >= GET_MODE_ALIGNMENT (TYPE_MODE (type)) > - || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type), > dest_align))) > + || !SLOW_UNALIGNED_ACCESS (TYPE_MODE (type), > dest_align) > + || (optab_handler (movmisalign_optab, TYPE_MODE > (type)) > + != CODE_FOR_nothing))) > { > tree srctype = type; > tree desttype = type; > @@ -720,7 +722,10 @@ gimple_fold_builtin_memory_op (gimple_st > srcmem = tem; > else if (src_align < GET_MODE_ALIGNMENT (TYPE_MODE > (type)) > && SLOW_UNALIGNED_ACCESS (TYPE_MODE (type), > - src_align)) > + src_align) > + && (optab_handler (movmisalign_optab, > + TYPE_MODE (type)) > + == CODE_FOR_nothing)) > srcmem = NULL_TREE; > if (srcmem) > { This plus the backend changes to deal with unaligned himode and simode values tested ok on armhf with only 2 extra failures in strlen-opt-8.c. Prima-facie they appear to be testisms, but it will be fun to handle this across all architecture levels for the arm target as unaligned access depends on architecture levels.