[Bug target/102125] (ARM Cortex-M3 and newer) missed optimization. memcpy not needed operations

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 30 Aug 2021 04:40:51 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102125


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|c                           |target
   Last reconfirmed|                            |2021-08-30
             Target|                            |arm
           Keywords|                            |missed-optimization
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
One common source of missed optimizations is gimple_fold_builtin_memory_op
which has

      /* If we can perform the copy efficiently with first doing all loads
         and then all stores inline it that way.  Currently efficiently
         means that we can load all the memory into a single integer
         register which is what MOVE_MAX gives us.  */
      src_align = get_pointer_alignment (src);
      dest_align = get_pointer_alignment (dest);
      if (tree_fits_uhwi_p (len)
          && compare_tree_int (len, MOVE_MAX) <= 0
...
                  /* If the destination pointer is not aligned we must be able
                     to emit an unaligned store.  */
                  && (dest_align >= GET_MODE_ALIGNMENT (mode)
                      || !targetm.slow_unaligned_access (mode, dest_align)
                      || (optab_handler (movmisalign_optab, mode)
                          != CODE_FOR_nothing)))

where here likely the MOVE_MAX limit applies (it is 4).  Since we actually
do need to perform two loads the code seems to do what is intended (but
that's of course "bad" for 64bit copies on 32bit archs and likewise for
128bit copies on 64bit archs).

It's usually too late for RTL memcpy expansion to fully elide stack storage.

[Bug target/102125] (ARM Cortex-M3 and newer) missed optimization. memcpy not needed operations

Reply via email to