I realized that ARM uses the generic memmove() implementation which is
rather slow. This series adds the assembler optimized version for ARM.
The corresponding recent Linux code doesn't fit into barebox anymore, so
to merge the code the surroundings have to be updated first, hence the
series is bigger than I like it to be.

Sascha

Signed-off-by: Sascha Hauer <s.ha...@pengutronix.de>
---
Sascha Hauer (10):
      ARM: Use optimized reads[bwl] and writes[bwl] functions
      ARM: rename logical shift macros push pull into lspush lspull
      ARM: convert all "mov.* pc, reg" to "bx reg" for ARMv6+
      ARM: update lib1funcs.S from Linux
      ARM: update findbit.S from Linux
      ARM: update io-* from Linux
      ARM: always assume the unified syntax for assembly code
      ARM: update memcpy.S and memset.S from Linux
      lib/string.c: export non optimized memmove as __default_memmove
      ARM: add optimized memmove

 arch/arm/Kconfig                  |   4 -
 arch/arm/Makefile                 |   3 +
 arch/arm/cpu/cache-armv4.S        |  11 +-
 arch/arm/cpu/cache-armv5.S        |  13 +-
 arch/arm/cpu/cache-armv6.S        |  13 +-
 arch/arm/cpu/cache-armv7.S        |   9 +-
 arch/arm/cpu/hyp.S                |   3 +-
 arch/arm/cpu/setupc_32.S          |   7 +-
 arch/arm/cpu/sm_as.S              |   3 +-
 arch/arm/include/asm/assembler.h  |  36 ++++-
 arch/arm/include/asm/cache.h      |   8 ++
 arch/arm/include/asm/io.h         |  24 ++++
 arch/arm/include/asm/string.h     |   4 +-
 arch/arm/include/asm/unified.h    |  75 +----------
 arch/arm/lib32/Makefile           |   1 +
 arch/arm/lib32/ashldi3.S          |   3 +-
 arch/arm/lib32/ashrdi3.S          |   3 +-
 arch/arm/lib32/copy_template.S    |  94 +++++++------
 arch/arm/lib32/findbit.S          | 243 +++++++++++++--------------------
 arch/arm/lib32/io-readsb.S        |  32 ++---
 arch/arm/lib32/io-readsl.S        |  32 ++---
 arch/arm/lib32/io-readsw-armv4.S  |  26 ++--
 arch/arm/lib32/io-writesb.S       |  34 ++---
 arch/arm/lib32/io-writesl.S       |  36 ++---
 arch/arm/lib32/io-writesw-armv4.S |  16 +--
 arch/arm/lib32/lib1funcs.S        |  80 ++++++-----
 arch/arm/lib32/lshrdi3.S          |   3 +-
 arch/arm/lib32/memcpy.S           |  30 +++--
 arch/arm/lib32/memmove.S          | 206 ++++++++++++++++++++++++++++
 arch/arm/lib32/memset.S           |  96 ++++++++-----
 arch/arm/lib32/runtime-offset.S   |   2 +-
 arch/arm/lib64/copy_template.S    |  11 +-
 arch/arm/lib64/memcpy.S           | 274 ++++++++++++++++++++++++++++++++------
 arch/arm/lib64/memset.S           |  18 ++-
 arch/arm/lib64/string.c           |  17 +++
 include/string.h                  |   2 +
 lib/string.c                      |  11 +-
 37 files changed, 954 insertions(+), 529 deletions(-)
---
base-commit: 419ea9350aa083d4a2806a70132129a49a5ecf95
change-id: 20240925-arm-assembly-memmove-8eccb9affa1b

Best regards,
-- 
Sascha Hauer <s.ha...@pengutronix.de>


Reply via email to