on 2024/5/10 17:29, HAO CHEN GUI wrote:
> Hi,
>   This patch enables overlapped by-piece operations. On rs6000, default
> move/set/clear ratio is 2. So the overlap is only enabled with compare
> by-pieces.
> 
>   Compared to previous version, the change is to remove power8
> requirement from test case.
> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/651045.html
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no
> regressions. Is it OK for the trunk?

OKļ¼Œthanks!

BR,
Kewen

> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> rs6000: Enable overlapped by-pieces operations
> 
> This patch enables overlapped by-piece operations by defining
> TARGET_OVERLAP_OP_BY_PIECES_P to true.  On rs6000, default move/set/clear
> ratio is 2.  So the overlap is only enabled with compare by-pieces.
> 
> gcc/
>       * config/rs6000/rs6000.cc (TARGET_OVERLAP_OP_BY_PIECES_P): Define.
> 
> gcc/testsuite/
>       * gcc.target/powerpc/block-cmp-9.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 117999613d8..e713a1e1d57 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1776,6 +1776,9 @@ static const scoped_attribute_specs *const 
> rs6000_attribute_table[] =
>  #undef TARGET_CONST_ANCHOR
>  #define TARGET_CONST_ANCHOR 0x8000
> 
> +#undef TARGET_OVERLAP_OP_BY_PIECES_P
> +#define TARGET_OVERLAP_OP_BY_PIECES_P hook_bool_void_true
> +
>  
> 
>  /* Processor table.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c 
> b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c
> new file mode 100644
> index 00000000000..f16429c2ffb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/block-cmp-9.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-not {\ml[hb]z\M} } } */
> +
> +/* Test if by-piece overlap compare is enabled and following case is
> +   implemented by two overlap word loads and compares.  */
> +
> +int foo (const char* s1, const char* s2)
> +{
> +  return __builtin_memcmp (s1, s2, 7) == 0;
> +}

Reply via email to