On Wed, 2020-12-02 at 17:44 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> This patch is to use paradoxical subreg instead of
> zero_extend for promoting QI/HI to SI/DI when we
> want to construct one vector with these modes.
> Since we do the gpr->vsx movement and vector merge
> or pack later, the high part is useless and safe to
> use paradoxical subreg. It can avoid useless rlwinms
> generated for signed cases.
>
> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
>
> Is it ok for trunk?
Mostly cosmetic review. comments sprinkled below.
thanks
-Will
>
> BR,
> Kewen
> ------
> gcc/ChangeLog:
>
> * config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
> paradoxical subreg instead of zero_extend for QI/HI promotion
> when doing QI/HI vector init.
A bit long, but OK with me. :-)
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/powerpc/pr96933-1.c: Adjusted to check no rlwinm.
> * gcc.target/powerpc/pr96933-2.c: Likewise.
Ok. (I'd hope a few more extend instructions would be eliminated, but
this only covers the tests that explicitly looked/didn't look for them, so OK).
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index f33fca3982a..9c084b055b8 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
I note that the code changes that follow here are within the code block
if (TARGET_DIRECT_MOVE && (mode == V16QImode || mode == V8HImode))
{
This is implied per the patch description, but not obvious from the context of
the changes here. (OK).
> /* Force the values into word_mode registers. */
> for (i = 0; i < n_elts; i++)
> {
> - rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
> - if (TARGET_POWERPC64)
> - {
> - op[i] = gen_reg_rtx (DImode);
> - emit_insn (gen_zero_extendqidi2 (op[i], tmp));
> - }
> - else
> - {
> - op[i] = gen_reg_rtx (SImode);
> - emit_insn (gen_zero_extendqisi2 (op[i], tmp));
> - }
> + rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
> + op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
> }
>
> /* Take unsigned char big endianness on 64bit as example for below
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> index 3b63865b3b8..71d72084413 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> @@ -13,3 +13,4 @@
> /* { dg-final { scan-assembler-times {\mvpkudum\M} 12 } } */
> /* { dg-final { scan-assembler-not {\mstb\M} } } */
> /* { dg-final { scan-assembler-not {\msth\M} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> index cef8fbd4f35..9fa15125d8d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> @@ -13,3 +13,4 @@
> /* { dg-final { scan-assembler-times {\mxxpermdi\M} 4 } } */
> /* { dg-final { scan-assembler-not {\mstb\M} } } */
> /* { dg-final { scan-assembler-not {\msth\M} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */
Ok.
Thanks
-Will