Re: [PATCH v3][PR119702] rs6000: Use vector addition when left shifting by 1

Michael Meissner Tue, 16 Sep 2025 21:02:08 -0700

On Mon, Sep 08, 2025 at 02:54:53PM +0530, Avinash Jayakar wrote:
> Hi,
> 
> This is the third version of the patch proposed for master aiming to fix 
> PR119702. Requesting review of this patch.


Some minor comments inline below.

> The following sequence of assembly in powerpc64le
>       vspltisw 0,1
>       vsld 2,2,0
> is replaced by this
>       vaddudm 2,2,2
> whenever there is a vector left shift by a constant value 1.
> Added the pattern in altivec.md to recognize a vector left shift by a constant
> value, and generate add instructions if constant is 1.
> 
>       PR target/119702

The PR line needs to go after the gcc line and duplicated after the gcc
testsuite line.  The intention is the gcc section will go in the
gcc/ChangeLog file, while the lines after gcc/testsuite go into
gcc/testsuite/ChangeLog.  And typically these lines go after the lines
that explain the differences.

> Changes made in v3: Incorporated review comments made for patch v2
>       1. Indentation fixes in the commit message
>       2. define_insn has the name *altivec_vsl<VI_char>_const_1
>       3. Iterate starting from 0 for checking vector constant = 1 and fixed 
>       source code formatting for the for loop.
>       4. Removed unused macro in pr119702-1.c test file

So something like:

gcc

        PR target/119702
        * config/rs6000/altivec.md: Added a define_insn to recognize left shift
        by constant 1 for vector instructions.
        * config/rs6000/predicates.md(shift_constant_1): Added a predicate for
        detecting if all values in a const_vector are 1.

gcc/testsuite

        PR target/119702
        * gcc.target/powerpc/pr119702-1.c: New test for (check generation of 
        vadd for different integer types)


> 
> ---
>  gcc/config/rs6000/altivec.md                  |  8 +++
>  gcc/config/rs6000/predicates.md               | 11 ++++
>  gcc/testsuite/gcc.target/powerpc/pr119702-1.c | 53 +++++++++++++++++++
>  3 files changed, 72 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 7edc288a656..e9e9c9dd554 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2107,6 +2107,14 @@
>    "vsrv %0,%1,%2"
>    [(set_attr "type" "vecsimple")])
>  
> +(define_insn "*altivec_vsl<VI_char>_const_1"
> +  [(set (match_operand:VI2 0 "register_operand" "=v")
> +     (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")
> +                   (match_operand:VI2 2 "shift_constant_1" "")))]
> +  "<VI_unit>"
> +  "vaddu<VI_char>m %0,%1,%1"
> +)
> +
>  (define_insn "*altivec_vsl<VI_char>"
>    [(set (match_operand:VI2 0 "register_operand" "=v")
>          (ashift:VI2 (match_operand:VI2 1 "register_operand" "v")

I would suggest changing shift_constant_1 to vector_constant_1.  That
way it could be used in other contexts.  But shift_constant_1 is fine.

> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 647e89afb6a..50e423d2514 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -924,6 +924,17 @@
>      }
>  })
>  
> +(define_predicate "shift_constant_1"
> +  (match_code "const_vector")
> +{
> +  unsigned nunits = GET_MODE_NUNITS (mode), i;
> +  for (i = 0; i < nunits; i++) 
> +    if (INTVAL (CONST_VECTOR_ELT(op, i)) != 1)
> +      return 0;
> +  
> +  return 1;
> +})
> +
>  ;; Return 1 if operand is 0.0.
>  (define_predicate "zero_fp_constant"
>    (and (match_code "const_double")

Even though it is 'obvious', you probably should add a comment on what
it does.  Given we are now programming in C++, it may be more C++ like
if you declare 'i' inside the for loop.  So something like:

;; Return 1 if the operand is a vector constant with 1 in all of the elements.
(define_predicate "vector_constant_1"
  (match_code "const_vector")
{
  unsigned nunits = GET_MODE_NUNITS (mode);
  for (unsigned i = 0; i < nunits; i++) 
    if (INTVAL (CONST_VECTOR_ELT(op, i)) != 1)
      return 0;
  
  return 1;
})

> diff --git a/gcc/testsuite/gcc.target/powerpc/pr119702-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> new file mode 100644
> index 00000000000..2f23ae20084
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr119702-1.c
> @@ -0,0 +1,53 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */

Here you need to add an option that says you need the Altivec
instruction set.  On a little endian server system, this will always be
true, but on a big endian system, the default compiler is power4 or
power5 which does not include the Altivec instruction set.

Something like:

/& { dg-do compile } */
/* { dg-options "-O2 -maltivec" } */
/* { dg-require-effective-target powerpc_altivec } */


> +
> +#include <inttypes.h>
> +
> +void lshift1_64(uint64_t *a) {
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +}
> +
> +void lshift1_32(unsigned int *a) {
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +}
> +
> +void lshift1_16(uint16_t *a) {
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +  a[4] <<= 1;
> +  a[5] <<= 1;
> +  a[6] <<= 1;
> +  a[7] <<= 1;
> +}
> +
> +void lshift1_8(uint8_t *a) {
> +  a[0] <<= 1;
> +  a[1] <<= 1;
> +  a[2] <<= 1;
> +  a[3] <<= 1;
> +  a[4] <<= 1;
> +  a[5] <<= 1;
> +  a[6] <<= 1;
> +  a[7] <<= 1;
> +  a[8] <<= 1;
> +  a[9] <<= 1;
> +  a[10] <<= 1;
> +  a[11] <<= 1;
> +  a[12] <<= 1;
> +  a[13] <<= 1;
> +  a[14] <<= 1;
> +  a[15] <<= 1;
> +}
> +
> +
> +/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */

It is probably useful to include a comment saying the test validates
that vaddu<x>m is generated instead of a shift left.

You may also want to add another test that tests using the Altivec
vector types.  Something like (note, you need lp64 to test vector
unsigned long):

/* { dg-do compile { target lp64 } } */
/* { dg-options "-O2 -maltivec" } */
/* { dg-require-effective-target powerpc_altivec } */

/* PR target/119702 -- verify vector shift left by 1 is converted into
   an addu<x>m instruction.  */

vector unsigned long
lshift1_64 (vector unsigned long a)
{
  return a << (vector unsigned long) { 1, 1 };
}

vector unsigned int
lshift1_32 (vector unsigned int a)
{
  return a << (vector unsigned int) { 1, 1, 1, 1 };
}

vector unsigned short
lshift1_16 (vector unsigned short a)
{
  return a << (vector unsigned short) { 1, 1, 1, 1, 1, 1, 1, 1 };
}

vector unsigned char
lshift1_8 (vector unsigned char a)
{
  return a << (vector unsigned char) { 1, 1, 1, 1, 1, 1, 1, 1,
                                       1, 1, 1, 1, 1, 1, 1, 1 };
}

/* { dg-final { scan-assembler-times {\mvaddudm\M} 1 } } */
/* { dg-final { scan-assembler-times {\mvadduwm\M} 1 } } */
/* { dg-final { scan-assembler-times {\mvadduhm\M} 1 } } */
/* { dg-final { scan-assembler-times {\mvaddubm\M} 1 } } */

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: [email protected]

Re: [PATCH v3][PR119702] rs6000: Use vector addition when left shifting by 1

Reply via email to