[PATCH applied], Backport PowerPC ISA 3.0 xxperm, builtin, and vneg support to GCC 6.2

Michael Meissner Wed, 01 Jun 2016 16:28:20 -0700

I applied the following patches that were applied to the trunk to the GCC 6.2
branch:


[gcc]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-23  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        PR target/71201
        * config/rs6000/altivec.md (altivec_vperm_<mode>_internal): Drop
        ISA 3.0 xxperm fusion alternative.
        (altivec_vperm_v8hiv16qi): Likewise.
        (altivec_vperm_<mode>_uns_internal): Likewise.
        (vperm_v8hiv4si): Likewise.
        (vperm_v16qiv8hi): Likewise.

        Back port from trunk
        2016-05-23  Michael Meissner  <meiss...@linux.vnet.ibm.com>
                    Kelvin Nilsen  <kel...@gcc.gnu.org>

        * config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
        vpermr/xxpermr on ISA 3.0.
        (altivec_expand_vec_perm_le): Likewise.
        * config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
        (altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
        ISA 3.0.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-23  Michael Meissner  <meiss...@linux.vnet.ibm.com>
                    Kelvin Nilsen  <kel...@gcc.gnu.org>

        * gcc.target/powerpc/p9-permute.c: Run test on big endian as well
        as little endian.

        Back port from trunk
        2016-05-23  Michael Meissner  <meiss...@linux.vnet.ibm.com>
                    Kelvin Nilsen  <kel...@gcc.gnu.org>

        * gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
        support.

[gcc]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * config/rs6000/altivec.md (VParity): New mode iterator for vector
        parity built-in functions.
        (p9v_ctz<mode>2): Add support for ISA 3.0 vector count trailing
        zeros.
        (p9v_parity<mode>2): Likewise.
        * config/rs6000/vector.md (VEC_IP): New mode iterator for vector
        parity.
        (ctz<mode>2): ISA 3.0 expander for vector count trailing zeros.
        (parity<mode>2): ISA 3.0 expander for vector parity.
        * config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for
        power9 built-ins.
        (BU_P9_64BIT_MISC_0): Likewise.
        (BU_P9_MISC_0): Likewise.
        (BU_P9V_AV_1): Likewise.
        (BU_P9V_AV_2): Likewise.
        (BU_P9V_AV_3): Likewise.
        (BU_P9V_AV_P): Likewise.
        (BU_P9V_VSX_1): Likewise.
        (BU_P9V_OVERLOAD_1): Likewise.
        (BU_P9V_OVERLOAD_2): Likewise.
        (BU_P9V_OVERLOAD_3): Likewise.
        (VCTZB): Add vector count trailing zeros support.
        (VCTZH): Likewise.
        (VCTZW): Likewise.
        (VCTZD): Likewise.
        (VPRTYBD): Add vector parity support.
        (VPRTYBQ): Likewise.
        (VPRTYBW): Likewise.
        (VCTZ): Add overloaded vector count trailing zeros support.
        (VPRTYB): Add overloaded vector parity support.
        * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
        overloaded vector count trailing zeros and parity instructions.
        * config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for
        vector parity support.
        * config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count
        trailing zeros support.
        (vec_cntlz): Likewise.
        (vec_vctzb): Likewise.
        (vec_vctzd): Likewise.
        (vec_vctzh): Likewise.
        (vec_vctzw): Likewise.
        (vec_vprtyb): Add ISA 3.0 vector parity support.
        (vec_vprtybd): Likewise.
        (vec_vprtybw): Likewise.
        (vec_vprtybq): Likewise.
        * doc/extend.texi (PowerPC AltiVec Built-in Functions): Document
        the ISA 3.0 vector count trailing zeros and vector parity built-in
        functions.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * gcc.target/powerpc/p9-vparity.c: New file to check ISA 3.0
        vector parity built-in functions.
        * gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector
        count trailing zeros automatic vectorization.
        * gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector
        count trailing zeros built-in functions.

[gcc]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * config/rs6000/altivec.md (VNEG iterator): New iterator for
        VNEGW/VNEGD instructions.
        (p9_neg<mode>2): New insns for ISA 3.0 VNEGW/VNEGD.
        (neg<mode>2): Add expander for V2DImode added in ISA 2.07, and
        support for ISA 3.0 VNEGW/VNEGD instructions.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        Back port from trunk
        2016-05-24  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD
        instructions.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 236958)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -6853,21 +6853,29 @@ rs6000_expand_vector_set (rtx target, rt
                        gen_rtvec (3, target, reg,
                                   force_reg (V16QImode, x)),
                        UNSPEC_VPERM);
-  else 
+  else
     {
-      /* Invert selector.  We prefer to generate VNAND on P8 so
-         that future fusion opportunities can kick in, but must
-         generate VNOR elsewhere.  */
-      rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
-      rtx iorx = (TARGET_P8_VECTOR
-                 ? gen_rtx_IOR (V16QImode, notx, notx)
-                 : gen_rtx_AND (V16QImode, notx, notx));
-      rtx tmp = gen_reg_rtx (V16QImode);
-      emit_insn (gen_rtx_SET (tmp, iorx));
-
-      /* Permute with operands reversed and adjusted selector.  */
-      x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
-                         UNSPEC_VPERM);
+      if (TARGET_P9_VECTOR)
+       x = gen_rtx_UNSPEC (mode,
+                           gen_rtvec (3, target, reg,
+                                      force_reg (V16QImode, x)),
+                           UNSPEC_VPERMR);
+      else
+       {
+         /* Invert selector.  We prefer to generate VNAND on P8 so
+            that future fusion opportunities can kick in, but must
+            generate VNOR elsewhere.  */
+         rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
+         rtx iorx = (TARGET_P8_VECTOR
+                     ? gen_rtx_IOR (V16QImode, notx, notx)
+                     : gen_rtx_AND (V16QImode, notx, notx));
+         rtx tmp = gen_reg_rtx (V16QImode);
+         emit_insn (gen_rtx_SET (tmp, iorx));
+
+         /* Permute with operands reversed and adjusted selector.  */
+         x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
+                             UNSPEC_VPERM);
+       }
     }
 
   emit_insn (gen_rtx_SET (target, x));
@@ -34128,17 +34136,25 @@ altivec_expand_vec_perm_le (rtx operands
   if (!REG_P (target))
     tmp = gen_reg_rtx (mode);
 
-  /* Invert the selector with a VNAND if available, else a VNOR.
-     The VNAND is preferred for future fusion opportunities.  */
-  notx = gen_rtx_NOT (V16QImode, sel);
-  iorx = (TARGET_P8_VECTOR
-         ? gen_rtx_IOR (V16QImode, notx, notx)
-         : gen_rtx_AND (V16QImode, notx, notx));
-  emit_insn (gen_rtx_SET (norreg, iorx));
+  if (TARGET_P9_VECTOR)
+    {
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel),
+                              UNSPEC_VPERMR);
+    }
+  else
+    {
+      /* Invert the selector with a VNAND if available, else a VNOR.
+        The VNAND is preferred for future fusion opportunities.  */
+      notx = gen_rtx_NOT (V16QImode, sel);
+      iorx = (TARGET_P8_VECTOR
+             ? gen_rtx_IOR (V16QImode, notx, notx)
+             : gen_rtx_AND (V16QImode, notx, notx));
+      emit_insn (gen_rtx_SET (norreg, iorx));
 
-  /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
-                          UNSPEC_VPERM);
+      /* Permute with operands reversed and adjusted selector.  */
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
+                              UNSPEC_VPERM);
+    }
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md        (revision 236941)
+++ gcc/config/rs6000/altivec.md        (working copy)
@@ -58,6 +58,7 @@ (define_c_enum "unspec"
    UNSPEC_VSUM2SWS
    UNSPEC_VSUMSWS
    UNSPEC_VPERM
+   UNSPEC_VPERMR
    UNSPEC_VPERM_UNS
    UNSPEC_VRFIN
    UNSPEC_VCFUX
@@ -1949,32 +1950,30 @@ (define_expand "altivec_vperm_<mode>"
 
 ;; Slightly prefer vperm, since the target does not overlap the source
 (define_insn "*altivec_vperm_<mode>_internal"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo")
-       (unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo")
-                   (match_operand:VM 2 "register_operand" "v,wo,wo")
-                   (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+       (unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+                   (match_operand:VM 2 "register_operand" "v,wo")
+                   (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_insn "altivec_vperm_v8hiv16qi"
-  [(set (match_operand:V16QI 0 "register_operand" "=v,?wo,?&wo")
-       (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0,wo")
-                      (match_operand:V8HI 2 "register_operand" "v,wo,wo")
-                      (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V16QI 0 "register_operand" "=v,?wo")
+       (unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0")
+                      (match_operand:V8HI 2 "register_operand" "v,wo")
+                      (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_expand "altivec_vperm_<mode>_uns"
   [(set (match_operand:VM 0 "register_operand" "")
@@ -1992,18 +1991,17 @@ (define_expand "altivec_vperm_<mode>_uns
 })
 
 (define_insn "*altivec_vperm_<mode>_uns_internal"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo")
-       (unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo")
-                   (match_operand:VM 2 "register_operand" "v,wo,wo")
-                   (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+       (unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+                   (match_operand:VM 2 "register_operand" "v,wo")
+                   (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERM_UNS))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_expand "vec_permv16qi"
   [(set (match_operand:V16QI 0 "register_operand" "")
@@ -2032,6 +2030,19 @@ (define_expand "vec_perm_constv16qi"
     FAIL;
 })
 
+(define_insn "*altivec_vpermr_<mode>_internal"
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+       (unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+                   (match_operand:VM 2 "register_operand" "v,wo")
+                   (match_operand:V16QI 3 "register_operand" "v,wo")]
+                  UNSPEC_VPERMR))]
+  "TARGET_P9_VECTOR"
+  "@
+   vpermr %0,%1,%2,%3
+   xxpermr %x0,%x2,%x3"
+  [(set_attr "type" "vecperm")
+   (set_attr "length" "4")])
+
 (define_insn "altivec_vrfip"           ; ceil
   [(set (match_operand:V4SF 0 "register_operand" "=v")
         (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")]
@@ -2791,32 +2802,30 @@ (define_expand "vec_unpacks_lo_<VP_small
   "")
 
 (define_insn "vperm_v8hiv4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,?wo,?&wo")
-        (unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v,0,wo")
-                     (match_operand:V4SI 2 "register_operand" "v,wo,wo")
-                     (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V4SI 0 "register_operand" "=v,?wo")
+        (unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v,0")
+                     (match_operand:V4SI 2 "register_operand" "v,wo")
+                     (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERMSI))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_insn "vperm_v16qiv8hi"
-  [(set (match_operand:V8HI 0 "register_operand" "=v,?wo,?&wo")
-        (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v,0,wo")
-                     (match_operand:V8HI 2 "register_operand" "v,wo,wo")
-                     (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V8HI 0 "register_operand" "=v,?wo")
+        (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v,0")
+                     (match_operand:V8HI 2 "register_operand" "v,wo")
+                     (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERMHI))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 
 (define_expand "vec_unpacku_hi_v16qi"
Index: gcc/testsuite/gcc.target/powerpc/p9-permute.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-permute.c       (revision 236941)
+++ gcc/testsuite/gcc.target/powerpc/p9-permute.c       (working copy)
@@ -1,4 +1,4 @@
-/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-do compile { target { powerpc64*-*-* } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
 /* { dg-options "-mcpu=power9 -O2" } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
@@ -17,5 +17,6 @@ permute (vector long long *p, vector lon
   return vec_perm (a, b, mask);
 }
 
+/* expect xxpermr on little-endian, xxperm on big-endian */
 /* { dg-final { scan-assembler    "xxperm" } } */
 /* { dg-final { scan-assembler-not "vperm"  } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-vpermr.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-vpermr.c        (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vpermr.c        (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* Test generation of VPERMR/XXPERMR on ISA 3.0 in little endian.  */
+
+#include <altivec.h>
+
+vector long long
+permute (vector long long *p, vector long long *q, vector unsigned char mask)
+{
+  vector long long a = *p;
+  vector long long b = *q;
+
+  /* Force a, b to be in altivec registers to select vpermr insn.  */
+  __asm__ (" # a: %x0, b: %x1" : "+v" (a), "+v" (b));
+
+  return vec_perm (a, b, mask);
+}
+
+/* { dg-final { scan-assembler "vpermr\|xxpermr" } } */

Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md (revision 236941)
+++ gcc/config/rs6000/vector.md (working copy)
@@ -26,6 +26,13 @@
 ;; Vector int modes
 (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
 
+;; Vector int modes for parity
+(define_mode_iterator VEC_IP [V8HI
+                             V4SI
+                             V2DI
+                             V1TI
+                             (TI "TARGET_VSX_TIMODE")])
+
 ;; Vector float modes
 (define_mode_iterator VEC_F [V4SF V2DF])
 
@@ -738,12 +745,24 @@ (define_expand "clz<mode>2"
        (clz:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
   "TARGET_P8_VECTOR")
 
+;; Vector count trailing zeros
+(define_expand "ctz<mode>2"
+  [(set (match_operand:VEC_I 0 "register_operand" "")
+       (ctz:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
+  "TARGET_P9_VECTOR")
+
 ;; Vector population count
 (define_expand "popcount<mode>2"
   [(set (match_operand:VEC_I 0 "register_operand" "")
         (popcount:VEC_I (match_operand:VEC_I 1 "register_operand" "")))]
   "TARGET_P8_VECTOR")
 
+;; Vector parity
+(define_expand "parity<mode>2"
+  [(set (match_operand:VEC_IP 0 "register_operand" "")
+       (parity:VEC_IP (match_operand:VEC_IP 1 "register_operand" "")))]
+  "TARGET_P9_VECTOR")
+
 
 ;; Same size conversions
 (define_expand "float<VEC_int><mode>2"
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c        (revision 236943)
+++ gcc/config/rs6000/rs6000-c.c        (working copy)
@@ -4215,6 +4215,43 @@ const struct altivec_builtin_types altiv
   { P8V_BUILTIN_VEC_VCLZD, P8V_BUILTIN_VCLZD,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
 
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZ, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZB, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZB, P9V_BUILTIN_VCTZB,
+    RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZH, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_V8HI, RS6000_BTI_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZH, P9V_BUILTIN_VCTZH,
+    RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZW, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZW, P9V_BUILTIN_VCTZW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VCTZD, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VCTZD, P9V_BUILTIN_VCTZD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
     RS6000_BTI_V16QI, RS6000_BTI_V16QI, 0, 0 },
   { P8V_BUILTIN_VEC_VGBBD, P8V_BUILTIN_VGBBD,
@@ -4344,6 +4381,42 @@ const struct altivec_builtin_types altiv
   { P8V_BUILTIN_VEC_VPOPCNTD, P8V_BUILTIN_VPOPCNTD,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
 
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_INTTI, RS6000_BTI_INTTI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYB, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_UINTTI, RS6000_BTI_UINTTI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBW, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_V4SI, RS6000_BTI_V4SI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBW, P9V_BUILTIN_VPRTYBW,
+    RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBD, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBD, P9V_BUILTIN_VPRTYBD,
+    RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 },
+
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_INTTI, RS6000_BTI_INTTI, 0, 0 },
+  { P9V_BUILTIN_VEC_VPRTYBQ, P9V_BUILTIN_VPRTYBQ,
+    RS6000_BTI_UINTTI, RS6000_BTI_UINTTI, 0, 0 },
+
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
     RS6000_BTI_V4SI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 },
   { P8V_BUILTIN_VEC_VPKUDUM, P8V_BUILTIN_VPKUDUM,
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def        (revision 236943)
+++ gcc/config/rs6000/rs6000-builtin.def        (working copy)
@@ -647,8 +647,113 @@
                     | RS6000_BTC_BINARY),                              \
                    CODE_FOR_ ## ICODE)                 /* ICODE */
 
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers.  */
+#define BU_P9_MISC_1(ENUM, NAME, ATTR, ICODE)                          \
+  RS6000_BUILTIN_1 (MISC_BUILTIN_ ## ENUM,             /* ENUM */      \
+                   "__builtin_" NAME,                  /* NAME */      \
+                   RS6000_BTM_MODULO,                  /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_UNARY),                               \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers,
+   and they require 64-bit addressing.  */
+#define BU_P9_64BIT_MISC_0(ENUM, NAME, ATTR, ICODE)                    \
+  RS6000_BUILTIN_0 (MISC_BUILTIN_ ## ENUM,             /* ENUM */      \
+                   "__builtin_" NAME,                  /* NAME */      \
+                   RS6000_BTM_MODULO                                   \
+                     | RS6000_BTM_64BIT,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_SPECIAL),                             \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+/* Miscellaneous builtins for instructions added in ISA 3.0.  These
+   instructions don't require either the DFP or VSX options, just the basic 
+   ISA 3.0 enablement since they operate on general purpose registers.  */
+#define BU_P9_MISC_0(ENUM, NAME, ATTR, ICODE)                      \
+  RS6000_BUILTIN_0 (MISC_BUILTIN_ ## ENUM,             /* ENUM */      \
+                   "__builtin_" NAME,                  /* NAME */      \
+                   RS6000_BTM_MODULO,                  /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_SPECIAL),                             \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+/* ISA 3.0 (power9) vector convenience macros.  */
+/* For the instructions that are encoded as altivec instructions use
+   __builtin_altivec_ as the builtin name.  */
+#define BU_P9V_AV_1(ENUM, NAME, ATTR, ICODE)                           \
+  RS6000_BUILTIN_1 (P9V_BUILTIN_ ## ENUM,              /* ENUM */      \
+                   "__builtin_altivec_" NAME,          /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_UNARY),                               \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+#define BU_P9V_AV_2(ENUM, NAME, ATTR, ICODE)                           \
+  RS6000_BUILTIN_2 (P9V_BUILTIN_ ## ENUM,              /* ENUM */      \
+                   "__builtin_altivec_" NAME,          /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_BINARY),                              \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+#define BU_P9V_AV_3(ENUM, NAME, ATTR, ICODE)                           \
+  RS6000_BUILTIN_3 (P9V_BUILTIN_ ## ENUM,              /* ENUM */      \
+                   "__builtin_altivec_" NAME,          /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_TERNARY),                             \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+#define BU_P9V_AV_P(ENUM, NAME, ATTR, ICODE)                           \
+  RS6000_BUILTIN_P (P9V_BUILTIN_ ## ENUM,              /* ENUM */      \
+                   "__builtin_altivec_" NAME,          /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_PREDICATE),                           \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+/* For the instructions encoded as VSX instructions use __builtin_vsx as the
+   builtin name.  */
+#define BU_P9V_VSX_1(ENUM, NAME, ATTR, ICODE)                          \
+  RS6000_BUILTIN_1 (P9V_BUILTIN_ ## ENUM,              /* ENUM */      \
+                   "__builtin_vsx_" NAME,              /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_ ## ATTR                /* ATTR */      \
+                    | RS6000_BTC_UNARY),                               \
+                   CODE_FOR_ ## ICODE)                 /* ICODE */
+
+#define BU_P9V_OVERLOAD_1(ENUM, NAME)                                  \
+  RS6000_BUILTIN_1 (P9V_BUILTIN_VEC_ ## ENUM,          /* ENUM */      \
+                   "__builtin_vec_" NAME,              /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_OVERLOADED              /* ATTR */      \
+                    | RS6000_BTC_UNARY),                               \
+                   CODE_FOR_nothing)                   /* ICODE */
+
+#define BU_P9V_OVERLOAD_2(ENUM, NAME)                                  \
+  RS6000_BUILTIN_2 (P9V_BUILTIN_VEC_ ## ENUM,          /* ENUM */      \
+                   "__builtin_vec_" NAME,              /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_OVERLOADED              /* ATTR */      \
+                    | RS6000_BTC_BINARY),                              \
+                   CODE_FOR_nothing)                   /* ICODE */
+
+#define BU_P9V_OVERLOAD_3(ENUM, NAME)                                  \
+  RS6000_BUILTIN_3 (P9V_BUILTIN_VEC_ ## ENUM,          /* ENUM */      \
+                   "__builtin_vec_" NAME,              /* NAME */      \
+                   RS6000_BTM_P9_VECTOR,               /* MASK */      \
+                   (RS6000_BTC_OVERLOADED              /* ATTR */      \
+                    | RS6000_BTC_TERNARY),                             \
+                   CODE_FOR_nothing)                   /* ICODE */
 #endif
 
+
 /* Insure 0 is not a legitimate index.  */
 BU_SPECIAL_X (RS6000_BUILTIN_NONE, NULL, 0, RS6000_BTC_MISC)
 
@@ -1659,6 +1764,26 @@ BU_LDBL128_2 (UNPACK_TF, "unpack_longdou
 BU_P7_MISC_2 (PACK_V1TI,       "pack_vector_int128",   CONST,  packv1ti)
 BU_P7_MISC_2 (UNPACK_V1TI,     "unpack_vector_int128", CONST,  unpackv1ti)
 
+/* 1 argument vector functions added in ISA 3.0 (power9).  */
+BU_P9V_AV_1 (VCTZB,            "vctzb",                CONST,  ctzv16qi2)
+BU_P9V_AV_1 (VCTZH,            "vctzh",                CONST,  ctzv8hi2)
+BU_P9V_AV_1 (VCTZW,            "vctzw",                CONST,  ctzv4si2)
+BU_P9V_AV_1 (VCTZD,            "vctzd",                CONST,  ctzv2di2)
+BU_P9V_AV_1 (VPRTYBD,          "vprtybd",              CONST,  parityv2di2)
+BU_P9V_AV_1 (VPRTYBQ,          "vprtybq",              CONST,  parityv1ti2)
+BU_P9V_AV_1 (VPRTYBW,          "vprtybw",              CONST,  parityv4si2)
+
+/* ISA 3.0 vector overloaded 1 argument functions.  */
+BU_P9V_OVERLOAD_1 (VCTZ,       "vctz")
+BU_P9V_OVERLOAD_1 (VCTZB,      "vctzb")
+BU_P9V_OVERLOAD_1 (VCTZH,      "vctzh")
+BU_P9V_OVERLOAD_1 (VCTZW,      "vctzw")
+BU_P9V_OVERLOAD_1 (VCTZD,      "vctzd")
+BU_P9V_OVERLOAD_1 (VPRTYB,     "vprtyb")
+BU_P9V_OVERLOAD_1 (VPRTYBD,    "vprtybd")
+BU_P9V_OVERLOAD_1 (VPRTYBQ,    "vprtybq")
+BU_P9V_OVERLOAD_1 (VPRTYBW,    "vprtybw")
+
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,            "vsbox",          CONST, crypto_vsbox)
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md        (revision 236959)
+++ gcc/config/rs6000/altivec.md        (working copy)
@@ -190,6 +190,13 @@ (define_mode_iterator VM2 [V4SI
                           (KF "FLOAT128_VECTOR_P (KFmode)")
                           (TF "FLOAT128_VECTOR_P (TFmode)")])
 
+;; Specific iterator for parity which does not have a byte/half-word form, but
+;; does have a quad word form
+(define_mode_iterator VParity [V4SI
+                              V2DI
+                              V1TI
+                              (TI "TARGET_VSX_TIMODE")])
+
 (define_mode_attr VI_char [(V2DI "d") (V4SI "w") (V8HI "h") (V16QI "b")])
 (define_mode_attr VI_scalar [(V2DI "DI") (V4SI "SI") (V8HI "HI") (V16QI "QI")])
 (define_mode_attr VI_unit [(V16QI "VECTOR_UNIT_ALTIVEC_P (V16QImode)")
@@ -3362,7 +3369,7 @@ (define_expand "vec_unpacku_float_lo_v8h
 }")
 
 
-;; Power8 vector instructions encoded as Altivec instructions
+;; Power8/power9 vector instructions encoded as Altivec instructions
 
 ;; Vector count leading zeros
 (define_insn "*p8v_clz<mode>2"
@@ -3373,6 +3380,15 @@ (define_insn "*p8v_clz<mode>2"
   [(set_attr "length" "4")
    (set_attr "type" "vecsimple")])
 
+;; Vector count trailing zeros
+(define_insn "*p9v_ctz<mode>2"
+  [(set (match_operand:VI2 0 "register_operand" "=v")
+       (ctz:VI2 (match_operand:VI2 1 "register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vctz<wd> %0,%1"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
 ;; Vector population count
 (define_insn "*p8v_popcount<mode>2"
   [(set (match_operand:VI2 0 "register_operand" "=v")
@@ -3382,6 +3398,15 @@ (define_insn "*p8v_popcount<mode>2"
   [(set_attr "length" "4")
    (set_attr "type" "vecsimple")])
 
+;; Vector parity
+(define_insn "*p9v_parity<mode>2"
+  [(set (match_operand:VParity 0 "register_operand" "=v")
+        (parity:VParity (match_operand:VParity 1 "register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vprtyb<wd> %0,%1"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
 ;; Vector Gather Bits by Bytes by Doubleword
 (define_insn "p8v_vgbbd"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 236943)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -577,7 +577,9 @@ (define_mode_attr wd [(QI    "b")
                      (V16QI "b")
                      (V8HI  "h")
                      (V4SI  "w")
-                     (V2DI  "d")])
+                     (V2DI  "d")
+                     (V1TI  "q")
+                     (TI    "q")])
 
 ;; How many bits in this mode?
 (define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")])
Index: gcc/config/rs6000/altivec.h
===================================================================
--- gcc/config/rs6000/altivec.h (revision 236943)
+++ gcc/config/rs6000/altivec.h (working copy)
@@ -384,6 +384,23 @@
 #define vec_vupklsw __builtin_vec_vupklsw
 #endif
 
+#ifdef _ARCH_PWR9
+/* Vector additions added in ISA 3.0.  */
+#define vec_vctz __builtin_vec_vctz
+#define vec_cntlz __builtin_vec_vctz
+#define vec_vctzb __builtin_vec_vctzb
+#define vec_vctzd __builtin_vec_vctzd
+#define vec_vctzh __builtin_vec_vctzh
+#define vec_vctzw __builtin_vec_vctzw
+#define vec_vprtyb __builtin_vec_vprtyb
+#define vec_vprtybd __builtin_vec_vprtybd
+#define vec_vprtybw __builtin_vec_vprtybw
+
+#ifdef _ARCH_PPC64
+#define vec_vprtybq __builtin_vec_vprtybq
+#endif
+#endif
+
 /* Predicates.
    For C++, we use templates in order to allow non-parenthesized arguments.
    For C, instead, we use macros since non-parenthesized arguments were
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi (revision 236943)
+++ gcc/doc/extend.texi (working copy)
@@ -16452,6 +16452,60 @@ int __builtin_bcdsub_gt (vector __int128
 int __builtin_bcdsub_ov (vector __int128_t, vector__int128_t);
 @end smallexample
 
+If the ISA 3.00 additions to the vector/scalar (power9-vector)
+instruction set are available:
+
+@smallexample
+vector long long vec_vctz (vector long long);
+vector unsigned long long vec_vctz (vector unsigned long long);
+vector int vec_vctz (vector int);
+vector unsigned int vec_vctz (vector int);
+vector short vec_vctz (vector short);
+vector unsigned short vec_vctz (vector unsigned short);
+vector signed char vec_vctz (vector signed char);
+vector unsigned char vec_vctz (vector unsigned char);
+
+vector signed char vec_vctzb (vector signed char);
+vector unsigned char vec_vctzb (vector unsigned char);
+
+vector long long vec_vctzd (vector long long);
+vector unsigned long long vec_vctzd (vector unsigned long long);
+
+vector short vec_vctzh (vector short);
+vector unsigned short vec_vctzh (vector unsigned short);
+
+vector int vec_vctzw (vector int);
+vector unsigned int vec_vctzw (vector int);
+
+vector int vec_vprtyb (vector int);
+vector unsigned int vec_vprtyb (vector unsigned int);
+vector long long vec_vprtyb (vector long long);
+vector unsigned long long vec_vprtyb (vector unsigned long long);
+
+vector int vec_vprtybw (vector int);
+vector unsigned int vec_vprtybw (vector unsigned int);
+
+vector long long vec_vprtybd (vector long long);
+vector unsigned long long vec_vprtybd (vector unsigned long long);
+@end smallexample
+
+
+If the ISA 3.00 additions to the vector/scalar (power9-vector)
+instruction set are available for 64-bit targets:
+
+@smallexample
+vector long vec_vprtyb (vector long);
+vector unsigned long vec_vprtyb (vector unsigned long);
+vector __int128_t vec_vprtyb (vector __int128_t);
+vector __uint128_t vec_vprtyb (vector __uint128_t);
+
+vector long vec_vprtybd (vector long);
+vector unsigned long vec_vprtybd (vector unsigned long);
+
+vector __int128_t vec_vprtybq (vector __int128_t);
+vector __uint128_t vec_vprtybd (vector __uint128_t);
+@end smallexample
+
 If the cryptographic instructions are enabled (@option{-mcrypto} or
 @option{-mcpu=power8}), the following builtins are enabled.
 
Index: gcc/testsuite/gcc.target/powerpc/p9-vparity.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-vparity.c       (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vparity.c       (revision 0)
@@ -0,0 +1,107 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2 -mlra -mvsx-timode" } */
+
+#include <altivec.h>
+
+vector int
+parity_v4si_1s (vector int a)
+{
+  return vec_vprtyb (a);
+}
+
+vector int
+parity_v4si_2s (vector int a)
+{
+  return vec_vprtybw (a);
+}
+
+vector unsigned int
+parity_v4si_1u (vector unsigned int a)
+{
+  return vec_vprtyb (a);
+}
+
+vector unsigned int
+parity_v4si_2u (vector unsigned int a)
+{
+  return vec_vprtybw (a);
+}
+
+vector long long
+parity_v2di_1s (vector long long a)
+{
+  return vec_vprtyb (a);
+}
+
+vector long long
+parity_v2di_2s (vector long long a)
+{
+  return vec_vprtybd (a);
+}
+
+vector unsigned long long
+parity_v2di_1u (vector unsigned long long a)
+{
+  return vec_vprtyb (a);
+}
+
+vector unsigned long long
+parity_v2di_2u (vector unsigned long long a)
+{
+  return vec_vprtybd (a);
+}
+
+vector __int128_t
+parity_v1ti_1s (vector __int128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+vector __int128_t
+parity_v1ti_2s (vector __int128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+__int128_t
+parity_ti_3s (__int128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+__int128_t
+parity_ti_4s (__int128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+vector __uint128_t
+parity_v1ti_1u (vector __uint128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+vector __uint128_t
+parity_v1ti_2u (vector __uint128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+__uint128_t
+parity_ti_3u (__uint128_t a)
+{
+  return vec_vprtyb (a);
+}
+
+__uint128_t
+parity_ti_4u (__uint128_t a)
+{
+  return vec_vprtybq (a);
+}
+
+/* { dg-final { scan-assembler "vprtybd" } } */
+/* { dg-final { scan-assembler "vprtybq" } } */
+/* { dg-final { scan-assembler "vprtybw" } } */
Index: gcc/testsuite/gcc.target/powerpc/ctz-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ctz-3.c    (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ctz-3.c    (revision 0)
@@ -0,0 +1,62 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2 -ftree-vectorize -fvect-cost-model=dynamic 
-fno-unroll-loops -fno-unroll-all-loops" } */
+
+#ifndef SIZE
+#define SIZE 1024
+#endif
+
+#ifndef ALIGN
+#define ALIGN 32
+#endif
+
+#define ALIGN_ATTR __attribute__((__aligned__(ALIGN)))
+
+#define DO_BUILTIN(PREFIX, TYPE, CTZ)                                  \
+TYPE PREFIX ## _a[SIZE] ALIGN_ATTR;                                    \
+TYPE PREFIX ## _b[SIZE] ALIGN_ATTR;                                    \
+                                                                       \
+void                                                                   \
+PREFIX ## _ctz (void)                                                  \
+{                                                                      \
+  unsigned long i;                                                     \
+                                                                       \
+  for (i = 0; i < SIZE; i++)                                           \
+    PREFIX ## _a[i] = CTZ (PREFIX ## _b[i]);                           \
+}
+
+#if !defined(DO_LONG_LONG) && !defined(DO_LONG) && !defined(DO_INT) && 
!defined(DO_SHORT) && !defined(DO_CHAR)
+#define DO_INT 1
+#endif
+
+#if DO_LONG_LONG
+/* At the moment, only int is auto vectorized.  */
+DO_BUILTIN (sll, long long,            __builtin_ctzll)
+DO_BUILTIN (ull, unsigned long long,   __builtin_ctzll)
+#endif
+
+#if defined(_ARCH_PPC64) && DO_LONG
+DO_BUILTIN (sl,  long,                 __builtin_ctzl)
+DO_BUILTIN (ul,  unsigned long,                __builtin_ctzl)
+#endif
+
+#if DO_INT
+DO_BUILTIN (si,  int,                  __builtin_ctz)
+DO_BUILTIN (ui,  unsigned int,         __builtin_ctz)
+#endif
+
+#if DO_SHORT
+DO_BUILTIN (ss,  short,                        __builtin_ctz)
+DO_BUILTIN (us,  unsigned short,       __builtin_ctz)
+#endif
+
+#if DO_CHAR
+DO_BUILTIN (sc,  signed char,          __builtin_ctz)
+DO_BUILTIN (uc,  unsigned char,                __builtin_ctz)
+#endif
+
+/* { dg-final { scan-assembler-times "vctzw" 2 } } */
+/* { dg-final { scan-assembler-not "cnttzd" } } */
+/* { dg-final { scan-assembler-not "cnttzw" } } */
Index: gcc/testsuite/gcc.target/powerpc/ctz-4.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ctz-4.c    (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ctz-4.c    (revision 0)
@@ -0,0 +1,110 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+
+vector signed char
+count_trailing_zeros_v16qi_1s (vector signed char a)
+{
+  return vec_vctz (a);
+}
+
+vector signed char
+count_trailing_zeros_v16qi_2s (vector signed char a)
+{
+  return vec_vctzb (a);
+}
+
+vector unsigned char
+count_trailing_zeros_v16qi_1u (vector unsigned char a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned char
+count_trailing_zeros_v16qi_2u (vector unsigned char a)
+{
+  return vec_vctzb (a);
+}
+
+vector short
+count_trailing_zeros_v8hi_1s (vector short a)
+{
+  return vec_vctz (a);
+}
+
+vector short
+count_trailing_zeros_v8hi_2s (vector short a)
+{
+  return vec_vctzh (a);
+}
+
+vector unsigned short
+count_trailing_zeros_v8hi_1u (vector unsigned short a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned short
+count_trailing_zeros_v8hi_2u (vector unsigned short a)
+{
+  return vec_vctzh (a);
+}
+
+vector int
+count_trailing_zeros_v4si_1s (vector int a)
+{
+  return vec_vctz (a);
+}
+
+vector int
+count_trailing_zeros_v4si_2s (vector int a)
+{
+  return vec_vctzw (a);
+}
+
+vector unsigned int
+count_trailing_zeros_v4si_1u (vector unsigned int a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned int
+count_trailing_zeros_v4si_2u (vector unsigned int a)
+{
+  return vec_vctzw (a);
+}
+
+vector long long
+count_trailing_zeros_v2di_1s (vector long long a)
+{
+  return vec_vctz (a);
+}
+
+vector long long
+count_trailing_zeros_v2di_2s (vector long long a)
+{
+  return vec_vctzd (a);
+}
+
+vector unsigned long long
+count_trailing_zeros_v2di_1u (vector unsigned long long a)
+{
+  return vec_vctz (a);
+}
+
+vector unsigned long long
+count_trailing_zeros_v2di_2u (vector unsigned long long a)
+{
+  return vec_vctzd (a);
+}
+
+/* { dg-final { scan-assembler "vctzb" } } */
+/* { dg-final { scan-assembler "vctzd" } } */
+/* { dg-final { scan-assembler "vctzh" } } */
+/* { dg-final { scan-assembler "vctzw" } } */
+/* { dg-final { scan-assembler-not "cnttzd" } } */
+/* { dg-final { scan-assembler-not "cnttzw" } } */

Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md        
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)    
(revision 236663)
+++ gcc/config/rs6000/altivec.md        (.../gcc/config/rs6000) (working copy)
@@ -207,6 +207,9 @@ (define_mode_attr VP_small [(V2DI "V4SI"
 (define_mode_attr VP_small_lc [(V2DI "v4si") (V4SI "v8hi") (V8HI "v16qi")])
 (define_mode_attr VU_char [(V2DI "w") (V4SI "h") (V8HI "b")])
 
+;; Vector negate
+(define_mode_iterator VNEG [V4SI V2DI])
+
 ;; Vector move instructions.
 (define_insn "*altivec_mov<mode>"
   [(set (match_operand:VM2 0 "nonimmediate_operand" "=Z,v,v,*Y,*r,*r,v,v,*r")
@@ -2754,20 +2757,28 @@ (define_expand "reduc_plus_scal_<mode>"
   DONE;
 })
 
+(define_insn "*p9_neg<mode>2"
+  [(set (match_operand:VNEG 0 "altivec_register_operand" "=v")
+       (neg:VNEG (match_operand:VNEG 1 "altivec_register_operand" "v")))]
+  "TARGET_P9_VECTOR"
+  "vneg<VI_char> %0,%1"
+  [(set_attr "type" "vecsimple")])
+
 (define_expand "neg<mode>2"
-  [(use (match_operand:VI 0 "register_operand" ""))
-   (use (match_operand:VI 1 "register_operand" ""))]
-  "TARGET_ALTIVEC"
-  "
+  [(set (match_operand:VI2 0 "register_operand" "")
+       (neg:VI2 (match_operand:VI2 1 "register_operand" "")))]
+  "<VI_unit>"
 {
-  rtx vzero;
+  if (!TARGET_P9_VECTOR || (<MODE>mode != V4SImode && <MODE>mode != V2DImode))
+    {
+      rtx vzero;
 
-  vzero = gen_reg_rtx (GET_MODE (operands[0]));
-  emit_insn (gen_altivec_vspltis<VI_char> (vzero, const0_rtx));
-  emit_insn (gen_sub<mode>3 (operands[0], vzero, operands[1])); 
-  
-  DONE;
-}")
+      vzero = gen_reg_rtx (GET_MODE (operands[0]));
+      emit_move_insn (vzero, CONST0_RTX (<MODE>mode));
+      emit_insn (gen_sub<mode>3 (operands[0], vzero, operands[1]));
+      DONE;
+    }
+})
 
 (define_expand "udot_prod<mode>"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
Index: gcc/testsuite/gcc.target/powerpc/p9-vneg.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-vneg.c  
(.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)
     (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vneg.c  
(.../gcc/testsuite/gcc.target/powerpc)  (revision 236665)
@@ -0,0 +1,12 @@
+/* { dg-do compile { target { powerpc64*-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* Verify P9 vector negate instructions.  */
+
+vector long long v2di_neg (vector long long a) { return -a; }
+vector int v4si_neg (vector int a) { return -a; }
+
+/* { dg-final { scan-assembler "vnegd" } } */
+/* { dg-final { scan-assembler "vnegw" } } */

[PATCH applied], Backport PowerPC ISA 3.0 xxperm, builtin, and vneg support to GCC 6.2

Reply via email to