Hi Richi,

>>
>> Yes, but you just missed the RC for 12.2 so please wait until after GCC 12.2
>> is released and the branch is open again.  The testcase looks mightly
>> complicated
>> so fallout there might be well possible as well ;)  I suppose it wasn't 
>> possible
>> to craft a simple C testcase after the analysis?
> 
> Thanks for the hints!  Let me give it a try next week and get back to you 
> then.
> 

As you suggested, I constructed one C testcase which has been verified on both 
i386
and ppc64 (failed w/o the patch while passed w/ that).

Is this attached patch ok for trunk?  And also ok for all release branches 
after a
week or so (also after frozen time)?

BR,
Kewen
From 8b63b3025d99a38cc0400ebc8d882cbcaf8a22cc Mon Sep 17 00:00:00 2001
From: Kewen Lin <li...@linux.ibm.com>
Date: Mon, 15 Aug 2022 01:30:48 -0500
Subject: [PATCH] vect: Don't allow vect_emulated_vector_p type in
 vectorizable_call [PR106322]

As PR106322 shows, in some cases for some vector type whose
TYPE_MODE is a scalar integral mode instead of a vector mode,
it's possible to obtain wrong target support information when
querying with the scalar integral mode.  For example, for the
test case in PR106322, on ppc64 32bit vectorizer gets vector
type "vector(2) short unsigned int" for scalar type "short
unsigned int", its mode is SImode instead of V2HImode.  The
target support querying checks umul_highpart optab with SImode
and considers it's supported, then vectorizer further generates
.MULH IFN call for that vector type.  Unfortunately it's wrong
to use SImode support for that vector type multiply highpart
here.

This patch is to teach vectorizable_call analysis not to allow
vect_emulated_vector_p type for both vectype_in and vectype_out
as Richi suggested.

        PR tree-optimization/106322

gcc/ChangeLog:

        * tree-vect-stmts.cc (vectorizable_call): Don't allow
        vect_emulated_vector_p type for both vectype_in and vectype_out.

gcc/testsuite/ChangeLog:

        * gcc.target/i386/pr106322.c: New test.
        * gcc.target/powerpc/pr106322.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr106322.c    | 51 +++++++++++++++++++++
 gcc/testsuite/gcc.target/powerpc/pr106322.c | 50 ++++++++++++++++++++
 gcc/tree-vect-stmts.cc                      |  8 ++++
 3 files changed, 109 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106322.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106322.c

diff --git a/gcc/testsuite/gcc.target/i386/pr106322.c 
b/gcc/testsuite/gcc.target/i386/pr106322.c
new file mode 100644
index 00000000000..31333c5fdcc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106322.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-require-effective-target ia32 } */
+/* { dg-options "-O2 -mtune=generic -march=i686" } */
+
+/* As PR106322, verify this can execute well (not abort).  */
+
+#define N 64
+typedef unsigned short int uh;
+typedef unsigned short int uw;
+uh a[N];
+uh b[N];
+uh c[N];
+uh e[N];
+
+__attribute__ ((noipa)) void
+foo ()
+{
+  for (int i = 0; i < N; i++)
+    c[i] = ((uw) b[i] * (uw) a[i]) >> 16;
+}
+
+__attribute__ ((optimize ("-O0"))) void
+init ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      a[i] = (uh) (0x7ABC - 0x5 * i);
+      b[i] = (uh) (0xEAB + 0xF * i);
+      e[i] = ((uw) b[i] * (uw) a[i]) >> 16;
+    }
+}
+
+__attribute__ ((optimize ("-O0"))) void
+check ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      if (c[i] != e[i])
+       __builtin_abort ();
+    }
+}
+
+int
+main ()
+{
+  init ();
+  foo ();
+  check ();
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr106322.c 
b/gcc/testsuite/gcc.target/powerpc/pr106322.c
new file mode 100644
index 00000000000..c05072d3416
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr106322.c
@@ -0,0 +1,50 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mdejagnu-cpu=power4" } */
+
+/* As PR106322, verify this can execute well (not abort).  */
+
+#define N 64
+typedef unsigned short int uh;
+typedef unsigned short int uw;
+uh a[N];
+uh b[N];
+uh c[N];
+uh e[N];
+
+__attribute__ ((noipa)) void
+foo ()
+{
+  for (int i = 0; i < N; i++)
+    c[i] = ((uw) b[i] * (uw) a[i]) >> 16;
+}
+
+__attribute__ ((optimize ("-O0"))) void
+init ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      a[i] = (uh) (0x7ABC - 0x5 * i);
+      b[i] = (uh) (0xEAB + 0xF * i);
+      e[i] = ((uw) b[i] * (uw) a[i]) >> 16;
+    }
+}
+
+__attribute__ ((optimize ("-O0"))) void
+check ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      if (c[i] != e[i])
+       __builtin_abort ();
+    }
+}
+
+int
+main ()
+{
+  init ();
+  foo ();
+  check ();
+
+  return 0;
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index f582d238984..c9dab217f05 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3423,6 +3423,14 @@ vectorizable_call (vec_info *vinfo,
       return false;
     }
 
+  if (vect_emulated_vector_p (vectype_in) || vect_emulated_vector_p 
(vectype_out))
+  {
+      if (dump_enabled_p ())
+       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                        "use emulated vector type for call\n");
+      return false;
+  }
+
   /* FORNOW */
   nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
   nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
-- 
2.27.0

Reply via email to