>
> optimize_function_for_speed ()?
>

Yes, updated patch with optimize_function_for_speed_p()

gcc/ChangeLog:

PR target/105034
* config/i386/i386-features.cc (pass_stv::gate()): Add
  optimize_function_for_speed_p ().

gcc/testsuite/ChangeLog:

PR target/105034
* gcc.target/i386/pr105034.c: New test.
---
 gcc/config/i386/i386-features.cc         |  3 ++-
 gcc/testsuite/gcc.target/i386/pr105034.c | 23 +++++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105034.c

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 6fe41c3c24f..a49c3aa1525 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -1911,7 +1911,8 @@ public:
   virtual bool gate (function *)
     {
       return ((!timode_p || TARGET_64BIT)
-       && TARGET_STV && TARGET_SSE2 && optimize > 1);
+       && TARGET_STV && TARGET_SSE2 && optimize > 1
+       && optimize_function_for_speed_p (cfun));
     }

   virtual unsigned int execute (function *)
diff --git a/gcc/testsuite/gcc.target/i386/pr105034.c
b/gcc/testsuite/gcc.target/i386/pr105034.c
new file mode 100644
index 00000000000..d997e26e9ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105034.c
@@ -0,0 +1,23 @@
+/* PR target/105034 */
+/* { dg-do compile } */
+/* { dg-options "-Os -msse4.1" } */
+
+#define max(a,b) (((a) > (b))? (a) : (b))
+#define min(a,b) (((a) < (b))? (a) : (b))
+
+int foo(int x)
+{
+  return max(x,0);
+}
+
+int bar(int x)
+{
+  return min(x,0);
+}
+
+unsigned int baz(unsigned int x)
+{
+  return min(x,1);
+}
+
+/* { dg-final { scan-assembler-not "xmm" } } */
-- 
2.18.1

Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> 于2022年4月14日周四 14:56写道:
>
> On Thu, Apr 14, 2022 at 3:18 AM Hongyu Wang via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi,
> >
> > From -Os point of view, stv converts scalar register to vector mode
> > which introduces extra reg conversion and increase instruction size.
> > Disabling stv under optimize_size would avoid such code size increment
> > and no need to touch ix86_size_cost that has not been tuned for long
> > time.
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,},
> >
> > Ok for master?
> >
> > gcc/ChangeLog:
> >
> >         PR target/105034
> >         * config/i386/i386-features.cc (pass_stv::gate()): Block out
> >         optimize_size.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/105034
> >         * gcc.target/i386/pr105034.c: New test.
> > ---
> >  gcc/config/i386/i386-features.cc         |  3 ++-
> >  gcc/testsuite/gcc.target/i386/pr105034.c | 23 +++++++++++++++++++++++
> >  2 files changed, 25 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr105034.c
> >
> > diff --git a/gcc/config/i386/i386-features.cc 
> > b/gcc/config/i386/i386-features.cc
> > index 6fe41c3c24f..f57281e672f 100644
> > --- a/gcc/config/i386/i386-features.cc
> > +++ b/gcc/config/i386/i386-features.cc
> > @@ -1911,7 +1911,8 @@ public:
> >    virtual bool gate (function *)
> >      {
> >        return ((!timode_p || TARGET_64BIT)
> > -             && TARGET_STV && TARGET_SSE2 && optimize > 1);
> > +             && TARGET_STV && TARGET_SSE2 && optimize > 1
> > +             && !optimize_size);
>
> optimize_function_for_speed ()?
>
> >      }
> >
> >    virtual unsigned int execute (function *)
> > diff --git a/gcc/testsuite/gcc.target/i386/pr105034.c 
> > b/gcc/testsuite/gcc.target/i386/pr105034.c
> > new file mode 100644
> > index 00000000000..d997e26e9ed
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr105034.c
> > @@ -0,0 +1,23 @@
> > +/* PR target/105034 */
> > +/* { dg-do compile } */
> > +/* { dg-options "-Os -msse4.1" } */
> > +
> > +#define max(a,b) (((a) > (b))? (a) : (b))
> > +#define min(a,b) (((a) < (b))? (a) : (b))
> > +
> > +int foo(int x)
> > +{
> > +  return max(x,0);
> > +}
> > +
> > +int bar(int x)
> > +{
> > +  return min(x,0);
> > +}
> > +
> > +unsigned int baz(unsigned int x)
> > +{
> > +  return min(x,1);
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "xmm" } } */
> > --
> > 2.18.1
> >

Reply via email to