The following allows to switch the x86 target to use the vectorizer
cost comparison mechanic to select between different vector mode
variants of vectorizations. The default is still to not do this
but this allows an opt-in.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
For next stage1 I'll probably propose flipping the switch (or not
add the switch at all). I'll follow up with a report on how
CPU 2017 behaves with this on vs. off before considering to ask
whether we want this switch for GCC 16 or not (like if it only
has overly negative effects).
PR target/123603
* config/i386/i386.opt (-param=ix86-vect-compare-costs=): Add.
* config/i386/i386.cc (ix86_autovectorize_vector_modes): Honor it.
* doc/invoke.texi (ix86-vect-compare-costs): Document.
* gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c: New testcase.
---
gcc/config/i386/i386.cc | 2 +-
gcc/config/i386/i386.opt | 4 ++++
gcc/doc/invoke.texi | 3 +++
.../vect/costmodel/x86_64/costmodel-pr123603.c | 15 +++++++++++++++
4 files changed, 23 insertions(+), 1 deletion(-)
create mode 100644
gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 6bf4af8bbe3..a3d0f7cb649 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -25700,7 +25700,7 @@ ix86_autovectorize_vector_modes (vector_modes *modes,
bool all)
if (TARGET_SSE2)
modes->safe_push (V4QImode);
- return 0;
+ return ix86_vect_compare_costs ? VECT_COMPARE_COSTS : 0;
}
/* Implemenation of targetm.vectorize.get_mask_mode. */
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 99bb674812b..ef9efabcff6 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1249,6 +1249,10 @@ Enable conservative small loop unrolling.
Target Joined UInteger Var(ix86_vect_unroll_limit) Init(4) Param
Limit how much the autovectorizer may unroll a loop.
+-param=ix86-vect-compare-costs=
+Target Joined UInteger Var(ix86_vect_compare_costs) Init(0) IntegerRange(0, 1)
Param Optimization
+Whether x86 vectorizer cost modeling compares costs of different vector sizes.
+
mlam=
Target RejectNegative Joined Enum(lam_type) Var(ix86_lam_type) Init(lam_none)
-mlam=[none|u48|u57] Instrument meta data position in user data pointers.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b703b531d75..5092e4ba9ad 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -18213,6 +18213,9 @@ the discovery is aborted.
@item ix86-vect-unroll-limit
Limit how much the autovectorizer may unroll a loop.
+@item ix86-vect-compare-costs
+Whether x86 vectorizer cost modeling compares costs of different vector sizes.
+
@end table
@end table
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c
b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c
new file mode 100644
index 00000000000..c074176a7e4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr123603.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-additional-options "--param ix86-vect-compare-costs=1" } */
+
+void foo (int *block)
+{
+ for (int i = 0; i < 3; ++i)
+ {
+ int a = block[i*9];
+ int b = block[i*9+1];
+ block[i*9] = a + 10;
+ block[i*9+1] = b + 10;
+ }
+}
+
+/* { dg-final { scan-tree-dump "optimized: loop vectorized using 8 byte
vectors" "vect" } } */
--
2.51.0