[PATCH 2/3] Disable generating load/store vector pairs for block copies.

If the store vector pair instruction is disabled, do not generate block
copies that use load and store vector pair instructions.

I have built bootstrap compilers and run the regression tests on three
different systems:

    1)  Little endian power10 using the --with-cpu=power10 option.

    2)  Little endian power9 using the --with-cpu=power9 option.

    3)  Big endian power8 using the --with-cpu=power8 option.  On this system,
        both 64-bit and 32-bit code generation was tested.

There were no regressions in the runs.  Can I check this patch into the
trunk?  If there are no changes needed for the backports, can I check this
code into the active branches after a burn-in period?

2022-06-06   Michael Meissner  <meiss...@linux.ibm.com>

gcc/

        * config/rs6000/rs6000-string.cc (expand_block_move): If the store
        vector pair instructions are disabled, do not generate block
        copies using load and store vector pairs.
---
 gcc/config/rs6000/rs6000-string.cc | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-string.cc 
b/gcc/config/rs6000/rs6000-string.cc
index 59d901ac68d..1b18e043269 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2787,14 +2787,16 @@ expand_block_move (rtx operands[], bool might_overlap)
       rtx src, dest;
       bool move_with_length = false;
 
-      /* Use OOmode for paired vsx load/store.  Use V2DI for single
-        unaligned vsx load/store, for consistency with what other
-        expansions (compare) already do, and so we can use lxvd2x on
-        p8.  Order is VSX pair unaligned, VSX unaligned, Altivec, VSX
-        with length < 16 (if allowed), then gpr load/store.  */
+      /* Use OOmode for paired vsx load/store unless the store vector pair
+        instructions are disabled.  Use V2DI for single unaligned vsx
+        load/store, for consistency with what other expansions (compare)
+        already do, and so we can use lxvd2x on p8.  Order is VSX pair
+        unaligned, VSX unaligned, Altivec, VSX with length < 16 (if allowed),
+        then gpr load/store.  */
 
       if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
          && TARGET_BLOCK_OPS_VECTOR_PAIR
+         && TARGET_STORE_VECTOR_PAIR
          && bytes >= 32
          && (align >= 256 || !STRICT_ALIGNMENT))
        {
-- 
2.35.3


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Reply via email to