On 2021/6/12 04:16, Segher Boessenkool wrote:
On Thu, Jun 10, 2021 at 03:11:08PM +0800, Xionghu Luo wrote:
On 2021/6/10 00:24, Segher Boessenkool wrote:
    "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
     && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
    [(const_int 0)]
{
    rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
    rs6000_emit_le_vsx_permute (operands[0], operands[1], <MODE>mode);
    rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
    DONE;
})

So it seems like it is only 3 insns in the very unlucky case?  Normally
it will end up as just one simple store?

I am afraid there is not "simple store" for *TImode on P8LE*.  There is only
stxvd2x that rotates the element(stvx requires memory to be aligned, not
suitable pattern), so every vsx_le_perm_store_v1ti must be split to 3
instructions for alternative 0, it seems incorrect to force the cost to be 4.

Often it could be done as just two insns though?  If the value stored is
not used elsewhere?

So we could make the first alternative cost 8 then as well, which will
also work out for combine, right?

Alternatively we could have what is now the second alternative be the
first, if that is realistic -- that one already has cost 8 (it is just
two machine instructions).

Attached the patch to update the 5 *vsx_le_perm_store_<xxx> function
costs from 12 to 8.

--
Thanks,
Xionghu
From da82766fec6d51e468d9106f8d4c52de84585b08 Mon Sep 17 00:00:00 2001
From: Xionghu Luo <luo...@linux.ibm.com>
Date: Mon, 14 Jun 2021 21:05:24 -0500
Subject: [PATCH] rs6000: Fix vsx_le_perm_store_* instructions length

vsx_le_perm_store_<mode> instructions often could be done as just two
insns, make the cost 8 which will also work out for combine.

Discussions:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572594.html

gcc/ChangeLog:

        * config/rs6000/vsx.md (*vsx_le_perm_store_<mode>): Change
        length from 12 to 8.
        (*vsx_le_perm_store_<mode>): Likewise.
        (*vsx_le_perm_store_v8hi): Likewise.
        (*vsx_le_perm_store_v16qi): Likewise.
        (*vsx_le_perm_store_<mode>): Likewise.
---
 gcc/config/rs6000/vsx.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index f1097326861..8c5865b8c34 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -613,7 +613,7 @@ (define_insn "*vsx_le_perm_store_<mode>"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
-   (set_attr "length" "12")])
+   (set_attr "length" "8")])
 
 (define_split
   [(set (match_operand:VSX_D 0 "indexed_or_indirect_operand")
@@ -683,7 +683,7 @@ (define_insn "*vsx_le_perm_store_<mode>"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
-   (set_attr "length" "12")])
+   (set_attr "length" "8")])
 
 (define_split
   [(set (match_operand:VSX_W 0 "indexed_or_indirect_operand")
@@ -758,7 +758,7 @@ (define_insn "*vsx_le_perm_store_v8hi"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
-   (set_attr "length" "12")])
+   (set_attr "length" "8")])
 
 (define_split
   [(set (match_operand:V8HI 0 "indexed_or_indirect_operand")
@@ -843,7 +843,7 @@ (define_insn "*vsx_le_perm_store_v16qi"
   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR"
   "#"
   [(set_attr "type" "vecstore")
-   (set_attr "length" "12")])
+   (set_attr "length" "8")])
 
 (define_split
   [(set (match_operand:V16QI 0 "indexed_or_indirect_operand")
@@ -1016,7 +1016,7 @@ (define_insn "*vsx_le_perm_store_<mode>"
    #
    #"
   [(set_attr "type" "vecstore,store")
-   (set_attr "length" "12,8")
+   (set_attr "length" "8,8")
    (set_attr "isa" "<VSisa>,*")])
 
 (define_split
-- 
2.25.1

Reply via email to