Re: [PATCH] s390: Implement vec_set with vec_merge and, vec_duplicate.

2022-08-16 Thread Andreas Krebbel via Gcc-patches
On 8/12/22 16:48, Robin Dapp wrote:
> Hi,
> 
> similar to other backends this patch implements vec_set via
> vec_merge and vec_duplicate instead of an unspec.  This opens up
> more possibilites to combine instructions.
> 
> Bootstrapped and regtested. No regressions.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md: Implement vec_set with vec_merge and
>   vec_duplicate.
>   * config/s390/vector.md: Likewise.
>   * config/s390/vx-builtins.md: Likewise.
>   * config/s390/s390.cc (s390_expand_vec_init): Emit new pattern.
>   (print_operand_address): New output modifier.
>   (print_operand): New output modifier.

The way you handle the element selector doesn't look right to me. It appears to 
be an index if it is
a CONST_INT and a bitmask otherwise. I don't think it is legal to change 
operand semantics like this
depending on the operand type. This would break e.g. if LRA would decide to 
load the immediate index
in a register.

Couldn't you make the shift part of the RTX instead and have the parameter 
always as an index?

Bye,

Andreas

> ---
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index c86b26933d7a..ff89fb83360a 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -7073,11 +7073,10 @@ s390_expand_vec_init (rtx target, rtx vals)
>if (!general_operand (elem, GET_MODE (elem)))
>   elem = force_reg (inner_mode, elem);
> 
> -  emit_insn (gen_rtx_SET (target,
> -   gen_rtx_UNSPEC (mode,
> -   gen_rtvec (3, elem,
> -  GEN_INT (i), target),
> -   UNSPEC_VEC_SET)));
> +  emit_insn
> + (gen_rtx_SET
> +  (target, gen_rtx_VEC_MERGE
> +   (mode, gen_rtx_VEC_DUPLICATE (mode, elem), target, GEN_INT (1 << 
> i;
>  }
>  }
> 
> @@ -8057,6 +8056,8 @@ print_operand_address (FILE *file, rtx addr)
>  'S': print S-type memory reference (base+displacement).
>  'Y': print address style operand without index (e.g. shift count or
> setmem
>operand).
> +'P': print address-style operand without index but with the offset as
> +  if it were specified by a 'p' format flag.
> 
>  'b': print integer X as if it's an unsigned byte.
>  'c': print integer X as if it's an signed byte.
> @@ -8068,6 +8069,7 @@ print_operand_address (FILE *file, rtx addr)
>  'k': print the first nonzero SImode part of X.
>  'm': print the first SImode part unequal to -1 of X.
>  'o': print integer X as if it's an unsigned 32bit word.
> +'p': print N such that 2^N == X (X must be a power of 2 and const int).
>  's': "start" of contiguous bitmask X in either DImode or vector
> inner mode.
>  't': CONST_INT: "start" of contiguous bitmask X in SImode.
>CONST_VECTOR: Generate a bitmask for vgbm instruction.
> @@ -8237,6 +8239,16 @@ print_operand (FILE *file, rtx x, int code)
>print_shift_count_operand (file, x);
>return;
> 
> +case 'P':
> +  if (CONST_INT_P (x))
> + {
> +   ival = exact_log2 (INTVAL (x));
> +   fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
> + }
> +  else
> + print_shift_count_operand (file, x);
> +  return;
> +
>  case 'K':
>/* Append @PLT to both local and non-local symbols in order to
> support
>Linux Kernel livepatching: patches contain individual functions and
> @@ -8321,6 +8333,9 @@ print_operand (FILE *file, rtx x, int code)
>   case 'o':
> ival &= 0x;
> break;
> + case 'p':
> +   ival = exact_log2 (INTVAL (x));
> +   break;
>   case 'e': case 'f':
>   case 's': case 't':
> {
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index f37d8fd33a15..a82db4c624fa 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -183,7 +183,6 @@ (define_c_enum "unspec" [
> UNSPEC_VEC_GFMSUM_128
> UNSPEC_VEC_GFMSUM_ACCUM
> UNSPEC_VEC_GFMSUM_ACCUM_128
> -   UNSPEC_VEC_SET
> 
> UNSPEC_VEC_VSUMG
> UNSPEC_VEC_VSUMQ
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index c50451a8326c..bde3a39db3d4 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -467,12 +467,17 @@ (define_insn "mov"
>  ; vec_set is supposed to *modify* an existing vector so operand 0 is
>  ; duplicated as input operand.
>  (define_expand "vec_set"
> -  [(set (match_operand:V0 "register_operand"  "")
> - (unspec:V [(match_operand: 1 "general_operand"   "")
> -(match_operand:SI2 "nonmemory_operand" "")
> -(match_dup 0)]
> -UNSPEC_VEC_SET))]
> -  "TARGET_VX")
> +  [(set (match_operand:V  0 "register_operand" "")
> + (vec_merge:V
> +   (vec_duplicate:V
> + (match_operand: 1 "general_operand" ""))

[PATCH] s390: Implement vec_set with vec_merge and, vec_duplicate.

2022-08-12 Thread Robin Dapp via Gcc-patches
Hi,

similar to other backends this patch implements vec_set via
vec_merge and vec_duplicate instead of an unspec.  This opens up
more possibilites to combine instructions.

Bootstrapped and regtested. No regressions.

Is it OK?

Regards
 Robin

gcc/ChangeLog:

* config/s390/s390.md: Implement vec_set with vec_merge and
vec_duplicate.
* config/s390/vector.md: Likewise.
* config/s390/vx-builtins.md: Likewise.
* config/s390/s390.cc (s390_expand_vec_init): Emit new pattern.
(print_operand_address): New output modifier.
(print_operand): New output modifier.
---

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index c86b26933d7a..ff89fb83360a 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -7073,11 +7073,10 @@ s390_expand_vec_init (rtx target, rtx vals)
   if (!general_operand (elem, GET_MODE (elem)))
elem = force_reg (inner_mode, elem);

-  emit_insn (gen_rtx_SET (target,
- gen_rtx_UNSPEC (mode,
- gen_rtvec (3, elem,
-GEN_INT (i), target),
- UNSPEC_VEC_SET)));
+  emit_insn
+   (gen_rtx_SET
+(target, gen_rtx_VEC_MERGE
+ (mode, gen_rtx_VEC_DUPLICATE (mode, elem), target, GEN_INT (1 << 
i;
 }
 }

@@ -8057,6 +8056,8 @@ print_operand_address (FILE *file, rtx addr)
 'S': print S-type memory reference (base+displacement).
 'Y': print address style operand without index (e.g. shift count or
setmem
 operand).
+'P': print address-style operand without index but with the offset as
+if it were specified by a 'p' format flag.

 'b': print integer X as if it's an unsigned byte.
 'c': print integer X as if it's an signed byte.
@@ -8068,6 +8069,7 @@ print_operand_address (FILE *file, rtx addr)
 'k': print the first nonzero SImode part of X.
 'm': print the first SImode part unequal to -1 of X.
 'o': print integer X as if it's an unsigned 32bit word.
+'p': print N such that 2^N == X (X must be a power of 2 and const int).
 's': "start" of contiguous bitmask X in either DImode or vector
inner mode.
 't': CONST_INT: "start" of contiguous bitmask X in SImode.
 CONST_VECTOR: Generate a bitmask for vgbm instruction.
@@ -8237,6 +8239,16 @@ print_operand (FILE *file, rtx x, int code)
   print_shift_count_operand (file, x);
   return;

+case 'P':
+  if (CONST_INT_P (x))
+   {
+ ival = exact_log2 (INTVAL (x));
+ fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
+   }
+  else
+   print_shift_count_operand (file, x);
+  return;
+
 case 'K':
   /* Append @PLT to both local and non-local symbols in order to
support
 Linux Kernel livepatching: patches contain individual functions and
@@ -8321,6 +8333,9 @@ print_operand (FILE *file, rtx x, int code)
case 'o':
  ival &= 0x;
  break;
+   case 'p':
+ ival = exact_log2 (INTVAL (x));
+ break;
case 'e': case 'f':
case 's': case 't':
  {
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index f37d8fd33a15..a82db4c624fa 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -183,7 +183,6 @@ (define_c_enum "unspec" [
UNSPEC_VEC_GFMSUM_128
UNSPEC_VEC_GFMSUM_ACCUM
UNSPEC_VEC_GFMSUM_ACCUM_128
-   UNSPEC_VEC_SET

UNSPEC_VEC_VSUMG
UNSPEC_VEC_VSUMQ
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index c50451a8326c..bde3a39db3d4 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -467,12 +467,17 @@ (define_insn "mov"
 ; vec_set is supposed to *modify* an existing vector so operand 0 is
 ; duplicated as input operand.
 (define_expand "vec_set"
-  [(set (match_operand:V0 "register_operand"  "")
-   (unspec:V [(match_operand: 1 "general_operand"   "")
-  (match_operand:SI2 "nonmemory_operand" "")
-  (match_dup 0)]
-  UNSPEC_VEC_SET))]
-  "TARGET_VX")
+  [(set (match_operand:V0 "register_operand" "")
+   (vec_merge:V
+ (vec_duplicate:V
+   (match_operand: 1 "general_operand" ""))
+ (match_dup 0)
+ (match_operand:SI  2 "nonmemory_operand")))]
+  ""
+{
+  if (CONST_INT_P (operands[2]))
+operands[2] = GEN_INT (1 << INTVAL (operands[2]));
+})

 ; FIXME: Support also vector mode operands for 1
 ; FIXME: A target memory operand seems to be useful otherwise we end
@@ -480,28 +485,31 @@ (define_expand "vec_set"
 ; that itself?
 ; vlvgb, vlvgh, vlvgf, vlvgg, vleb, vleh, vlef, vleg, vleib, vleih,
vleif, vleig
 (define_insn "*vec_set"
-  [(set (match_operand:V0 "register_operand"  "=v,v,v")
-   (unspec:V [(match_operand: 1 "general_operand""d,R,K")
-