from:"Andreas Krebbel"

Re: [PATCH] testsuite/s390: Silence warning in pr80725.c

2022-04-14 Thread Andreas Krebbel via Gcc-patches

On 4/13/22 09:35, Robin Dapp wrote:
> Hi,
> 
> this test case checks that we do not ICE but FAILs because of
> -Wint-to-pointer-cast.  Silence this warning.
> 
> Is it OK?

Ok. Thanks!

Andreas

Re: [PATCH] testsuite: Skip pr105250.c for powerpc and s390 [PR105266]

2022-04-14 Thread Andreas Krebbel via Gcc-patches

On 4/14/22 05:10, Kewen.Lin wrote:
> Hi,
> 
> The test case pr105250.c is like its related pr105140.c, which
> suffers the error with message like "{AltiVec,vector} argument
> passed to unprototyped" on powerpc and s390.  So like commits
> r12-8025 and r12-8039, this fix is to add the dg-skip-if for
> powerpc*-*-* and s390*-*-*.
> 
> Tested on powerpc64le-linux-gnu P9 and it should work on s390
> as its similar PR105147.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> 
> gcc/testsuite/ChangeLog:
> 
>   PR testsuite/105266
>   * gcc.dg/pr105250.c: Skip for powerpc*-*-* and s390*-*-*.

Ok for s390. Thanks!

Andreas

[Committed] IBM zSystems: Add support for z16 as CPU name.

2022-04-12 Thread Andreas Krebbel via Gcc-patches

So far z16 was identified as arch14. After the machine has been
announced we can now add the real name.

gcc/ChangeLog:

* common/config/s390/s390-common.cc: Rename PF_ARCH14 to PF_Z16.
* config.gcc: Add z16 as march/mtune switch.
* config/s390/driver-native.cc (s390_host_detect_local_cpu):
Recognize z16 with -march=native.
* config/s390/s390-opts.h (enum processor_type): Rename
PROCESSOR_ARCH14 to PROCESSOR_3931_Z16.
* config/s390/s390.cc (PROCESSOR_ARCH14): Rename to ...
(PROCESSOR_3931_Z16): ... throughout the file.
(s390_processor processor_table): Add z16 as cpu string.
* config/s390/s390.h (enum processor_flags): Rename PF_ARCH14 to
PF_Z16.
(TARGET_CPU_ARCH14): Rename to ...
(TARGET_CPU_Z16): ... this.
(TARGET_CPU_ARCH14_P): Rename to ...
(TARGET_CPU_Z16_P): ... this.
(TARGET_ARCH14): Rename to ...
(TARGET_Z16): ... this.
(TARGET_ARCH14_P): Rename to ...
(TARGET_Z16_P): ... this.
* config/s390/s390.md (cpu_facility): Rename arch14 to z16 and
check TARGET_Z16 instead of TARGET_ARCH14.
* config/s390/s390.opt: Add z16 to processor_type.
* doc/invoke.texi: Document z16 and arch14.
---
 gcc/common/config/s390/s390-common.cc |  4 ++--
 gcc/config.gcc|  2 +-
 gcc/config/s390/driver-native.cc  |  6 +-
 gcc/config/s390/s390-opts.h   |  2 +-
 gcc/config/s390/s390.cc   | 14 --
 gcc/config/s390/s390.h| 16 
 gcc/config/s390/s390.md   |  6 +++---
 gcc/config/s390/s390.opt  |  5 -
 gcc/doc/invoke.texi   |  3 ++-
 9 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.cc 
b/gcc/common/config/s390/s390-common.cc
index caec2f14c6c..72a5ef47eaa 100644
--- a/gcc/common/config/s390/s390-common.cc
+++ b/gcc/common/config/s390/s390-common.cc
@@ -50,10 +50,10 @@ EXPORTED_CONST int processor_flags_table[] =
 /* z15 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15,
-/* arch14 */ PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+/* z16 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15
-| PF_NNPA | PF_ARCH14
+| PF_NNPA | PF_Z16
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 48a5bbcf787..c5064dd3766 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5532,7 +5532,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 | arch14 )
+   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | z16 | arch5 | arch6 | arch7 | arch8 | arch9 | 
arch10 | arch11 | arch12 | arch13 | arch14 )
# OK
;;
*)
diff --git a/gcc/config/s390/driver-native.cc b/gcc/config/s390/driver-native.cc
index 48524c49251..b5eb222872d 100644
--- a/gcc/config/s390/driver-native.cc
+++ b/gcc/config/s390/driver-native.cc
@@ -123,8 +123,12 @@ s390_host_detect_local_cpu (int argc, const char **argv)
case 0x8562:
  cpu = "z15";
  break;
+   case 0x3931:
+   case 0x3932:
+ cpu = "z16";
+ break;
default:
- cpu = "arch14";
+ cpu = "z16";
  break;
}
}
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index 1ec84631a5f..4ef82ac5d34 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,7 +38,7 @@ enum processor_type
   PROCESSOR_2964_Z13,
   PROCESSOR_3906_Z14,
   PROCESSOR_8561_Z15,
-  PROCESSOR_ARCH14,
+  PROCESSOR_3931_Z16,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..1342a2e7db0 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -337,7 +337,7 @@ const struct s390_processor processor_table[] =
   { "z13","z13",PROCESSOR_2964_Z13,_cost,  11 },
   { "z14","arch12", PROCESSOR_3906_Z14,_cost,  12 },
   { "z15","arch13", PROCESSOR_8561_Z15,_cost,  13 },
-  { "arch14", "arch14", PROCESSOR_ARCH14,  _cost,  14 },
+  { "z16","arch14", PROCESSOR_3931_Z16,_cost,  14 },
   {

[PATCH] v2 PR102024 - IBM Z: Add psabi diagnostics

2022-04-11 Thread Andreas Krebbel via Gcc-patches

v2:

- Remove redundant num_zero_width_bf_seen and num_fields_seen
  tracking. (Thanks Stefan Schulze-Frielinghaus)

Re-tested with testsuite and ABI tests.



For IBM Z in particular there is a problem with structs like:

struct A { float a; int :0; };

Our ABI document allows passing a struct in an FPR only if it has
exactly one member. On the other hand it says that structs of 1,2,4,8
bytes are passed in a GPR. So this struct is expected to be passed in
a GPR. Since we don't return structs in registers (regardless of the
number of members) it is always returned in memory.

Situation is as follows:

All compiler versions tested return it in memory - as expected.

gcc 11, gcc 12, g++ 12, and clang 13 pass it in a GPR - as expected.

g++ 11 as well as clang++ 13 pass in an FPR

For IBM Z we stick to the current GCC 12 behavior, i.e. zero-width
bitfields are NOT ignored.  A struct as above will be passed in a
GPR. Rational behind this is that not affecting the C ABI is more
important here.

A patch for clang is in progress: https://reviews.llvm.org/D122388

In addition to the usual regression test I ran the compat and
struct-layout-1 testsuites comparing the compiler before and after the
patch.

gcc/ChangeLog:
PR target/102024
* config/s390/s390-protos.h (s390_function_arg_vector): Remove
prototype.
* config/s390/s390.cc (s390_single_field_struct_p): New function.
(s390_function_arg_vector): Invoke s390_single_field_struct_p.
(s390_function_arg_float): Likewise.

gcc/testsuite/ChangeLog:
PR target/102024
* g++.target/s390/pr102024-1.C: New test.
* g++.target/s390/pr102024-2.C: New test.
* g++.target/s390/pr102024-3.C: New test.
* g++.target/s390/pr102024-4.C: New test.
* g++.target/s390/pr102024-5.C: New test.
* g++.target/s390/pr102024-6.C: New test.
---
 gcc/config/s390/s390-protos.h  |   1 -
 gcc/config/s390/s390.cc| 208 +++--
 gcc/testsuite/g++.target/s390/pr102024-1.C |  12 ++
 gcc/testsuite/g++.target/s390/pr102024-2.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-3.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-4.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-5.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-6.C |  12 ++
 8 files changed, 187 insertions(+), 104 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-1.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-2.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-3.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-4.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-5.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-6.C

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index e6251595870..fd4acaae44a 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -49,7 +49,6 @@ extern void s390_function_profiler (FILE *, int);
 extern void s390_set_has_landing_pad_p (bool);
 extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int s390_class_max_nregs (enum reg_class, machine_mode);
-extern bool s390_function_arg_vector (machine_mode, const_tree);
 extern bool s390_return_addr_from_memory(void);
 extern bool s390_fma_allowed_p (machine_mode);
 #if S390_USE_TARGET_ATTRIBUTE
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..c091d2a692a 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -12148,29 +12148,26 @@ s390_function_arg_size (machine_mode mode, const_tree 
type)
   gcc_unreachable ();
 }
 
-/* Return true if a function argument of type TYPE and mode MODE
-   is to be passed in a vector register, if available.  */
-
-bool
-s390_function_arg_vector (machine_mode mode, const_tree type)
+/* Return true if a variable of TYPE should be passed as single value
+   with type CODE. If STRICT_SIZE_CHECK_P is true the sizes of the
+   record type and the field type must match.
+
+   The ABI says that record types with a single member are treated
+   just like that member would be.  This function is a helper to
+   detect such cases.  The function also produces the proper
+   diagnostics for cases where the outcome might be different
+   depending on the GCC version.  */
+static bool
+s390_single_field_struct_p (enum tree_code code, const_tree type,
+   bool strict_size_check_p)
 {
-  if (!TARGET_VX_ABI)
-return false;
-
-  if (s390_function_arg_size (mode, type) > 16)
-return false;
-
-  /* No type info available for some library calls ...  */
-  if (!type)
-return VECTOR_MODE_P (mode);
-
-  /* The ABI says that record types with a single member are treated
- just like that member would be.  */
   int empty_base_seen = 0;
+  bool zero_width_bf_skipped_p = false;
   const_tree orig_type = type;
+
   while (TREE_CODE (type) == RECORD_TYPE)
 {
-  tree field,

Re: [PATCH] rs6000/testsuite: Skip pr105140.c

2022-04-06 Thread Andreas Krebbel via Gcc-patches

On 4/6/22 17:32, Segher Boessenkool wrote:
> This test fails with error "AltiVec argument passed to unprototyped
> function", but the code (in rs6000.c:invalid_arg_for_unprototyped_fn,
> from 2005) actually tests for any vector type argument.  It also does
> not fail on Darwin, not reflected here though.
> 
> Andreas, s390 has this same hook code, you may need to do the same?

Yes, thanks for the pointer. I've just committed the following:

IBM zSystems/testsuite: PR105147: Skip pr105140.c

pr105140.c fails on IBM zSystems with "vector argument passed to
unprototyped function".  s390_invalid_arg_for_unprototyped_fn in
s390.cc is triggered by that.

gcc/testsuite/ChangeLog:

PR target/105147
* gcc.dg/pr105140.c: Skip for s390*-*-*.
---
 gcc/testsuite/gcc.dg/pr105140.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr105140.c b/gcc/testsuite/gcc.dg/pr105140.c
index da34e7ad656..7d30985e850 100644
--- a/gcc/testsuite/gcc.dg/pr105140.c
+++ b/gcc/testsuite/gcc.dg/pr105140.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-Os -w -Wno-psabi" } */
-/* { dg-skip-if "PR105147" { powerpc*-*-* } } */
+/* { dg-skip-if "PR105147" { powerpc*-*-* s390*-*-* } } */

 typedef char __attribute__((__vector_size__ (16 * sizeof (char U;
 typedef int __attribute__((__vector_size__ (16 * sizeof (int V;

Re: [PATCH] testsuite/s390: Adapt test expections.

2022-04-04 Thread Andreas Krebbel via Gcc-patches

On 4/4/22 13:52, Robin Dapp wrote:
> Hi,
> 
> some tests expect a convert instruction but nowadays the conversion is
> already done at compile time.  This results in a literal-pool load.
> Change the tests accordingly.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zvector/vec-double-compile.c: Expect vl
> instead of vc*.
>   * gcc.target/s390/zvector/vec-float-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-signed-compile.c: Dito.
>   * gcc.target/s390/zvector/vec-unsigned-compile.c: Dito.

I've seen Mike's comment but I'm not opposed to checking it in that way. These 
kind of comments have
probably saved me a few hours of bisecting already. Next time you might 
consider moving it to the
commit message instead.

Ok. Thanks!

Bye,

Andreas

Re: [PATCH] testsuite/s390: Change nle -> h in ifcvt tests.

2022-04-04 Thread Andreas Krebbel via Gcc-patches

On 4/4/22 13:51, Robin Dapp wrote:
> Hi,
> 
> we have been emitting the "higher" variantes instead of the "not less or
> equal" ones for a while.  Change the test expectations accordingly.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ifcvt-two-insns-bool.c: Change nle to h.
>   * gcc.target/s390/ifcvt-two-insns-int.c: Dito.
>   * gcc.target/s390/ifcvt-two-insns-long.c: Dito.

Ok. Thanks!

Andreas

Re: [PATCH] testsuite: Add -fno-tree-loop-distribute-patterns for s390.

2022-04-04 Thread Andreas Krebbel via Gcc-patches

On 4/4/22 13:51, Robin Dapp wrote:
> Hi,
> 
> in gcc.dg/Wuse-after-free-2.c we try to detect a use-after-free.  On
> s390 the test's while loop is converted into a rawmemchr builtin making
> it impossible to determine that the pointers *p and *q are related.
> 
> Therefore, disable the tree loop distribute patterns pass on s390 for
> this test.
> 
> OK for trunk?
> 
> Regards
>  Robin
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/Wuse-after-free-2.c:
>   Add -fno-tree-loop-distribute-patterns for s390*.

Ok. Thanks!

Andreas

[PATCH] PR102024 - IBM Z: Add psabi diagnostics

2022-03-25 Thread Andreas Krebbel via Gcc-patches

For IBM Z in particular there is a problem with structs like:

struct A { float a; int :0; };

Our ABI document allows passing a struct in an FPR only if it has
exactly one member. On the other hand it says that structs of 1,2,4,8
bytes are passed in a GPR. So this struct is expected to be passed in
a GPR. Since we don't return structs in registers (regardless of the
number of members) it is always returned in memory.

Situation is as follows:

All compiler versions tested return it in memory - as expected.

gcc 11, gcc 12, g++ 12, and clang 13 pass it in a GPR - as expected.

g++ 11 as well as clang++ 13 pass in an FPR

For IBM Z we stick to the current GCC 12 behavior, i.e. zero-width
bitfields are NOT ignored.  A struct as above will be passed in a
GPR. Rational behind this is that not affecting the C ABI is more
important here.

A patch for clang is in progress: https://reviews.llvm.org/D122388

In addition to the usual regression test I ran the compat and
struct-layout-1 testsuites comparing the compiler before and after the
patch.

gcc/ChangeLog:
PR target/102024
* config/s390/s390-protos.h (s390_function_arg_vector): Remove
prototype.
* config/s390/s390.cc (s390_single_field_struct_p): New function.
(s390_function_arg_vector): Invoke s390_single_field_struct_p.
(s390_function_arg_float): Likewise.

gcc/testsuite/ChangeLog:
PR target/102024
* g++.target/s390/pr102024-1.C: New test.
* g++.target/s390/pr102024-2.C: New test.
* g++.target/s390/pr102024-3.C: New test.
* g++.target/s390/pr102024-4.C: New test.
* g++.target/s390/pr102024-5.C: New test.
* g++.target/s390/pr102024-6.C: New test.
---
 gcc/config/s390/s390-protos.h  |   1 -
 gcc/config/s390/s390.cc| 212 +++--
 gcc/testsuite/g++.target/s390/pr102024-1.C |  12 ++
 gcc/testsuite/g++.target/s390/pr102024-2.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-3.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-4.C |  15 ++
 gcc/testsuite/g++.target/s390/pr102024-5.C |  14 ++
 gcc/testsuite/g++.target/s390/pr102024-6.C |  12 ++
 8 files changed, 195 insertions(+), 100 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-1.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-2.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-3.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-4.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-5.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr102024-6.C

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index e6251595870..fd4acaae44a 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -49,7 +49,6 @@ extern void s390_function_profiler (FILE *, int);
 extern void s390_set_has_landing_pad_p (bool);
 extern bool s390_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int s390_class_max_nregs (enum reg_class, machine_mode);
-extern bool s390_function_arg_vector (machine_mode, const_tree);
 extern bool s390_return_addr_from_memory(void);
 extern bool s390_fma_allowed_p (machine_mode);
 #if S390_USE_TARGET_ATTRIBUTE
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index d2af6d8813d..6cfa586b9cd 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -12148,29 +12148,29 @@ s390_function_arg_size (machine_mode mode, const_tree 
type)
   gcc_unreachable ();
 }
 
-/* Return true if a function argument of type TYPE and mode MODE
-   is to be passed in a vector register, if available.  */
-
-bool
-s390_function_arg_vector (machine_mode mode, const_tree type)
+/* Return true if a variable of TYPE should be passed as single value
+   with type CODE. If STRICT_SIZE_CHECK_P is true the sizes of the
+   record type and the field type must match.
+
+   The ABI says that record types with a single member are treated
+   just like that member would be.  This function is a helper to
+   detect such cases.  The function also produces the proper
+   diagnostics for cases where the outcome might be different
+   depending on the GCC version.  */
+static bool
+s390_single_field_struct_p (enum tree_code code, const_tree type,
+   bool strict_size_check_p)
 {
-  if (!TARGET_VX_ABI)
-return false;
-
-  if (s390_function_arg_size (mode, type) > 16)
-return false;
-
-  /* No type info available for some library calls ...  */
-  if (!type)
-return VECTOR_MODE_P (mode);
-
-  /* The ABI says that record types with a single member are treated
- just like that member would be.  */
   int empty_base_seen = 0;
+  bool zero_width_bf_seen_p = false;
   const_tree orig_type = type;
+  bool single_p = true;
+
   while (TREE_CODE (type) == RECORD_TYPE)
 {
-  tree field, single = NULL_TREE;
+  tree field, single_type = NULL_TREE;
+  int num_zero_width_bf_seen = 0;
+  int num_fields_seen = 0;

Re: [PATCH] s390: Fix up *cmp_and_trap_unsigned_int constraints [PR104775]

2022-03-07 Thread Andreas Krebbel via Gcc-patches

On 3/5/22 09:33, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase fails to assemble due to clgte %r6,0(%r1,%r10)
> insn not being accepted by assembler.
> My rough understanding is that in the RSY-b insn format the spot
> in other formats used for index registers is used instead for M3 what
> kind of comparison it is, so this patch follows what other similar
> instructions use for constraint (i.e. one without index register).
> 
> Bootstrapped on s390x-linux, regtest there still pending, ok for
> trunk if it passes it?
> 
> 2022-03-05  Jakub Jelinek  
> 
>   PR target/104775
>   * config/s390/s390.md (*cmp_and_trap_unsigned_int): Use
>   S constraint instead of T in the last alternative.
> 
>   * gcc.target/s390/pr104775.c: New test.

Ok. Thanks for the fix!

Bye,

Andreas

Re: [PATCH] s390: Change SET rtx_cost handling.

2022-02-25 Thread Andreas Krebbel via Gcc-patches

On 2/25/22 12:38, Robin Dapp wrote:
> Hi,
> 
> the IF_THEN_ELSE detection currently prevents us from properly costing
> register-register moves which causes the lower-subreg pass to assume
> that a VR-VR move is as expensive as two GPR-GPR moves.
> 
> This patch adds handling for SETs containing REGs as well as MEMs and is
> inspired by the aarch64 implementation.
> 
> Bootstrapped and regtested on z900 up to z15. Is it OK?
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_address_cost): Declare.
>   (s390_hard_regno_nregs): Declare.
>   (s390_rtx_costs): Add handling for REG and MEM in SET.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vec-sum-across-no-lower-subreg-1.c: New
> test.

Ok. Thanks

Andreas

Re: [PATCH] Check always_inline flag in s390_can_inline_p [PR104327]

2022-02-07 Thread Andreas Krebbel via Gcc-patches

On 2/7/22 09:11, Jakub Jelinek wrote:
...
> 1) formatting, = should be at the start of next line rather than end of the
>line
> 2) all_masks, always_inline_safe_masks and caller_required_masks aren't
>ever modified, perhaps make them const?
> 3) I wonder if there is any advantage to have all_masks with all the masks
>enumerated, compared to
>const HOST_WIDE_INT all_masks
>  = (caller_required_masks | must_match_masks | always_inline_safe_masks
>   | MASK_DEBUG_ARG | MASK_PACKED_STACK | MASK_ZVECTOR);
>i.e. when you add a new mask, instead of listing it in all_masks
>and one or more of the other vars you'd just stick it either in one
>or more of those vars or in all_masks.

I've just committed the patch with these changes. Thanks Jakub!

Andreas


diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 5c2a830f9f0..c6cfe41ad7b 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -16091,6 +16091,23 @@ s390_valid_target_attribute_p (tree fndecl,
 static bool
 s390_can_inline_p (tree caller, tree callee)
 {
+  /* Flags which if present in the callee are required in the caller as well.  
*/
+  const unsigned HOST_WIDE_INT caller_required_masks = MASK_OPT_HTM;
+
+  /* Flags which affect the ABI and in general prevent inlining.  */
+  unsigned HOST_WIDE_INT must_match_masks
+= (MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP | MASK_SOFT_FLOAT
+   | MASK_LONG_DOUBLE_128 | MASK_OPT_VX);
+
+  /* Flags which we in general want to prevent inlining but accept for
+ always_inline.  */
+  const unsigned HOST_WIDE_INT always_inline_safe_masks
+= MASK_MVCLE | MASK_BACKCHAIN | MASK_SMALL_EXEC;
+
+  const HOST_WIDE_INT all_masks
+ = (caller_required_masks | must_match_masks | always_inline_safe_masks
+   | MASK_DEBUG_ARG | MASK_PACKED_STACK | MASK_ZVECTOR);
+
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);

@@ -16103,16 +16120,18 @@ s390_can_inline_p (tree caller, tree callee)

   struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
   struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-  bool ret = true;

-  if ((caller_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP))
-  != (callee_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP)))
-ret = false;
+  /* If one of these triggers make sure to add proper handling of your
+ new flag to this hook.  */
+  gcc_assert (!(caller_opts->x_target_flags & ~all_masks));
+  gcc_assert (!(callee_opts->x_target_flags & ~all_masks));

-  /* Don't inline functions to be compiled for a more recent arch into a
- function for an older arch.  */
-  else if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
-ret = false;
+  bool always_inline
+= (DECL_DISREGARD_INLINE_LIMITS (callee)
+   && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));
+
+  if (!always_inline)
+must_match_masks |= always_inline_safe_masks;

   /* Inlining a hard float function into a soft float function is only
  allowed if the hard float function doesn't actually make use of
@@ -16120,16 +16139,27 @@ s390_can_inline_p (tree caller, tree callee)

  We are called from FEs for multi-versioning call optimization, so
  beware of ipa_fn_summaries not available.  */
-  else if (((TARGET_SOFT_FLOAT_P (caller_opts->x_target_flags)
-&& !TARGET_SOFT_FLOAT_P (callee_opts->x_target_flags))
-   || (!TARGET_HARD_DFP_P (caller_opts->x_target_flags)
-   && TARGET_HARD_DFP_P (callee_opts->x_target_flags)))
-  && (! ipa_fn_summaries
-  || ipa_fn_summaries->get
-  (cgraph_node::get (callee))->fp_expressions))
-ret = false;
+  if (always_inline && ipa_fn_summaries
+  && !ipa_fn_summaries->get(cgraph_node::get (callee))->fp_expressions)
+must_match_masks &= ~(MASK_HARD_DFP | MASK_SOFT_FLOAT);

-  return ret;
+  if ((caller_opts->x_target_flags & must_match_masks)
+  != (callee_opts->x_target_flags & must_match_masks))
+return false;
+
+  if (~(caller_opts->x_target_flags & caller_required_masks)
+  & (callee_opts->x_target_flags & caller_required_masks))
+return false;
+
+  /* Don't inline functions to be compiled for a more recent arch into a
+ function for an older arch.  */
+  if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
+return false;
+
+  if (!always_inline && caller_opts->x_s390_tune != callee_opts->x_s390_tune)
+return false;
+
+  return true;
 }
 #endif

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr104327.c
b/gcc/testsuite/gcc.c-torture/compile/pr104327.c
new file mode 100644
index 000..d54e5d58cc4
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr104327.c
@@ -0,0 +1,15 @@
+/* PR target/104327 */
+
+void foo (int *);
+
+static inline __attribute__((always_inline)) void
+bar (int *x)
+{
+  foo (x);
+}
+
+__attribute__((cold,

[PATCH] Check always_inline flag in s390_can_inline_p [PR104327]

2022-02-06 Thread Andreas Krebbel via Gcc-patches

MASK_MVCLE is set for -Os but not for other optimization levels. In
general it should not make much sense to inline across calls where the
flag is different but we have to allow it for always_inline.

The patch also rearranges the hook implementation a bit based on the
recommendations from Jakub und Martin in the PR.

Bootstrapped and regression tested on s390x with various arch flags.
Will commit after giving a few days for comments.

gcc/ChangeLog:

PR target/104327
* config/s390/s390.cc (s390_can_inline_p): Accept a few more flags
if always_inline is set. Don't inline when tune differs without
always_inline.

gcc/testsuite/ChangeLog:

PR target/104327
* gcc.c-torture/compile/pr104327.c: New test.
---
 gcc/config/s390/s390.cc   | 66 ++-
 .../gcc.c-torture/compile/pr104327.c  | 15 +
 2 files changed, 64 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr104327.c

diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 5c2a830f9f0..bbf2dd8dfb4 100644
--- a/gcc/config/s390/s390.cc
+++ b/gcc/config/s390/s390.cc
@@ -16091,6 +16091,25 @@ s390_valid_target_attribute_p (tree fndecl,
 static bool
 s390_can_inline_p (tree caller, tree callee)
 {
+  unsigned HOST_WIDE_INT all_masks =
+(MASK_64BIT | MASK_BACKCHAIN | MASK_DEBUG_ARG | MASK_ZARCH
+ | MASK_HARD_DFP | MASK_SOFT_FLOAT
+ | MASK_OPT_HTM | MASK_LONG_DOUBLE_128 | MASK_MVCLE | MASK_PACKED_STACK
+ | MASK_SMALL_EXEC | MASK_OPT_VX | MASK_ZVECTOR);
+
+  /* Flags which if present in the callee are required in the caller as well.  
*/
+  unsigned HOST_WIDE_INT caller_required_masks = MASK_OPT_HTM;
+
+  /* Flags which affect the ABI and in general prevent inlining.  */
+  unsigned HOST_WIDE_INT must_match_masks =
+(MASK_64BIT | MASK_ZARCH | MASK_HARD_DFP | MASK_SOFT_FLOAT
+ | MASK_LONG_DOUBLE_128 | MASK_OPT_VX);
+
+  /* Flags which we in general want to prevent inlining but accept for
+ always_inline.  */
+  unsigned HOST_WIDE_INT always_inline_safe_masks =
+MASK_MVCLE | MASK_BACKCHAIN | MASK_SMALL_EXEC;
+
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
 
@@ -16103,16 +16122,18 @@ s390_can_inline_p (tree caller, tree callee)
 
   struct cl_target_option *caller_opts = TREE_TARGET_OPTION (caller_tree);
   struct cl_target_option *callee_opts = TREE_TARGET_OPTION (callee_tree);
-  bool ret = true;
 
-  if ((caller_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP))
-  != (callee_opts->x_target_flags & ~(MASK_SOFT_FLOAT | MASK_HARD_DFP)))
-ret = false;
+  /* If one of these triggers make sure to add proper handling of your
+ new flag to this hook.  */
+  gcc_assert (!(caller_opts->x_target_flags & ~all_masks));
+  gcc_assert (!(callee_opts->x_target_flags & ~all_masks));
 
-  /* Don't inline functions to be compiled for a more recent arch into a
- function for an older arch.  */
-  else if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
-ret = false;
+  bool always_inline
+= (DECL_DISREGARD_INLINE_LIMITS (callee)
+   && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)));
+
+  if (!always_inline)
+must_match_masks |= always_inline_safe_masks;
 
   /* Inlining a hard float function into a soft float function is only
  allowed if the hard float function doesn't actually make use of
@@ -16120,16 +16141,27 @@ s390_can_inline_p (tree caller, tree callee)
 
  We are called from FEs for multi-versioning call optimization, so
  beware of ipa_fn_summaries not available.  */
-  else if (((TARGET_SOFT_FLOAT_P (caller_opts->x_target_flags)
-&& !TARGET_SOFT_FLOAT_P (callee_opts->x_target_flags))
-   || (!TARGET_HARD_DFP_P (caller_opts->x_target_flags)
-   && TARGET_HARD_DFP_P (callee_opts->x_target_flags)))
-  && (! ipa_fn_summaries
-  || ipa_fn_summaries->get
-  (cgraph_node::get (callee))->fp_expressions))
-ret = false;
+  if (always_inline && ipa_fn_summaries
+  && !ipa_fn_summaries->get(cgraph_node::get (callee))->fp_expressions)
+must_match_masks &= ~(MASK_HARD_DFP | MASK_SOFT_FLOAT);
 
-  return ret;
+  if ((caller_opts->x_target_flags & must_match_masks)
+  != (callee_opts->x_target_flags & must_match_masks))
+return false;
+
+  if (~(caller_opts->x_target_flags & caller_required_masks)
+  & (callee_opts->x_target_flags & caller_required_masks))
+return false;
+
+  /* Don't inline functions to be compiled for a more recent arch into a
+ function for an older arch.  */
+  if (caller_opts->x_s390_arch < callee_opts->x_s390_arch)
+return false;
+
+  if (!always_inline && caller_opts->x_s390_tune != callee_opts->x_s390_tune)
+return false;
+
+  return true;
 }
 #endif
 
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr104327.c

Re: [PATCH][GCC11] IBM Z: fix `section type conflict` with -mindirect-branch-table

2022-02-02 Thread Andreas Krebbel via Gcc-patches

On 2/2/22 12:57, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for
> releases/gcc-11?
> 
> 
> 
> s390_code_end () puts indirect branch tables into separate sections and
> tries to switch back to wherever it was in the beginning by calling
> switch_to_section (current_function_section ()).
> 
> First of all, this is unnecessary - the other backends don't do it.
> 
> Furthermore, at this time there is no current function, but if the
> last processed function was cold, in_cold_section_p remains set.  This
> causes targetm.asm_out.function_section () to call
> targetm.section_type_flags (), which in absence of current function
> decl classifies the section as SECTION_WRITE.  This causes a section
> type conflict with the existing SECTION_CODE.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (s390_code_end): Do not switch back to
>   code section.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/nobp-section-type-conflict.c: New test.

Ok. Thanks!

Andreas

> 
> (cherry picked from commit 8753b13a31c777cdab0265dae0b68534247908f7)
> ---
>  gcc/config/s390/s390.c|  1 -
>  .../s390/nobp-section-type-conflict.c | 22 +++
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> 
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 8895dd7cc76..2d2e6522eb4 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -16700,7 +16700,6 @@ s390_code_end (void)
> assemble_name_raw (asm_out_file, label_start);
> fputs ("-.\n", asm_out_file);
>   }
> -   switch_to_section (current_function_section ());
>   }
>  }
>  }
> diff --git a/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c 
> b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> new file mode 100644
> index 000..5d78bc99bb5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> @@ -0,0 +1,22 @@
> +/* Checks that we don't get error: section type conflict with ‘put_page’.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-mindirect-branch=thunk-extern 
> -mfunction-return=thunk-extern -mindirect-branch-table -O2" } */
> +
> +int a;
> +int b (void);
> +void c (int);
> +
> +static void
> +put_page (void)
> +{
> +  if (b ())
> +c (a);
> +}
> +
> +__attribute__ ((__section__ (".init.text"), __cold__)) void
> +d (void)
> +{
> +  put_page ();
> +  put_page ();
> +}

Re: [PATCH] IBM Z: fix `section type conflict` with -mindirect-branch-table

2022-02-01 Thread Andreas Krebbel via Gcc-patches

On 2/1/22 21:49, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> s390_code_end () puts indirect branch tables into separate sections and
> tries to switch back to wherever it was in the beginning by calling
> switch_to_section (current_function_section ()).
> 
> First of all, this is unnecessary - the other backends don't do it.
> 
> Furthermore, at this time there is no current function, but if the
> last processed function was cold, in_cold_section_p remains set.  This
> causes targetm.asm_out.function_section () to call
> targetm.section_type_flags (), which in absence of current function
> decl classifies the section as SECTION_WRITE.  This causes a section
> type conflict with the existing SECTION_CODE.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_code_end): Do not switch back to
>   code section.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/nobp-section-type-conflict.c: New test.

Ok. Thanks!

Andreas


> ---
>  gcc/config/s390/s390.cc   |  1 -
>  .../s390/nobp-section-type-conflict.c | 22 +++
>  2 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> 
> diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
> index 43c5c72554a..2db12d4ba4b 100644
> --- a/gcc/config/s390/s390.cc
> +++ b/gcc/config/s390/s390.cc
> @@ -16809,7 +16809,6 @@ s390_code_end (void)
> assemble_name_raw (asm_out_file, label_start);
> fputs ("-.\n", asm_out_file);
>   }
> -   switch_to_section (current_function_section ());
>   }
>  }
>  }
> diff --git a/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c 
> b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> new file mode 100644
> index 000..5d78bc99bb5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/nobp-section-type-conflict.c
> @@ -0,0 +1,22 @@
> +/* Checks that we don't get error: section type conflict with ‘put_page’.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-mindirect-branch=thunk-extern 
> -mfunction-return=thunk-extern -mindirect-branch-table -O2" } */
> +
> +int a;
> +int b (void);
> +void c (int);
> +
> +static void
> +put_page (void)
> +{
> +  if (b ())
> +c (a);
> +}
> +
> +__attribute__ ((__section__ (".init.text"), __cold__)) void
> +d (void)
> +{
> +  put_page ();
> +  put_page ();
> +}

[PATCH] PR101260 regcprop: Add mode change check for copy reg

2022-01-21 Thread Andreas Krebbel via Gcc-patches

When propagating a multi-word register into an access with a smaller
mode the can_change_mode backend hook is already consulted for the
original register.  This however is also required for the intermediate
copy in copy_regno which might use a different register class.

Bootstrapped on x86_64 and s390x. No testsuite regressions.

Ok for mainline?

gcc/ChangeLog:

PR rtl-optimization/101260
* regcprop.cc (maybe_mode_change): Invoke mode_change_ok also for
copy_regno.
---
 gcc/regcprop.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
index 1a9bcf0a1ad..8e966f2b5ac 100644
--- a/gcc/regcprop.cc
+++ b/gcc/regcprop.cc
@@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode 
copy_mode,
 
   if (orig_mode == new_mode)
 return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+  && mode_change_ok (copy_mode, new_mode, copy_regno))
 {
   int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
   int use_nregs = hard_regno_nregs (copy_regno, new_mode);
-- 
2.34.1

Re: [PATCH] s390: Change costs for load on condition.

2022-01-21 Thread Andreas Krebbel via Gcc-patches

On 1/20/22 11:10, Robin Dapp wrote:
> Hi,
> 
> this patch is a follow-up patch to the recent ifcvt changes. It
> increased costs for a load on condition to 6.  This ensures that we
> if-convert sequences of three regular instructions (of cost 4) e.g. a
> compare and two SETs into two loads on condition (of cost 6).  With a
> cost of 5, four-insn sequences (three SETs) would also be if-converted.
> 
> The adjustment to the mov[qi/si]cc expander makes sure we if-convert a
> QImode/bool.  Before, combine would create a paradoxical subreg itself
> but need an additional insn.
> 
> Bootstrapped and regtested on s390x.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.cc (s390_rtx_costs): Increase costs for load
>   on condition.
>   * config/s390/s390.md: Change mov[qi/si]cc expander.

Could you please add two tests for the sequences which are improved here. Just 
to make sure we get
aware once it breaks again.

Patch is ok. Thanks!

Andreas

Re: [PATCH] s390: Split CCSmode into CCSINT and CCSFP

2022-01-21 Thread Andreas Krebbel via Gcc-patches

On 1/20/22 17:13, Robin Dapp wrote:
> Hi,
> 
> this patch splits the CCSmode into an integer and a floating point
> variant.  This allows ifcvt to consider floating point compares which
> would be rejected before because they could not be reversed.
> 
> Bootstrapped and regtested on s390x.
> 
> Is it OK?
> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
>   * config/s390/predicates.md: Add CCSINTmode and CCSFPmode.
>   * config/s390/s390-modes.def (UNORDERED): Likewise.
>   (CC_MODE): Likewise.
>   * config/s390/s390.cc (s390_cc_modes_compatible): Likewise.
>   (s390_match_ccmode_set): Likewise.
>   (s390_select_ccmode): Likewise.
>   (s390_branch_condition_mask): Likewise.
>   (s390_reverse_condition): Likewise.
>   * config/s390/s390.h (REVERSIBLE_CC_MODE): Likewise.
>   * config/s390/s390.md: Likewise.
>   * config/s390/subst.md: Likewise.

> diff --git a/gcc/config/s390/predicates.md b/gcc/config/s390/predicates.md
> index 33194d3f3d6..ec47416cc1b 100644
> --- a/gcc/config/s390/predicates.md
> +++ b/gcc/config/s390/predicates.md
> @@ -325,7 +325,8 @@
>  case E_CCURmode:
>return GET_CODE (op) == LTU;
>
> -case E_CCSmode:
> +case E_CCSINTmode:
> +case E_CCSFPmode:
>return GET_CODE (op) == UNGT;

Can we get an UNGT for CCSINTmode here? Shouldn't this be just GT?

>
>  case E_CCSRmode:
> @@ -370,7 +371,8 @@
>  case E_CCURmode:
>return GET_CODE (op) == GEU;
>
> -case E_CCSmode:
> +case E_CCSINTmode:
> +case E_CCSFPmode:
>return GET_CODE (op) == LE;
>
>  case E_CCSRmode:
> diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
> index b419907960e..eafe1e12938 100644
> --- a/gcc/config/s390/s390-modes.def
> +++ b/gcc/config/s390/s390-modes.def
> @@ -48,12 +48,12 @@ CCUR: EQ  GTU  LTU NE 
> (CLGF/R)
>
>  Signed compares
>
> -CCS:  EQ  LT   GT  UNORDERED  (LTGFR, LTGR, LTR, 
> ICM/Y,
> -   LTDBR, LTDR, LTEBR, 
> LTER,
> +CCSINT: EQLT   GT  UNORDERED  (LTGFR, LTGR, LTR, 
> ICM/Y,

CC3 for signed integer compares should not occur. So perhaps '-' instead of 
UNORDERED?

> CG/R, C/R/Y, CGHI, 
> CHI,
> -   CDB/R, CD/R, CEB/R, 
> CE/R,
> -   ADB/R, AEB/R, SDB/R, 
> SEB/R,
> SRAG, SRA, SRDA)
> +CCSFP:  EQLT   GT  UNORDERED  (CDB/R, CD/R, CEB/R, 
> CE/R,
> +   LTDBR, LTDR, LTEBR, 
> LTER,
> +   ADB/R, AEB/R, SDB/R, 
> SEB/R)
>  CCSR: EQ  GT   LT  UNORDERED  (CGF/R, CH/Y)
>  CCSFPS: EQLT   GT  UNORDERED  (KEB/R, KDB/R, KXBR, 
> KDTR,
>  KXTR, WFK)
...
> @@ -2139,7 +2148,8 @@ s390_branch_condition_mask (rtx code)
>   }
>break;
>
> -case E_CCSmode:
> +case E_CCSINTmode:
> +case E_CCSFPmode:
>  case E_CCSFPSmode:
>switch (GET_CODE (code))
>   {

We will need a new switch statement for CCSINT without all the FP only 
comparison operators.

Andreas

Re: [PATCH v2] Disable -fsplit-stack support on non-glibc targets

2022-01-20 Thread Andreas Krebbel via Gcc-patches

On 1/20/22 23:52, Richard Sandiford wrote:
> cc:ing the x86 and s390 maintainers
> 
> soeren--- via Gcc-patches  writes:
>> From: Sören Tempel 
>>
>> The -fsplit-stack option requires the pthread_t TCB definition in the
>> libc to provide certain struct fields at specific hardcoded offsets. As
>> far as I know, only glibc provides these fields at the required offsets.
>> Most notably, musl libc does not have these fields. However, since gcc
>> accesses the fields using a fixed offset, this does not cause a
>> compile-time error, but instead results in a silent memory corruption at
>> run-time with musl libc. For example, on s390x libgcc's
>> __stack_split_initialize CTOR will overwrite the cancel field in the
>> pthread_t TCB on musl.
>>
>> The -fsplit-stack option is used within the gcc code base itself by
>> gcc-go (if available). On musl-based systems with split-stack support
>> (i.e. s390x or x86) this causes Go programs compiled with gcc-go to
>> misbehave at run-time.
>>
>> This patch fixes gcc-go on musl by disabling -fsplit-stack in gcc itself
>> since it is not supported on non-glibc targets anyhow. This is achieved
>> by checking if gcc targets a glibc-based system. This check has been
>> added for x86 and s390x, the rs6000 config already checks for
>> TARGET_GLIBC_MAJOR. Other architectures do not have split-stack
>> support. With this patch applied, the gcc-go configure script will
>> detect that -fsplit-stack support is not available and will not use it.
>>
>> See https://www.openwall.com/lists/musl/2012/10/16/12
>>
>> This patch was written under the assumption that glibc is the only libc
>> implementation which supports the required fields at the required
>> offsets in the pthread_t TCB. The patch has been tested on Alpine Linux
>> Edge on the s390x and x86 architectures by bootstrapping Google's Go
>> implementation with gcc-go.
>>
>> Signed-off-by: Sören Tempel 
>>
>> gcc/ChangeLog:
>>
>>  * common/config/s390/s390-common.c (s390_supports_split_stack):
>>  Only support split-stack on glibc targets.
>>  * config/i386/gnu-user-common.h (STACK_CHECK_STATIC_BUILTIN): Ditto.
>>  * config/i386/gnu.h (defined): Ditto.

s390 parts are ok.

Thanks!

Andreas

>> ---
>> This version of the patch addresses feedback by Andrew Pinski and uses
>> OPTION_GLIBC as well as opts->x_linux_libc == LIBC_GLIBC to detect glibc
>> targets (instead of relying on TARGET_GLIBC_MAJOR).
>>
>>  gcc/common/config/s390/s390-common.c | 11 +--
>>  gcc/config/i386/gnu-user-common.h|  5 +++--
>>  gcc/config/i386/gnu.h|  6 +-
>>  3 files changed, 17 insertions(+), 5 deletions(-)
> 
> Sorry for the slow review.  The patch LGTM bar some minor formatting
> nits below, but target maintainers should have the final say.
> 
>> diff --git a/gcc/common/config/s390/s390-common.c 
>> b/gcc/common/config/s390/s390-common.c
>> index b6bc8501742..fc86e0bc5e7 100644
>> --- a/gcc/common/config/s390/s390-common.c
>> +++ b/gcc/common/config/s390/s390-common.c
>> @@ -116,13 +116,20 @@ s390_handle_option (struct gcc_options *opts 
>> ATTRIBUTE_UNUSED,
>>  
>>  /* -fsplit-stack uses a field in the TCB, available with glibc-2.23.
>> We don't verify it, since earlier versions just have padding at
>> -   its place, which works just as well.  */
>> +   its place, which works just as well. For other libc implementations
> 
> GCC style is to use 2 spaces after a full stop.  Same for the x86 part.
> 
>> +   we disable the feature entirely to avoid corrupting the TCB.  */
>>  
>>  static bool
>>  s390_supports_split_stack (bool report ATTRIBUTE_UNUSED,
>> struct gcc_options *opts ATTRIBUTE_UNUSED)
> 
> These parameters are no longer unused after the patch, so it'd be good
> to remove the attributes.
> 
>>  {
>> -  return true;
>> +  if (opts->x_linux_libc == LIBC_GLIBC) {
>> +return true;
>> +  } else {
>> +if (report)
>> +  error("%<-fsplit-stack%> currently only supported on GNU/Linux");
>> +return false;
>> +  }
> 
> Normal GCC formatting would be something like:
> 
>   if (opts->x_linux_libc == LIBC_GLIBC)
> return true;
> 
>   if (report)
> error ("%<-fsplit-stack%> currently only supported on GNU/Linux");
>   return false;
> 
> Sorry for the fussy rules.
> 
> Thanks,
> Richard
> 
>>  }
>>  
>>  #undef TARGET_DEFAULT_TARGET_FLAGS
>> diff --git a/gcc/config/i386/gnu-user-common.h 
>> b/gcc/config/i386/gnu-user-common.h
>> index 00226f5a455..6e13315b5a3 100644
>> --- a/gcc/config/i386/gnu-user-common.h
>> +++ b/gcc/config/i386/gnu-user-common.h
>> @@ -66,7 +66,8 @@ along with GCC; see the file COPYING3.  If not see
>>  #define STACK_CHECK_STATIC_BUILTIN 1
>>  
>>  /* We only build the -fsplit-stack support in libgcc if the
>> -   assembler has full support for the CFI directives.  */
>> -#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE
>> +   assembler has full support for the CFI directives and
>> +   targets glibc.  */
>> +#if HAVE_GAS_CFI_PERSONALITY_DIRECTIVE

Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-14 Thread Andreas Krebbel via Gcc-patches

On 1/14/22 20:41, Andreas Krebbel via Gcc-patches wrote:
> On 1/14/22 08:37, Richard Biener wrote:
> ...
>> Can the gist of this bug be put into the GCC bugzilla so the rev can
>> refer to it? 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034
> 
>> Can we have a testcase even?
> The testcase from Jakub is in the BZ. However, since it doesn't fail with 
> head I didn't try to
> include it in my patch.
> 
>> I'm not quite understanding the problem but is it that, say,
>>
>>  (subreg:DI (reg:V2DI ..) 0)
>>
>> isn't the same as
>>
>>  (lowpart:DI (reg:V2DI ...) 0)
> 
> (reg:DI v0) does not match the lower order bits of (reg:TI v0)
> 
>> ?  The regcprop code looks more like asking whether the larger reg
>> is a composition of multiple other hardregs and will return the specific
>> hardreg corresponding to the lowpart - so like if on s390 the vector
>> registers overlap with some other regset.  But then doing the actual
>> accesses via the other regset regs doesn't actually work?  Isn't the
>> backend then lying to us (aka the mode_change_ok returns the
>> wrong answer)?
> 
> can_change_mode_class should do the right thing. We return false in case 
> somebody wants to change TI
> to DI for a vector register. However, the hook never gets called like this 
> from regcprop. regcprop
> only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
> indeed ok.

After writing this I'm wondering whether this would be a better fix:

diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 18132425ab2..b6a3f4e3804 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -402,7 +402,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode 
copy_mode,

   if (orig_mode == new_mode)
 return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+   && mode_change_ok (copy_mode, new_mode, copy_regno))
 {
   int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
   int use_nregs = hard_regno_nregs (copy_regno, new_mode);


Andreas

Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-14 Thread Andreas Krebbel via Gcc-patches

On 1/14/22 08:37, Richard Biener wrote:
...
> Can the gist of this bug be put into the GCC bugzilla so the rev can
> refer to it? 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104034

> Can we have a testcase even?
The testcase from Jakub is in the BZ. However, since it doesn't fail with head 
I didn't try to
include it in my patch.

> I'm not quite understanding the problem but is it that, say,
> 
>  (subreg:DI (reg:V2DI ..) 0)
> 
> isn't the same as
> 
>  (lowpart:DI (reg:V2DI ...) 0)

(reg:DI v0) does not match the lower order bits of (reg:TI v0)

> ?  The regcprop code looks more like asking whether the larger reg
> is a composition of multiple other hardregs and will return the specific
> hardreg corresponding to the lowpart - so like if on s390 the vector
> registers overlap with some other regset.  But then doing the actual
> accesses via the other regset regs doesn't actually work?  Isn't the
> backend then lying to us (aka the mode_change_ok returns the
> wrong answer)?

can_change_mode_class should do the right thing. We return false in case 
somebody wants to change TI
to DI for a vector register. However, the hook never gets called like this from 
regcprop. regcprop
only asks whether it is ok to change (reg:TI r8) to (reg:DI r8) and that's 
indeed ok.

Before cprop we have:

(insn 175 176 174 3 (set (reg/v:TI 16 %f0 [orig:69 __comp ] [69])
(reg:TI 8 %r8)) -1
 (nil))

(insn 155 124 156 3 (set (reg:DI 6 %r6 [ __comp ])
(reg:DI 16 %f0)) 1277 {*movdi_64}
 (nil))

(insn 156 155 128 3 (set (reg:DI 7 %r7 [orig:69 __comp+8 ] [69])
(unspec:DI [
(reg:V2DI 16 %f0)
(const_int 1 [0x1])
] UNSPEC_VEC_EXTRACT)) 409 {*vec_extractv2di}
 (expr_list:REG_DEAD (reg:V2DI 16 %f0)
(nil)))

So a copy of reg pair r8/r9 is kept in v0==f0. The problem comes from cprop 
assuming that (reg:DI
f0) refers to the low part of f0 and as a consequence replaces (reg:DI 16 %f0) 
with (reg:DI 9 %r9)
what would be the DImode lowpart of (reg:TI r8)

Insn 155 and 156 are the result of applying the following splitter:

; Split a VR -> GPR TImode move into 2 vector load GR from VR element.
; For the higher order bits we do simply a DImode move while the
; second part is done via vec extract.  Both will end up as vlgvg.
(define_split
  [(set (match_operand:TI 0 "register_operand" "")
(match_operand:TI 1 "register_operand" ""))]
  "TARGET_VX && reload_completed
   && GENERAL_REG_P (operands[0])
   && VECTOR_REG_P (operands[1])"
  [(set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (unspec:DI [(match_dup 5) (const_int 1)]
 UNSPEC_VEC_EXTRACT))]
{
  operands[2] = operand_subword (operands[0], 0, 0, TImode);
  operands[3] = operand_subword (operands[0], 1, 0, TImode);
  operands[4] = gen_rtx_REG (DImode, REGNO (operands[1]));
  operands[5] = gen_rtx_REG (V2DImode, REGNO (operands[1]));
})

Introducing the (reg:DI 16 %f0) access to the TImode VR is something the middle 
end is not expected
to do - because we prevent it in can_change_mode_class. However, I don't see 
anything wrong with
doing that in the splitter. In our backend this is well-defined as being the 
first element in the
vector register - the high part of the TImode vector register value.

Unfortunately it confuses cprop :(

Andreas

> 
> How does the stage1 fix, aka "rewrite" of cprop, look like?  How can we
> be sure this hack isn't still present in 10 years from now?
> 
> Thanks,
> Richard.
> 
>> Bootstrapped and regression-tested on s390x.
>>
>> Ok?
>>
>> gcc/ChangeLog:
>>
>> * target.def (narrow_mode_refers_low_part_p): Add new target hook.
>> * config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
>> Implement new target hook for IBM Z.
>> (TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
>> * regcprop.c (maybe_mode_change): Disable transformation depending
>> on the new target hook.
>> ---
>>  gcc/config/s390/s390.c | 14 ++
>>  gcc/regcprop.c |  3 ++-
>>  gcc/target.def | 12 +++-
>>  3 files changed, 27 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
>> index 056002e4a4a..aafc6d63be6 100644
>> --- a/gcc/config/s390/s390.c
>> +++ b/gcc/config/s390/s390.c
>> @@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
>> machine_mode mode)
>>return false;
>>  }
>>
>> +/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
>> +
>> +static bool
>> +s390_narrow_mode_refers_low_part_p (unsigned int regno)
>> +{
>> +  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
>> +return false;
>> +
>> +  return true;
>> +}
>> +
>> +
>>  /* Implement TARGET_MODES_TIEABLE_P.  */
>>
>>  static bool
>> @@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, 
>> rtx target, rtx op0, rtx op1,
>>  #undef TARGET_VECTORIZE_VEC_PERM_CONST
>>  #define

Re: [PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-13 Thread Andreas Krebbel via Gcc-patches

On 1/13/22 18:11, Andreas Krebbel via Gcc-patches wrote:
...
> @@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being 
> done.  As long as the\n\
>  floating registers are not in class @code{GENERAL_REGS}, they will not\n\
>  be used unless some pattern's constraint asks for one.",
>   bool, (unsigned int regno, machine_mode mode),
> - hook_bool_uint_mode_true)
> + hook_bool_uint_true)
>  
>  DEFHOOK
>  (modes_tieable_p,

That hunk was a copy and paste bug and does not belong to the patch.

Andreas

[PATCH] cprop_hardreg: Workaround for narrow mode != lowpart targets

2022-01-13 Thread Andreas Krebbel via Gcc-patches

The cprop_hardreg pass is built around the assumption that accessing a
register in a narrower mode is the same as accessing the lowpart of
the register.  This unfortunately is not true for vector registers on
IBM Z. This caused a miscompile of LLVM with GCC 8.5. The problem
could not be reproduced with upstream GCC unfortunately but we have to
assume that it is latent there. The right fix would require
substantial changes to the cprop pass and is certainly something we
would want for our platform. But since this would not be acceptable
for older GCCs I'll go with what Vladimir proposed in the RedHat BZ
and introduce a hopefully temporary and undocumented target hook to
disable that specific transformation in regcprop.c.

Here the RedHat BZ for reference:
https://bugzilla.redhat.com/show_bug.cgi?id=2028609

Bootstrapped and regression-tested on s390x.

Ok?

gcc/ChangeLog:

* target.def (narrow_mode_refers_low_part_p): Add new target hook.
* config/s390/s390.c (s390_narrow_mode_refers_low_part_p):
Implement new target hook for IBM Z.
(TARGET_NARROW_MODE_REFERS_LOW_PART_P): New macro.
* regcprop.c (maybe_mode_change): Disable transformation depending
on the new target hook.
---
 gcc/config/s390/s390.c | 14 ++
 gcc/regcprop.c |  3 ++-
 gcc/target.def | 12 +++-
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 056002e4a4a..aafc6d63be6 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -10488,6 +10488,18 @@ s390_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
   return false;
 }
 
+/* Implement TARGET_NARROW_MODE_REFERS_LOW_PART_P.  */
+
+static bool
+s390_narrow_mode_refers_low_part_p (unsigned int regno)
+{
+  if (reg_classes_intersect_p (VEC_REGS, REGNO_REG_CLASS (regno)))
+return false;
+
+  return true;
+}
+
+
 /* Implement TARGET_MODES_TIEABLE_P.  */
 
 static bool
@@ -17472,6 +17484,8 @@ s390_vectorize_vec_perm_const (machine_mode vmode, rtx 
target, rtx op0, rtx op1,
 #undef TARGET_VECTORIZE_VEC_PERM_CONST
 #define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const
 
+#undef TARGET_NARROW_MODE_REFERS_LOW_PART_P
+#define TARGET_NARROW_MODE_REFERS_LOW_PART_P s390_narrow_mode_refers_low_part_p
 
 struct gcc_target targetm = TARGET_INITIALIZER;
 
diff --git a/gcc/regcprop.c b/gcc/regcprop.c
index 1a9bcf0a1ad..aaf94ad9b51 100644
--- a/gcc/regcprop.c
+++ b/gcc/regcprop.c
@@ -426,7 +426,8 @@ maybe_mode_change (machine_mode orig_mode, machine_mode 
copy_mode,
 
   if (orig_mode == new_mode)
 return gen_raw_REG (new_mode, regno);
-  else if (mode_change_ok (orig_mode, new_mode, regno))
+  else if (mode_change_ok (orig_mode, new_mode, regno)
+  && targetm.narrow_mode_refers_low_part_p (regno))
 {
   int copy_nregs = hard_regno_nregs (copy_regno, copy_mode);
   int use_nregs = hard_regno_nregs (copy_regno, new_mode);
diff --git a/gcc/target.def b/gcc/target.def
index 8fd2533e90a..598eea501ff 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -5446,6 +5446,16 @@ value that the middle-end intended.",
  bool, (machine_mode from, machine_mode to, reg_class_t rclass),
  hook_bool_mode_mode_reg_class_t_true)
 
+/* This hook is used to work around a problem in regcprop. Hardcoded
+assumptions currently prevent it from working correctly for targets
+where the low part of a multi-word register doesn't align to accessing
+the register with a narrower mode.  */
+DEFHOOK_UNDOC
+(narrow_mode_refers_low_part_p,
+"",
+bool, (unsigned int regno),
+hook_bool_unit_true)
+
 /* Change pseudo allocno class calculated by IRA.  */
 DEFHOOK
 (ira_change_pseudo_allocno_class,
@@ -5949,7 +5959,7 @@ register if floating point arithmetic is not being done.  
As long as the\n\
 floating registers are not in class @code{GENERAL_REGS}, they will not\n\
 be used unless some pattern's constraint asks for one.",
  bool, (unsigned int regno, machine_mode mode),
- hook_bool_uint_mode_true)
+ hook_bool_uint_true)
 
 DEFHOOK
 (modes_tieable_p,
-- 
2.33.1

Re: [PATCH] IBM Z: Fix load-and-test peephole2 condition

2021-11-19 Thread Andreas Krebbel via Gcc-patches

On 11/19/21 10:45, Stefan Schulze Frielinghaus wrote:
...
> diff --git a/gcc/testsuite/gcc.target/s390/2029.c 
> b/gcc/testsuite/gcc.target/s390/2029.c
> new file mode 100644
> index 000..1a6df4f4b89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/2029.c
> @@ -0,0 +1,12 @@
> +/* { dg-do run } */
> +/* { dg-options "-Os -march=z10" } */

Although z10 is pretty old we will need an effective target check here. Ok with 
that change.

Thanks!

Andreas

Re: [PATCH] Fix PR103028

2021-11-05 Thread Andreas Krebbel via Gcc-patches

On 11/5/21 20:34, Jeff Law wrote:
> 
> 
> On 11/5/2021 4:19 AM, Andreas Krebbel via Gcc-patches wrote:
>> This prevents find_cond_trap from being invoked after reload.  It may
>> generate compares which would require reloading.
>>
>> Bootstrapped and regression tested on s390x.
>>
>> Ok for mainline?
>>
>> gcc/ChangeLog:
>>
>>  PR rtl-optimization/103028
>>  * ifcvt.c (find_if_header): Invoke find_cond_trap only before
>>  reload.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  PR rtl-optimization/103028
>>  * gcc.dg/pr103028.c: New test.
> Shouldn't this be handled by the target by rejecting creating the trap 
> after reload has completed since the target seems to need new pseudos to 
> generate a conditional trap?  Otherwise we're penalizing targets which 
> don't need new pseudos to generate conditional traps.

In this case we do not explicitely create a new pseudo. It is rather that we 
emit a pattern which
would need to be handled be reload. I think passes which run after reload are 
not allowed to emit
patterns which would require reloading and it cannot be up to the backend to 
prevent this.

Instead of disabling this path after reload we could also try to check all the 
to be emitted insns
with constrain_operands to make sure at least one of the alternatives is an 
immediate match. This
should only reject cases which are really broken. I didn't try this because I 
haven't seen anything
like this in ifcvt.c while I have seen several places where we just bail out 
once reload_completed
is true.

Andreas

[PATCH] Fix PR103028

2021-11-05 Thread Andreas Krebbel via Gcc-patches

This prevents find_cond_trap from being invoked after reload.  It may
generate compares which would require reloading.

Bootstrapped and regression tested on s390x.

Ok for mainline?

gcc/ChangeLog:

PR rtl-optimization/103028
* ifcvt.c (find_if_header): Invoke find_cond_trap only before
reload.

gcc/testsuite/ChangeLog:

PR rtl-optimization/103028
* gcc.dg/pr103028.c: New test.
---
 gcc/ifcvt.c |  3 ++-
 gcc/testsuite/gcc.dg/pr103028.c | 16 
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103028.c

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 017944f4f79..1f5b9476ac2 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -4341,7 +4341,8 @@ find_if_header (basic_block test_bb, int pass)
   && cond_exec_find_if_block (_info))
 goto success;
 
-  if (targetm.have_trap ()
+  if (!reload_completed
+  && targetm.have_trap ()
   && optab_handler (ctrap_optab, word_mode) != CODE_FOR_nothing
   && find_cond_trap (test_bb, then_edge, else_edge))
 goto success;
diff --git a/gcc/testsuite/gcc.dg/pr103028.c b/gcc/testsuite/gcc.dg/pr103028.c
new file mode 100644
index 000..e299ac5d5b5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103028.c
@@ -0,0 +1,16 @@
+/* PR rtl-optimization/103028 */
+/* { dg-do compile } */
+/* { dg-options "-Og -fif-conversion2 -fharden-conditional-branches" } */
+
+/* This used to fail on s390x only with -march=z9-109 and -march=z9-ec */
+/* { dg-additional-options "-march=z9-ec" { target s390*-*-* } } */
+
+unsigned char x;
+int foo(void)
+{
+  unsigned long long i = x;
+  i = i + 0x8000;
+  if (i > 0x)
+return x;
+  return 0;
+}
-- 
2.31.1

[Committed] IBM Z: Define STACK_CHECK_MOVING_SP

2021-11-04 Thread Andreas Krebbel via Gcc-patches

With -fstack-check the stack probes emitted access memory below the
stack pointer.

Bootstrapped and regression tested on s390x.

Committed to mainline

gcc/ChangeLog:

* config/s390/s390.h (STACK_CHECK_MOVING_SP): New macro
definition.
---
 gcc/config/s390/s390.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index fb16a455a03..186c5c6200b 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -332,6 +332,11 @@ extern const char *s390_host_detect_local_cpu (int argc, 
const char **argv);
 
 #define STACK_SIZE_MODE (Pmode)
 
+/* Make the stack pointer to be moved downwards while issuing stack probes with
+   -fstack-check.  We need this to prevent memory below the stack pointer from
+   being accessed.  */
+#define STACK_CHECK_MOVING_SP 1
+
 #ifndef IN_LIBGCC2
 
 /* Width of a word, in units (bytes).  */
-- 
2.31.1

Re: [PATCH] IBM Z: Free bbs in s390_loop_unroll_adjust

2021-11-03 Thread Andreas Krebbel via Gcc-patches

On 11/2/21 18:31, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on IBM Z.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (s390_loop_unroll_adjust): In case of early
>   exit free bbs.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: ldist-{rawmemchr,strlen} tests require vector extensions

2021-11-02 Thread Andreas Krebbel via Gcc-patches

On 11/2/21 15:54, Stefan Schulze Frielinghaus wrote:
> The tests require vector extensions which are only available for z13 and
> later while using the z/Architecture.
> 
> Bootstrapped and regtested on IBM Z.  Ok for mainline?
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/ldist-rawmemchr-1.c: For IBM Z set arch to z13
>   and use z/Architecture since the tests require vector extensions.
>   * gcc.dg/tree-ssa/ldist-rawmemchr-2.c: Likewise.
>   * gcc.dg/tree-ssa/ldist-strlen-1.c: Likewise.
>   * gcc.dg/tree-ssa/ldist-strlen-3.c: Likewise.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Fix address of operands will never be NULL warnings

2021-11-02 Thread Andreas Krebbel via Gcc-patches

On 10/30/21 12:43, Stefan Schulze Frielinghaus wrote:
> Since a recent enhancement of -Waddress a couple of warnings are emitted
> and turned into errors during bootstrap:
> 
> gcc/config/s390/s390.md:12087:25: error: the address of 'operands' will never 
> be NULL [-Werror=address]
> 12087 |   "TARGET_HTM && operands != NULL
> build/gencondmd.c:59:12: note: 'operands' declared here
>59 | extern rtx operands[];
>   |^~~~
> 
> Fixed by removing those non-null checks.
> Bootstrapped and regtested on IBM Z.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md ("*cc_to_int", "tabort", "*tabort_1",
>   "*tabort_1_plus"): Remove operands non-null check.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Provide rawmemchr{qi,hi,si} expander

2021-10-08 Thread Andreas Krebbel via Gcc-patches

On 10/8/21 16:23, Stefan Schulze Frielinghaus wrote:
> On Thu, Oct 07, 2021 at 11:16:24AM +0200, Andreas Krebbel wrote:
>> On 9/20/21 11:24, Stefan Schulze Frielinghaus wrote:
>>> This patch implements the rawmemchr expander as introduced in
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579649.html
>>>
>>> Bootstrapped and regtested in conjunction with the patch from above on
>>> IBM Z.  Ok for mainline?
>>>
>>
>>> From 551362cda54048dc1a51588112f11c070ed52020 Mon Sep 17 00:00:00 2001
>>> From: Stefan Schulze Frielinghaus 
>>> Date: Mon, 8 Feb 2021 10:35:39 +0100
>>> Subject: [PATCH 2/2] IBM Z: Provide rawmemchr{qi,hi,si} expander
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/s390/s390-protos.h (s390_rawmemchrqi): Add prototype.
>>> (s390_rawmemchrhi): Add prototype.
>>> (s390_rawmemchrsi): Add prototype.
>>> * config/s390/s390.c (s390_rawmemchr): New function.
>>> (s390_rawmemchrqi): New function.
>>> (s390_rawmemchrhi): New function.
>>> (s390_rawmemchrsi): New function.
>>> * config/s390/s390.md (rawmemchr): New expander.
>>> (rawmemchr): New expander.
>>> * config/s390/vector.md (vec_vfees): Basically a copy of
>>> the pattern vfees from vx-builtins.md.
>>> * config/s390/vx-builtins.md (*vfees): Remove.
>>
>> Thanks! Would it make sense to also extend the strlen and movstr expanders
>> we have to support the additional character modes?
> 
> For strlen-like loops over non-character arrays the current
> implementation in the loop distribution pass uses rawmemchr and
> computes pointer difference in order to compute the length.  Thus we get
> strlen for free and don't need to reimplement it.

Good to know. Thanks!

...
> Please find a new version attached.  I did another bootstrap+regtest on
> IBM Z.  Ok for mainline?
> 
> Thanks for your detailed review!

Ok for mainline. Thanks!

Andreas

Re: [PATCH] IBM Z: Provide rawmemchr{qi,hi,si} expander

2021-10-07 Thread Andreas Krebbel via Gcc-patches

aa;  \
> +  /* unaligned + 3rd load */   \
> +  q[19] = pattern; \
> +  assert ((rawmemchr_##T ([2]) == [19]));  \
> +  q[19] = (T) 0x;  \
> +  /* unaligned + 4th load */   \
> +  q[25] = pattern; \
> +  assert ((rawmemchr_##T ([2]) == [25]));  \
> +  q[25] = (T) 0x;  \
> +  /* aligned + 1st load */ \
> +  q[5] = pattern;  \
> +  assert ((rawmemchr_##T ([0]) == [5]));   \
> +  q[5] = (T) 0x;   \
> +  /* aligned + 2nd load */ \
> +  q[14] = pattern; \
> +  assert ((rawmemchr_##T ([0]) == [14]));  \
> +  q[14] = (T) 0x;  \
> +  /* aligned + 3rd load */ \
> +  q[19] = pattern; \
> +  assert ((rawmemchr_##T ([0]) == [19]));  \
> +  q[19] = (T) 0x;  \
> +  /* aligned + 4th load */ \
> +  q[25] = pattern; \
> +  assert ((rawmemchr_##T ([0]) == [25]));  \
> +  q[25] = (T) 0x;  \
> +  free (buf);  \
> +}
> +
> +runT(int8_t, (int8_t)0xde)
> +runT(uint8_t, 0xde)
> +runT(int16_t, (int16_t)0xdead)
> +runT(uint16_t, 0xdead)
> +runT(int32_t, (int32_t)0xdeadbeef)
> +runT(uint32_t, 0xdeadbeef)
> +
> +int main (void)
> +{
> +  run_uint8_t ();
> +  run_int8_t ();
> +  run_uint16_t ();
> +  run_int16_t ();
> +  run_uint32_t ();
> +  run_int32_t ();
> +  return 0;
> +}
> --
> 2.31.1
>commit 221bc8cb66f74f0e0df932e2f755dad44f0d7637
Author: Andreas Krebbel 
Date:   Thu Oct 7 07:34:57 2021 +0200

rawmemchr improvements

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 0d9619e8254..87977b53bc2 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -66,9 +66,7 @@ s390_asm_declare_function_size (FILE *asm_out_file,
 const char *fnname ATTRIBUTE_UNUSED, tree decl);
 #endif
 
-extern void s390_rawmemchrqi(rtx dst, rtx src, rtx pat);
-extern void s390_rawmemchrhi(rtx dst, rtx src, rtx pat);
-extern void s390_rawmemchrsi(rtx dst, rtx src, rtx pat);
+extern void s390_rawmemchr(machine_mode elt_mode, rtx dst, rtx src, rtx pat);
 
 #ifdef RTX_CODE
 extern int s390_extra_constraint_str (rtx, int, const char *);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 2b826e2ebf0..773d4a7d805 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16569,18 +16569,16 @@ s390_excess_precision (enum excess_precision_type type)
 }
 #endif
 
-template 
-static void
-s390_rawmemchr(rtx dst, rtx src, rtx pat) {
+void
+s390_rawmemchr(machine_mode elt_mode, rtx dst, rtx src, rtx pat) {
+  machine_mode vec_mode = mode_for_vector (as_a  (elt_mode),
+	   16 / GET_MODE_SIZE (elt_mode)).require();
   rtx lens = gen_reg_rtx (V16QImode);
   rtx pattern = gen_reg_rtx (vec_mode);
   rtx loop_start = gen_label_rtx ();
   rtx loop_end = gen_label_rtx ();
   rtx addr = gen_reg_rtx (Pmode);
   rtx offset = gen_reg_rtx (Pmode);
-  rtx tmp = gen_reg_rtx (Pmode);
   rtx loadlen = gen_reg_rtx (SImode);
   rtx matchlen = gen_reg_rtx (SImode);
   rtx mem;
@@ -16594,20 +16592,21 @@ s390_rawmemchr(rtx dst, rtx src, rtx pat) {
   emit_insn (gen_vlbb (lens, gen_rtx_MEM (BLKmode, addr), GEN_INT (6)));
   emit_insn (gen_lcbb (loadlen, addr, GEN_INT (6)));
   lens = convert_to_mode (vec_mode, lens, 1);
-  emit_insn (gen_vec_vfees (lens, lens, pattern, GEN_INT (0)));
+  emit_insn (gen_vec_vfees (vec_mode, lens, lens, pattern, GEN_INT (0)));
   lens = convert_to_mode (V4SImode, lens, 1);
   emit_insn (gen_vec_extractv4sisi (matchlen, lens, GEN_INT (1)));
   lens = convert_to_mode (vec_mode, lens, 1);
   emit_cmp_and_jump_insns (matchlen, loadlen, LT, NULL_RTX, SImode, 1, loop_end);
-  force_expand_binop (Pmode, and_optab, addr, GEN_INT (15), tmp, 1, OPTAB_DIRECT);
-  force_expand_binop (Pmode, sub_optab, GEN_INT (16), tmp, tmp, 1, OPTAB_DIRECT);
-  force_expand_binop (Pmode, add_optab, addr, tmp, addr, 1, OPTAB_DIRECT);
+
+  force_expand_binop (Pmode, add_optab, addr, GEN_INT(16), addr, 1, OPTAB_DIRECT);
+  force_expand_binop (Pmode, and_optab, addr, GEN_INT(~HOST_WIDE_INT_UC(0xf)), addr, 1, OPTAB_DIRECT);
+
   // now, addr is 16-byte aligned
 
   mem = gen_rtx_MEM (vec_mode, addr);
   set_mem_align (mem, 128);
   emit_move_insn (lens, mem);
-  emit_insn (gen_vec_vfees (lens, lens, pattern, GEN_INT (VSTRING_FLAG_CS)));
+  emit_insn (gen_vec_vfees (vec_mode, lens, lens, pattern, GEN_INT (VSTRING_FLAG_CS)));
   add_int_reg_note (s390_emit_ccraw_jump (4,

Re: [PATCH gcc-11 0/2] Backport kpatch changes

2021-09-30 Thread Andreas Krebbel via Gcc-patches

On 9/30/21 10:50, Ilya Leoshkevich wrote:
> Hi,
> 
> This series contains a backport of kpatch changes needed to support
> https://github.com/dynup/kpatch/pull/1203 so that it could be used in
> RHEL 9.  The patches have been in master for 4 months now without
> issues.
> 
> Bootstrapped and regtested on s390x-redhat-linux.
> 
> Ok for gcc-11?

Ok for both. Thanks!

Andreas

[Committed] IBM Z: TPF: Add cc clobber to profiling expanders

2021-09-22 Thread Andreas Krebbel via Gcc-patches

The code sequence emitted uses CC internally.

gcc/ChangeLog:

* config/s390/tpf.md (prologue_tpf, epilogue_tpf): Add cc clobber.
---
 gcc/config/s390/tpf.md | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/s390/tpf.md b/gcc/config/s390/tpf.md
index 297e9d1f755..35b37190705 100644
--- a/gcc/config/s390/tpf.md
+++ b/gcc/config/s390/tpf.md
@@ -21,7 +21,8 @@ (define_insn "prologue_tpf"
   [(unspec_volatile [(match_operand 0 "const_int_operand" "J")
 (match_operand 1 "const_int_operand" "J")]
UNSPECV_TPF_PROLOGUE)
-   (clobber (reg:DI 1))]
+   (clobber (reg:DI 1))
+   (clobber (reg:CC CC_REGNUM))]
   "TARGET_TPF_PROFILING"
   "larl\t%%r1,.+14\;tm\t%0,255\;bnz\t%1"
   [(set_attr "length"   "14")])
@@ -31,7 +32,8 @@ (define_insn "epilogue_tpf"
   [(unspec_volatile [(match_operand 0 "const_int_operand" "J")
 (match_operand 1 "const_int_operand" "J")]
UNSPECV_TPF_EPILOGUE)
-   (clobber (reg:DI 1))]
+   (clobber (reg:DI 1))
+   (clobber (reg:CC CC_REGNUM))]
   "TARGET_TPF_PROFILING"
   "larl\t%%r1,.+14\;tm\t%0,255\;bnz\t%1"
   [(set_attr "length"   "14")])
-- 
2.31.1

[Committed] IBM Z: Fix PR102222

2021-09-22 Thread Andreas Krebbel via Gcc-patches

Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.

Bootstrapped and regression tested on s390x.

gcc/ChangeLog:

PR target/10
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr10.c: New test.
---
 gcc/config/s390/s390.c   | 10 ++
 gcc/testsuite/gcc.target/s390/pr10.c | 16 
 2 files changed, 26 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr10.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 54dd6332c3a..e04385451cf 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6414,6 +6414,15 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
   if (bitsize + bitpos > GET_MODE_BITSIZE (mode))
 return false;
 
+  /* Just a move.  */
+  if (bitpos == 0
+  && bitsize == GET_MODE_BITSIZE (GET_MODE (src))
+  && mode == GET_MODE (src))
+{
+  emit_move_insn (dest, src);
+  return true;
+}
+
   /* Generate INSERT IMMEDIATE (IILL et al).  */
   /* (set (ze (reg)) (const_int)).  */
   if (TARGET_ZARCH
@@ -6510,6 +6519,7 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
   && (bitpos & 32) == ((bitpos + bitsize - 1) & 32)
   && MEM_P (src)
   && (mode == DImode || mode == SImode)
+  && mode != smode
   && register_operand (dest, mode))
 {
   /* Emit a strict_low_part pattern if possible.  */
diff --git a/gcc/testsuite/gcc.target/s390/pr10.c 
b/gcc/testsuite/gcc.target/s390/pr10.c
new file mode 100644
index 000..47d075e47fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/pr10.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -m31 -mesa" } */
+
+struct squashfs_reg_inode_header_1 read_inode_inode;
+
+int read_inode_val;
+
+struct squashfs_reg_inode_header_1
+{
+  int file_size:32;
+} __attribute__((packed)) read_inode ();
+
+void foo (void)
+{
+  read_inode_inode.file_size = read_inode_val;
+}
-- 
2.31.1

[PATCH 1/5] IBM Z: Get rid of vec merge unspec

2021-07-29 Thread Andreas Krebbel via Gcc-patches

This patch gets rid of the unspecs we were using for the vector merge
instruction and replaces it with generic rtx.

gcc/ChangeLog:

* config/s390/s390-modes.def: Add more vector modes to support
concatenation of two vectors.
* config/s390/s390-protos.h (s390_expand_merge_perm_const): Add
prototype.
(s390_expand_merge): Likewise.
* config/s390/s390.c (s390_expand_merge_perm_const): New function.
(s390_expand_merge): New function.
* config/s390/s390.md (UNSPEC_VEC_MERGEH, UNSPEC_VEC_MERGEL):
Remove constant definitions.
* config/s390/vector.md (V_HW_2): Add mode iterators.
(VI_HW_4, V_HW_4): Rename VI_HW_4 to V_HW_4.
(vec_2x_nelts, vec_2x_wide): New mode attributes.
(*vmrhb, *vmrlb, *vmrhh, *vmrlh, *vmrhf, *vmrlf, *vmrhg, *vmrlg):
New pattern definitions.
(vec_widen_umult_lo_, vec_widen_umult_hi_)
(vec_widen_smult_lo_, vec_widen_smult_hi_)
(vec_unpacks_lo_v4sf, vec_unpacks_hi_v4sf, vec_unpacks_lo_v2df)
(vec_unpacks_hi_v2df): Adjust expanders to emit non-unspec RTX for
vec merge.
* config/s390/vx-builtins.md (V_HW_4): Remove mode iterator. Now
in vector.md.
(vec_mergeh, vec_mergel): Use s390_expand_merge to
emit vec merge pattern.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c:
Instead of vpdi with 0 and 5 vmrlg and vmrhg are used now.
* gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: Likewise.
* gcc.target/s390/zvector/vec-types.h: New test.
* gcc.target/s390/zvector/vec_merge.c: New test.
---
 gcc/config/s390/s390-modes.def|  11 +-
 gcc/config/s390/s390-protos.h |   2 +
 gcc/config/s390/s390.c|  36 
 gcc/config/s390/s390.md   |   2 -
 gcc/config/s390/vector.md | 204 +++---
 gcc/config/s390/vx-builtins.md|  35 ++-
 .../long-double-asm-in-out-hard-fp-reg.c  |   8 +-
 .../long-double-asm-inout-hard-fp-reg.c   |   6 +-
 .../gcc.target/s390/zvector/vec-types.h   |  37 
 .../gcc.target/s390/zvector/vec_merge.c   |  88 
 10 files changed, 367 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-types.h
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_merge.c

diff --git a/gcc/config/s390/s390-modes.def b/gcc/config/s390/s390-modes.def
index 6d814fc490c..245c2b811d4 100644
--- a/gcc/config/s390/s390-modes.def
+++ b/gcc/config/s390/s390-modes.def
@@ -259,14 +259,17 @@ CC_MODE (CCVFANY);
 
 /* Vector modes.  */
 
-VECTOR_MODES (INT, 2);/* V2QI */
-VECTOR_MODES (INT, 4);/*V4QI V2HI */
-VECTOR_MODES (INT, 8);/*   V8QI V4HI V2SI */
-VECTOR_MODES (INT, 16);   /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (INT, 2);/*   V2QI */
+VECTOR_MODES (INT, 4);/*  V4QI V2HI */
+VECTOR_MODES (INT, 8);/* V8QI V4HI V2SI */
+VECTOR_MODES (INT, 16);   /*   V16QI V8HI V4SI V2DI */
+VECTOR_MODES (INT, 32);   /* V32QI V16HI V8SI V4DI V2TI */
 
 VECTOR_MODE (FLOAT, SF, 2);   /* V2SF */
 VECTOR_MODE (FLOAT, SF, 4);   /* V4SF */
+VECTOR_MODE (FLOAT, SF, 8);   /* V8SF */
 VECTOR_MODE (FLOAT, DF, 2);   /* V2DF */
+VECTOR_MODE (FLOAT, DF, 4);   /* V4DF */
 
 VECTOR_MODE (INT, QI, 1); /* V1QI */
 VECTOR_MODE (INT, HI, 1); /* V1HI */
diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 289e018cf0f..4b03c6e99f5 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -122,6 +122,8 @@ extern void s390_expand_vec_compare_cc (rtx, enum rtx_code, 
rtx, rtx, bool);
 extern enum rtx_code s390_reverse_condition (machine_mode, enum rtx_code);
 extern void s390_expand_vcond (rtx, rtx, rtx, enum rtx_code, rtx, rtx);
 extern void s390_expand_vec_init (rtx, rtx);
+extern rtx s390_expand_merge_perm_const (machine_mode, bool);
+extern void s390_expand_merge (rtx, rtx, rtx, bool);
 extern rtx s390_build_signbit_mask (machine_mode);
 extern rtx s390_return_addr_rtx (int, rtx);
 extern rtx s390_back_chain_rtx (void);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1d3b99784d..b1a9ca9d8aa 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -7014,6 +7014,42 @@ s390_expand_vec_init (rtx target, rtx vals)
 }
 }
 
+/* Return a parallel of constant integers to be used as permutation
+   vector for a vector merge operation in MODE.  If HIGH_P is true the
+   left-most elements of the source vectors are merged otherwise the
+   right-most elements.  */
+rtx
+s390_expand_merge_perm_const (machine_mode mode, bool high_p)
+{
+  int nelts = GET_MODE_NUNITS (mode);
+  rtx perm[16];
+  int addend = high_p ? 0 : nelts;
+
+  for (int i = 0; i < nelts; i++)
+

[PATCH 3/5] IBM Z: Remove redundant V_HW_64 mode iterator.

2021-07-29 Thread Andreas Krebbel via Gcc-patches

gcc/ChangeLog:

* config/s390/vector.md (V_HW_64): Remove mode iterator.
(*vec_load_pair): Use V_HW_2 instead of V_HW_64.
* config/s390/vx-builtins.md
(vec_scatter_element_SI): Use V_HW_2 instead of
V_HW_64.
---
 gcc/config/s390/vector.md  |  7 +++
 gcc/config/s390/vx-builtins.md | 14 +++---
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 6a6370b5275..b372bf171f7 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -36,7 +36,6 @@ (define_mode_iterator V_HW  [V16QI V8HI V4SI V2DI (V1TI 
"TARGET_VXE") V2DF
 (define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF (V4SF "TARGET_VXE")
 (V1TF "TARGET_VXE") (TF "TARGET_VXE")])
 
-(define_mode_iterator V_HW_64 [V2DI V2DF])
 (define_mode_iterator VT_HW_HSDT [V8HI V4SI V4SF V2DI V2DF V1TI V1TF TI TF])
 (define_mode_iterator V_HW_HSD [V8HI V4SI (V4SF "TARGET_VXE") V2DI V2DF])
 
@@ -1972,9 +1971,9 @@ (define_expand "vec_cmp"
 })
 
 (define_insn "*vec_load_pair"
-  [(set (match_operand:V_HW_64   0 "register_operand" 
"=v,v")
-   (vec_concat:V_HW_64 (match_operand: 1 "register_operand"  
"d,v")
-   (match_operand: 2 "register_operand"  
"d,v")))]
+  [(set (match_operand:V_HW_2   0 "register_operand" 
"=v,v")
+   (vec_concat:V_HW_2 (match_operand: 1 "register_operand"  "d,v")
+  (match_operand: 2 "register_operand"  
"d,v")))]
   "TARGET_VX"
   "@
vlvgp\t%v0,%1,%2
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 3799e833187..3e7b8541887 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -452,17 +452,17 @@ (define_insn "vec_scatter_element_DI"
 
 ; A 31 bit target address is generated from 64 bit elements
 ; vsceg
-(define_insn "vec_scatter_element_SI"
+(define_insn "vec_scatter_element_SI"
   [(set (mem:
 (plus:SI (subreg:SI
-  (unspec: [(match_operand:V_HW_64 1 
"register_operand"   "v")
- (match_operand:QI  3 
"const_mask_operand" "C")]
+  (unspec: [(match_operand:V_HW_2 1 
"register_operand"   "v")
+ (match_operand:QI 3 
"const_mask_operand" "C")]
 UNSPEC_VEC_EXTRACT) 4)
- (match_operand:SI  2 
"address_operand"   "ZQ")))
-   (unspec: [(match_operand:V_HW_640 
"register_operand"   "v")
+ (match_operand:SI 2 
"address_operand"   "ZQ")))
+   (unspec: [(match_operand:V_HW_20 
"register_operand"   "v")
   (match_dup 3)] UNSPEC_VEC_EXTRACT))]
-  "TARGET_VX && !TARGET_64BIT && UINTVAL (operands[3]) < GET_MODE_NUNITS 
(mode)"
-  "vsce\t%v0,%O2(%v1,%R2),%3"
+  "TARGET_VX && !TARGET_64BIT && UINTVAL (operands[3]) < GET_MODE_NUNITS 
(mode)"
+  "vsce\t%v0,%O2(%v1,%R2),%3"
   [(set_attr "op_type" "VRV")])
 
 ; Element size and target address size is the same
-- 
2.31.1

[PATCH 2/5] IBM Z: Get rid of vpdi unspec

2021-07-29 Thread Andreas Krebbel via Gcc-patches

The patch gets rid of the unspec used for the vector permute double
immediate instruction and replaces it with generic rtx.

gcc/ChangeLog:

* config/s390/s390.md (UNSPEC_VEC_PERMI): Remove constant
definition.
* config/s390/vector.md (*vpdi1, *vpdi4): New pattern
definitions.
* config/s390/vx-builtins.md (*vec_permi): Emit generic rtx
instead of an unspec.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/vec-permi.c: Removed.
* gcc.target/s390/zvector/vec_permi.c: New test.
---
 gcc/config/s390/s390.md   |  1 -
 gcc/config/s390/vector.md | 26 
 gcc/config/s390/vx-builtins.md| 26 +++-
 .../gcc.target/s390/zvector/vec-permi.c   | 54 ---
 .../gcc.target/s390/zvector/vec_permi.c   | 66 +++
 5 files changed, 102 insertions(+), 71 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-permi.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_permi.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d896faee0fb..1b894a926ce 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -166,7 +166,6 @@ (define_c_enum "unspec" [
UNSPEC_VEC_PACK_UNSIGNED_SATURATE_CC
UNSPEC_VEC_PACK_UNSIGNED_SATURATE_GENCC
UNSPEC_VEC_PERM
-   UNSPEC_VEC_PERMI
UNSPEC_VEC_EXTEND
UNSPEC_VEC_STORE_LEN
UNSPEC_VEC_STORE_LEN_R
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 7507aec1c8e..6a6370b5275 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -767,6 +767,32 @@ (define_insn "*vec_perm"
   "vperm\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type" "VRR")])
 
+
+; First DW of op1 and second DW of op2
+(define_insn "*vpdi1"
+  [(set (match_operand:V_HW_2   0 "register_operand" "=v")
+   (vec_select:V_HW_2
+(vec_concat:
+ (match_operand:V_HW_2 1 "register_operand"  "v")
+ (match_operand:V_HW_2 2 "register_operand"  "v"))
+(parallel [(const_int 0) (const_int 3)])))]
+  "TARGET_VX"
+  "vpdi\t%v0,%v1,%v2,1"
+  [(set_attr "op_type" "VRR")])
+
+; Second DW of op1 and first of op2
+(define_insn "*vpdi4"
+  [(set (match_operand:V_HW_2   0 "register_operand" "=v")
+   (vec_select:V_HW_2
+(vec_concat:
+ (match_operand:V_HW_2 1 "register_operand"  "v")
+ (match_operand:V_HW_2 2 "register_operand"  "v"))
+(parallel [(const_int 1) (const_int 2)])))]
+  "TARGET_VX"
+  "vpdi\t%v0,%v1,%v2,4"
+  [(set_attr "op_type" "VRR")])
+
+
 (define_insn "*vmrhb"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
 (vec_select:V16QI
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 5abe43b9e53..3799e833187 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -403,28 +403,22 @@ (define_insn "vec_zperm"
   "vperm\t%v0,%v1,%v2,%v3"
   [(set_attr "op_type" "VRR")])
 
+; Incoming op3 is in vec_permi format and will we turned into a
+; permute vector consisting of op3 and op4.
 (define_expand "vec_permi"
-  [(set (match_operand:V_HW_64  0 "register_operand"   "")
-   (unspec:V_HW_64 [(match_operand:V_HW_64 1 "register_operand"   "")
-(match_operand:V_HW_64 2 "register_operand"   "")
-(match_operand:QI  3 "const_mask_operand" "")]
-   UNSPEC_VEC_PERMI))]
+  [(set (match_operand:V_HW_2   0 "register_operand" "")
+   (vec_select:V_HW_2
+(vec_concat:
+ (match_operand:V_HW_2 1 "register_operand" "")
+ (match_operand:V_HW_2 2 "register_operand" ""))
+(parallel [(match_operand:QI 3 "const_mask_operand" "") (match_dup 
4)])))]
   "TARGET_VX"
 {
   HOST_WIDE_INT val = INTVAL (operands[3]);
-  operands[3] = GEN_INT ((val & 1) | (val & 2) << 1);
+  operands[3] = GEN_INT ((val & 2) >> 1);
+  operands[4] = GEN_INT ((val & 1) + 2);
 })
 
-(define_insn "*vec_permi"
-  [(set (match_operand:V_HW_64  0 "register_operand"  "=v")
-   (unspec:V_HW_64 [(match_operand:V_HW_64 1 "register_operand"   "v")
-(match_operand:V_HW_64 2 "register_operand"   "v")
-(match_operand:QI  3 "const_mask_operand" "C")]
-   UNSPEC_VEC_PERMI))]
-  "TARGET_VX && (UINTVAL (operands[3]) & 10) == 0"
-  "vpdi\t%v0,%v1,%v2,%b3"
-  [(set_attr "op_type" "VRR")])
-
 
 ; Vector replicate
 
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-permi.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-permi.c
deleted file mode 100644
index c0a852b9703..000
--- a/gcc/testsuite/gcc.target/s390/zvector/vec-permi.c
+++ /dev/null
@@ -1,54 +0,0 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -march=z13 -mzarch --save-temps" } */
-/* { dg-do run { target { s390_z13_hw } } } */
-
-/*
- * The vector intrinsic vec_permi(a, b, c) chooses one of the two

[PATCH 0/5] IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-29 Thread Andreas Krebbel via Gcc-patches

This patchset, after some prep work, provides an initial
implementation of the TARGET_VECTORIZE_VEC_PERM_CONST hook for IBM Z.
Only the vmrh, vmrl, and vpdi instruction are exploited so far.  More
instructions will be added with follow-on patches.

Bootstrapped and regression tested on s390x.

As expected various occurrences of the vperm instruction get replaced
with vmr* and vpdi.

I'll commit the patches after giving it a few days for comments.

Andreas Krebbel (5):
  IBM Z: Get rid of vec merge unspec
  IBM Z: Get rid of vpdi unspec
  IBM Z: Remove redundant V_HW_64 mode iterator.
  IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vector merge
  IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vpdi

 gcc/config/s390/s390-modes.def|  11 +-
 gcc/config/s390/s390-protos.h |   2 +
 gcc/config/s390/s390.c| 191 ++
 gcc/config/s390/s390.md   |   3 -
 gcc/config/s390/vector.md | 238 +++---
 gcc/config/s390/vx-builtins.md|  75 +++---
 .../long-double-asm-in-out-hard-fp-reg.c  |   8 +-
 .../long-double-asm-inout-hard-fp-reg.c   |   6 +-
 .../gcc.target/s390/vector/perm-merge.c   | 104 
 .../gcc.target/s390/vector/perm-vpdi.c|  49 
 .../gcc.target/s390/vector/vec-types.h|  35 +++
 .../gcc.target/s390/zvector/vec-permi.c   |  54 
 .../gcc.target/s390/zvector/vec-types.h   |  37 +++
 .../gcc.target/s390/zvector/vec_merge.c   |  88 +++
 .../gcc.target/s390/zvector/vec_permi.c   |  66 +
 15 files changed, 822 insertions(+), 145 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/perm-merge.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/perm-vpdi.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-types.h
 delete mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-permi.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-types.h
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_merge.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_permi.c

-- 
2.31.1

[PATCH 5/5] IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vpdi

2021-07-29 Thread Andreas Krebbel via Gcc-patches

This patch makes use of the vector permute double immediate
instruction for constant permute vectors.

gcc/ChangeLog:

* config/s390/s390.c (expand_perm_with_vpdi): New function.
(vectorize_vec_perm_const_1): Call expand_perm_with_vpdi.
* config/s390/vector.md (*vpdi1, @vpdi1): Enable a
parameterized expander.
(*vpdi4, @vpdi4): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/perm-vpdi.c: New test.
---
 gcc/config/s390/s390.c| 47 ++
 gcc/config/s390/vector.md |  5 +-
 .../gcc.target/s390/vector/perm-vpdi.c| 49 +++
 3 files changed, 98 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/perm-vpdi.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 684241b00b8..20c52c83c72 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16981,6 +16981,50 @@ expand_perm_with_merge (const struct expand_vec_perm_d 
)
   return merge_lo_p || merge_hi_p;
 }
 
+/* Try to expand the vector permute operation described by D using the
+   vector permute doubleword immediate instruction vpdi.  Return true
+   if vpdi could be used.
+
+   VPDI allows 4 different immediate values (0, 1, 4, 5). The 0 and 5
+   cases are covered by vmrhg and vmrlg already.  So we only care
+   about the 1, 4 cases here.
+   1 - First element of src1 and second of src2
+   4 - Second element of src1 and first of src2  */
+static bool
+expand_perm_with_vpdi (const struct expand_vec_perm_d )
+{
+  bool vpdi1_p = false;
+  bool vpdi4_p = false;
+  rtx op0_reg, op1_reg;
+
+  // Only V2DI and V2DF are supported here.
+  if (d.nelt != 2)
+return false;
+
+  if (d.perm[0] == 0 && d.perm[1] == 3)
+vpdi1_p = true;
+
+  if (d.perm[0] == 1 && d.perm[1] == 2)
+vpdi4_p = true;
+
+  if (!vpdi1_p && !vpdi4_p)
+return false;
+
+  if (d.testing_p)
+return true;
+
+  op0_reg = force_reg (GET_MODE (d.op0), d.op0);
+  op1_reg = force_reg (GET_MODE (d.op1), d.op1);
+
+  if (vpdi1_p)
+emit_insn (gen_vpdi1 (d.vmode, d.target, op0_reg, op1_reg));
+
+  if (vpdi4_p)
+emit_insn (gen_vpdi4 (d.vmode, d.target, op0_reg, op1_reg));
+
+  return true;
+}
+
 /* Try to find the best sequence for the vector permute operation
described by D.  Return true if the operation could be
expanded.  */
@@ -16990,6 +17034,9 @@ vectorize_vec_perm_const_1 (const struct 
expand_vec_perm_d )
   if (expand_perm_with_merge (d))
 return true;
 
+  if (expand_perm_with_vpdi (d))
+return true;
+
   return false;
 }
 
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index b372bf171f7..1b0ae47ab49 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -768,7 +768,7 @@ (define_insn "*vec_perm"
 
 
 ; First DW of op1 and second DW of op2
-(define_insn "*vpdi1"
+(define_insn "@vpdi1"
   [(set (match_operand:V_HW_2   0 "register_operand" "=v")
(vec_select:V_HW_2
 (vec_concat:
@@ -780,7 +780,7 @@ (define_insn "*vpdi1"
   [(set_attr "op_type" "VRR")])
 
 ; Second DW of op1 and first of op2
-(define_insn "*vpdi4"
+(define_insn "@vpdi4"
   [(set (match_operand:V_HW_2   0 "register_operand" "=v")
(vec_select:V_HW_2
 (vec_concat:
@@ -926,7 +926,6 @@ (define_insn_and_split "tf_to_fprx2"
   operands[5] = simplify_gen_subreg (DFmode, operands[1], TFmode, 8);
 })
 
-; vec_perm_const for V2DI using vpdi?
 
 ;;
 ;; Vector integer arithmetic instructions
diff --git a/gcc/testsuite/gcc.target/s390/vector/perm-vpdi.c 
b/gcc/testsuite/gcc.target/s390/vector/perm-vpdi.c
new file mode 100644
index 000..cc925315b37
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/perm-vpdi.c
@@ -0,0 +1,49 @@
+/* { dg-do run { target { s390*-*-* } } } */
+/* { dg-options "-O3 -mzarch -march=z14 -mzvector --save-temps" } */
+
+/* { dg-final { scan-assembler-times "\tvmrhg\t" 3 } } */
+/* { dg-final { scan-assembler-times "\tvmrlg\t" 3 } } */
+/* { dg-final { scan-assembler-times "\tvpdi\t" 6 } } */
+
+#include "vec-types.h"
+#include 
+
+#define GEN_PERMI_BITS(VEC_TYPE, BITS) \
+  VEC_TYPE __attribute__((noinline))   \
+  permi_##BITS##_##VEC_TYPE(VEC_TYPE a, VEC_TYPE b) {  \
+return (VEC_TYPE){a[((BITS) & 2) >> 1], b[(BITS) & 1] }; }
+
+#define GEN_PERMI(VEC_TYPE)\
+  GEN_PERMI_BITS(VEC_TYPE, 0); \
+  GEN_PERMI_BITS(VEC_TYPE, 1); \
+  GEN_PERMI_BITS(VEC_TYPE, 2); \
+  GEN_PERMI_BITS(VEC_TYPE, 3); \
+
+GEN_PERMI(v2di)
+GEN_PERMI(uv2di)
+GEN_PERMI(v2df)
+
+
+#define CHECK_PERMI_BITS(VEC_TYPE, BITS)   \
+  VEC_TYPE r##BITS = permi_##BITS##_##VEC_TYPE (a, b); \
+  if (r##BITS[0] != ((BITS) & 2) >> 1  \
+  || r##BITS[1] != ((BITS) & 1) + 2)   \
+__builtin_abort();
+
+#define CHECK_PERMI(VEC_TYPE)  \
+  {

[PATCH 4/5] IBM Z: Implement TARGET_VECTORIZE_VEC_PERM_CONST for vector merge

2021-07-29 Thread Andreas Krebbel via Gcc-patches

This patch implements the TARGET_VECTORIZE_VEC_PERM_CONST in the IBM Z
backend. The initial implementation only exploits the vector merge
instruction but there is more to come.

gcc/ChangeLog:

* config/s390/s390.c (MAX_VECT_LEN): Define macro.
(struct expand_vec_perm_d): Define struct.
(expand_perm_with_merge): New function.
(vectorize_vec_perm_const_1): New function.
(s390_vectorize_vec_perm_const): New function.
(TARGET_VECTORIZE_VEC_PERM_CONST): Define target macro.

gcc/testsuite/ChangeLog:

* gcc.target/s390/vector/perm-merge.c: New test.
* gcc.target/s390/vector/vec-types.h: New test.
---
 gcc/config/s390/s390.c| 108 ++
 .../gcc.target/s390/vector/perm-merge.c   | 104 +
 .../gcc.target/s390/vector/vec-types.h|  35 ++
 3 files changed, 247 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/perm-merge.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/vec-types.h

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index b1a9ca9d8aa..684241b00b8 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16928,6 +16928,110 @@ s390_md_asm_adjust (vec , vec 
,
   return after_md_seq;
 }
 
+#define MAX_VECT_LEN   16
+
+struct expand_vec_perm_d
+{
+  rtx target, op0, op1;
+  unsigned char perm[MAX_VECT_LEN];
+  machine_mode vmode;
+  unsigned char nelt;
+  bool testing_p;
+};
+
+/* Try to expand the vector permute operation described by D using the
+   vector merge instructions vml and vmh.  Return true if vector merge
+   could be used.  */
+static bool
+expand_perm_with_merge (const struct expand_vec_perm_d )
+{
+  bool merge_lo_p = true;
+  bool merge_hi_p = true;
+
+  if (d.nelt % 2)
+return false;
+
+  // For V4SI this checks for: { 0, 4, 1, 5 }
+  for (int telt = 0; telt < d.nelt; telt++)
+if (d.perm[telt] != telt / 2 + (telt % 2) * d.nelt)
+  {
+   merge_hi_p = false;
+   break;
+  }
+
+  if (!merge_hi_p)
+{
+  // For V4SI this checks for: { 2, 6, 3, 7 }
+  for (int telt = 0; telt < d.nelt; telt++)
+   if (d.perm[telt] != (telt + d.nelt) / 2 + (telt % 2) * d.nelt)
+ {
+   merge_lo_p = false;
+   break;
+ }
+}
+  else
+merge_lo_p = false;
+
+  if (d.testing_p)
+return merge_lo_p || merge_hi_p;
+
+  if (merge_lo_p || merge_hi_p)
+s390_expand_merge (d.target, d.op0, d.op1, merge_hi_p);
+
+  return merge_lo_p || merge_hi_p;
+}
+
+/* Try to find the best sequence for the vector permute operation
+   described by D.  Return true if the operation could be
+   expanded.  */
+static bool
+vectorize_vec_perm_const_1 (const struct expand_vec_perm_d )
+{
+  if (expand_perm_with_merge (d))
+return true;
+
+  return false;
+}
+
+/* Return true if we can emit instructions for the constant
+   permutation vector in SEL.  If OUTPUT, IN0, IN1 are non-null the
+   hook is supposed to emit the required INSNs.  */
+
+bool
+s390_vectorize_vec_perm_const (machine_mode vmode, rtx target, rtx op0, rtx 
op1,
+  const vec_perm_indices )
+{
+  struct expand_vec_perm_d d;
+  unsigned char perm[MAX_VECT_LEN];
+  unsigned int i, nelt;
+
+  if (!s390_vector_mode_supported_p (vmode) || GET_MODE_SIZE (vmode) != 16)
+return false;
+
+  d.target = target;
+  d.op0 = op0;
+  d.op1 = op1;
+
+  d.vmode = vmode;
+  gcc_assert (VECTOR_MODE_P (d.vmode));
+  d.nelt = nelt = GET_MODE_NUNITS (d.vmode);
+  d.testing_p = target == NULL_RTX;
+
+  gcc_assert (target == NULL_RTX || REG_P (target));
+  gcc_assert (sel.length () == nelt);
+  gcc_checking_assert (sizeof (d.perm) == sizeof (perm));
+
+  for (i = 0; i < nelt; i++)
+{
+  unsigned char e = sel[i];
+  gcc_assert (e < 2 * nelt);
+  d.perm[i] = e;
+  perm[i] = e;
+}
+
+  return vectorize_vec_perm_const_1 (d);
+}
+
 /* Initialize GCC target structure.  */
 
 #undef  TARGET_ASM_ALIGNED_HI_OP
@@ -17238,6 +17342,10 @@ s390_md_asm_adjust (vec , vec 
,
 #undef TARGET_MD_ASM_ADJUST
 #define TARGET_MD_ASM_ADJUST s390_md_asm_adjust
 
+#undef TARGET_VECTORIZE_VEC_PERM_CONST
+#define TARGET_VECTORIZE_VEC_PERM_CONST s390_vectorize_vec_perm_const
+
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-s390.h"
diff --git a/gcc/testsuite/gcc.target/s390/vector/perm-merge.c 
b/gcc/testsuite/gcc.target/s390/vector/perm-merge.c
new file mode 100644
index 000..51b23ddd886
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/vector/perm-merge.c
@@ -0,0 +1,104 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mzarch -march=z14 -mzvector --save-temps" } */
+/* { dg-do run { target { s390_z14_hw } } } */
+
+/* { dg-final { scan-assembler-times "\tvmrhb\t" 2 } } */
+/* { dg-final { scan-assembler-times "\tvmrlb\t" 2 } } */
+/* { dg-final { scan-assembler-times "\tvmrhh\t" 2 } } */
+/* { dg-final { scan-assembler-times "\tvmrlh\t" 2 } } */
+/* { dg-final {

Re: [PATCH] IBM Z: Fix 5 tests in 31-bit mode

2021-07-28 Thread Andreas Krebbel via Gcc-patches

On 7/23/21 2:47 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/global-array-element-pic2.c: Add -mzarch, add
>   an expectation for 31-bit mode.
>   * gcc.target/s390/load-imm64-1.c: Use unsigned long long.
>   * gcc.target/s390/load-imm64-2.c: Likewise.
>   * gcc.target/s390/vector/long-double-vx-macro-off-on.c: Use
>   -mzarch.
>   * gcc.target/s390/vector/long-double-vx-macro-on-off.c:
>   Likewise.

Ok. Thanks!

Andreas

Re: [PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Andreas Krebbel via Gcc-patches

On 7/28/21 9:43 AM, Richard Biener wrote:
> On Wed, Jul 28, 2021 at 8:44 AM Andreas Krebbel via Gcc-patches
>  wrote:
>>
>> There are also memory operands passed for in0 and in1.
>>
>> Ok for mainline?
> 
> They can also be constant vectors, I'd just not specify the operand
> kind - usually
> expanders are not limited as to what they feed down.

Right, I'll just replace "registers" with "operands" then. Ok?

 also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are\n\
 the source vectors and @var{out} is the destination vector; all three are\n\
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
+operands of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
 @var{sel} describes a permutation on one vector instead of two.\n\
 \n\
 Return true if the operation is possible, emitting instructions for it\n\

Andreas

Re: [PATCH] IBM Z: Enable LSan and TSan

2021-07-28 Thread Andreas Krebbel via Gcc-patches

On 7/27/21 10:04 PM, Ilya Leoshkevich via Gcc-patches wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> libsanitizer/ChangeLog:
> 
>   * configure.tgt (s390*-*-linux*): Enable LSan and TSan for
>   s390x.

Ok. Thanks!

Andreas

[PATCH] Adjust docu of TARGET_VECTORIZE_VEC_PERM_CONST

2021-07-28 Thread Andreas Krebbel via Gcc-patches

There are also memory operands passed for in0 and in1.

Ok for mainline?

gcc/ChangeLog:

* target.def: Describe in0 and in1 as being either register or
memory operands.
* doc/tm.texi: Regenerate.
---
 gcc/doc/tm.texi | 7 ---
 gcc/target.def  | 7 ---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index c8f4abe3e41..31f188daf00 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6124,9 +6124,10 @@ This hook is used to test whether the target can permute 
up to two
 vectors of mode @var{mode} using the permutation vector @code{sel}, and
 also to emit such a permutation.  In the former case @var{in0}, @var{in1}
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are
-the source vectors and @var{out} is the destination vector; all three are
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if
-@var{sel} describes a permutation on one vector instead of two.
+the source vectors and @var{out} is the destination vector.  The destination
+vector is a register of mode @var{mode} while the source vectors can be either
+register or memory operands of mode @var{mode}.  @var{in1} is the same as
+@var{in0} if @var{sel} describes a permutation on one vector instead of two.
 
 Return true if the operation is possible, emitting instructions for it
 if rtxes are provided.
diff --git a/gcc/target.def b/gcc/target.def
index 2e40448e6c5..b368d81be63 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1860,9 +1860,10 @@ DEFHOOK
 vectors of mode @var{mode} using the permutation vector @code{sel}, and\n\
 also to emit such a permutation.  In the former case @var{in0}, @var{in1}\n\
 and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are\n\
-the source vectors and @var{out} is the destination vector; all three are\n\
-registers of mode @var{mode}.  @var{in1} is the same as @var{in0} if\n\
-@var{sel} describes a permutation on one vector instead of two.\n\
+the source vectors and @var{out} is the destination vector.  The destination\n\
+vector is a register of mode @var{mode} while the source vectors can be 
either\n\
+register or memory operands of mode @var{mode}.  @var{in1} is the same as\n\
+@var{in0} if @var{sel} describes a permutation on one vector instead of two.\n\
 \n\
 Return true if the operation is possible, emitting instructions for it\n\
 if rtxes are provided.\n\
-- 
2.31.1

Re: [PATCH v3] IBM Z: Use @PLT symbols for local functions in 64-bit mode

2021-07-16 Thread Andreas Krebbel via Gcc-patches

On 7/12/21 9:23 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573614.html
> v1 -> v2: Do not use UNSPEC_PLT in 64-bit code and rename it to
>   UNSPEC_PLT31 (Ulrich, Andreas).  Do not append @PLT only to
>   weak symbols in non-PIC code (Ulrich).  Add TLS tests.
> 
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574646.html
> v2 -> v3: Use %K in function_profiler() and s390_output_mi_thunk(),
>   add tests for these cases.
> 
> 
> 
> This helps with generating code for kernel hotpatches, which contain
> individual functions and are loaded more than 2G away from vmlinux.
> This should not create performance regressions for the normal use
> cases, because for local functions ld replaces @PLT calls with direct
> calls.
> 
> gcc/ChangeLog:
> 
>   * config/s390/predicates.md (bras_sym_operand): Accept all
>   functions in 64-bit mode, use UNSPEC_PLT31.
>   (larl_operand): Use UNSPEC_PLT31.
>   * config/s390/s390.c (s390_loadrelative_operand_p): Likewise.
>   (legitimize_pic_address): Likewise.
>   (s390_emit_tls_call_insn): Mark __tls_get_offset as function,
>   use UNSPEC_PLT31.
>   (s390_delegitimize_address): Use UNSPEC_PLT31.
>   (s390_output_addr_const_extra): Likewise.
>   (print_operand): Add @PLT to TLS calls, handle %K.
>   (s390_function_profiler): Mark __fentry__/_mcount as function,
>   use %K, use UNSPEC_PLT31.
>   (s390_output_mi_thunk): Use only UNSPEC_GOT, use %K.
>   (s390_emit_call): Use UNSPEC_PLT31.
>   (s390_emit_tpf_eh_return): Mark __tpf_eh_return as function.
>   * config/s390/s390.md (UNSPEC_PLT31): Rename from UNSPEC_PLT.
>   (*movdi_64): Use %K.
>   (reload_base_64): Likewise.
>   (*sibcall_brc): Likewise.
>   (*sibcall_brcl): Likewise.
>   (*sibcall_value_brc): Likewise.
>   (*sibcall_value_brcl): Likewise.
>   (*bras): Likewise.
>   (*brasl): Likewise.
>   (*bras_r): Likewise.
>   (*brasl_r): Likewise.
>   (*bras_tls): Likewise.
>   (*brasl_tls): Likewise.
>   (main_base_64): Likewise.
>   (reload_base_64): Likewise.
>   (@split_stack_call): Likewise.

Ok. Thanks!

Andreas

Re: [PATCH v2] IBM Z: Define NO_PROFILE_COUNTERS

2021-06-23 Thread Andreas Krebbel via Gcc-patches

On 6/24/21 12:42 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573348.html
> v1 -> v2: Use ATTRIBUTE_UNUSED, compact op[] array (Andreas).
>   I've also noticed that one of the nops that we generate for
>   -mnop-mcount is not needed now and removed it.  A couple
>   tests needed to be adjusted after that.
> 
> 
> 
> 
> s390 glibc does not need counters in the .data section, since it stores
> edge hits in its own data structure.  Therefore counters only waste
> space and confuse diffing tools (e.g. kpatch), so don't generate them.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (s390_function_profiler): Ignore labelno
>   parameter.
>   * config/s390/s390.h (NO_PROFILE_COUNTERS): Define.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/mnop-mcount-m31-mzarch.c: Adapt to the new
>   prologue size.
>   * gcc.target/s390/mnop-mcount-m64.c: Likewise.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Define NO_PROFILE_COUNTERS

2021-06-22 Thread Andreas Krebbel via Gcc-patches

On 6/22/21 12:20 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> s390 glibc does not need counters in the .data section, since it stores
> edge hits in its own data structure.  Therefore counters only waste
> space and confuse diffing tools (e.g. kpatch), so don't generate them.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (s390_function_profiler): Ignore labelno
>   parameter.
>   * config/s390/s390.h (NO_PROFILE_COUNTERS): Define.

Just two minor nits below. Ok with these changes. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.c | 14 ++
>  gcc/config/s390/s390.h |  2 ++
>  2 files changed, 4 insertions(+), 12 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 6bbeb640e1f..96c9a9db53b 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -13110,17 +13110,13 @@ output_asm_nops (const char *user, int hw)
>  }
>  }
>  
> -/* Output assembler code to FILE to increment profiler label # LABELNO
> -   for profiling a function entry.  */
> +/* Output assembler code to FILE to call a profiler hook.  */
>  
>  void
> -s390_function_profiler (FILE *file, int labelno)
> +s390_function_profiler (FILE *file, int /* labelno */)

ATTRIBUTE_UNUSED?

>  {
>rtx op[8];
>  
> -  char label[128];
> -  ASM_GENERATE_INTERNAL_LABEL (label, "LP", labelno);
> -
>fprintf (file, "# function profiler \n");
>  
>op[0] = gen_rtx_REG (Pmode, RETURN_REGNUM);
> @@ -13128,10 +13124,6 @@ s390_function_profiler (FILE *file, int labelno)
>op[1] = gen_rtx_MEM (Pmode, plus_constant (Pmode, op[1], UNITS_PER_LONG));
>op[7] = GEN_INT (UNITS_PER_LONG);
>  
> -  op[2] = gen_rtx_REG (Pmode, 1);
> -  op[3] = gen_rtx_SYMBOL_REF (Pmode, label);
> -  SYMBOL_REF_FLAGS (op[3]) = SYMBOL_FLAG_LOCAL;
> -

Shouldn't we remove these two slots from the op array and renumber the 
subsequent entries then?

>op[4] = gen_rtx_SYMBOL_REF (Pmode, flag_fentry ? "__fentry__" : "_mcount");
>if (flag_pic)
>  {
> @@ -13162,7 +13154,6 @@ s390_function_profiler (FILE *file, int labelno)
> output_asm_insn ("stg\t%0,%1", op);
> if (flag_dwarf2_cfi_asm)
>   output_asm_insn (".cfi_rel_offset\t%0,%7", op);
> -   output_asm_insn ("larl\t%2,%3", op);
> output_asm_insn ("brasl\t%0,%4", op);
> output_asm_insn ("lg\t%0,%1", op);
> if (flag_dwarf2_cfi_asm)
> @@ -13179,7 +13170,6 @@ s390_function_profiler (FILE *file, int labelno)
> output_asm_insn ("st\t%0,%1", op);
> if (flag_dwarf2_cfi_asm)
>   output_asm_insn (".cfi_rel_offset\t%0,%7", op);
> -   output_asm_insn ("larl\t%2,%3", op);
> output_asm_insn ("brasl\t%0,%4", op);
> output_asm_insn ("l\t%0,%1", op);
> if (flag_dwarf2_cfi_asm)
> diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
> index 3b876160420..fb16a455a03 100644
> --- a/gcc/config/s390/s390.h
> +++ b/gcc/config/s390/s390.h
> @@ -787,6 +787,8 @@ CUMULATIVE_ARGS;
>  
>  #define PROFILE_BEFORE_PROLOGUE 1
>  
> +#define NO_PROFILE_COUNTERS 1
> +
>  
>  /* Trampolines for nested functions.  */
>  
>

Re: [PATCH] s390: Add more vcond_mask patterns.

2021-06-09 Thread Andreas Krebbel via Gcc-patches

On 6/9/21 2:47 PM, Robin Dapp wrote:
>> I think the real problem is the expander name. That's why it could not be 
>> found by optab. The second
>> mode needs to be the int vector mode of op3. With that change the testcases 
>> work as expected:
>>
>> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
>> index c80d582a300d..ab605b3d2cf3 100644
>> --- a/gcc/config/s390/vector.md
>> +++ b/gcc/config/s390/vector.md
>> @@ -715,7 +715,7 @@
>> DONE;
>>   })
>>
>> -(define_expand "vcond_mask_"
>> +(define_expand "vcond_mask_"
>> [(set (match_operand:V 0 "register_operand" "")
>>  (if_then_else:V
>>   (eq (match_operand: 3 "register_operand" "")
> 
> Ah, yes, it's indeed much simpler that way.  Attached the revised 
> version with the small change and the new tests as a single patch now.
> 
> Regtest and bootstrap was successful.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Remove match_scratch workaround

2021-06-02 Thread Andreas Krebbel via Gcc-patches

On 6/2/21 4:21 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> Since commit dd1ef00c45ba ("Fix bug in the define_subst handling that
> made match_scratch unusable for multi-alternative patterns.") the
> workaround for that bug in *ashrdi3_31 is not only no
> longer necessary, but actually breaks the build.
> 
> Get rid of it by using only one alternative in (match_scratch).  It
> will be replicated as many times as needed in order to match the
> pattern with which (define_subst) is used.
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md(*ashrdi3_31): Use a single
>   constraint.
>   * config/s390/subst.md(cconly_subst): Use a single constraint
>   in (match_scratch).
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/ashr.c: New test.

Ok. Thanks!

Andreas

[Committed] IBM Z: Support vector _Bool language extension

2021-05-18 Thread Andreas Krebbel via Gcc-patches

_Bool needs to be defined as macro in order to trigger the
context-sensitive macro expansion mechanism.

Bootstrapped and regtested on s390x.

Committed to mainline.

gcc/ChangeLog:

* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Define
_Bool as macro expanding to _Bool.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/vec-_Bool.c: New test.
---
 gcc/config/s390/s390-c.c  | 2 ++
 gcc/testsuite/gcc.target/s390/zvector/vec-_Bool.c | 7 +++
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec-_Bool.c

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 7dbd8bf5da3..4cce2619ce4 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -367,6 +367,8 @@ s390_cpu_cpp_builtins_internal (cpp_reader *pfile,
   old_opts, opts, "vector=vector", "vector");
   s390_def_or_undef_macro (pfile, target_flag_set_p (MASK_ZVECTOR),
   old_opts, opts, "bool=bool", "bool");
+  s390_def_or_undef_macro (pfile, target_flag_set_p (MASK_ZVECTOR),
+  old_opts, opts, "_Bool=_Bool", "_Bool");
   if (TARGET_ZVECTOR_P (opts->x_target_flags) && __vector_keyword == NULL)
{
  __vector_keyword = get_identifier ("__vector");
diff --git a/gcc/testsuite/gcc.target/s390/zvector/vec-_Bool.c 
b/gcc/testsuite/gcc.target/s390/zvector/vec-_Bool.c
new file mode 100644
index 000..525b950253c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/zvector/vec-_Bool.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-march=z13 -mzvector" } */
+
+vector _Bool char bc;
+vector _Bool short int bs;
+vector _Bool int bi;
+vector _Bool long long bll;
-- 
2.31.1

Re: [PATCH 1/1 v2] PR100281 C++: Fix SImode pointer handling

2021-05-13 Thread Andreas Krebbel via Gcc-patches

v1 -> v2: build_reference_type_for_mode and build_pointer_type_for_mode now 
pick pointer mode if
MODE argument is VOIDmode.

Bootstrapped and regression tested on x86_64 and s390x.

Ok for mainline and GCC 11?

Andreas


gcc/cp/ChangeLog:

PR c++/100281
* cvt.c (cp_convert_to_pointer): Use the size of the target
pointer type.
* tree.c (cp_build_reference_type): Call
cp_build_reference_type_for_mode with VOIDmode.
(cp_build_reference_type_for_mode): Rename from
cp_build_reference_type.  Add MODE argument and invoke
build_reference_type_for_mode.
(strip_typedefs): Use build_pointer_type_for_mode and
cp_build_reference_type_for_mode for pointers and references.

gcc/ChangeLog:

PR c++/100281
* tree.c (build_reference_type_for_mode)
(build_pointer_type_for_mode): Pick pointer mode if MODE argument
is VOIDmode.
(build_reference_type, build_pointer_type): Invoke
build_*_type_for_mode with VOIDmode.

gcc/testsuite/ChangeLog:

PR c++/100281
* g++.target/s390/pr100281-1.C: New test.
* g++.target/s390/pr100281-2.C: New test.
---
 gcc/cp/cvt.c   |  2 +-
 gcc/cp/tree.c  | 25 ++-
 gcc/testsuite/g++.target/s390/pr100281-1.C | 10 
 gcc/testsuite/g++.target/s390/pr100281-2.C |  9 +++
 gcc/tree.c | 29 ++
 5 files changed, 57 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr100281-1.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr100281-2.C

diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index f1687e804d1..7fa6e8df52b 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
 {
   if (TYPE_PRECISION (intype) == POINTER_SIZE)
return build1 (CONVERT_EXPR, type, expr);
-  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
+  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
expr,
 complain);
   /* Modes may be different but sizes should be the same.  There
 is supposed to be some integral type that is the same width
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 7f148b4b158..35faeff065a 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1206,12 +1206,14 @@ vla_type_p (tree t)
   return false;
 }

-/* Return a reference type node referring to TO_TYPE.  If RVAL is
+
+/* Return a reference type node of MODE referring to TO_TYPE.  If MODE
+   is VOIDmode the standard pointer mode will be picked.  If RVAL is
true, return an rvalue reference type, otherwise return an lvalue
reference type.  If a type node exists, reuse it, otherwise create
a new one.  */
 tree
-cp_build_reference_type (tree to_type, bool rval)
+cp_build_reference_type_for_mode (tree to_type, machine_mode mode, bool rval)
 {
   tree lvalue_ref, t;

@@ -1224,7 +1226,8 @@ cp_build_reference_type (tree to_type, bool rval)
   to_type = TREE_TYPE (to_type);
 }

-  lvalue_ref = build_reference_type (to_type);
+  lvalue_ref = build_reference_type_for_mode (to_type, mode, false);
+
   if (!rval)
 return lvalue_ref;

@@ -1250,7 +1253,7 @@ cp_build_reference_type (tree to_type, bool rval)
 SET_TYPE_STRUCTURAL_EQUALITY (t);
   else if (TYPE_CANONICAL (to_type) != to_type)
 TYPE_CANONICAL (t)
-  = cp_build_reference_type (TYPE_CANONICAL (to_type), rval);
+  = cp_build_reference_type_for_mode (TYPE_CANONICAL (to_type), mode, 
rval);
   else
 TYPE_CANONICAL (t) = t;

@@ -1260,6 +1263,16 @@ cp_build_reference_type (tree to_type, bool rval)

 }

+/* Return a reference type node referring to TO_TYPE.  If RVAL is
+   true, return an rvalue reference type, otherwise return an lvalue
+   reference type.  If a type node exists, reuse it, otherwise create
+   a new one.  */
+tree
+cp_build_reference_type (tree to_type, bool rval)
+{
+  return cp_build_reference_type_for_mode (to_type, VOIDmode, rval);
+}
+
 /* Returns EXPR cast to rvalue reference type, like std::move.  */

 tree
@@ -1561,11 +1574,11 @@ strip_typedefs (tree t, bool *remove_attributes, 
unsigned int flags)
 {
 case POINTER_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
-  result = build_pointer_type (type);
+  result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
   break;
 case REFERENCE_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
-  result = cp_build_reference_type (type, TYPE_REF_IS_RVALUE (t));
+  result = cp_build_reference_type_for_mode (type, TYPE_MODE (t), 
TYPE_REF_IS_RVALUE (t));
   break;
 case OFFSET_TYPE:
   t0 = strip_typedefs (TYPE_OFFSET_BASETYPE (t), remove_attributes, flags);
diff --git a/gcc/testsuite/g++.target/s390/pr100281-1.C

Re: [PATCH 1/1] PR100281 C++: Fix SImode pointer handling

2021-05-12 Thread Andreas Krebbel via Gcc-patches

Ping

On 4/30/21 8:32 AM, Andreas Krebbel via Gcc-patches wrote:
> The problem appears to be triggered by two locations in the front-end
> where non-POINTER_SIZE pointers aren't handled right now.
> 
> 1. An assertion in strip_typedefs is triggered because the alignment
> of the types don't match. This in turn is caused by creating the new
> type with build_pointer_type instead of taking the type of the
> original pointer into account.
> 
> 2. An assertion in cp_convert_to_pointer is triggered which expects
> the target type to always have POINTER_SIZE.
> 
> Bootstrapped and regression tested on x86_64 and s390x.
> 
> Ok for mainline?
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/100281
>   * cvt.c (cp_convert_to_pointer): Use the size of the target
>   pointer type.
>   * tree.c (cp_build_reference_type): Call
>   cp_build_reference_type_for_mode with VOIDmode.
>   (cp_build_reference_type_for_mode): Rename from
>   cp_build_reference_type.  Add MODE argument and invoke
>   build_reference_type_for_mode if MODE isn't VOIDmode.
>   (strip_typedefs): Use build_pointer_type_for_mode and
>   cp_build_reference_type_for_mode for pointers and references.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/100281
>   * g++.target/s390/pr100281-1.C: New test.
>   * g++.target/s390/pr100281-2.C: New test.
> ---
>  gcc/cp/cvt.c   |  2 +-
>  gcc/cp/tree.c  | 28 +-
>  gcc/testsuite/g++.target/s390/pr100281-1.C | 10 
>  gcc/testsuite/g++.target/s390/pr100281-2.C |  9 +++
>  4 files changed, 42 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/s390/pr100281-1.C
>  create mode 100644 gcc/testsuite/g++.target/s390/pr100281-2.C
> 
> diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
> index f1687e804d1..7fa6e8df52b 100644
> --- a/gcc/cp/cvt.c
> +++ b/gcc/cp/cvt.c
> @@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
>  {
>if (TYPE_PRECISION (intype) == POINTER_SIZE)
>   return build1 (CONVERT_EXPR, type, expr);
> -  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
> +  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
> expr,
>complain);
>/* Modes may be different but sizes should be the same.  There
>is supposed to be some integral type that is the same width
> diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
> index a8bfd5fc053..3817b499e46 100644
> --- a/gcc/cp/tree.c
> +++ b/gcc/cp/tree.c
> @@ -1201,12 +1201,14 @@ vla_type_p (tree t)
>return false;
>  }
>  
> -/* Return a reference type node referring to TO_TYPE.  If RVAL is
> +
> +/* Return a reference type node of MODE referring to TO_TYPE.  If MODE
> +   is VOIDmode the standard pointer mode will be picked.  If RVAL is
> true, return an rvalue reference type, otherwise return an lvalue
> reference type.  If a type node exists, reuse it, otherwise create
> a new one.  */
>  tree
> -cp_build_reference_type (tree to_type, bool rval)
> +cp_build_reference_type_for_mode (tree to_type, machine_mode mode, bool rval)
>  {
>tree lvalue_ref, t;
>  
> @@ -1219,7 +1221,11 @@ cp_build_reference_type (tree to_type, bool rval)
>to_type = TREE_TYPE (to_type);
>  }
>  
> -  lvalue_ref = build_reference_type (to_type);
> +  if (mode == VOIDmode)
> +lvalue_ref = build_reference_type (to_type);
> +  else
> +lvalue_ref = build_reference_type_for_mode (to_type, mode, false);
> +
>if (!rval)
>  return lvalue_ref;
>  
> @@ -1245,7 +1251,7 @@ cp_build_reference_type (tree to_type, bool rval)
>  SET_TYPE_STRUCTURAL_EQUALITY (t);
>else if (TYPE_CANONICAL (to_type) != to_type)
>  TYPE_CANONICAL (t) 
> -  = cp_build_reference_type (TYPE_CANONICAL (to_type), rval);
> +  = cp_build_reference_type_for_mode (TYPE_CANONICAL (to_type), mode, 
> rval);
>else
>  TYPE_CANONICAL (t) = t;
>  
> @@ -1255,6 +1261,16 @@ cp_build_reference_type (tree to_type, bool rval)
>  
>  }
>  
> +/* Return a reference type node referring to TO_TYPE.  If RVAL is
> +   true, return an rvalue reference type, otherwise return an lvalue
> +   reference type.  If a type node exists, reuse it, otherwise create
> +   a new one.  */
> +tree
> +cp_build_reference_type (tree to_type, bool rval)
> +{
> +  return cp_build_reference_type_for_mode (to_type, VOIDmode, rval);
> +}
> +
>  /* Returns EXPR cast to rvalue reference type, like std::move.  */
>  
>  tree
> @@ -1556,11 +1572,11 @@ strip_typedefs (tree t, bool *remove_attributes, 
> unsigned

Re: [PATCH] s390: Add more vcond_mask patterns.

2021-05-11 Thread Andreas Krebbel via Gcc-patches

Hi Robin,


On 5/5/21 5:18 PM, Robin Dapp wrote:
...
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index c80d582a300..7c730432d80 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -36,6 +36,7 @@
>  (define_mode_iterator V_HW2 [V16QI V8HI V4SI V2DI V2DF (V4SF "TARGET_VXE")
>(V1TF "TARGET_VXE") (TF "TARGET_VXE")])
>
> +

whitespace diff?

>  (define_mode_iterator V_HW_64 [V2DI V2DF])
>  (define_mode_iterator VT_HW_HSDT [V8HI V4SI V4SF V2DI V2DF V1TI V1TF TI TF])
>  (define_mode_iterator V_HW_HSD [V8HI V4SI (V4SF "TARGET_VXE") V2DI V2DF])
> @@ -725,6 +726,26 @@
>"TARGET_VX"
>"operands[4] = CONST0_RTX (mode);")
>
> +(define_expand "vcond_mask_"
> +  [(set (match_operand:VX_VEC_CONV_BFP 0 "register_operand" "")
> + (if_then_else:VX_VEC_CONV_BFP
> +  (eq (match_operand:VX_VEC_CONV_INT 3 "register_operand" "")
> +  (match_dup 4))
> +  (match_operand:VX_VEC_CONV_BFP 2 "register_operand" "")
> +  (match_operand:VX_VEC_CONV_BFP 1 "register_operand" "")))]
> +  "TARGET_VX"
> +  "operands[4] = CONST0_RTX (mode);")

This should be covered by the existing pattern already.

> +
> +(define_expand "vcond_mask_"
> +  [(set (match_operand:VX_VEC_CONV_INT 0 "register_operand" "")
> + (if_then_else:VX_VEC_CONV_INT
> +  (eq (match_operand:VX_VEC_CONV_BFP 3 "register_operand" "")
> +  (match_dup 4))
> +  (match_operand:VX_VEC_CONV_INT 2 "register_operand" "")
> +  (match_operand:VX_VEC_CONV_INT 1 "register_operand" "")))]
> +  "TARGET_VX"
> +  "operands[4] = CONST0_RTX (mode);")

op3 is supposed to be a comparison result operand. A vector float mode looks 
wrong here.

I think the real problem is the expander name. That's why it could not be found 
by optab. The second
mode needs to be the int vector mode of op3. With that change the testcases 
work as expected:

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index c80d582a300d..ab605b3d2cf3 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -715,7 +715,7 @@
   DONE;
 })

-(define_expand "vcond_mask_"
+(define_expand "vcond_mask_"
   [(set (match_operand:V 0 "register_operand" "")
(if_then_else:V
 (eq (match_operand: 3 "register_operand" "")


> +
>
>  ; We only have HW support for byte vectors.  The middle-end is
>  ; supposed to lower the mode if required.
> diff --git a/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-double.c
b/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-double.c
> new file mode 100644
> index 000..8795d08a732
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-double.c
> @@ -0,0 +1,41 @@
> +/* Check for vectorization of mixed conditionals.  */
> +/* { dg-do compile { target { s390*-*-* } } } */
> +/* { dg-options "-O3 -march=z14 -mzarch" } */

I think you have to add -fdump-tree-vect-details here. Otherwise the dump scan 
below will just go as
"unresolved".

> +
> +double xd[1024];
> +double zd[1024];
> +double wd[1024];
> +
> +long xl[1024];
> +long zl[1024];
> +long wl[1024];
> +
> +void foold ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zd[i] = xl[i] ? zd[i] : wd[i];
> +}
> +
> +void foodl ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zl[i] = xd[i] ? zl[i] : wl[i];
> +}
> +
> +void foold2 ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zd[i] = (xd[i] > 0) ? zd[i] : wd[i];
> +}
> +
> +void foold3 ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zd[i] = (xd[i] > 0. & wd[i] < 0.) ? zd[i] : wd[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
> diff --git a/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-float.c
b/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-float.c
> new file mode 100644
> index 000..1153cace420
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/vector/vcond-mixed-float.c
> @@ -0,0 +1,41 @@
> +/* Check for vectorization of mixed conditionals.  */
> +/* { dg-do compile { target { s390*-*-* } } } */
> +/* { dg-options "-O3 -march=z15 -mzarch" } */

Likewise.

> +
> +float xf[1024];
> +float zf[1024];
> +float wf[1024];
> +
> +int xi[1024];
> +int zi[1024];
> +int wi[1024];
> +
> +void fooif ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zf[i] = xi[i] ? zf[i] : wf[i];
> +}
> +
> +void foofi ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zi[i] = xf[i] ? zi[i] : wi[i];
> +}
> +
> +void fooif2 ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zf[i] = (xf[i] > 0) ? zf[i] : wf[i];
> +}
> +
> +void fooif3 ()
> +{
> +  int i;
> +  for (i = 0; i < 1024; ++i)
> +zf[i] = (xf[i] > 0.f & wf[i] < 0.f) ? zf[i] : wf[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" } } */
> --
> 2.23.0
>

Andreas

Re: [PATCH] testsuite/s390: Fix risbg-ll-3.c f2_cconly test.

2021-05-11 Thread Andreas Krebbel via Gcc-patches

On 5/4/21 5:08 PM, Robin Dapp wrote:
> Hi,
> 
> instead of selecting bits 62 to (wraparound) 59 from r2 and inserting 
> them into r3, we select bits 60 to 62 from r3 and insert them into r2 
> nowadays.  Adjust the test accordingly.
> 
> Is this OK?
> 
> Regards
>   Robin
> 
> gcc/testsuite/ChangeLog:
> 
>  * gcc.target/s390/risbg-ll-3.c: Change match pattern.
> 

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Fix error checking for builtin vec_permi

2021-05-06 Thread Andreas Krebbel via Gcc-patches

On 5/6/21 9:56 AM, Marius Hillenbrand wrote:
> Hi,
> 
> this patch fixes the check of immediate operands to the builtin vec_permi and
> adds a new test for this built-in.
> 
> Reg-rested and bootstrapped on s390x.
> 
> Is it OK for master? Is it OK for backporting to gcc-11?
> 
> Regards,
> Marius
> 
> 
> --8<--8<-8<-
> 
> The builtin vec_permi is peculiar in that its immediate operand is
> encoded differently than the immediate operand that is backing the
> builtin. This fixes the check for the immediate operand, adding a
> regression test in the process.
> 
> This partially reverts commit 3191c1f4488d1f7563b563d7ae2a102a26f16d82
> 
> gcc/ChangeLog:
> 
> 2021-05-04  Marius Hillenbrand  
> 
> * config/s390/s390-builtins.def (O_M5, O1_M5, ...): Remove unused 
> macros.
> (s390_vec_permi_s64, s390_vec_permi_b64, s390_vec_permi_u64)
> (s390_vec_permi_dbl, s390_vpdi): Use the O3_U2 type for the immediate
> operand.
>   * config/s390/s390.c (s390_const_operand_ok): Remove unused
>   values.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/s390/zvector/imm-range-error-1.c: Fix test for
>   __builtin_s390_vpdi.
> * gcc.target/s390/zvector/vec-permi.c: New test for builtin
>   vec_permi.

Ok for mainline and GCC 11 branch. Thanks for the fix!

Andreas

Re: [PATCH v2] IBM Z: Handle hard registers in s390_md_asm_adjust()

2021-05-05 Thread Andreas Krebbel via Gcc-patches

On 5/3/21 1:09 PM, Ilya Leoshkevich wrote:
> On Fri, 2021-04-30 at 08:49 +0200, Andreas Krebbel wrote:
>> On 4/28/21 3:48 AM, Ilya Leoshkevich wrote:
>>> Bootstrapped and regtested on s390x-redhat-linux.  Tested with
>>> valgrind
>>> too (PR 100278 is now fixed).  Ok for master?
>>>
>>> v1:
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568771.html
>>> v1 -> v2: Use the UNSPEC pattern, which is less efficient, but is
>>> more
>>>   on the "obviously correct" side than gen_raw_SUBREG().
>>>
>>>
>>>
>>> gen_fprx2_to_tf() and gen_tf_to_fprx2() cannot handle hard
>>> registers,
>>> since the subregs they create do not pass validation.  Change
>>> s390_md_asm_adjust() to manually copy between hard VRs and FPRs
>>> instead
>>> of using these two functions.
>>>
>>> gcc/ChangeLog:
>>>
>>> PR target/100217
>>> * config/s390/s390.c (s390_hard_fp_reg_p): New function.
>>> (s390_md_asm_adjust): Handle hard registers.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR target/100217
>>> * gcc.target/s390/vector/long-double-asm-in-out-hard-fp-
>>> reg.c: New test.
>>> * gcc.target/s390/vector/long-double-asm-inout-hard-fp-
>>> reg.c: New test.
>>
>> Ok. Thanks!
>>
>> Andreas
> 
> Thanks!
> 
> I forgot to ask: ok for gcc-11 branch?

Ok for GCC 11 branch as well. Thanks!

Andreas

Re: [PATCH v2] IBM Z: Handle hard registers in s390_md_asm_adjust()

2021-04-30 Thread Andreas Krebbel via Gcc-patches

On 4/28/21 3:48 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Tested with valgrind
> too (PR 100278 is now fixed).  Ok for master?
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568771.html
> v1 -> v2: Use the UNSPEC pattern, which is less efficient, but is more
>   on the "obviously correct" side than gen_raw_SUBREG().
> 
> 
> 
> gen_fprx2_to_tf() and gen_tf_to_fprx2() cannot handle hard registers,
> since the subregs they create do not pass validation.  Change
> s390_md_asm_adjust() to manually copy between hard VRs and FPRs instead
> of using these two functions.
> 
> gcc/ChangeLog:
> 
>   PR target/100217
>   * config/s390/s390.c (s390_hard_fp_reg_p): New function.
>   (s390_md_asm_adjust): Handle hard registers.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/100217
>   * gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c: New test.
>   * gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: New test.

Ok. Thanks!

Andreas

[PATCH 1/1] PR100281 C++: Fix SImode pointer handling

2021-04-30 Thread Andreas Krebbel via Gcc-patches

The problem appears to be triggered by two locations in the front-end
where non-POINTER_SIZE pointers aren't handled right now.

1. An assertion in strip_typedefs is triggered because the alignment
of the types don't match. This in turn is caused by creating the new
type with build_pointer_type instead of taking the type of the
original pointer into account.

2. An assertion in cp_convert_to_pointer is triggered which expects
the target type to always have POINTER_SIZE.

Bootstrapped and regression tested on x86_64 and s390x.

Ok for mainline?

gcc/cp/ChangeLog:

PR c++/100281
* cvt.c (cp_convert_to_pointer): Use the size of the target
pointer type.
* tree.c (cp_build_reference_type): Call
cp_build_reference_type_for_mode with VOIDmode.
(cp_build_reference_type_for_mode): Rename from
cp_build_reference_type.  Add MODE argument and invoke
build_reference_type_for_mode if MODE isn't VOIDmode.
(strip_typedefs): Use build_pointer_type_for_mode and
cp_build_reference_type_for_mode for pointers and references.

gcc/testsuite/ChangeLog:

PR c++/100281
* g++.target/s390/pr100281-1.C: New test.
* g++.target/s390/pr100281-2.C: New test.
---
 gcc/cp/cvt.c   |  2 +-
 gcc/cp/tree.c  | 28 +-
 gcc/testsuite/g++.target/s390/pr100281-1.C | 10 
 gcc/testsuite/g++.target/s390/pr100281-2.C |  9 +++
 4 files changed, 42 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr100281-1.C
 create mode 100644 gcc/testsuite/g++.target/s390/pr100281-2.C

diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index f1687e804d1..7fa6e8df52b 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
 {
   if (TYPE_PRECISION (intype) == POINTER_SIZE)
return build1 (CONVERT_EXPR, type, expr);
-  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
+  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
expr,
 complain);
   /* Modes may be different but sizes should be the same.  There
 is supposed to be some integral type that is the same width
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index a8bfd5fc053..3817b499e46 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1201,12 +1201,14 @@ vla_type_p (tree t)
   return false;
 }
 
-/* Return a reference type node referring to TO_TYPE.  If RVAL is
+
+/* Return a reference type node of MODE referring to TO_TYPE.  If MODE
+   is VOIDmode the standard pointer mode will be picked.  If RVAL is
true, return an rvalue reference type, otherwise return an lvalue
reference type.  If a type node exists, reuse it, otherwise create
a new one.  */
 tree
-cp_build_reference_type (tree to_type, bool rval)
+cp_build_reference_type_for_mode (tree to_type, machine_mode mode, bool rval)
 {
   tree lvalue_ref, t;
 
@@ -1219,7 +1221,11 @@ cp_build_reference_type (tree to_type, bool rval)
   to_type = TREE_TYPE (to_type);
 }
 
-  lvalue_ref = build_reference_type (to_type);
+  if (mode == VOIDmode)
+lvalue_ref = build_reference_type (to_type);
+  else
+lvalue_ref = build_reference_type_for_mode (to_type, mode, false);
+
   if (!rval)
 return lvalue_ref;
 
@@ -1245,7 +1251,7 @@ cp_build_reference_type (tree to_type, bool rval)
 SET_TYPE_STRUCTURAL_EQUALITY (t);
   else if (TYPE_CANONICAL (to_type) != to_type)
 TYPE_CANONICAL (t) 
-  = cp_build_reference_type (TYPE_CANONICAL (to_type), rval);
+  = cp_build_reference_type_for_mode (TYPE_CANONICAL (to_type), mode, 
rval);
   else
 TYPE_CANONICAL (t) = t;
 
@@ -1255,6 +1261,16 @@ cp_build_reference_type (tree to_type, bool rval)
 
 }
 
+/* Return a reference type node referring to TO_TYPE.  If RVAL is
+   true, return an rvalue reference type, otherwise return an lvalue
+   reference type.  If a type node exists, reuse it, otherwise create
+   a new one.  */
+tree
+cp_build_reference_type (tree to_type, bool rval)
+{
+  return cp_build_reference_type_for_mode (to_type, VOIDmode, rval);
+}
+
 /* Returns EXPR cast to rvalue reference type, like std::move.  */
 
 tree
@@ -1556,11 +1572,11 @@ strip_typedefs (tree t, bool *remove_attributes, 
unsigned int flags)
 {
 case POINTER_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
-  result = build_pointer_type (type);
+  result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
   break;
 case REFERENCE_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
-  result = cp_build_reference_type (type, TYPE_REF_IS_RVALUE (t));
+  result = cp_build_reference_type_for_mode (type, TYPE_MODE (t), 
TYPE_REF_IS_RVALUE (t));
   break;
 case OFFSET_TYPE:
   t0 = strip_typedefs (TYPE_OFFSET_BASETYPE (t), remove_attributes, flags);

Re: [PATCH 1/1] PR100281 Fix SImode pointer handling

2021-04-28 Thread Andreas Krebbel via Gcc-patches

On 4/28/21 10:12 AM, Richard Biener wrote:
> On Wed, Apr 28, 2021 at 8:54 AM Andreas Krebbel via Gcc-patches
>  wrote:
>>
>> The problem appears to be triggered by two locations in the front-end
>> where non-POINTER_SIZE pointers aren't handled right now.
>>
>> 1. An assertion in strip_typedefs is triggered because the alignment
>> of the types don't match. This in turn is caused by creating the new
>> type with build_pointer_type instead of taking the type of the
>> original pointer into account.
>>
>> 2. An assertion in cp_convert_to_pointer is triggered which expects
>> the target type to always have POINTER_SIZE.
>>
>> Ok for mainline?
>>
>> gcc/cp/ChangeLog:
>>
>> PR c++/100281
>> * cvt.c (cp_convert_to_pointer): Use the size of the target
>> pointer type.
>> * tree.c (strip_typedefs): Use build_pointer_type_for_mode for
>> non-POINTER_SIZE pointers.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR c++/100281
>> * g++.target/s390/pr100281.C: New test.
>> ---
>>  gcc/cp/cvt.c |  2 +-
>>  gcc/cp/tree.c|  5 -
>>  gcc/testsuite/g++.target/s390/pr100281.C | 10 ++
>>  3 files changed, 15 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.target/s390/pr100281.C
>>
>> diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
>> index f1687e804d1..7fa6e8df52b 100644
>> --- a/gcc/cp/cvt.c
>> +++ b/gcc/cp/cvt.c
>> @@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
>>  {
>>if (TYPE_PRECISION (intype) == POINTER_SIZE)
>> return build1 (CONVERT_EXPR, type, expr);
>> -  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
>> +  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
>> expr,
>>  complain);
>>/* Modes may be different but sizes should be the same.  There
>>  is supposed to be some integral type that is the same width
>> diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
>> index a8bfd5fc053..6f6b732c9c9 100644
>> --- a/gcc/cp/tree.c
>> +++ b/gcc/cp/tree.c
>> @@ -1556,7 +1556,10 @@ strip_typedefs (tree t, bool *remove_attributes, 
>> unsigned int flags)
>>  {
>>  case POINTER_TYPE:
>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
>> -  result = build_pointer_type (type);
>> +  if (TYPE_PRECISION (t) == POINTER_SIZE)
>> +   result = build_pointer_type (type);
>> +  else
>> +   result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
> 
> I wonder under which circumstances re-using the original mode will fail?  In
> particular I do not like the TYPE_PRECISION check.  Supposedly you
> were thinking of playing safe?
> 
>>break;
>>  case REFERENCE_TYPE:
>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
> 
> There's code below with exactly the same issue for reference types which
> would need adjustments to cp_build_reference_type.
> 
>> diff --git a/gcc/testsuite/g++.target/s390/pr100281.C 
>> b/gcc/testsuite/g++.target/s390/pr100281.C
>> new file mode 100644
>> index 000..f45798c3879
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.target/s390/pr100281.C
>> @@ -0,0 +1,10 @@
>> +// PR C++/100281
>> +// { dg-do compile }
>> +
>> +typedef void * __attribute__((mode (SI))) __ptr32_t;
>> +
>> +void foo(){
>> +  unsigned int b = 100;
>> +  __ptr32_t a;
>> +  a = b; /* { dg-error "invalid conversion from 'unsigned int' to 
>> '__ptr32_t'.*" } */
>> +}
>> --
>> 2.30.2
>>

Like so?

diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index f1687e804d1..7fa6e8df52b 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
 {
   if (TYPE_PRECISION (intype) == POINTER_SIZE)
return build1 (CONVERT_EXPR, type, expr);
-  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
+  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
expr,
 complain);
   /* Modes may be different but sizes should be the same.  There
 is supposed to be some integral type that is the same width
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index a8bfd5fc053..fe5c414c8d9 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1201,12 +1201,14 @@ vla_type_p (tree t)
   return false;
 }

-/* Return a reference type node referrin

Re: [PATCH 1/1] PR100281 Fix SImode pointer handling

2021-04-28 Thread Andreas Krebbel via Gcc-patches

On 4/28/21 10:22 AM, Andreas Krebbel via Gcc-patches wrote:
> On 4/28/21 10:12 AM, Richard Biener wrote:
>> On Wed, Apr 28, 2021 at 8:54 AM Andreas Krebbel via Gcc-patches
>>  wrote:
>>>
>>> The problem appears to be triggered by two locations in the front-end
>>> where non-POINTER_SIZE pointers aren't handled right now.
>>>
>>> 1. An assertion in strip_typedefs is triggered because the alignment
>>> of the types don't match. This in turn is caused by creating the new
>>> type with build_pointer_type instead of taking the type of the
>>> original pointer into account.
>>>
>>> 2. An assertion in cp_convert_to_pointer is triggered which expects
>>> the target type to always have POINTER_SIZE.
>>>
>>> Ok for mainline?
>>>
>>> gcc/cp/ChangeLog:
>>>
>>> PR c++/100281
>>> * cvt.c (cp_convert_to_pointer): Use the size of the target
>>> pointer type.
>>> * tree.c (strip_typedefs): Use build_pointer_type_for_mode for
>>> non-POINTER_SIZE pointers.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR c++/100281
>>> * g++.target/s390/pr100281.C: New test.
>>> ---
>>>  gcc/cp/cvt.c |  2 +-
>>>  gcc/cp/tree.c|  5 -
>>>  gcc/testsuite/g++.target/s390/pr100281.C | 10 ++
>>>  3 files changed, 15 insertions(+), 2 deletions(-)
>>>  create mode 100644 gcc/testsuite/g++.target/s390/pr100281.C
>>>
>>> diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
>>> index f1687e804d1..7fa6e8df52b 100644
>>> --- a/gcc/cp/cvt.c
>>> +++ b/gcc/cp/cvt.c
>>> @@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool 
>>> dofold,
>>>  {
>>>if (TYPE_PRECISION (intype) == POINTER_SIZE)
>>> return build1 (CONVERT_EXPR, type, expr);
>>> -  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
>>> +  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 
>>> 0), expr,
>>>  complain);
>>>/* Modes may be different but sizes should be the same.  There
>>>  is supposed to be some integral type that is the same width
>>> diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
>>> index a8bfd5fc053..6f6b732c9c9 100644
>>> --- a/gcc/cp/tree.c
>>> +++ b/gcc/cp/tree.c
>>> @@ -1556,7 +1556,10 @@ strip_typedefs (tree t, bool *remove_attributes, 
>>> unsigned int flags)
>>>  {
>>>  case POINTER_TYPE:
>>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
>>> -  result = build_pointer_type (type);
>>> +  if (TYPE_PRECISION (t) == POINTER_SIZE)
>>> +   result = build_pointer_type (type);
>>> +  else
>>> +   result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
>>
>> I wonder under which circumstances re-using the original mode will fail?  In
>> particular I do not like the TYPE_PRECISION check.  Supposedly you
>> were thinking of playing safe?
> 
> Yes. build_pointer_type_for_mode carries some additional logic compared to 
> just build_pointer_type
> and I wanted to avoid impacting other targets that way.

build_pointer_type just calls build_pointer_type_for_mode. I'll drop the check 
then and re-test.

> 
>>
>>>break;
>>>  case REFERENCE_TYPE:
>>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
>>
>> There's code below with exactly the same issue for reference types which
>> would need adjustments to cp_build_reference_type.
> 
> Ok. I'll have a look.
> 
> Andreas
> 
>>
>>> diff --git a/gcc/testsuite/g++.target/s390/pr100281.C 
>>> b/gcc/testsuite/g++.target/s390/pr100281.C
>>> new file mode 100644
>>> index 000..f45798c3879
>>> --- /dev/null
>>> +++ b/gcc/testsuite/g++.target/s390/pr100281.C
>>> @@ -0,0 +1,10 @@
>>> +// PR C++/100281
>>> +// { dg-do compile }
>>> +
>>> +typedef void * __attribute__((mode (SI))) __ptr32_t;
>>> +
>>> +void foo(){
>>> +  unsigned int b = 100;
>>> +  __ptr32_t a;
>>> +  a = b; /* { dg-error "invalid conversion from 'unsigned int' to 
>>> '__ptr32_t'.*" } */
>>> +}
>>> --
>>> 2.30.2
>>>
>

Re: [PATCH 1/1] PR100281 Fix SImode pointer handling

2021-04-28 Thread Andreas Krebbel via Gcc-patches

On 4/28/21 10:12 AM, Richard Biener wrote:
> On Wed, Apr 28, 2021 at 8:54 AM Andreas Krebbel via Gcc-patches
>  wrote:
>>
>> The problem appears to be triggered by two locations in the front-end
>> where non-POINTER_SIZE pointers aren't handled right now.
>>
>> 1. An assertion in strip_typedefs is triggered because the alignment
>> of the types don't match. This in turn is caused by creating the new
>> type with build_pointer_type instead of taking the type of the
>> original pointer into account.
>>
>> 2. An assertion in cp_convert_to_pointer is triggered which expects
>> the target type to always have POINTER_SIZE.
>>
>> Ok for mainline?
>>
>> gcc/cp/ChangeLog:
>>
>> PR c++/100281
>> * cvt.c (cp_convert_to_pointer): Use the size of the target
>> pointer type.
>> * tree.c (strip_typedefs): Use build_pointer_type_for_mode for
>> non-POINTER_SIZE pointers.
>>
>> gcc/testsuite/ChangeLog:
>>
>> PR c++/100281
>> * g++.target/s390/pr100281.C: New test.
>> ---
>>  gcc/cp/cvt.c |  2 +-
>>  gcc/cp/tree.c|  5 -
>>  gcc/testsuite/g++.target/s390/pr100281.C | 10 ++
>>  3 files changed, 15 insertions(+), 2 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.target/s390/pr100281.C
>>
>> diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
>> index f1687e804d1..7fa6e8df52b 100644
>> --- a/gcc/cp/cvt.c
>> +++ b/gcc/cp/cvt.c
>> @@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
>>  {
>>if (TYPE_PRECISION (intype) == POINTER_SIZE)
>> return build1 (CONVERT_EXPR, type, expr);
>> -  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
>> +  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
>> expr,
>>  complain);
>>/* Modes may be different but sizes should be the same.  There
>>  is supposed to be some integral type that is the same width
>> diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
>> index a8bfd5fc053..6f6b732c9c9 100644
>> --- a/gcc/cp/tree.c
>> +++ b/gcc/cp/tree.c
>> @@ -1556,7 +1556,10 @@ strip_typedefs (tree t, bool *remove_attributes, 
>> unsigned int flags)
>>  {
>>  case POINTER_TYPE:
>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
>> -  result = build_pointer_type (type);
>> +  if (TYPE_PRECISION (t) == POINTER_SIZE)
>> +   result = build_pointer_type (type);
>> +  else
>> +   result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
> 
> I wonder under which circumstances re-using the original mode will fail?  In
> particular I do not like the TYPE_PRECISION check.  Supposedly you
> were thinking of playing safe?

Yes. build_pointer_type_for_mode carries some additional logic compared to just 
build_pointer_type
and I wanted to avoid impacting other targets that way.

> 
>>break;
>>  case REFERENCE_TYPE:
>>type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
> 
> There's code below with exactly the same issue for reference types which
> would need adjustments to cp_build_reference_type.

Ok. I'll have a look.

Andreas

> 
>> diff --git a/gcc/testsuite/g++.target/s390/pr100281.C 
>> b/gcc/testsuite/g++.target/s390/pr100281.C
>> new file mode 100644
>> index 000..f45798c3879
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.target/s390/pr100281.C
>> @@ -0,0 +1,10 @@
>> +// PR C++/100281
>> +// { dg-do compile }
>> +
>> +typedef void * __attribute__((mode (SI))) __ptr32_t;
>> +
>> +void foo(){
>> +  unsigned int b = 100;
>> +  __ptr32_t a;
>> +  a = b; /* { dg-error "invalid conversion from 'unsigned int' to 
>> '__ptr32_t'.*" } */
>> +}
>> --
>> 2.30.2
>>

[PATCH 1/1] PR100281 Fix SImode pointer handling

2021-04-27 Thread Andreas Krebbel via Gcc-patches

The problem appears to be triggered by two locations in the front-end
where non-POINTER_SIZE pointers aren't handled right now.

1. An assertion in strip_typedefs is triggered because the alignment
of the types don't match. This in turn is caused by creating the new
type with build_pointer_type instead of taking the type of the
original pointer into account.

2. An assertion in cp_convert_to_pointer is triggered which expects
the target type to always have POINTER_SIZE.

Ok for mainline?

gcc/cp/ChangeLog:

PR c++/100281
* cvt.c (cp_convert_to_pointer): Use the size of the target
pointer type.
* tree.c (strip_typedefs): Use build_pointer_type_for_mode for
non-POINTER_SIZE pointers.

gcc/testsuite/ChangeLog:

PR c++/100281
* g++.target/s390/pr100281.C: New test.
---
 gcc/cp/cvt.c |  2 +-
 gcc/cp/tree.c|  5 -
 gcc/testsuite/g++.target/s390/pr100281.C | 10 ++
 3 files changed, 15 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/s390/pr100281.C

diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index f1687e804d1..7fa6e8df52b 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -232,7 +232,7 @@ cp_convert_to_pointer (tree type, tree expr, bool dofold,
 {
   if (TYPE_PRECISION (intype) == POINTER_SIZE)
return build1 (CONVERT_EXPR, type, expr);
-  expr = cp_convert (c_common_type_for_size (POINTER_SIZE, 0), expr,
+  expr = cp_convert (c_common_type_for_size (TYPE_PRECISION (type), 0), 
expr,
 complain);
   /* Modes may be different but sizes should be the same.  There
 is supposed to be some integral type that is the same width
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index a8bfd5fc053..6f6b732c9c9 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1556,7 +1556,10 @@ strip_typedefs (tree t, bool *remove_attributes, 
unsigned int flags)
 {
 case POINTER_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
-  result = build_pointer_type (type);
+  if (TYPE_PRECISION (t) == POINTER_SIZE)
+   result = build_pointer_type (type);
+  else
+   result = build_pointer_type_for_mode (type, TYPE_MODE (t), false);
   break;
 case REFERENCE_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
diff --git a/gcc/testsuite/g++.target/s390/pr100281.C 
b/gcc/testsuite/g++.target/s390/pr100281.C
new file mode 100644
index 000..f45798c3879
--- /dev/null
+++ b/gcc/testsuite/g++.target/s390/pr100281.C
@@ -0,0 +1,10 @@
+// PR C++/100281
+// { dg-do compile }
+
+typedef void * __attribute__((mode (SI))) __ptr32_t;
+
+void foo(){
+  unsigned int b = 100;
+  __ptr32_t a;
+  a = b; /* { dg-error "invalid conversion from 'unsigned int' to 
'__ptr32_t'.*" } */
+}
-- 
2.30.2

[PATCH] Fix PR88085

2021-04-20 Thread Andreas Krebbel via Gcc-patches

With the current handling of decl alignments it is impossible to
reduce the alignment requirement as part of a variable declaration.

This change has been proposed by Richard in the PR. It fixes the
align-1.c testcase on IBM Z.

Bootstrapped on x86_64 and s390x. No regressions.

Ok for mainline?

gcc/ChangeLog:

PR middle-end/88085
* emit-rtl.c (set_mem_attributes_minus_bitpos): Use the user
alignment if there are no pre-existing mem attrs.
---
 gcc/emit-rtl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 07e908624a0..fc12fa927da 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -2124,7 +2124,7 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
   unsigned int diff_align = known_alignment (obj_bitpos - bitpos);
   if (diff_align != 0)
obj_align = MIN (obj_align, diff_align);
-  attrs.align = MAX (attrs.align, obj_align);
+  attrs.align = refattrs ? MAX (refattrs->align, obj_align) : obj_align;
 }
 
   poly_uint64 const_size;
-- 
2.30.2

Re: [PATCH] testsuite: Fix up gcc.target/s390/zero-scratch-regs-1.c

2021-04-20 Thread Andreas Krebbel via Gcc-patches

On 4/20/21 9:17 AM, Stefan Schulze Frielinghaus wrote:
> Depending on whether GCC is configured using --with-mode=zarch or not,
> for the 31bit target instructions are generated either for ESA or
> z/Architecture.  For the sake of simplicity and robustness test only for
> the latter by adding manually option -mzarch.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/zero-scratch-regs-1.c: Force test to run for
>   z/Architecture only.>
> Ok for mainline?

Ok. Thanks!

Andreas

> 
> ---
>  .../gcc.target/s390/zero-scratch-regs-1.c | 95 ---
>  1 file changed, 40 insertions(+), 55 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c 
> b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> index c394c4b69e7..1c02c0c4e51 100644
> --- a/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> +++ b/gcc/testsuite/gcc.target/s390/zero-scratch-regs-1.c
> @@ -1,65 +1,50 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13" } */
> +/* { dg-options "-O2 -fzero-call-used-regs=all -march=z13 -mzarch" } */
>  
>  /* Ensure that all call clobbered GPRs, FPRs, and VRs are zeroed and all call
> saved registers are kept. */
>  
>  void foo (void) { }
>  
> -/* { dg-final { scan-assembler-times "lhi\t" 6 { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r0,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r1,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r2,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r3,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r4,0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lhi\t%r5,0" { target { ! lp64 } } } } */
> +/* { dg-final { scan-assembler-times "lghi\t" 6 } } */
> +/* { dg-final { scan-assembler "lghi\t%r0,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r1,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r2,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r3,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r4,0" } } */
> +/* { dg-final { scan-assembler "lghi\t%r5,0" } } */
>  
> -/* { dg-final { scan-assembler-times "lzdr\t" 14 { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f0" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f1" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f2" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f3" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f5" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f7" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f8" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f9" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f10" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f11" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f12" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f13" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f14" { target { ! lp64 } } } } */
> -/* { dg-final { scan-assembler "lzdr\t%f15" { target { ! lp64 } } } } */
> -
> -/* { dg-final { scan-assembler-times "lghi\t" 6 { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r0,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r1,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r2,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r3,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r4,0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "lghi\t%r5,0" { target { lp64 } } } } */
> -
> -/* { dg-final { scan-assembler-times "vzero\t" 24 { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v0" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v1" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v2" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v3" { target { lp64 } } } } */
> +/* { dg-final { scan-assembler-times "vzero\t" 30 { target { ! lp64 } } } } 
> */
> +/* { dg-final { scan-assembler-times "vzero\t" 24 { target {   lp64 } } } } 
> */
> +/* { dg-final { scan-assembler "vzero\t%v0" } } */
> +/* { dg-final { scan-assembler "vzero\t%v1" } } */
> +/* { dg-final { scan-assembler "vzero\t%v2" } } */
> +/* { dg-final { scan-assembler "vzero\t%v3" } } */
>  /* { dg-final { scan-assembler "vzero\t%v4" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v5" { target { lp64 } } } } */
> +/* { dg-final { scan-assembler "vzero\t%v5" } } */
>  /* { dg-final { scan-assembler "vzero\t%v6" { target { lp64 } } } } */
> -/* { dg-final { scan-assembler "vzero\t%v7" { target { lp64 } } } } */
> -/* { dg-final {

Re: [PATCH] s390/testsuite: Fix oscbreak-1.c.

2021-04-20 Thread Andreas Krebbel via Gcc-patches

On 4/16/21 3:59 PM, Robin Dapp wrote:
> Hi,
> 
> checking for an osc break is somewhat brittle especially with many
> passes potentially introducing new insns and moving them around.
> Therefore, only run the test with -O1 -fschedule-insns in order to limit
> the influence of other passes.

Yeah, that's because of the very limited analysis we do in the backend to 
detect such cases. In fact
we probably would want to have an OSC break in many of them as well.

For me the testcase appears to work with -O2 on all the -march levels. I think 
-O2 would be
preferred because that's what is most frequently used.

> 
> Is it OK?

Yes, either with -O2 or the options you have proposed if -O2 doesn't work out 
for you.

Thanks!

Andreas

> 
> Regards
>   Robin
> 
> --
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/s390/oscbreak-1.c: Compile with -O1
>   -fschedule-insns.
>

Re: [PATCH] IBM Z: Add alternative to *movdi_{31, 64} in order to load a DFP zero

2021-04-16 Thread Andreas Krebbel via Gcc-patches

On 4/12/21 3:40 PM, Stefan Schulze Frielinghaus wrote:
> Bootstraped and regtested on IBM Z.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md ("*movdi_31", "*movdi_64"): Add
> alternative in order to load a DFP zero.

Ok, thanks!

Andreas

> ---
>  gcc/config/s390/s390.md | 25 ++---
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index c10f25b2472..7faf775fbf2 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -1868,9 +1868,9 @@
>  
>  (define_insn "*movdi_64"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> - "=d,d,d,d,d, d,d,
> d,f,d,d,d,d,d,T,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t,v,v,v,d,v,R,d")
> + "=d,d,d,d,d, d,d,
> d,f,d,!*f,d,d,d,d,T,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t,v,v,v,d,v,R,d")
>  (match_operand:DI 1 "general_operand"
> - " K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,L,b,d,T,d, *f,  R,  
> T,*f,*f,d,K,t,d,t,Q,K,v,d,v,R,v,ZL"))]
> + " K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,j00,L,b,d,T,d, *f,  
> R,  T,*f,*f,d,K,t,d,t,Q,K,v,d,v,R,v,ZL"))]
>"TARGET_ZARCH"
>"@
> lghi\t%0,%h1
> @@ -1883,6 +1883,7 @@
> llilf\t%0,%k1
> ldgr\t%0,%1
> lgdr\t%0,%1
> +   lzdr\t%0
> lay\t%0,%a1
> lgrl\t%0,%1
> lgr\t%0,%1
> @@ -1906,13 +1907,13 @@
> vleg\t%v0,%1,0
> vsteg\t%v1,%0,0
> larl\t%0,%1"
> -  [(set_attr "op_type" "RI,RI,RI,RI,RI,RIL,RIL,RIL,RRE,RRE,RXY,RIL,RRE,RXY,
> +  [(set_attr "op_type" 
> "RI,RI,RI,RI,RI,RIL,RIL,RIL,RRE,RRE,RRE,RXY,RIL,RRE,RXY,
>  
> RXY,RR,RX,RXY,RX,RXY,RIL,SIL,*,*,RS,RS,VRI,VRR,VRS,VRS,
>  VRX,VRX,RIL")
> -   (set_attr "type" "*,*,*,*,*,*,*,*,floaddf,floaddf,la,larl,lr,load,store,
> +   (set_attr "type" 
> "*,*,*,*,*,*,*,*,floaddf,floaddf,fsimpdf,la,larl,lr,load,store,
>   floaddf,floaddf,floaddf,fstoredf,fstoredf,larl,*,*,*,*,
>   *,*,*,*,*,*,*,larl")
> -   (set_attr "cpu_facility" "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,longdisp,
> +   (set_attr "cpu_facility" 
> "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,*,longdisp,
>   z10,*,*,*,*,*,longdisp,*,longdisp,
>   z10,z10,*,*,*,*,vx,vx,vx,vx,vx,vx,*")
> (set_attr "z10prop" "z10_fwd_A1,
> @@ -1925,6 +1926,7 @@
>  z10_fwd_E1,
>  *,
>  *,
> + *,
>  z10_fwd_A1,
>  z10_fwd_A3,
>  z10_fr_E1,
> @@ -1942,7 +1944,7 @@
>  *,
>  *,*,*,*,*,*,*,
>  z10_super_A1")
> -   (set_attr "relative_long" "*,*,*,*,*,*,*,*,*,*,
> +   (set_attr "relative_long" "*,*,*,*,*,*,*,*,*,*,*,
>*,yes,*,*,*,*,*,*,*,*,
>yes,*,*,*,*,*,*,*,*,*,
>*,*,yes")
> @@ -2002,9 +2004,9 @@
>  
>  (define_insn "*movdi_31"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> -"=d,d,Q,S,d  ,o,!*f,!*f,!*f,!R,!T,d")
> +"=d,d,Q,S,d  ,o,!*f,!*f,!*f,!*f,!R,!T,d")
>  (match_operand:DI 1 "general_operand"
> -" Q,S,d,d,dPT,d, *f,  R,  T,*f,*f,b"))]
> +" Q,S,d,d,dPT,d, *f,  R,  T,j00,*f,*f,b"))]
>"!TARGET_ZARCH"
>"@
> lm\t%0,%N0,%S1
> @@ -2016,12 +2018,13 @@
> ldr\t%0,%1
> ld\t%0,%1
> ldy\t%0,%1
> +   lzdr\t%0
> std\t%1,%0
> stdy\t%1,%0
> #"
> -  [(set_attr "op_type" "RS,RSY,RS,RSY,*,*,RR,RX,RXY,RX,RXY,*")
> -   (set_attr "type" 
> "lm,lm,stm,stm,*,*,floaddf,floaddf,floaddf,fstoredf,fstoredf,*")
> -   (set_attr "cpu_facility" 
> "*,longdisp,*,longdisp,*,*,*,*,longdisp,*,longdisp,z10")])
> +  [(set_attr "op_type" "RS,RSY,RS,RSY,*,*,RR,RX,RXY,RRE,RX,RXY,*")
> +   (set_attr "type" 
> "lm,lm,stm,stm,*,*,floaddf,floaddf,floaddf,fsimpdf,fstoredf,fstoredf,*")
> +   (set_attr "cpu_facility" 
> "*,longdisp,*,longdisp,*,*,*,*,longdisp,*,*,longdisp,z10")])
>  
>  ; For a load from a symbol ref we can use one of the target registers
>  ; together with larl to load the address.
>

[Committed] IBM Z: Fix error checking for immediate builtin operands

2021-04-14 Thread Andreas Krebbel via Gcc-patches

This fixes the error checking for two of the vector builtins which
accept irregular (e.g. non-contigiuous) ranges of values.

Regression tested on s390x (--with-arch=arch13).
Applied to mainline. Needs to go into 9 and 10 branch as well.

gcc/ChangeLog:

* config/s390/s390-builtins.def (O_M5, O_M12, ...): Add new macros
for mask operand types.
(s390_vec_permi_s64, s390_vec_permi_b64, s390_vec_permi_u64)
(s390_vec_permi_dbl, s390_vpdi): Use the M5 type for the immediate
operand.
(s390_vec_msum_u128, s390_vmslg): Use the M12 type for the
immediate operand.
* config/s390/s390.c (s390_const_operand_ok): Check the new
operand types and generate a list of valid values.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/imm-range-error-1.c: New test.
* gcc.target/s390/zvector/vec_msum_u128-1.c: New test.
---
 gcc/config/s390/s390-builtins.def | 85 ---
 gcc/config/s390/s390.c| 35 ++--
 .../s390/zvector/imm-range-error-1.c  | 26 ++
 .../gcc.target/s390/zvector/vec_msum_u128-1.c | 45 ++
 4 files changed, 156 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/imm-range-error-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_msum_u128-1.c

diff --git a/gcc/config/s390/s390-builtins.def 
b/gcc/config/s390/s390-builtins.def
index 129d7124cba..f77ab750d22 100644
--- a/gcc/config/s390/s390-builtins.def
+++ b/gcc/config/s390/s390-builtins.def
@@ -29,6 +29,9 @@
 #undef O_U16
 #undef O_U32
 
+#undef O_M5
+#undef O_M12
+
 #undef O_S2
 #undef O_S3
 #undef O_S4
@@ -37,6 +40,7 @@
 #undef O_S12
 #undef O_S16
 #undef O_S32
+
 #undef O_ELEM
 #undef O_LIT
 
@@ -85,6 +89,16 @@
 #undef O3_U32
 #undef O4_U32
 
+#undef O1_M5
+#undef O2_M5
+#undef O3_M5
+#undef O4_M5
+
+#undef O1_M12
+#undef O2_M12
+#undef O3_M12
+#undef O4_M12
+
 #undef O1_S2
 #undef O2_S2
 #undef O3_S2
@@ -140,31 +154,34 @@
 #undef O_UIMM_P
 #undef O_SIMM_P
 
-#define O_U1   1 /* unsigned  1 bit literal */
-#define O_U2   2 /* unsigned  2 bit literal */
-#define O_U3   3 /* unsigned  3 bit literal */
-#define O_U4   4 /* unsigned  4 bit literal */
-#define O_U5   5 /* unsigned  5 bit literal */
-#define O_U8   6 /* unsigned  8 bit literal */
-#define O_U12  7 /* unsigned 16 bit literal */
-#define O_U16  8 /* unsigned 16 bit literal */
-#define O_U32  9 /* unsigned 32 bit literal */
-
-#define O_S2  10 /* signed  2 bit literal */
-#define O_S3  11 /* signed  3 bit literal */
-#define O_S4  12 /* signed  4 bit literal */
-#define O_S5  13 /* signed  5 bit literal */
-#define O_S8  14 /* signed  8 bit literal */
-#define O_S12 15 /* signed 12 bit literal */
-#define O_S16 16 /* signed 16 bit literal */
-#define O_S32 17 /* signed 32 bit literal */
-
-#define O_ELEM  18 /* Element selector requiring modulo arithmetic. */
-#define O_LIT   19 /* Operand must be a literal fitting the target type.  */
+#define O_U1 1 /* unsigned  1 bit literal */
+#define O_U2 2 /* unsigned  2 bit literal */
+#define O_U3 3 /* unsigned  3 bit literal */
+#define O_U4 4 /* unsigned  4 bit literal */
+#define O_U5 5 /* unsigned  5 bit literal */
+#define O_U8 6 /* unsigned  8 bit literal */
+#define O_U127 /* unsigned 16 bit literal */
+#define O_U168 /* unsigned 16 bit literal */
+#define O_U329 /* unsigned 32 bit literal */
+
+#define O_M510 /* matches bitmask of 5 */
+#define O_M12   11 /* matches bitmask of 12 */
+
+#define O_S212 /* signed  2 bit literal */
+#define O_S313 /* signed  3 bit literal */
+#define O_S414 /* signed  4 bit literal */
+#define O_S515 /* signed  5 bit literal */
+#define O_S816 /* signed  8 bit literal */
+#define O_S12   17 /* signed 12 bit literal */
+#define O_S16   18 /* signed 16 bit literal */
+#define O_S32   19 /* signed 32 bit literal */
+
+#define O_ELEM  20 /* Element selector requiring modulo arithmetic. */
+#define O_LIT   21 /* Operand must be a literal fitting the target type.  */
 
 #define O_SHIFT 5
 
-#define O_UIMM_P(X) ((X) >= O_U1 && (X) <= O_U32)
+#define O_UIMM_P(X) ((X) >= O_U1 && (X) <= O_M12)
 #define O_SIMM_P(X) ((X) >= O_S2 && (X) <= O_S32)
 #define O_IMM_P(X) ((X) == O_LIT || ((X) >= O_U1 && (X) <= O_S32))
 
@@ -213,6 +230,16 @@
 #define O3_U32 (O_U32 << (2 * O_SHIFT))
 #define O4_U32 (O_U32 << (3 * O_SHIFT))
 
+#define O1_M5 O_M5
+#define O2_M5 (O_M5 << O_SHIFT)
+#define O3_M5 (O_M5 << (2 * O_SHIFT))
+#define O4_M5 (O_M5 << (3 * O_SHIFT))
+
+#define O1_M12 O_M12
+#define O2_M12 (O_M12 << O_SHIFT)
+#define O3_M12 (O_M12 << (2 * O_SHIFT))
+#define O4_M12 (O_M12 << (3 * O_SHIFT))
+
 
 #define O1_S2 O_S2
 #define O2_S2 (O_S2 << O_SHIFT)
@@ -644,12 +671,12 @@ OB_DEF_VAR (s390_vec_perm_dbl,  s390_vperm,   
  0,
 B_DEF  (s390_vperm, vec_permv16qi,  0, 
 B_VX,   0,

Re: [PATCH] IBM Z: Add alternative to *movdi_{31, 64} in order to load a DFP zero

2021-04-12 Thread Andreas Krebbel via Gcc-patches

On 4/12/21 3:40 PM, Stefan Schulze Frielinghaus wrote:
> Bootstraped and regtested on IBM Z.  Ok for mainline?
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.md ("*movdi_31", "*movdi_64"): Add
> alternative in order to load a DFP zero.

Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.md | 25 ++---
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index c10f25b2472..7faf775fbf2 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -1868,9 +1868,9 @@
>  
>  (define_insn "*movdi_64"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> - "=d,d,d,d,d, d,d,
> d,f,d,d,d,d,d,T,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t,v,v,v,d,v,R,d")
> + "=d,d,d,d,d, d,d,
> d,f,d,!*f,d,d,d,d,T,!*f,!*f,!*f,!R,!T,b,Q,d,t,Q,t,v,v,v,d,v,R,d")
>  (match_operand:DI 1 "general_operand"
> - " K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,L,b,d,T,d, *f,  R,  
> T,*f,*f,d,K,t,d,t,Q,K,v,d,v,R,v,ZL"))]
> + " K,N0HD0,N1HD0,N2HD0,N3HD0,Os,N0SD0,N1SD0,d,f,j00,L,b,d,T,d, *f,  
> R,  T,*f,*f,d,K,t,d,t,Q,K,v,d,v,R,v,ZL"))]
>"TARGET_ZARCH"
>"@
> lghi\t%0,%h1
> @@ -1883,6 +1883,7 @@
> llilf\t%0,%k1
> ldgr\t%0,%1
> lgdr\t%0,%1
> +   lzdr\t%0
> lay\t%0,%a1
> lgrl\t%0,%1
> lgr\t%0,%1
> @@ -1906,13 +1907,13 @@
> vleg\t%v0,%1,0
> vsteg\t%v1,%0,0
> larl\t%0,%1"
> -  [(set_attr "op_type" "RI,RI,RI,RI,RI,RIL,RIL,RIL,RRE,RRE,RXY,RIL,RRE,RXY,
> +  [(set_attr "op_type" 
> "RI,RI,RI,RI,RI,RIL,RIL,RIL,RRE,RRE,RRE,RXY,RIL,RRE,RXY,
>  
> RXY,RR,RX,RXY,RX,RXY,RIL,SIL,*,*,RS,RS,VRI,VRR,VRS,VRS,
>  VRX,VRX,RIL")
> -   (set_attr "type" "*,*,*,*,*,*,*,*,floaddf,floaddf,la,larl,lr,load,store,
> +   (set_attr "type" 
> "*,*,*,*,*,*,*,*,floaddf,floaddf,fsimpdf,la,larl,lr,load,store,
>   floaddf,floaddf,floaddf,fstoredf,fstoredf,larl,*,*,*,*,
>   *,*,*,*,*,*,*,larl")
> -   (set_attr "cpu_facility" "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,longdisp,
> +   (set_attr "cpu_facility" 
> "*,*,*,*,*,extimm,extimm,extimm,dfp,dfp,*,longdisp,
>   z10,*,*,*,*,*,longdisp,*,longdisp,
>   z10,z10,*,*,*,*,vx,vx,vx,vx,vx,vx,*")
> (set_attr "z10prop" "z10_fwd_A1,
> @@ -1925,6 +1926,7 @@
>  z10_fwd_E1,
>  *,
>  *,
> + *,
>  z10_fwd_A1,
>  z10_fwd_A3,
>  z10_fr_E1,
> @@ -1942,7 +1944,7 @@
>  *,
>  *,*,*,*,*,*,*,
>  z10_super_A1")
> -   (set_attr "relative_long" "*,*,*,*,*,*,*,*,*,*,
> +   (set_attr "relative_long" "*,*,*,*,*,*,*,*,*,*,*,
>*,yes,*,*,*,*,*,*,*,*,
>yes,*,*,*,*,*,*,*,*,*,
>*,*,yes")
> @@ -2002,9 +2004,9 @@
>  
>  (define_insn "*movdi_31"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> -"=d,d,Q,S,d  ,o,!*f,!*f,!*f,!R,!T,d")
> +"=d,d,Q,S,d  ,o,!*f,!*f,!*f,!*f,!R,!T,d")
>  (match_operand:DI 1 "general_operand"
> -" Q,S,d,d,dPT,d, *f,  R,  T,*f,*f,b"))]
> +" Q,S,d,d,dPT,d, *f,  R,  T,j00,*f,*f,b"))]
>"!TARGET_ZARCH"
>"@
> lm\t%0,%N0,%S1
> @@ -2016,12 +2018,13 @@
> ldr\t%0,%1
> ld\t%0,%1
> ldy\t%0,%1
> +   lzdr\t%0
> std\t%1,%0
> stdy\t%1,%0
> #"
> -  [(set_attr "op_type" "RS,RSY,RS,RSY,*,*,RR,RX,RXY,RX,RXY,*")
> -   (set_attr "type" 
> "lm,lm,stm,stm,*,*,floaddf,floaddf,floaddf,fstoredf,fstoredf,*")
> -   (set_attr "cpu_facility" 
> "*,longdisp,*,longdisp,*,*,*,*,longdisp,*,longdisp,z10")])
> +  [(set_attr "op_type" "RS,RSY,RS,RSY,*,*,RR,RX,RXY,RRE,RX,RXY,*")
> +   (set_attr "type" 
> "lm,lm,stm,stm,*,*,floaddf,floaddf,floaddf,fsimpdf,fstoredf,fstoredf,*")
> +   (set_attr "cpu_facility" 
> "*,longdisp,*,longdisp,*,*,*,*,longdisp,*,*,longdisp,z10")])
>  
>  ; For a load from a symbol ref we can use one of the target registers
>  ; together with larl to load the address.
>

Re: [PATCH] testsuite: Disable zero-scratch-regs-{8,9,10,11}.c on s390* [PR97680]

2021-03-30 Thread Andreas Krebbel via Gcc-patches

On 3/30/21 12:43 PM, Jakub Jelinek wrote:
> Hi!
> 
> These test FAIL on s390*:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-10.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:
>  In function 'foo8':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-11.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-10.c:71:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> In file included from 
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-8.c:5:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-8.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:
>  In function 'foo':
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> compiler exited with status 1
> FAIL: c-c++-common/zero-scratch-regs-9.c  -Wc++-compat  (test for excess 
> errors)
> Excess errors:
> /builddir/build/BUILD/gcc-11.0.1-20210324/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c:10:1:
>  sorry, unimplemented: '-fzero-call-used-regs' not supported on this target
> Like on powerpc or arm, they need backend support which isn't there and
> likely should be added for GCC 12.
> 
> Ok to skip the test on s390* until then?

Fine with me.

Thanks!

Andreas

> 
> 2021-03-30  Jakub Jelinek  
> 
>   PR testsuite/97680
>   * c-c++-common/zero-scratch-regs-8.c: Skip on s390.
>   * c-c++-common/zero-scratch-regs-9.c: Likewise.
>   * c-c++-common/zero-scratch-regs-10.c: Likewise.
>   * c-c++-common/zero-scratch-regs-11.c: Likewise.
> 
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-8.c.jj   2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-8.c  2021-03-30 
> 12:32:11.099667255 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all-arg" } */
>  
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-9.c.jj   2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-9.c  2021-03-30 
> 12:32:26.707493760 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all" } */
>  
>  #include "zero-scratch-regs-1.c"
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-10.c.jj  2021-03-18 
> 15:32:56.459617723 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-10.c 2021-03-30 
> 12:31:56.468829910 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-skip-if "not implemented" { arm*-*-* } } */
>  /* { dg-options "-O2" } */
>  
> --- gcc/testsuite/c-c++-common/zero-scratch-regs-11.c.jj  2020-11-11 
> 01:46:03.392696119 +0100
> +++ gcc/testsuite/c-c++-common/zero-scratch-regs-11.c 2021-03-30 
> 12:32:46.012279152 +0200
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-skip-if "not implemented" { powerpc*-*-* } } */
> +/* { dg-skip-if "not implemented" { s390*-*-* } } */
>  /* { dg-options "-O2 -fzero-call-used-regs=all" } */
>  
>  #include "zero-scratch-regs-10.c"
> 
>   Jakub
>

Re: [PATCH] IBM Z: Fix "+fvm" constraint with long doubles

2021-03-16 Thread Andreas Krebbel via Gcc-patches

On 3/16/21 1:16 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> When a long double is passed to an asm statement with a "+fvm"
> constraint, a LRA loop occurs.  This happens, because LRA chooses the
> widest register class in this case (VEC_REGS), but the code generated
> by s390_md_asm_adjust() always wants FP_REGS.  Mismatching register
> classes cause infinite reloading.
> 
> Fix by treating "fv" constraints as "v" in s390_md_asm_adjust().
> 
> gcc/ChangeLog:
> 
>   * config/s390/s390.c (f_constraint_p): Treat "fv" constraints
>   as "v".
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/long-double-asm-fprvrmem.c: New test.

Ok. Thanks!

Andreas

> ---
>  gcc/config/s390/s390.c   | 12 ++--
>  .../s390/vector/long-double-asm-fprvrmem.c   | 11 +++
>  2 files changed, 21 insertions(+), 2 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/s390/vector/long-double-asm-fprvrmem.c
> 
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 151136bedbc..f7b1c03561e 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -16714,13 +16714,21 @@ s390_shift_truncation_mask (machine_mode mode)
>  static bool
>  f_constraint_p (const char *constraint)
>  {
> +  bool seen_f_p = false;
> +  bool seen_v_p = false;
> +
>for (size_t i = 0, c_len = strlen (constraint); i < c_len;
> i += CONSTRAINT_LEN (constraint[i], constraint + i))
>  {
>if (constraint[i] == 'f')
> - return true;
> + seen_f_p = true;
> +  if (constraint[i] == 'v')
> + seen_v_p = true;
>  }
> -  return false;
> +
> +  /* Treat "fv" constraints as "v", because LRA will choose the widest 
> register
> +   * class.  */
> +  return seen_f_p && !seen_v_p;
>  }
>  
>  /* Implement TARGET_MD_ASM_ADJUST hook in order to fix up "f"
> diff --git a/gcc/testsuite/gcc.target/s390/vector/long-double-asm-fprvrmem.c 
> b/gcc/testsuite/gcc.target/s390/vector/long-double-asm-fprvrmem.c
> new file mode 100644
> index 000..f95656c5723
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/s390/vector/long-double-asm-fprvrmem.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=z14 -mzarch" } */
> +
> +long double
> +foo (long double x)
> +{
> +  x = x * x;
> +  asm("# %0" : "+fvm"(x));
> +  x = x + x;
> +  return x;
> +}
>

[Committed] IBM Z: arch14 fix option string used for Binutils

2021-03-09 Thread Andreas Krebbel via Gcc-patches

gcc/ChangeLog:

* config/s390/s390.c (struct s390_processor processor_table):
Binutils name string must not be empty.
---
 gcc/config/s390/s390.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index de48271d6d4..151136bedbc 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -337,7 +337,7 @@ const struct s390_processor processor_table[] =
   { "z13","z13",PROCESSOR_2964_Z13,_cost,  11 },
   { "z14","arch12", PROCESSOR_3906_Z14,_cost,  12 },
   { "z15","arch13", PROCESSOR_8561_Z15,_cost,  13 },
-  { "arch14", "",   PROCESSOR_ARCH14,  _cost,  14 },
+  { "arch14", "arch14", PROCESSOR_ARCH14,  _cost,  14 },
   { "native", "",   PROCESSOR_NATIVE,  NULL, 0  }
 };
 
-- 
2.29.2

[Committed] IBM Z: Fix vcond-shift.c testcase.

2021-03-08 Thread Andreas Krebbel via Gcc-patches

Due to a common code change the comparison in the testcase is emitted
via vec_cmp instead of vcond.  The testcase checks for an optimization
currently only available via vcond.

Fixed by implementing the same optimization also in
s390_expand_vec_compare.

Bootstrapped and regression tested on s390x with -march=z15

This fixes the following testsuite fails:

< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-not vzero\\t*
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesrab\\t%v.?,%v.?,7 6
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesraf\\t%v.?,%v.?,31 6
< FAIL: gcc.target/s390/vector/vcond-shift.c scan-assembler-times 
vesrah\\t%v.?,%v.?,15 6

gcc/ChangeLog:

* config/s390/s390.c (s390_expand_vec_compare): Implement <0
comparison with arithmetic right shift.
(s390_expand_vcond): No need for a force_reg anymore.
s390_vec_compare will do it.
* config/s390/vector.md ("vec_cmp"): Accept also
immediate operands.
---
 gcc/config/s390/s390.c| 20 +++-
 gcc/config/s390/vector.md |  2 +-
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index f3d0d1ba596..c9aea21fe40 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6569,6 +6569,7 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
 
   if (GET_MODE_CLASS (GET_MODE (cmp_op1)) == MODE_VECTOR_FLOAT)
 {
+  cmp_op2 = force_reg (GET_MODE (cmp_op1), cmp_op2);
   switch (cond)
{
  /* NE a != b -> !(a == b) */
@@ -6607,6 +6608,19 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
 }
   else
 {
+  /* Turn x < 0 into x >> (bits per element - 1)  */
+  if (cond == LT && cmp_op2 == CONST0_RTX (mode))
+   {
+ int shift = GET_MODE_BITSIZE (GET_MODE_INNER (mode)) - 1;
+ rtx res = expand_simple_binop (mode, ASHIFTRT, cmp_op1,
+GEN_INT (shift), target,
+0, OPTAB_DIRECT);
+ if (res != target)
+   emit_move_insn (target, res);
+ return;
+   }
+  cmp_op2 = force_reg (GET_MODE (cmp_op1), cmp_op2);
+
   switch (cond)
{
  /* NE: a != b -> !(a == b) */
@@ -6824,11 +6838,7 @@ s390_expand_vcond (rtx target, rtx then, rtx els,
   if (!REG_P (cmp_op1))
 cmp_op1 = force_reg (GET_MODE (cmp_op1), cmp_op1);
 
-  if (!REG_P (cmp_op2))
-cmp_op2 = force_reg (GET_MODE (cmp_op2), cmp_op2);
-
-  s390_expand_vec_compare (result_target, cond,
-  cmp_op1, cmp_op2);
+  s390_expand_vec_compare (result_target, cond, cmp_op1, cmp_op2);
 
   /* If the results are supposed to be either -1 or 0 we are done
  since this is what our compare instructions generate anyway.  */
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index bc52211c55e..c80d582a300 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -1589,7 +1589,7 @@ (define_expand "vec_cmp"
   [(set (match_operand:  0 "register_operand" "")
(match_operator: 1 "vcond_comparison_operator"
  [(match_operand:V_HW 2 "register_operand" "")
-  (match_operand:V_HW 3 "register_operand" "")]))]
+  (match_operand:V_HW 3 "nonmemory_operand" "")]))]
   "TARGET_VX"
 {
   s390_expand_vec_compare (operands[0], GET_CODE(operands[1]), operands[2], 
operands[3]);
-- 
2.29.2

Re: [PATCH v3] IBM Z: Fix usage of "f" constraint with long doubles

2021-03-07 Thread Andreas Krebbel via Gcc-patches

On 3/4/21 3:08 PM, Ilya Leoshkevich wrote:
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563799.html
> v1 -> v2:
> - Handle constraint modifiers, use AR constraint instead of R, add
>   testcases for & and %.
> 
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html
> v2 -> v3:
> - The main prereq is now committed:
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566237.html
> - Dropped long-double-asm-abi.c test, because its prereq is not
>   approved (yet):
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566218.html
> - Removed superfluous constraint pointer increment.
> 
> 
> 
> After switching the s390 backend to store long doubles in vector
> registers, "f" constraint broke when used with the former: long doubles
> correspond to TFmode, which in combination with "f" corresponds to
> hard regs %v0-%v15, however, asm users expect a %f0-%f15 pair.
> 
> Fix by using TARGET_MD_ASM_ADJUST hook to convert TFmode values to
> FPRX2mode and back.
> 
> gcc/ChangeLog:
> 
> 2020-12-14  Ilya Leoshkevich  
> 
>   * config/s390/s390.c (f_constraint_p): New function.
>   (s390_md_asm_adjust): Implement TARGET_MD_ASM_ADJUST.
>   (TARGET_MD_ASM_ADJUST): Likewise.
>   * config/s390/vector.md (fprx2_to_tf): Rename from *fprx2_to_tf,
>   add memory alternative.
>   (tf_to_fprx2): New pattern.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-14  Ilya Leoshkevich  
> 
>   * gcc.target/s390/vector/long-double-asm-commutative.c: New
>   test.
>   * gcc.target/s390/vector/long-double-asm-earlyclobber.c: New
>   test.
>   * gcc.target/s390/vector/long-double-asm-in-out.c: New test.
>   * gcc.target/s390/vector/long-double-asm-inout.c: New test.
>   * gcc.target/s390/vector/long-double-asm-matching.c: New test.
>   * gcc.target/s390/vector/long-double-asm-regmem.c: New test.
>   * gcc.target/s390/vector/long-double-volatile-from-i64.c: New
>   test.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-03 Thread Andreas Krebbel via Gcc-patches

On 3/3/21 11:50 AM, Ilya Leoshkevich wrote:
> On Wed, 2021-03-03 at 07:50 +0100, Andreas Krebbel wrote:
>> On 3/2/21 11:59 PM, Ilya Leoshkevich wrote:
>>> mul-signed-overflow-*.c execution tests fail on z13, because they
>>> contain z14-specific instructions.  Fix by requiring s390_z14_hw
>>> target.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/s390/mul-signed-overflow-1.c: Run only on
>>> z14+.
>>> * gcc.target/s390/mul-signed-overflow-2.c: Likewise.
>>
>> I did that change yesterday already.
> 
> Ah, I haven't noticed.  One difference between our patches is, though,
> that I also have `dg-do compile` - this way, compile tests still run on
> z13.

Ok, that's a bit better indeed. Feel free to commit that change ontop.

Andreas

Re: [PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-02 Thread Andreas Krebbel via Gcc-patches

On 3/2/21 11:59 PM, Ilya Leoshkevich wrote:
> mul-signed-overflow-*.c execution tests fail on z13, because they
> contain z14-specific instructions.  Fix by requiring s390_z14_hw
> target.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/mul-signed-overflow-1.c: Run only on z14+.
>   * gcc.target/s390/mul-signed-overflow-2.c: Likewise.

I did that change yesterday already.

Andreas

> ---
>  gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c | 3 ++-
>  gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c 
> b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> index fdf56d6e695..e8b1938dab7 100644
> --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
> +/* { dg-do run { target { s390_z14_hw } } } */
>  /* z14 only because we need msrkc, msc, msgrkc, msgc  */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  
> diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c 
> b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> index d0088188aa2..01328e1d286 100644
> --- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> +++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
> @@ -1,4 +1,5 @@
> -/* { dg-do run } */
> +/* { dg-do compile } */
> +/* { dg-do run { target { s390_z14_hw } } } */
>  /* z14 only because we need msrkc, msc, msgrkc, msgc  */
>  /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
>  
>

[Committed 2/2] IBM Z: arch14: New instrinsics

2021-03-02 Thread Andreas Krebbel via Gcc-patches

This adds support for 5 new builtins.

gcc/ChangeLog:

* config/s390/s390-builtin-types.def (BT_FN_V4SF_V8HI_UINT): New
builtin signature.
(BT_FN_V8HI_V8HI_UINT): Likewise.
(BT_FN_V8HI_V4SF_V4SF_UINT): Likewise.
* config/s390/s390-builtins.def (B_NNPA): New macro definition.
(s390_vclfnhs, s390_vclfnls, s390_vcrnfs, s390_vcfn, s390_vcnf):
New builtin definitions.
* config/s390/s390-c.c (s390_cpu_cpp_builtins_internal): Bump
vector extension version.
* config/s390/s390.c (s390_expand_builtin): Check if builtins are
available with current -march level.
* config/s390/s390.md (UNSPEC_NNPA_VCLFNHS_V8HI)
(UNSPEC_NNPA_VCLFNLS_V8HI, UNSPEC_NNPA_VCRNFS_V8HI)
(UNSPEC_NNPA_VCFN_V8HI, UNSPEC_NNPA_VCNF_V8HI): New constants.
* config/s390/vecintrin.h (vec_extend_to_fp32_hi): New macro.
(vec_extend_to_fp32_lo): Likewise.
(vec_round_from_fp32): Likewise.
(vec_convert_to_fp16): Likewise.
(vec_convert_from_fp16): Likewise.
* config/s390/vx-builtins.md (vclfnhs_v8hi): New insn pattern.
(vclfnls_v8hi): Likewise.
(vcrnfs_v8hi): Likewise.
(vcfn_v8hi): Likewise.
(vcnf_v8hi): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/vec-nnpa-fp16-convert.c: New test.
* gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c: New test.
* gcc.target/s390/zvector/vec_convert_from_fp16.c: New test.
* gcc.target/s390/zvector/vec_convert_to_fp16.c: New test.
* gcc.target/s390/zvector/vec_extend_to_fp32_hi.c: New test.
* gcc.target/s390/zvector/vec_extend_to_fp32_lo.c: New test.
* gcc.target/s390/zvector/vec_round_from_fp32.c: New test.
---
 gcc/config/s390/s390-builtin-types.def|  3 +
 gcc/config/s390/s390-builtins.def | 12 
 gcc/config/s390/s390-c.c  |  2 +-
 gcc/config/s390/s390.c|  6 ++
 gcc/config/s390/s390.md   |  7 +++
 gcc/config/s390/vecintrin.h   |  6 ++
 gcc/config/s390/vx-builtins.md| 55 +++
 .../s390/zvector/vec-nnpa-fp16-convert.c  | 34 
 .../s390/zvector/vec-nnpa-fp32-convert-1.c| 27 +
 .../s390/zvector/vec_convert_from_fp16.c  | 12 
 .../s390/zvector/vec_convert_to_fp16.c| 12 
 .../s390/zvector/vec_extend_to_fp32_hi.c  | 12 
 .../s390/zvector/vec_extend_to_fp32_lo.c  | 12 
 .../s390/zvector/vec_round_from_fp32.c| 12 
 14 files changed, 211 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec-nnpa-fp16-convert.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec-nnpa-fp32-convert-1.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_convert_from_fp16.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_convert_to_fp16.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_extend_to_fp32_hi.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/zvector/vec_extend_to_fp32_lo.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_round_from_fp32.c

diff --git a/gcc/config/s390/s390-builtin-types.def 
b/gcc/config/s390/s390-builtin-types.def
index a2b7d4a9a32..52ef5728539 100644
--- a/gcc/config/s390/s390-builtin-types.def
+++ b/gcc/config/s390/s390-builtin-types.def
@@ -267,6 +267,7 @@ DEF_FN_TYPE_2 (BT_FN_V2DI_V4SI_V4SI, BT_V2DI, BT_V4SI, 
BT_V4SI)
 DEF_FN_TYPE_2 (BT_FN_V4SF_FLT_INT, BT_V4SF, BT_FLT, BT_INT)
 DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_UCHAR, BT_V4SF, BT_V4SF, BT_UCHAR)
 DEF_FN_TYPE_2 (BT_FN_V4SF_V4SF_V4SF, BT_V4SF, BT_V4SF, BT_V4SF)
+DEF_FN_TYPE_2 (BT_FN_V4SF_V8HI_UINT, BT_V4SF, BT_V8HI, BT_UINT)
 DEF_FN_TYPE_2 (BT_FN_V4SI_BV4SI_V4SI, BT_V4SI, BT_BV4SI, BT_V4SI)
 DEF_FN_TYPE_2 (BT_FN_V4SI_INT_VOIDCONSTPTR, BT_V4SI, BT_INT, BT_VOIDCONSTPTR)
 DEF_FN_TYPE_2 (BT_FN_V4SI_UV4SI_UV4SI, BT_V4SI, BT_UV4SI, BT_UV4SI)
@@ -278,6 +279,7 @@ DEF_FN_TYPE_2 (BT_FN_V8HI_BV8HI_V8HI, BT_V8HI, BT_BV8HI, 
BT_V8HI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_UV8HI_UV8HI, BT_V8HI, BT_UV8HI, BT_UV8HI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V16QI_V16QI, BT_V8HI, BT_V16QI, BT_V16QI)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V4SI_V4SI, BT_V8HI, BT_V4SI, BT_V4SI)
+DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_UINT, BT_V8HI, BT_V8HI, BT_UINT)
 DEF_FN_TYPE_2 (BT_FN_V8HI_V8HI_V8HI, BT_V8HI, BT_V8HI, BT_V8HI)
 DEF_FN_TYPE_2 (BT_FN_VOID_UINT64PTR_UINT64, BT_VOID, BT_UINT64PTR, BT_UINT64)
 DEF_FN_TYPE_2 (BT_FN_VOID_V2DF_FLTPTR, BT_VOID, BT_V2DF, BT_FLTPTR)
@@ -345,6 +347,7 @@ DEF_FN_TYPE_3 (BT_FN_V4SI_V4SI_V4SI_V4SI, BT_V4SI, BT_V4SI, 
BT_V4SI, BT_V4SI)
 DEF_FN_TYPE_3 (BT_FN_V4SI_V8HI_V8HI_V4SI, BT_V4SI, BT_V8HI, BT_V8HI, BT_V4SI)
 DEF_FN_TYPE_3 (BT_FN_V8HI_UV8HI_UV8HI_INTPTR, BT_V8HI, BT_UV8HI, BT_UV8HI, 
BT_INTPTR)
 DEF_FN_TYPE_3 (BT_FN_V8HI_V16QI_V16QI_V8HI, BT_V8HI, BT_V16QI, BT_V16QI, 
BT_V8HI)
+DEF_FN_TYPE_3 (BT_FN_V8HI_V4SF_V4SF_UINT, BT_V8HI, BT_V4SF, BT_V4SF,

[Committed 1/2] IBM Z: arch14: Add command line options

2021-03-02 Thread Andreas Krebbel via Gcc-patches

Prepare GCC for a future architecture extension.

gcc/ChangeLog:

* common/config/s390/s390-common.c (processor_flags_table): New entry.
* config.gcc: Enable arch14 for --with-arch and --with-tune.
* config/s390/driver-native.c (s390_host_detect_local_cpu): Pick
arch14 for unknown CPU models.
* config/s390/s390-opts.h (enum processor_type): Add PROCESSOR_ARCH14.
* config/s390/s390.c (s390_issue_rate): Add case for PROCESSOR_ARCH14.
(s390_get_sched_attrmask): Likewise.
(s390_get_unit_mask): Likewise.
* config/s390/s390.h (enum processor_flags): Add PF_NNPA and PF_ARCH14.
(TARGET_CPU_ARCH14, TARGET_CPU_ARCH14_P, TARGET_CPU_NNPA)
(TARGET_CPU_NNPA_P, TARGET_ARCH14, TARGET_ARCH14_P, TARGET_NNPA)
(TARGET_NNPA_P): New macro definitions.
* config/s390/s390.md ("cpu_facility", "enabled"): Add arch14 and nnpa.
* config/s390/s390.opt: Add PROCESSOR_ARCH14.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add check for nnpa facility.
---
 gcc/common/config/s390/s390-common.c  |  4 
 gcc/config.gcc|  2 +-
 gcc/config/s390/driver-native.c   |  2 +-
 gcc/config/s390/s390-opts.h   |  1 +
 gcc/config/s390/s390.c|  4 
 gcc/config/s390/s390.h| 20 +++-
 gcc/config/s390/s390.md   | 12 ++--
 gcc/config/s390/s390.opt  |  3 +++
 gcc/testsuite/lib/target-supports.exp | 16 
 9 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/s390/s390-common.c 
b/gcc/common/config/s390/s390-common.c
index d066cf7395b..b6bc8501742 100644
--- a/gcc/common/config/s390/s390-common.c
+++ b/gcc/common/config/s390/s390-common.c
@@ -48,8 +48,12 @@ EXPORTED_CONST int processor_flags_table[] =
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14,
 /* z15 */PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
+| PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
+| PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15,
+/* arch14 */ PF_IEEE_FLOAT | PF_ZARCH | PF_LONG_DISPLACEMENT
 | PF_EXTIMM | PF_DFP | PF_Z10 | PF_Z196 | PF_ZEC12 | PF_TX
 | PF_Z13 | PF_VX | PF_VXE | PF_Z14 | PF_VXE2 | PF_Z15
+| PF_NNPA | PF_ARCH14
   };
 
 /* Change optimizations to be performed, depending on the
diff --git a/gcc/config.gcc b/gcc/config.gcc
index c8853009e55..966cbc888cb 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -5122,7 +5122,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 )
+   "" | native | z900 | z990 | z9-109 | z9-ec | z10 | z196 
| zEC12 | z13 | z14 | z15 | arch5 | arch6 | arch7 | arch8 | arch9 | arch10 | 
arch11 | arch12 | arch13 | arch14 )
# OK
;;
*)
diff --git a/gcc/config/s390/driver-native.c b/gcc/config/s390/driver-native.c
index 4a065a52c17..c0247154c0b 100644
--- a/gcc/config/s390/driver-native.c
+++ b/gcc/config/s390/driver-native.c
@@ -124,7 +124,7 @@ s390_host_detect_local_cpu (int argc, const char **argv)
  cpu = "z15";
  break;
default:
- cpu = "z15";
+ cpu = "arch14";
  break;
}
}
diff --git a/gcc/config/s390/s390-opts.h b/gcc/config/s390/s390-opts.h
index d5751809ba5..4141b4d36dd 100644
--- a/gcc/config/s390/s390-opts.h
+++ b/gcc/config/s390/s390-opts.h
@@ -38,6 +38,7 @@ enum processor_type
   PROCESSOR_2964_Z13,
   PROCESSOR_3906_Z14,
   PROCESSOR_8561_Z15,
+  PROCESSOR_ARCH14,
   PROCESSOR_NATIVE,
   PROCESSOR_max
 };
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2cee950d0..fcb26316632 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -337,6 +337,7 @@ const struct s390_processor processor_table[] =
   { "z13","z13",PROCESSOR_2964_Z13,_cost,  11 },
   { "z14","arch12", PROCESSOR_3906_Z14,_cost,  12 },
   { "z15","arch13", PROCESSOR_8561_Z15,_cost,  13 },
+  { "arch14", "",   PROCESSOR_ARCH14,  _cost,  14 },
   { "native", "",   PROCESSOR_NATIVE,  NULL, 0  }
 };
 
@@ -8409,6 +8410,7 @@ s390_issue_rate (void)
 case PROCESSOR_2827_ZEC12:
 case PROCESSOR_2964_Z13:
 case PROCESSOR_3906_Z14:
+case PROCESSOR_ARCH14:
 default:
   return 1;
 }
@@ -14768,6 +14770,7 @@ s390_get_sched_attrmask (rtx_insn *insn)
mask |= S390_SCHED_ATTR_MASK_GROUPOFTWO;
   break;
 case PROCESSOR_8561_Z15:

[Committed] IBM Z: Run mul-signed-overflow tests only on z14

2021-03-02 Thread Andreas Krebbel via Gcc-patches

gcc/testsuite/ChangeLog:

* gcc.target/s390/mul-signed-overflow-1.c: Run only on z14.
* gcc.target/s390/mul-signed-overflow-2.c: Run only on z14.
---
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c | 2 +-
 gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
index fdf56d6e695..be95acc54aa 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-1.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
diff --git a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c 
b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
index d0088188aa2..f5fbf276c5f 100644
--- a/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
+++ b/gcc/testsuite/gcc.target/s390/mul-signed-overflow-2.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { target { s390_z14_hw } } } */
 /* z14 only because we need msrkc, msc, msgrkc, msgc  */
 /* { dg-options "-O3 -march=z14 -mzarch --save-temps" } */
 
-- 
2.29.2

Re: [PATCH] IBM Z: Fix testcase vcond-shift.c

2021-03-01 Thread Andreas Krebbel via Gcc-patches

On 3/1/21 5:00 PM, Stefan Schulze Frielinghaus wrote:
> As of commit 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1 expressions
> x CMP y ? -1 : 0 are fold into x CMP y.  Due to this we do not see
> shifts anymore after expand in our testcases but comparisons.  Thus
> replace instructions vesraX by corresponding vchX.  Keep testcases
> vchX_{lt,gt} where only a relational comparison is done and no shift in
> order to keep test coverage for vectorization.

The vcond-shift optimization verified by the testcase is currently implemented 
in s390_expand_vcond
but due to the common code change we go the vec_cmp route now. So we probably 
should do the same
also in s390_expand_vec_compare now. Perhaps like this ... it appears to fix 
the testcase for me:

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2cee950d0b..9d9f5a0f6f4e 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -6562,6 +6562,7 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,

   if (GET_MODE_CLASS (GET_MODE (cmp_op1)) == MODE_VECTOR_FLOAT)
 {
+  cmp_op2 = force_operand (cmp_op2, 0);
   switch (cond)
{
  /* NE a != b -> !(a == b) */
@@ -6600,6 +6601,19 @@ s390_expand_vec_compare (rtx target, enum rtx_code cond,
 }
   else
 {
+  /* Turn x < 0 into x >> (bits - )  */
+  if (cond == LT && cmp_op2 == CONST0_RTX (mode))
+   {
+ int shift = GET_MODE_BITSIZE (GET_MODE_INNER (mode)) - 1;
+ rtx res = expand_simple_binop (mode, ASHIFTRT, cmp_op1,
+GEN_INT (shift), target,
+0, OPTAB_DIRECT);
+ if (res != target)
+   emit_move_insn (target, res);
+ return;
+   }
+  cmp_op2 = force_operand (cmp_op2, 0);
+
   switch (cond)
{
  /* NE: a != b -> !(a == b) */
diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index bc52211c55e5..c80d582a300d 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -1589,7 +1589,7 @@
   [(set (match_operand:  0 "register_operand" "")
(match_operator: 1 "vcond_comparison_operator"
  [(match_operand:V_HW 2 "register_operand" "")
-  (match_operand:V_HW 3 "register_operand" "")]))]
+  (match_operand:V_HW 3 "nonmemory_operand" "")]))]
   "TARGET_VX"
 {
   s390_expand_vec_compare (operands[0], GET_CODE(operands[1]), operands[2], 
operands[3]);

Andreas


> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/s390/vector/vcond-shift.c: Replace vesraX
>   instructions by corresponding vchX instructions.
> ---
>  .../gcc.target/s390/vector/vcond-shift.c  | 31 ++-
>  1 file changed, 17 insertions(+), 14 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/s390/vector/vcond-shift.c 
> b/gcc/testsuite/gcc.target/s390/vector/vcond-shift.c
> index a6b4e97aa50..9e472aef960 100644
> --- a/gcc/testsuite/gcc.target/s390/vector/vcond-shift.c
> +++ b/gcc/testsuite/gcc.target/s390/vector/vcond-shift.c
> @@ -3,10 +3,13 @@
>  /* { dg-do compile { target { s390*-*-* } } } */
>  /* { dg-options "-O3 -march=z13 -mzarch" } */
>  
> -/* { dg-final { scan-assembler-times "vesraf\t%v.?,%v.?,31" 6 } } */
> -/* { dg-final { scan-assembler-times "vesrah\t%v.?,%v.?,15" 6 } } */
> -/* { dg-final { scan-assembler-times "vesrab\t%v.?,%v.?,7" 6 } } */
> -/* { dg-final { scan-assembler-not "vzero\t*" } } */
> +/* { dg-final { scan-assembler-times "vzero\t" 9 } } */
> +/* { dg-final { scan-assembler-times "vchf\t" 6 } } */
> +/* { dg-final { scan-assembler-times "vesraf\t%v.?,%v.?,1" 2 } } */
> +/* { dg-final { scan-assembler-times "vchh\t" 6 } } */
> +/* { dg-final { scan-assembler-times "vesrah\t%v.?,%v.?,1" 2 } } */
> +/* { dg-final { scan-assembler-times "vchb\t" 6 } } */
> +/* { dg-final { scan-assembler-times "vesrab\t%v.?,%v.?,1" 2 } } */
>  /* { dg-final { scan-assembler-times "vesrlf\t%v.?,%v.?,31" 4 } } */
>  /* { dg-final { scan-assembler-times "vesrlh\t%v.?,%v.?,15" 4 } } */
>  /* { dg-final { scan-assembler-times "vesrlb\t%v.?,%v.?,7" 4 } } */
> @@ -15,19 +18,19 @@
>  #define ITER(X) (2 * (16 / sizeof (X[1])))
>  
>  void
> -vesraf_div (int *x)
> +vchf_vesraf_div (int *x)
>  {
>int i;
>int *xx = __builtin_assume_aligned (x, 8);
>  
>/* Should expand to (xx + (xx < 0 ? 1 : 0)) >> 1
> - which in turn should get simplified to (xx + (xx >> 31)) >> 1.  */
> + which in turn should get simplified to (xx - (xx < 0)) >> 1.  */
>for (i = 0; i < ITER (xx); i++)
>  xx[i] = xx[i] / 2;
>  }
>  
>  void
> -vesrah_div (short *x)
> +vchh_vesrah_div (short *x)
>  {
>int i;
>short *xx = __builtin_assume_aligned (x, 8);
> @@ -38,7 +41,7 @@ vesrah_div (short *x)
>  
>  
>  void
> -vesrab_div (signed char *x)
> +vchb_vesrab_div (signed char *x)
>  {
>int i;
>signed char *xx = __builtin_assume_aligned (x, 8);
> @@ -50,7 +53,7 @@ vesrab_div (signed char *x)
>  
>  
>  int
> -vesraf_lt (int *x)
>

Re: [PATCH 0/2] IBM Z: Fix long double <-> DFP conversions

2021-02-19 Thread Andreas Krebbel via Gcc-patches

On 2/18/21 1:57 PM, Ilya Leoshkevich wrote:
> This series fixes PR99134.  Patch 1 is factored out from the pending
> [1], patch 2 is the actual fix.  Bootstrapped and regtested on
> s390x-redhat-linux.  Ok for master?
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html
> 
> Ilya Leoshkevich (2):
>   IBM Z: Improve FPRX2 <-> TF conversions
>   IBM Z: Fix long double <-> DFP conversions

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Fix usage of "f" constraint with long doubles

2021-01-26 Thread Andreas Krebbel via Gcc-patches

On 1/18/21 10:54 PM, Ilya Leoshkevich wrote:
...

> +static rtx_insn *
> +s390_md_asm_adjust (vec , vec ,
> + vec _modes,
> + vec , vec & /*clobbers*/,
> + HARD_REG_SET & /*clobbered_regs*/)
> +{
> +  if (!TARGET_VXE)
> +/* Long doubles are stored in FPR pairs - nothing to do.  */
> +return NULL;
> +
> +  rtx_insn *after_md_seq = NULL, *after_md_end = NULL;
> +
> +  unsigned ninputs = inputs.length ();
> +  unsigned noutputs = outputs.length ();
> +  for (unsigned i = 0; i < noutputs; i++)
> +{
> +  if (GET_MODE (outputs[i]) != TFmode)
> + /* Not a long double - nothing to do.  */
> + continue;
> +  const char *constraint = constraints[i];
> +  bool allows_mem, allows_reg, is_inout;
> +  bool ok = parse_output_constraint (, i, ninputs, noutputs,
> +  _mem, _reg, _inout);
> +  gcc_assert (ok);
> +  if (strcmp (constraint, "=f") != 0)
> + /* Long double with a constraint other than "=f" - nothing to do.  */
> + continue;

What about other constraint modifiers like & and %? Don't we need to handle 
matching constraints as
well here?

> +  gcc_assert (allows_reg);
> +  gcc_assert (!allows_mem);
> +  gcc_assert (!is_inout);
> +  /* Copy output value from a FPR pair into a vector register.  */
> +  rtx fprx2 = gen_reg_rtx (FPRX2mode);
> +  push_to_sequence2 (after_md_seq, after_md_end);
> +  emit_insn (gen_fprx2_to_tf (outputs[i], fprx2));
> +  after_md_seq = get_insns ();
> +  after_md_end = get_last_insn ();
> +  end_sequence ();
> +  outputs[i] = fprx2;
> +}
> +
> +  for (unsigned i = 0; i < ninputs; i++)
> +{
> +  if (GET_MODE (inputs[i]) != TFmode)
> + /* Not a long double - nothing to do.  */
> + continue;
> +  const char *constraint = constraints[noutputs + i];
> +  bool allows_mem, allows_reg;
> +  bool ok = parse_input_constraint (, i, ninputs, noutputs, 0,
> + constraints.address (), _mem,
> + _reg);
> +  gcc_assert (ok);
> +  if (strcmp (constraint, "f") != 0 && strcmp (constraint, "=f") != 0)
> + /* Long double with a constraint other than "f" (or "=f" for inout
> +operands) - nothing to do.  */
> + continue;
> +  gcc_assert (allows_reg);
> +  gcc_assert (!allows_mem);
> +  /* Copy input value from a vector register into a FPR pair.  */
> +  rtx fprx2 = gen_reg_rtx (FPRX2mode);
> +  emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i]));
> +  inputs[i] = fprx2;
> +  input_modes[i] = FPRX2mode;
> +}
> +
> +  return after_md_seq;
> +}
> +
>  /* Initialize GCC target structure.  */
>  
>  #undef  TARGET_ASM_ALIGNED_HI_OP
> @@ -16995,6 +17065,9 @@ s390_shift_truncation_mask (machine_mode mode)
>  #undef TARGET_MAX_ANCHOR_OFFSET
>  #define TARGET_MAX_ANCHOR_OFFSET 0xfff
>  
> +#undef TARGET_MD_ASM_ADJUST
> +#define TARGET_MD_ASM_ADJUST s390_md_asm_adjust
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
>  #include "gt-s390.h"
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index 0e3c31f5d4f..1332a65a1d1 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -616,12 +616,23 @@ (define_insn "*vec_tf_to_v1tf_vr"
> vlvgp\t%v0,%1,%N1"
>[(set_attr "op_type" "VRR,VRX,VRX,VRI,VRR")])
>  
> -(define_insn "*fprx2_to_tf"
> -  [(set (match_operand:TF   0 "nonimmediate_operand" "=v")
> - (subreg:TF (match_operand:FPRX2 1 "general_operand"   "f") 0))]
> +(define_insn_and_split "fprx2_to_tf"
> +  [(set (match_operand:TF   0 "nonimmediate_operand" "=v,R")
> + (subreg:TF (match_operand:FPRX2 1 "general_operand"   "f,f") 0))]
>"TARGET_VXE"
> -  "vmrhg\t%v0,%1,%N1"
> -  [(set_attr "op_type" "VRR")])
> +  "@
> +   vmrhg\t%v0,%1,%N1
> +   #"
> +  "!(MEM_P (operands[0]) && MEM_VOLATILE_P (operands[0]))"
> +  [(set (match_dup 2) (match_dup 3))
> +   (set (match_dup 4) (match_dup 5))]
> +{
> +  operands[2] = simplify_gen_subreg (DFmode, operands[0], TFmode, 0);
> +  operands[3] = simplify_gen_subreg (DFmode, operands[1], FPRX2mode, 0);
> +  operands[4] = simplify_gen_subreg (DFmode, operands[0], TFmode, 8);
> +  operands[5] = simplify_gen_subreg (DFmode, operands[1], FPRX2mode, 8);
> +}
> +  [(set_attr "op_type" "VRR,*")])

Splitting an address like this might cause the displacement to overflow in the 
second part. This
would require an additional reg to make the address valid again. Which in turn 
will be a problem
after reload. You can use the 'AR' constraint for the memory alternative. That 
way reload will make
sure the address is offsetable.

Andreas


>  
>  (define_insn "*vec_ti_to_v1ti"
>[(set (match_operand:V1TI   0 "nonimmediate_operand" 
> "=v,v,R,  v,  v,v")
> @@ -753,6 +764,21 @@ (define_insn "*tf_to_fprx2_1"
>"vpdi\t%V0,%v1,%V0,5"
>[(set_attr "op_type"

[PATCH] tree-optimization/98221 - fix wrong unpack operation used for big-endian

2021-01-11 Thread Andreas Krebbel via Gcc-patches

The vec-abi-varargs-1.c testcase on IBM Z currently fails.

While adding an SI mode vector to a DI mode vector the first is unpacked using:

  _28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
  _34 = [vec_unpack_lo_expr] _28;

However, on big endian targets lo refers to the right hand side of the vector - 
in this case the zeroes.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2021-01-11  Andreas Krebbel  

* tree-ssa-forwprop.c (simplify_vector_constructor): For
big-endian, use UNPACK[_FLOAT]_HI.
---
 gcc/tree-ssa-forwprop.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index 8a1a1237647..0706fd862de 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2392,6 +2392,17 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
 some simple special cases via VEC_[UN]PACK[_FLOAT]_LO_EXPR.  */
  optab optab;
  tree halfvectype, dblvectype;
+ enum tree_code unpack_op;
+
+ if (!BYTES_BIG_ENDIAN)
+   unpack_op = (FLOAT_TYPE_P (TREE_TYPE (type))
+? VEC_UNPACK_FLOAT_LO_EXPR
+: VEC_UNPACK_LO_EXPR);
+ else
+   unpack_op = (FLOAT_TYPE_P (TREE_TYPE (type))
+? VEC_UNPACK_FLOAT_HI_EXPR
+: VEC_UNPACK_HI_EXPR);
+
  if (CONVERT_EXPR_CODE_P (conv_code)
  && (2 * TYPE_PRECISION (TREE_TYPE (TREE_TYPE (orig[0])))
  == TYPE_PRECISION (TREE_TYPE (type)))
@@ -2405,9 +2416,7 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
 represented as scalar bitmasks.  See PR95528.  */
  && (VECTOR_MODE_P (TYPE_MODE (dblvectype))
  || VECTOR_BOOLEAN_TYPE_P (dblvectype))
- && (optab = optab_for_tree_code (FLOAT_TYPE_P (TREE_TYPE (type))
-  ? VEC_UNPACK_FLOAT_LO_EXPR
-  : VEC_UNPACK_LO_EXPR,
+ && (optab = optab_for_tree_code (unpack_op,
   dblvectype,
   optab_default))
  && (optab_handler (optab, TYPE_MODE (dblvectype))
@@ -2430,11 +2439,7 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
orig[0], TYPE_SIZE (dblvectype),
bitsize_zero_node);
  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
- gimple_assign_set_rhs_with_ops (gsi,
- FLOAT_TYPE_P (TREE_TYPE (type))
- ? VEC_UNPACK_FLOAT_LO_EXPR
- : VEC_UNPACK_LO_EXPR,
- dbl);
+ gimple_assign_set_rhs_with_ops (gsi, unpack_op, dbl);
}
  else if (CONVERT_EXPR_CODE_P (conv_code)
   && (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (orig[0])))
-- 
2.26.2

Re: [PATCH] IBM Z: Fix constraints in vpdi patterns

2021-01-08 Thread Andreas Krebbel via Gcc-patches

On 1/8/21 5:35 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> The destination register is only partially overwritten, so + should be
> used instead of =.
> 
> gcc/ChangeLog:
> 
> 2021-01-08  Ilya Leoshkevich  
> 
>   * config/s390/vector.md (*tf_to_fprx2_0): Rename from
>   *mov_tf_to_fprx2_0 for consistency, fix constraint.
>   (*tf_to_fprx2_1): Rename from *mov_tf_to_fprx2_1 for
>   consistency, fix constraint.

Ok, thanks!

Andreas

Re: [PATCH v2] IBM Z: Introduce __LONG_DOUBLE_VX__ macro

2021-01-08 Thread Andreas Krebbel via Gcc-patches

On 1/8/21 2:14 PM, Ilya Leoshkevich wrote:
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563034.html
> v1 -> v2: Use TARGET_VXE_P instead of TARGET_Z14_P.
> 
> 
> 
> Give end users the opportunity to find out whether long doubles are
> stored in floating-point register pairs or in vector registers, so that
> they could fine-tune their asm statements.
> 
> gcc/ChangeLog:
> 
> 2020-12-14  Ilya Leoshkevich  
> 
>   * config/s390/s390-c.c (s390_def_or_undef_macro): Accept
>   callables instead of mask values.
>   (struct target_flag_set_p): New predicate.
>   (s390_cpu_cpp_builtins_internal): Define or undefine
>   __LONG_DOUBLE_VX__ macro.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-14  Ilya Leoshkevich  
> 
>   * gcc.target/s390/vector/long-double-vx-macro-off.c: New test.
>   * gcc.target/s390/vector/long-double-vx-macro-on.c: New test.

Ok, thanks!

Andreas

Re: [PATCH] IBM Z: Introduce __LONG_DOUBLE_VX__ macro

2021-01-08 Thread Andreas Krebbel via Gcc-patches

On 1/8/21 1:17 AM, Ilya Leoshkevich wrote:

> +  s390_def_or_undef_macro (
> +  pfile,
> +  [] (const struct cl_target_option *opts) { return TARGET_Z14_P (opts); 
> },
> +  old_opts, opts, "__LONG_DOUBLE_VX__", "__LONG_DOUBLE_VX__");

Shouldn't this rather check TARGET_VXE_P instead?

Bye,

Andreas

Re: [PATCH] IBM Z: Fix check_effective_target_s390_z14_hw

2021-01-05 Thread Andreas Krebbel via Gcc-patches

On 1/5/21 7:37 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on z14.  Ok for master?
> 
> 
> 
> Commit 2f473f4b065d ("IBM Z: Do not run long double tests on old
> machines") introduced a predicate for tests that must run only on z14+.
> However, due to a syntax error, the predicate always returns false.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-10  Ilya Leoshkevich  
> 
>   * gcc.target/s390/s390.exp: Replace %% with %.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Build autovec-*-signaling-eq.c tests with exceptions

2020-12-08 Thread Andreas Krebbel via Gcc-patches

On 12/3/20 2:22 AM, Ilya Leoshkevich wrote:
> According to
> https://gcc.gnu.org/pipermail/gcc/2020-November/234344.html, GCC is
> allowed to perform optimizations that remove floating point traps,
> since they do not affect the modeled control flow.  This interferes with
> two signaling comparison tests, where (a <= b && a >= b) is turned into
> (a <= b && a == b) by test_for_singularity, into ((a <= b) & (a == b))
> by vectorizer and then into (a == b) eliminate_redundant_comparison.
> 
> Fix by making traps affect the control flow by turning them into
> exceptions.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-12-03  Ilya Leoshkevich  
> 
>   * gcc.target/s390/zvector/autovec-double-signaling-eq.c: Build
>   with exceptions.
>   * gcc.target/s390/zvector/autovec-float-signaling-eq.c:
>   Likewise.

Ok. Thanks!

Andreas

[Committed] IBM Z: Change Pmode to word_mode for stack probes

2020-12-07 Thread Andreas Krebbel via Gcc-patches

In s390.c we are still using Pmode for the stack probes. This breaks
with -m31 -mzarch where Pmode != word_mode.

The patch also adds a new target check to s390.exp which allows us to
implement zarch specific checks in the testcases.

Bootstrapped and regression tested on s390x with and without zarch
default.

gcc/ChangeLog:

* config/s390/s390.c (s390_emit_stack_probe): Change Pmode to
word_mode.

gcc/testsuite/ChangeLog:

* gcc.target/s390/s390.exp: New target check s390_zarch.
* gcc.target/s390/stack-clash-1.c: Use s390_zarch instead of lp64.
* gcc.target/s390/stack-clash-2.c: Likewise.
* gcc.target/s390/stack-clash-3.c: Likewise.
* gcc.target/s390/stack-clash-5.c: New test.
---
 gcc/config/s390/s390.c|  2 +-
 gcc/testsuite/gcc.target/s390/s390.exp|  7 +++
 gcc/testsuite/gcc.target/s390/stack-clash-1.c |  4 ++--
 gcc/testsuite/gcc.target/s390/stack-clash-2.c |  4 ++--
 gcc/testsuite/gcc.target/s390/stack-clash-3.c |  4 ++--
 gcc/testsuite/gcc.target/s390/stack-clash-5.c | 10 ++
 6 files changed, 24 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/stack-clash-5.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index fb48102559d..2f839882d96 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -11082,7 +11082,7 @@ s390_prologue_plus_offset (rtx target, rtx reg, rtx 
offset, bool frame_related_p
 static void
 s390_emit_stack_probe (rtx addr)
 {
-  rtx mem = gen_rtx_MEM (Pmode, addr);
+  rtx mem = gen_rtx_MEM (word_mode, addr);
   MEM_VOLATILE_P (mem) = 1;
   emit_insn (gen_probe_stack (mem));
 }
diff --git a/gcc/testsuite/gcc.target/s390/s390.exp 
b/gcc/testsuite/gcc.target/s390/s390.exp
index 00e0555d55c..d76d80dd0f3 100644
--- a/gcc/testsuite/gcc.target/s390/s390.exp
+++ b/gcc/testsuite/gcc.target/s390/s390.exp
@@ -202,6 +202,13 @@ proc check_effective_target_s390_z14_hw { } {
}
 }] "-march=z14 -m64 -mzarch" ] } { return 0 } else { return 1 }
 }
+# Return 1 if the default compiler options enable z/Architecture mode
+proc check_effective_target_s390_zarch { } {
+return [check_no_compiler_messages s390_zarch object {
+   int dummy[sizeof (int __attribute__((__mode__(__word__ == 8
+ ? 1 : -1];
+}]
+}
 
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CFLAGS
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-1.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-1.c
index 3d29cab9446..45221c4ef82 100644
--- a/gcc/testsuite/gcc.target/s390/stack-clash-1.c
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-1.c
@@ -13,5 +13,5 @@ void large_stack() {
 
 /* We use a compare for the stack probe.  There needs to be one inside
a loop and another for the remaining bytes.  */
-/* { dg-final { scan-assembler-times "cg\t" 2 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "c\t" 2 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "cg\t" 2 { target s390_zarch } } } */
+/* { dg-final { scan-assembler-times "c\t" 2 { target { ! s390_zarch } } } } */
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-2.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-2.c
index e554ad5ed0d..20f645de347 100644
--- a/gcc/testsuite/gcc.target/s390/stack-clash-2.c
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-2.c
@@ -13,5 +13,5 @@ foo ()
 /* For alloca a common code routine emits the probes.  Make sure the
"probe_stack" expander is used in that case. We want to use mem
compares instead of stores.  */
-/* { dg-final { scan-assembler-times "cg\t" 5 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "c\t" 5 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "cg\t" 5 { target s390_zarch } } } */
+/* { dg-final { scan-assembler-times "c\t" 5 { target { ! s390_zarch } } } } */
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-3.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-3.c
index 929d3fbb365..12a2d34cacf 100644
--- a/gcc/testsuite/gcc.target/s390/stack-clash-3.c
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-3.c
@@ -13,5 +13,5 @@ foo ()
 /* For alloca a common code routine emits the probes.  Make sure the
"probe_stack" expander is used in that case. We want to use mem
compares instead of stores.  */
-/* { dg-final { scan-assembler-times "cg\t" 5 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "c\t" 5 { target { ! lp64 } } } } */
+/* { dg-final { scan-assembler-times "cg\t" 5 { target s390_zarch } } } */
+/* { dg-final { scan-assembler-times "c\t" 5 { target { ! s390_zarch } } } } */
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-5.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-5.c
new file mode 100644
index 000..81e202e2aab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-5.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -m31 -mzarch -fstack-clash-protection" } */
+
+extern void bar

[Committed] IBM Z: Fix mode in probe_stack pattern

2020-12-03 Thread Andreas Krebbel via Gcc-patches

The probe pattern uses Pmode but the middle-end wants to emit a
word_mode probe check.  This - as usual - breaks on Z with -m31 -mzarch
were word_mode doesn't match Pmode.

Bootstrapped and regression-tested on s390x.

gcc/ChangeLog:

* config/s390/s390.md ("@probe_stack2"): Change mode
iterator to W.

gcc/testsuite/ChangeLog:

* gcc.target/s390/stack-clash-4.c: New test.
---
 gcc/config/s390/s390.md   |  6 +++---
 gcc/testsuite/gcc.target/s390/stack-clash-4.c | 10 ++
 2 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/stack-clash-4.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d4cfbdf6732..d6d8965a740 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -6,8 +6,8 @@ (define_expand "allocate_stack"
 
 (define_expand "@probe_stack2"
   [(set (reg:CCZ CC_REGNUM)
-   (compare:CCZ (reg:P 0)
-(match_operand 0 "memory_operand")))
+   (compare:CCZ (reg:W 0)
+(match_operand:W 0 "memory_operand")))
(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
   "")
 
@@ -11125,7 +11125,7 @@ (define_expand "probe_stack"
   [(match_operand 0 "memory_operand")]
   ""
 {
-  emit_insn (gen_probe_stack2 (Pmode, operands[0]));
+  emit_insn (gen_probe_stack2 (word_mode, operands[0]));
   DONE;
 })
 
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-4.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-4.c
new file mode 100644
index 000..619d99ddf69
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-4.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -m31 -mzarch -fstack-clash-protection" } */
+
+extern void c(char*);
+
+void
+a() {
+  char *b = __builtin_alloca(3);
+  c(b);
+}
-- 
2.28.0

Re: [PATCH v2] IBM Z: Use llihf and oilf to load large immediates into GPRs

2020-12-02 Thread Andreas Krebbel via Gcc-patches

On 12/2/20 4:08 PM, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2020-December/560822.html
> 
> v1 -> v2:
> - Use SYMBOL_REF_P.
> - Fix usage of gcc_assert.
> - Use GEN_INT.
> 
> 
> 
> Currently GCC loads large immediates into GPRs from the literal pool,
> which is not as efficient as loading two halves with llihf and oilf.
> 
> gcc/ChangeLog:
> 
> 2020-11-30  Ilya Leoshkevich  
> 
>   * config/s390/s390-protos.h (s390_const_int_pool_entry_p): New
>   function.
>   * config/s390/s390.c (s390_const_int_pool_entry_p): New
>   function.
>   * config/s390/s390.md: Add define_peephole2 that produces llihf
>   and oilf.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-30  Ilya Leoshkevich  
> 
>   * gcc.target/s390/load-imm64-1.c: New test.
>   * gcc.target/s390/load-imm64-2.c: New test.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Use llihf and oilf to load large immediates into GPRs

2020-12-01 Thread Andreas Krebbel via Gcc-patches

On 12/2/20 2:34 AM, Ilya Leoshkevich wrote:
> Bootstrapped and regtesed on s390x-redhat-linux.  There are slight
> improvements in all SPEC benchmarks, no regressions that could not be
> "fixed" by adding nops.  Ok for master?
> 
> 
> 
> Currently GCC loads large immediates into GPRs from the literal pool,
> which is not as efficient as loading two halves with llihf and oilf.
> 
> gcc/ChangeLog:
> 
> 2020-11-30  Ilya Leoshkevich  
> 
>   * config/s390/s390-protos.h (s390_const_int_pool_entry_p): New
>   function.
>   * config/s390/s390.c (s390_const_int_pool_entry_p): New
>   function.
>   * config/s390/s390.md: Add define_peephole2 that produces llihf
>   and oilf.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-30  Ilya Leoshkevich  
> 
>   * gcc.target/s390/load-imm64-1.c: New test.
>   * gcc.target/s390/load-imm64-2.c: New test.
> ---
>  gcc/config/s390/s390-protos.h|  1 +
>  gcc/config/s390/s390.c   | 31 
>  gcc/config/s390/s390.md  | 22 ++
>  gcc/testsuite/gcc.target/s390/load-imm64-1.c | 10 +++
>  gcc/testsuite/gcc.target/s390/load-imm64-2.c | 10 +++
>  5 files changed, 74 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/s390/load-imm64-1.c
>  create mode 100644 gcc/testsuite/gcc.target/s390/load-imm64-2.c
> 
> diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
> index ad2f7f77c18..eb10c3f4bbb 100644
> --- a/gcc/config/s390/s390-protos.h
> +++ b/gcc/config/s390/s390-protos.h
> @@ -135,6 +135,7 @@ extern void s390_split_access_reg (rtx, rtx *, rtx *);
>  extern void print_operand_address (FILE *, rtx);
>  extern void print_operand (FILE *, rtx, int);
>  extern void s390_output_pool_entry (rtx, machine_mode, unsigned int);
> +extern bool s390_const_int_pool_entry_p (rtx, HOST_WIDE_INT *);
>  extern int s390_label_align (rtx_insn *);
>  extern int s390_agen_dep_p (rtx_insn *, rtx_insn *);
>  extern rtx_insn *s390_load_got (void);
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 02f18366aa1..e3d68d3543b 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -9400,6 +9400,37 @@ s390_output_pool_entry (rtx exp, machine_mode mode, 
> unsigned int align)
>  }
>  }
>  
> +/* Return true if MEM refers to an integer constant in the literal pool.  If
> +   VAL is not nullptr, then also fill it with the constant's value.  */
> +
> +bool
> +s390_const_int_pool_entry_p (rtx mem, HOST_WIDE_INT *val)
> +{
> +  /* Try to match the following:
> + - (mem (unspec [(symbol_ref) (reg)] UNSPEC_LTREF)).
> + - (mem (symbol_ref)).  */
> +
> +  if (!MEM_P (mem))
> +return false;
> +
> +  rtx addr = XEXP (mem, 0);
> +  rtx sym;
> +  if (GET_CODE (addr) == UNSPEC && XINT (addr, 1) == UNSPEC_LTREF)
> +sym = XVECEXP (addr, 0, 0);
> +  else
> +sym = addr;
> +
> +  if (GET_CODE (sym) != SYMBOL_REF || !CONSTANT_POOL_ADDRESS_P (sym))
!SYMBOL_REF_P (sym)

> +return false;
> +
> +  rtx val_rtx = get_pool_constant (sym);
> +  if (!CONST_INT_P (val_rtx))
> +return false;
> +
> +  if (val != nullptr)
> +*val = INTVAL (val_rtx);
> +  return true;
> +}
Alternatively you probably could have returned the RTX instead and use 
gen_highpart / gen_lowpart in
the peephole. But no need to change that.

>  
>  /* Return an RTL expression representing the value of the return address
> for the frame COUNT steps up from the current frame.  FRAME is the
> diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
> index 910415a5974..79e9a75ba2f 100644
> --- a/gcc/config/s390/s390.md
> +++ b/gcc/config/s390/s390.md
> @@ -2116,6 +2116,28 @@ (define_peephole2
>[(set (match_dup 0) (plus:DI (match_dup 1) (match_dup 2)))]
>"")
>  
> +; Split loading of 64-bit constants into GPRs into llihf + oilf -
> +; counterintuitively, using oilf is faster than iilf.  oilf clobbers
> +; cc, so cc must be dead.
> +(define_peephole2
> +  [(set (match_operand:DI 0 "register_operand" "")
> +(match_operand:DI 1 "memory_operand" ""))]
> +  "TARGET_64BIT
> +   && TARGET_EXTIMM
> +   && GENERAL_REG_P (operands[0])
> +   && s390_const_int_pool_entry_p (operands[1], nullptr)
> +   && peep2_reg_dead_p (1, gen_rtx_REG (CCmode, CC_REGNUM))"
> +  [(set (match_dup 0) (match_dup 2))
> +   (parallel
> +[(set (match_dup 0) (ior:DI (match_dup 0) (match_dup 3)))
> + (clobber (reg:CC CC_REGNUM))])]
> +{
> +  HOST_WIDE_INT val;
> +  gcc_assert (s390_const_int_pool_entry_p (operands[1], ));

This probably breaks with checking disabled.

> +  operands[2] = gen_rtx_CONST_INT (DImode, val & 0x);
> +  operands[3] = gen_rtx_CONST_INT (DImode, val & 0x);

ULL for the constants?

> +})
> +
>  ;
>  ; movsi instruction pattern(s).
>  ;
> diff --git a/gcc/testsuite/gcc.target/s390/load-imm64-1.c 
> b/gcc/testsuite/gcc.target/s390/load-imm64-1.c
> new file mode 100644
> index 000..db0a89395aa
> ---

Re: [PATCH 2/2] gcc/testsuite/s390: Add test cases for float_t

2020-12-01 Thread Andreas Krebbel via Gcc-patches

On 11/25/20 6:06 PM, Marius Hillenbrand wrote:
> Add two test cases that check for acceptable combinations of float_t and
> FLT_EVAL_METHOD on s390x.
> 
> Tested against an as-is glibc and one modified so that it derives
> float_t from FLT_EVAL_METHOD.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-25  Marius Hillenbrand  
> 
>   * gcc.target/s390/float_t-1.c: New test.
>   * gcc.target/s390/float_t-2.c: New test.

Ok. Applied to mainline.

Thanks!

Andreas

Re: [PATCH 1/2] IBM Z: Configure excess precision for float at compile-time

2020-12-01 Thread Andreas Krebbel via Gcc-patches

On 11/25/20 6:06 PM, Marius Hillenbrand wrote:
> Historically, float_t has been defined as double on s390 and gcc would
> emit double precision insns for evaluating float expressions when in
> standard-compliant mode. Configure that behavior at compile-time as prep
> for changes in glibc: When glibc ties float_t to double, keep the old
> behavior; when glibc derives float_t from FLT_EVAL_METHOD (as on most
> other archs), revert to the default behavior (i.e.,
> FLT_EVAL_METHOD_PROMOTE_TO_FLOAT). Provide a configure option
> --enable-s390-excess-float-precision to override the check.
> 
> gcc/ChangeLog:
> 
> 2020-11-25  Marius Hillenbrand  
> 
>   * configure.ac: Add configure option
>   --enable-s390-excess-float-precision and check to derive default
>   from glibc.
>   * config/s390/s390.c: Guard s390_excess_precision with an ifdef
>   for ENABLE_S390_EXCESS_FLOAT_PRECISION.
>   * doc/install.texi: Document --enable-s390-excess-float-precision.
>   * configure: Regenerate.
>   * config.in: Regenerate.

Ok. Applied to mainline.

Thanks!

Andreas

Re: [PATCH] IBM Z: Restrict vec_cmp on z13

2020-11-24 Thread Andreas Krebbel via Gcc-patches

On 24.11.20 12:55, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on s390x-redhat-linux.  Ok for master?
> 
> 
> 
> Commit 5d9ade39b872 ("IBM Z: Fix PR97326: Enable fp compares in
> vec_cmp") made it possible to create rtxes that describe signaling
> comparisons on z13, which are not supported by the hardware.  Restrict
> this by using vcond_comparison_operator predicate.
> 
> gcc/ChangeLog:
> 
> 2020-11-24  Ilya Leoshkevich  
> 
>   * config/s390/vector.md: Use vcond_comparison_operator
>   predicate.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Update autovec-*-quiet-uneq expectations

2020-11-23 Thread Andreas Krebbel via Gcc-patches

On 23.11.20 22:38, Ilya Leoshkevich wrote:
> Commit 229752afe315 ("VEC_COND_EXPR optimizations") has improved code
> generation: we no longer need "vx x,x,-1", which turned out to be
> superfluous.  Instead, we simply swap 0 and -1 arguments of the
> preceding "vsel".
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-23  Ilya Leoshkevich  
> 
>   * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Expect
>   that "vx" is not emitted.
>   * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Likewise.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Do not run long double tests on old machines

2020-11-16 Thread Andreas Krebbel via Gcc-patches

On 13.11.20 23:23, Ilya Leoshkevich wrote:
> Bootstrapped and regtested on z13 s390x-redhat-linux.  Ok for master?
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-11-12  Ilya Leoshkevich  
> 
>   * gcc.target/s390/s390.exp (check_effective_target_s390_z14_hw):
>   New predicate.
>   * gcc.target/s390/vector/long-double-caller-abi-run.c: Use the
>   new predicate.
>   * gcc.target/s390/vector/long-double-copysign.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-double.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-float.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-i16.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-i32.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-i64.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-i8.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-u16.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-u32.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-u64.c: Likewise.
>   * gcc.target/s390/vector/long-double-from-u8.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-double.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-float.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-i16.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-i32.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-i64.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-i8.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-u16.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-u32.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-u64.c: Likewise.
>   * gcc.target/s390/vector/long-double-to-u8.c: Likewise.
>   * gcc.target/s390/vector/long-double-wfaxb.c: Likewise.
>   * gcc.target/s390/vector/long-double-wfdxb.c: Likewise.
>   * gcc.target/s390/vector/long-double-wfsxb-1.c: Likewise.

Ok. Thanks!

Andreas

Re: [PATCH] IBM Z: Define vec_vfees instruction pattern

2020-11-12 Thread Andreas Krebbel via Gcc-patches

On 12.11.20 13:21, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on IBM Z.  Ok for master?
> 
> gcc/ChangeLog:
> 
>   * config/s390/vector.md ("vec_vfees"): New insn pattern.
> ---
>  gcc/config/s390/vector.md | 26 ++
>  1 file changed, 26 insertions(+)
> 
> diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
> index 31d323930b2..4333a2191ae 100644
> --- a/gcc/config/s390/vector.md
> +++ b/gcc/config/s390/vector.md
> @@ -1798,6 +1798,32 @@
>"vll\t%v0,%1,%2"
>[(set_attr "op_type" "VRS")])
>  
> +; vfeebs, vfeehs, vfeefs
> +; vfeezbs, vfeezhs, vfeezfs
> +(define_insn "vec_vfees"
> +  [(set (match_operand:VI_HW_QHS 0 "register_operand" "=v")
> + (unspec:VI_HW_QHS [(match_operand:VI_HW_QHS 1 "register_operand" "v")
> +(match_operand:VI_HW_QHS 2 "register_operand" "v")
> +(match_operand:QI 3 "const_mask_operand" "C")]
> +   UNSPEC_VEC_VFEE))
> +   (set (reg:CCRAW CC_REGNUM)
> + (unspec:CCRAW [(match_dup 1)
> +(match_dup 2)
> +(match_dup 3)]
> +   UNSPEC_VEC_VFEECC))]
> +  "TARGET_VX"
> +{
> +  unsigned HOST_WIDE_INT flags = UINTVAL (operands[3]);
> +
> +  gcc_assert (!(flags & ~(VSTRING_FLAG_ZS | VSTRING_FLAG_CS)));
> +  flags &= ~VSTRING_FLAG_CS;
> +
> +  if (flags == VSTRING_FLAG_ZS)
> +return "vfeezs\t%v0,%v1,%v2";
> +  return "vfees\t%v0,%v1,%v2";
> +}
> +  [(set_attr "op_type" "VRR")])
> +
>  ; vfenebs, vfenehs, vfenefs
>  ; vfenezbs, vfenezhs, vfenezfs
>  (define_insn "vec_vfenes"
> 

Since this is mostly a copy of the pattern in vx-builtins.md I think we should 
remove the other
version then.

I also would prefer this to be committed together with the code making use of 
the expander. So far
this would be dead code - right?

Andreas

Re: [PATCH] IBM Z: Fix output template for "*vfees"

2020-11-12 Thread Andreas Krebbel via Gcc-patches

On 12.11.20 13:25, Stefan Schulze Frielinghaus wrote:
> Bootstrapped and regtested on IBM Z.  Ok for master?
> 
> gcc/ChangeLog:
> 
>   * config/s390/vx-builtins.md ("*vfees"): Fix output
> template.
> ---
>  gcc/config/s390/vx-builtins.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
> index 010db4d1115..0c2e7170223 100644
> --- a/gcc/config/s390/vx-builtins.md
> +++ b/gcc/config/s390/vx-builtins.md
> @@ -1395,7 +1395,7 @@
>  
>if (flags == VSTRING_FLAG_ZS)
>  return "vfeezs\t%v0,%v1,%v2";
> -  return "vfees\t%v0,%v1,%v2,%b3";
> +  return "vfees\t%v0,%v1,%v2";
>  }
>[(set_attr "op_type" "VRR")])
>  
> 

Ok. Thanks!

Andreas

[Committed 2/2] IBM Z: Fix PR97326: Enable fp compares in vec_cmp

2020-11-11 Thread Andreas Krebbel via Gcc-patches

Bootstrapped and regression tested on s390x.

gcc/ChangeLog:

PR target/97326
* config/s390/vector.md: Support vector floating point modes in
vec_cmp.
---
 gcc/config/s390/vector.md | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/config/s390/vector.md b/gcc/config/s390/vector.md
index 58b8999f2db..fef68644625 100644
--- a/gcc/config/s390/vector.md
+++ b/gcc/config/s390/vector.md
@@ -145,6 +145,16 @@ (define_mode_attr TOINTVEC [(V1QI "V1QI") (V2QI "V2QI") 
(V4QI "V4QI") (V8QI "V8Q
(V1SF "V1SI") (V2SF "V2SI") (V4SF "V4SI")
(V1DF "V1DI") (V2DF "V2DI")
(V1TF "V1TI") (TF "V1TI")])
+
+(define_mode_attr tointvec [(V1QI "v1qi") (V2QI "v2qi") (V4QI "v4qi") (V8QI 
"v8qi") (V16QI "v16qi")
+   (V1HI "v1hi") (V2HI "v2hi") (V4HI "v4hi") (V8HI 
"v8hi")
+   (V1SI "v1si") (V2SI "v2si") (V4SI "v4si")
+   (V1DI "v1di") (V2DI "v2di")
+   (V1TI "v1ti")
+   (V1SF "v1si") (V2SF "v2si") (V4SF "v4si")
+   (V1DF "v1di") (V2DF "v2di")
+   (V1TF "v1ti") (TF   "v1ti")])
+
 (define_mode_attr vw [(SF "w") (V1SF "w") (V2SF "v") (V4SF "v")
  (DF "w") (V1DF "w") (V2DF "v")
  (TF "w") (V1TF "w")])
@@ -1546,14 +1556,14 @@ (define_expand "copysign3"
 })
 
 ;;
-;; Integer compares
+;; Compares
 ;;
 
-(define_expand "vec_cmp"
-  [(set (match_operand:VI_HW0 "register_operand" "")
-   (match_operator:VI_HW   1 ""
- [(match_operand:VI_HW 2 "register_operand" "")
-  (match_operand:VI_HW 3 "register_operand" "")]))]
+(define_expand "vec_cmp"
+  [(set (match_operand:  0 "register_operand" "")
+   (match_operator: 1 ""
+ [(match_operand:V_HW 2 "register_operand" "")
+  (match_operand:V_HW 3 "register_operand" "")]))]
   "TARGET_VX"
 {
   s390_expand_vec_compare (operands[0], GET_CODE(operands[1]), operands[2], 
operands[3]);
-- 
2.25.1

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1040 matches

Mail list logo