[avr,committed] Fix PR90622

2023-05-21 Thread Georg-Johann Lay

This patch fixes a minor optimization issue for an avr specific builtin.
Applied as obvious.

https://gcc.gnu.org/r14-1025

Johann

--


target/90622: __builtin_avr_insert bits: Use BLD/BST for one bit in place.

If just one bit is inserted in the same position like with:
__builtin_avr_insert_bits (0xF2FF, src, dst);
a BLD/BST sequence is better than XOR/AND/XOR.  Thus, don't fold that
case to the latter sequence.

gcc/
PR target/90622
* config/avr/avr.cc (avr_fold_builtin) [AVR_BUILTIN_INSERT_BITS]:
Don't fold to XOR / AND / XOR if just one bit is copied to the
same position.

diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index d5af40f7091..9fa50ca230d 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -14425,10 +14425,13 @@ avr_fold_builtin (tree fndecl, int n_args 
ATTRIBUTE_UNUSED, tree *arg,

 if (changed)
   return build_call_expr (fndecl, 3, tmap, tbits, tval);

-/* If bits don't change their position we can use vanilla logic
-   to merge the two arguments.  */
+/* If bits don't change their position, we can use vanilla logic
+   to merge the two arguments...  */

-   if (avr_map_metric (map, MAP_NONFIXED_0_7) == 0)
+if (avr_map_metric (map, MAP_NONFIXED_0_7) == 0
+// ...except when we are copying just one bit. In that
+// case, BLD/BST is better than XOR/AND/XOR, see PR90622.
+&& avr_map_metric (map, MAP_FIXED_0_7) != 1)
   {
 int mask_f = avr_map_metric (map, MAP_MASK_PREIMAGE_F);
 tree tres, tmask = build_int_cst (val_type, mask_f ^ 0xff);


Re: [PATCH 7/7] Expand directly for single bit test

2023-05-21 Thread David Edelsohn via Gcc-patches
Hi, Andrew

Thanks for this series of patches to improve do_store_flag.  Unfortunately
this specific patch in the series has caused a bootstrap failure on
powerpc-aix.  I bisected this failure to this specific patch. Note that I
am building as 32 bit, so this could be a specific issue about bit size.

* expr.cc (fold_single_bit_test): Rename to ...
(expand_single_bit_test): This and expand directly.
(do_store_flag): Update for the rename function.


Thanks, David


Re: [PATCH 7/7] Expand directly for single bit test

2023-05-21 Thread Andrew Pinski via Gcc-patches
On Sun, May 21, 2023 at 11:17 AM David Edelsohn via Gcc-patches
 wrote:
>
> Hi, Andrew
>
> Thanks for this series of patches to improve do_store_flag.  Unfortunately
> this specific patch in the series has caused a bootstrap failure on
> powerpc-aix.  I bisected this failure to this specific patch. Note that I
> am building as 32 bit, so this could be a specific issue about bit size.
>
> * expr.cc (fold_single_bit_test): Rename to ...
> (expand_single_bit_test): This and expand directly.
> (do_store_flag): Update for the rename function.

Did this include the fix I did for big-endian at
r14-1022-g7f3df8e65c71e5 ? I had found that I broke big-endian last
night with that patch and pushed the fix once I figured out what I did
wrong.
If you already tried post the fix, I will try to look into it as soon
as possible.

Thanks,
Andrew

>
>
> Thanks, David


[pushed] wwwdocs: readings: Adjust link to Arm architectures

2023-05-21 Thread Gerald Pfeifer
arm.com does some interesting special effects with URL; hopefully this 
simplification is somewhat resilient.

Pushed.

Gerald
---
 htdocs/readings.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/htdocs/readings.html b/htdocs/readings.html
index 6813b84f..26f2af7a 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -65,7 +65,7 @@ names.
   The 64-bit execution state of the ARM Architecture, first introduced
   by the ARMv8-A architecture.
   Manufacturer: Various, by license from ARM.
-  https://developer.arm.com/architectures/cpu-architecture";>ARM 
Documentation
+  https://developer.arm.com/architectures";>ARM Documentation
  
 
  andes (nds32)
@@ -84,7 +84,7 @@ names.
  ARM
   Manufacturer: Various, by license from ARM.
   CPUs include: ARM7TDMI, and the Cortex-A, Cortex-R and Cortex-M series.
-  https://developer.arm.com/architectures/cpu-architecture";>ARM 
Documentation
+  https://developer.arm.com/architectures";>ARM Documentation
   https://developer.arm.com/documentation/ihi0036/latest/";>Application 
Binary Interface (ABI) for the ARM Architecture
  
 
-- 
2.40.1


Re: [PATCH 7/7] Expand directly for single bit test

2023-05-21 Thread Andrew Pinski via Gcc-patches
On Sun, May 21, 2023 at 11:25 AM Andrew Pinski  wrote:
>
> On Sun, May 21, 2023 at 11:17 AM David Edelsohn via Gcc-patches
>  wrote:
> >
> > Hi, Andrew
> >
> > Thanks for this series of patches to improve do_store_flag.  Unfortunately
> > this specific patch in the series has caused a bootstrap failure on
> > powerpc-aix.  I bisected this failure to this specific patch. Note that I
> > am building as 32 bit, so this could be a specific issue about bit size.
> >
> > * expr.cc (fold_single_bit_test): Rename to ...
> > (expand_single_bit_test): This and expand directly.
> > (do_store_flag): Update for the rename function.
>
> Did this include the fix I did for big-endian at
> r14-1022-g7f3df8e65c71e5 ? I had found that I broke big-endian last
> night with that patch and pushed the fix once I figured out what I did
> wrong.
> If you already tried post the fix, I will try to look into it as soon
> as possible.

I just re-read my message and I think it might have been confusing.
Last night I noticed the patch which you pointed out broke big-endian
targets, I pushed r14-1022-g7f3df8e65c71e5 as the fix. I am wondering
if your testing included this fix.
If yes then I will try to figure out the best way of figuring out how
I broke this target too.

Thanks,
Andrew

>
> Thanks,
> Andrew
>
> >
> >
> > Thanks, David


Re: [PATCH 7/7] Expand directly for single bit test

2023-05-21 Thread Jeff Law via Gcc-patches




On 5/21/23 12:25, Andrew Pinski via Gcc-patches wrote:

On Sun, May 21, 2023 at 11:17 AM David Edelsohn via Gcc-patches
 wrote:


Hi, Andrew

Thanks for this series of patches to improve do_store_flag.  Unfortunately
this specific patch in the series has caused a bootstrap failure on
powerpc-aix.  I bisected this failure to this specific patch. Note that I
am building as 32 bit, so this could be a specific issue about bit size.

 * expr.cc (fold_single_bit_test): Rename to ...
 (expand_single_bit_test): This and expand directly.
 (do_store_flag): Update for the rename function.


Did this include the fix I did for big-endian at
r14-1022-g7f3df8e65c71e5 ? I had found that I broke big-endian last
night with that patch and pushed the fix once I figured out what I did
wrong.
If you already tried post the fix, I will try to look into it as soon
as possible.
FWIW, the various failing hosts from yesterday in my tester have all 
returned to successful builds after the BE fixes.  m32r, iq2000, moxie, 
sh3eb, h8300.


There's a very reasonable chance the PPC bug is the same underlying issue.

Jeff


[PATCH] Fortran: checking and simplification of RESHAPE intrinsic [PR103794]

2023-05-21 Thread Harald Anlauf via Gcc-patches
Dear all,

checking and simplification of the RESHAPE intrinsic could fail in
various ways for sufficiently complicated arguments, like array
constructors.  Debugging revealed that in these cases we determined
that the array arguments were constant but we did not properly
simplify and expand the constructors.

A possible solution is the extend the test for constant arrays -
which already does an expansion for initialization expressions -
to also perform an expansion for small constructors in the
non-initialization case.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From bfb708fdb6c313473a3054be710c630dcdebf69d Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Sun, 21 May 2023 22:25:29 +0200
Subject: [PATCH] Fortran: checking and simplification of RESHAPE intrinsic
 [PR103794]

gcc/fortran/ChangeLog:

	PR fortran/103794
	* check.cc (gfc_check_reshape): Expand constant arguments SHAPE and
	ORDER before checking.
	* gfortran.h (gfc_is_constant_array_expr): Add prototype.
	* iresolve.cc (gfc_resolve_reshape): Expand constant argument SHAPE.
	* simplify.cc (is_constant_array_expr): If array is determined to be
	constant, expand small array constructors if needed.
	(gfc_is_constant_array_expr): Wrapper for is_constant_array_expr.
	(gfc_simplify_reshape): Fix check for insufficient elements in SOURCE
	when no padding specified.

gcc/testsuite/ChangeLog:

	PR fortran/103794
	* gfortran.dg/reshape_10.f90: New test.
	* gfortran.dg/reshape_11.f90: New test.
---
 gcc/fortran/check.cc |  6 +++--
 gcc/fortran/gfortran.h   |  1 +
 gcc/fortran/iresolve.cc  |  2 +-
 gcc/fortran/simplify.cc  | 25 ++---
 gcc/testsuite/gfortran.dg/reshape_10.f90 | 34 
 gcc/testsuite/gfortran.dg/reshape_11.f90 | 15 +++
 6 files changed, 77 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/reshape_10.f90
 create mode 100644 gcc/testsuite/gfortran.dg/reshape_11.f90

diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc
index 3dd1711aa14..4086dc71d34 100644
--- a/gcc/fortran/check.cc
+++ b/gcc/fortran/check.cc
@@ -4723,7 +4723,7 @@ gfc_check_reshape (gfc_expr *source, gfc_expr *shape,
 }

   gfc_simplify_expr (shape, 0);
-  shape_is_const = gfc_is_constant_expr (shape);
+  shape_is_const = gfc_is_constant_array_expr (shape);

   if (shape->expr_type == EXPR_ARRAY && shape_is_const)
 {
@@ -4732,6 +4732,8 @@ gfc_check_reshape (gfc_expr *source, gfc_expr *shape,
   for (i = 0; i < shape_size; ++i)
 	{
 	  e = gfc_constructor_lookup_expr (shape->value.constructor, i);
+	  if (e == NULL)
+	break;
 	  if (e->expr_type != EXPR_CONSTANT)
 	continue;

@@ -4764,7 +4766,7 @@ gfc_check_reshape (gfc_expr *source, gfc_expr *shape,
   if (!type_check (order, 3, BT_INTEGER))
 	return false;

-  if (order->expr_type == EXPR_ARRAY && gfc_is_constant_expr (order))
+  if (order->expr_type == EXPR_ARRAY && gfc_is_constant_array_expr (order))
 	{
 	  int i, order_size, dim, perm[GFC_MAX_DIMENSIONS];
 	  gfc_expr *e;
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 9dd6b45f112..8cfa8fd3afd 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3970,6 +3970,7 @@ bool gfc_fix_implicit_pure (gfc_namespace *);

 void gfc_convert_mpz_to_signed (mpz_t, int);
 gfc_expr *gfc_simplify_ieee_functions (gfc_expr *);
+bool gfc_is_constant_array_expr (gfc_expr *);
 bool gfc_is_size_zero_array (gfc_expr *);

 /* trans-array.cc  */
diff --git a/gcc/fortran/iresolve.cc b/gcc/fortran/iresolve.cc
index 7880aba63bb..571e1bd3441 100644
--- a/gcc/fortran/iresolve.cc
+++ b/gcc/fortran/iresolve.cc
@@ -2424,7 +2424,7 @@ gfc_resolve_reshape (gfc_expr *f, gfc_expr *source, gfc_expr *shape,
   break;
 }

-  if (shape->expr_type == EXPR_ARRAY && gfc_is_constant_expr (shape))
+  if (shape->expr_type == EXPR_ARRAY && gfc_is_constant_array_expr (shape))
 {
   gfc_constructor *c;
   f->shape = gfc_get_shape (f->rank);
diff --git a/gcc/fortran/simplify.cc b/gcc/fortran/simplify.cc
index 6ba2040e61c..3f77203e62e 100644
--- a/gcc/fortran/simplify.cc
+++ b/gcc/fortran/simplify.cc
@@ -254,12 +254,19 @@ is_constant_array_expr (gfc_expr *e)
 	break;
   }

-  /* Check and expand the constructor.  */
-  if (!array_OK && gfc_init_expr_flag && e->rank == 1)
+  /* Check and expand the constructor.  We do this when either
+ gfc_init_expr_flag is set or for not too large array constructors.  */
+  bool expand;
+  expand = (e->rank == 1
+	&& e->shape
+	&& (mpz_cmp_ui (e->shape[0], flag_max_array_constructor) < 0));
+
+  if (!array_OK && (gfc_init_expr_flag || expand) && e->rank == 1)
 {
+  bool saved_init_expr_flag = gfc_init_expr_flag;
   array_OK = gfc_reduce_init_expr (e);
   /* gfc_reduce_init_expr resets the flag.  */
-  gfc_init_expr_flag = true;
+  gfc_init_expr_flag = saved_init_expr_flag;
 }
   else
 return array_

Re: [PATCH 7/7] Expand directly for single bit test

2023-05-21 Thread David Edelsohn via Gcc-patches
On Sun, May 21, 2023 at 11:25 AM Andrew Pinski  wrote:

> On Sun, May 21, 2023 at 11:17 AM David Edelsohn via Gcc-patches
>  wrote:
> >
> > Hi, Andrew
> >
> > Thanks for this series of patches to improve do_store_flag.
> Unfortunately
> > this specific patch in the series has caused a bootstrap failure on
> > powerpc-aix.  I bisected this failure to this specific patch. Note that I
> > am building as 32 bit, so this could be a specific issue about bit size.
> >
> > * expr.cc (fold_single_bit_test): Rename to ...
> > (expand_single_bit_test): This and expand directly.
> > (do_store_flag): Update for the rename function.
>
> Did this include the fix I did for big-endian at
> r14-1022-g7f3df8e65c71e5 ? I had found that I broke big-endian last
> night with that patch and pushed the fix once I figured out what I did
> wrong.
> If you already tried post the fix, I will try to look into it as soon
> as possible.
>
>
The big-endian patch fixed the issue for Power also.

Thanks, David


[PATCH V12] VECT: Fix issue of multiple-rgroup for length is counting elements

2023-05-21 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Address comments from Richard that splits the patch of fixing multiple-rgroup
handling of length counting elements.

This patch is fixing issue of handling multiple-rgroup of length is counting 
elements

Before this patch, multiple rgroup run fail:
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c execution 
test
FAIL: gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c execution 
test

After this patch, These tests are all passed.

gcc/ChangeLog:

* tree-vect-loop.cc (vect_get_loop_len): Fix issue for multiple-rgroup 
of length.
* tree-vect-stmts.cc (vectorizable_store): Ditto.
(vectorizable_load): Ditto.
* tree-vectorizer.h (vect_get_loop_len): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h: New test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c: New 
test.

---
 .../rvv/autovec/partial/multiple_rgroup-1.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-1.h   | 304 ++
 .../rvv/autovec/partial/multiple_rgroup-2.c   |   6 +
 .../rvv/autovec/partial/multiple_rgroup-2.h   | 546 ++
 .../autovec/partial/multiple_rgroup_run-1.c   |  19 +
 .../autovec/partial/multiple_rgroup_run-2.c   |  19 +
 gcc/tree-vect-loop.cc |  26 +-
 gcc/tree-vect-stmts.cc|  28 +-
 gcc/tree-vectorizer.h |   5 +-
 9 files changed, 944 insertions(+), 15 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-2.h
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_run-2.c

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
new file mode 100644
index 000..69cc3be78f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param 
riscv-autovec-preference=fixed-vlmax" } */
+
+#include "multiple_rgroup-1.h"
+
+TEST_ALL (test_1)
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
new file mode 100644
index 000..fbc49f4855d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup-1.h
@@ -0,0 +1,304 @@
+#include 
+#include 
+
+#define test_1(TYPE1, TYPE2)   
\
+  void __attribute__ ((noinline, noclone)) 
\
+  test_1_##TYPE1_##TYPE2 (TYPE1 *__restrict f, TYPE2 *__restrict d, TYPE1 x,   
\
+ TYPE1 x2, TYPE2 y, int n)\
+  {
\
+for (int i = 0; i < n; ++i)
\
+  {
\
+   f[i * 2 + 0] = x;  \
+   f[i * 2 + 1] = x2; \
+   d[i] = y;  \
+  }   

Re: [RFC V2] RISC-V : Support rv64 ilp32

2023-05-21 Thread Guo Ren via Gcc-patches
On Fri, May 19, 2023 at 3:35 PM Kito Cheng  wrote:
>
> I am concern about we didn't define POINTERS_EXTEND_UNSIGNED here, and
> also concern about the code model stuffs, I know currently Guo-Ren's
> implementation is rely on some MMU trick, but I am not sure does it
> also applicable on embedded applications.
There are two ways:
 - Limit address < 2GB. (Fortunately, most MCUs have a limit on their
address of less than 2GB.)
 - The zjpm liked hardware extension could mask highest 32 bits of the address.

>
>
> > diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
> > index b9557a75dc7..4f33c88ef6e 100644
> > --- a/gcc/config/riscv/linux.h
> > +++ b/gcc/config/riscv/linux.h
> > @@ -58,7 +58,7 @@ along with GCC; see the file COPYING3.  If not see
> >"%{mabi=ilp32:_ilp32}"
> >
> >  #define LINK_SPEC "\
> > --melf" XLEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
> > +-melf" ABI_LEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
> >  %{mno-relax:--no-relax} \
> >  %{mbig-endian:-EB} \
> >  %{mlittle-endian:-EL} \
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 5f44f6dc5c9..09ab940447d 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -6291,10 +6291,6 @@ riscv_option_override (void)
> >&& riscv_abi != ABI_LP64 && riscv_abi != ABI_ILP32E)
> >  error ("z*inx requires ABI ilp32, ilp32e or lp64");
> >
> > -  /* We do not yet support ILP32 on RV64.  */
> > -  if (BITS_PER_WORD != POINTER_SIZE)
> > -error ("ABI requires %<-march=rv%d%>", POINTER_SIZE);
>
> It seems to also make -march=rv32g -mabi=lp64 become acceptable?
>
> >



-- 
Best Regards
 Guo Ren


Re: [RFC V2] RISC-V : Support rv64 ilp32

2023-05-21 Thread Guo Ren via Gcc-patches
On Mon, May 22, 2023 at 10:51 AM Guo Ren  wrote:
>
> On Fri, May 19, 2023 at 3:35 PM Kito Cheng  wrote:
> >
> > I am concern about we didn't define POINTERS_EXTEND_UNSIGNED here, and
> > also concern about the code model stuffs, I know currently Guo-Ren's
> > implementation is rely on some MMU trick, but I am not sure does it
> > also applicable on embedded applications.
> There are two ways:
>  - Limit address < 2GB. (Fortunately, most MCUs have a limit on their
> address of less than 2GB.)
>  - The zjpm liked hardware extension could mask the highest 32 bits of the 
> address.
I guess your question is: Shall we start the work of the GCC's
POINTERS_EXTEND_UNSIGNED?
If the SoC's start address of dram > 2GB, we need the GCC's
POINTERS_EXTEND_UNSIGNED.

>
> >
> >
> > > diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
> > > index b9557a75dc7..4f33c88ef6e 100644
> > > --- a/gcc/config/riscv/linux.h
> > > +++ b/gcc/config/riscv/linux.h
> > > @@ -58,7 +58,7 @@ along with GCC; see the file COPYING3.  If not see
> > >"%{mabi=ilp32:_ilp32}"
> > >
> > >  #define LINK_SPEC "\
> > > --melf" XLEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
> > > +-melf" ABI_LEN_SPEC DEFAULT_ENDIAN_SPEC "riscv" LD_EMUL_SUFFIX " \
> > >  %{mno-relax:--no-relax} \
> > >  %{mbig-endian:-EB} \
> > >  %{mlittle-endian:-EL} \
> > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > > index 5f44f6dc5c9..09ab940447d 100644
> > > --- a/gcc/config/riscv/riscv.cc
> > > +++ b/gcc/config/riscv/riscv.cc
> > > @@ -6291,10 +6291,6 @@ riscv_option_override (void)
> > >&& riscv_abi != ABI_LP64 && riscv_abi != ABI_ILP32E)
> > >  error ("z*inx requires ABI ilp32, ilp32e or lp64");
> > >
> > > -  /* We do not yet support ILP32 on RV64.  */
> > > -  if (BITS_PER_WORD != POINTER_SIZE)
> > > -error ("ABI requires %<-march=rv%d%>", POINTER_SIZE);
> >
> > It seems to also make -march=rv32g -mabi=lp64 become acceptable?
> >
> > >
>
>
>
> --
> Best Regards
>  Guo Ren



-- 
Best Regards
 Guo Ren


Re: [PATCH] vect: Don't retry if the previous analysis fails

2023-05-21 Thread Kewen.Lin via Gcc-patches
on 2023/5/17 14:32, Richard Biener wrote:
> On Wed, May 17, 2023 at 8:06 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> When working on a cost tweaking patch, I found that a newly
>> added test case has different dumpings with stage-1 and
>> bootstrapped gcc.  By looking into it, the apparent reason
>> is vect_analyze_loop_2 doesn't get slp_done_for_suggested_uf
>> set expectedly, the following retrying will use the garbage
>> slp_done_for_suggested_uf instead.  In fact, the setting of
>> slp_done_for_suggested_uf only happens when the previous
>> analysis succeeds, for the mentioned test case, its previous
>> analysis does fail, it's unexpected to use the value of
>> slp_done_for_suggested_uf any more.
>>
>> In function vect_analyze_loop_1, we only return success when
>> res is true, which is the result of 1st analysis.  It means
>> we never try to vectorize with unroll_vinfo if the previous
>> analysis fails.  So this patch shouldn't break anything, and
>> just stop some useless analysis early.
>>
>> Bootstrapped and regtested on x86_64-redhat-linux,
>> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>>
>> Is it ok for trunk?
> 
> OK for trunk and affected branches.

Pushed as r14-926 and backported in r13-7364 & r12-9633.  Thanks!

BR,
Kewen


Re: [PATCH 1/2] vect: Refactor code for index == count in vect_transform_slp_perm_load_1

2023-05-21 Thread Kewen.Lin via Gcc-patches
on 2023/5/18 14:12, Richard Biener wrote:
> On Wed, May 17, 2023 at 9:19 AM Kewen.Lin  wrote:
>>
>> Hi Richi,
>>
>> on 2023/5/17 14:34, Richard Biener wrote:
>>> On Wed, May 17, 2023 at 8:09 AM Kewen.Lin  wrote:

 Hi,

 This patch is to refactor the handlings for the case (index
 == count) in a loop of vect_transform_slp_perm_load_1, in
 order to prepare a subsequent adjustment on *nperm.  This
 patch doesn't have any functional changes.
>>>
>>> The diff is impossible to be reviewed - can you explain the
>>> refactoring you have done or also attach a patch more clearly
>>> showing what you change?
>>
>> Sorry, I should have made it more clear.
>> It mainly to combine these two hunks:
>>
>>   if (index == count && !noop_p)
>> {
>>// A ...
>>// ++*n_perms;
>> }
>>
>>   if (index == count)
>> {
>>if (!analyze_only)
>>  {
>> if (!noop_p)
>>// B1 ...
>>
>> // B2 ...
>>
>> for ...
>>   {
>>  if (!noop_p)
>> // B3 building VEC_PERM_EXPR
>>  else
>> // B4 building nothing (no uses for B2 and its seq)
>>   }
>>  }
>>// B5
>> }
>>
>> The former can be part of the latter, so it becomes to:
>>
>>   if (index == count)
>> {
>>if (!noop_p)
>>  {
>>// A ...
>>// ++*n_perms;
>>
>>if (!analyze_only)
>>  {
>> // B1 ...
>> // B2 ...
>> for ...
>>// B3 building VEC_PERM_EXPR
>>  }
>>  }
>>else if (!analyze_only)
>>  {
>> // no B2 since no any further uses here.
>> for ...
>>   // B4 building nothing
>>  }
>> // B5 ...
>> }
> 
> Ah, thanks - that made reviewing easy.  1/2 is OK for trunk.

Thanks for the review!  Pushed as r14-1028.

BR,
Kewen


Re: [PATCH v2] rs6000: Add buildin for mffscrn instructions

2023-05-21 Thread Kewen.Lin via Gcc-patches
Hi Carl,

on 2023/5/19 05:12, Carl Love via Gcc-patches wrote:
> GCC maintainers:
> 
> version 2.  Fixed an issue with the test case.  The dg-options line was
> missing.
> 
> The following patch adds an overloaded builtin.  There are two possible
> arguments for the builtin.  The builtin definitions are:
> 
>   double __builtin_mffscrn (unsigned long int);
>   double __builtin_mffscrn (double);
> 

We already have one  bif __builtin_set_fpscr_rn for RN setting, apparently
these two are mainly for direct mapping to mffscr[ni] and want the FPSCR bits.
I'm curious what's the requirements requesting these two built-in functions?

> The patch has been tested on Power 10 with no regressions.  
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
> Carl
> 
> 
> rs6000: Add buildin for mffscrn instructions
> 

s/buildin/built-in/

> This patch adds overloaded __builtin_mffscrn for the move From FPSCR
> Control & Set R instruction with an immediate argument.  It also adds the
> builtin with a floating point register argument.  A new runnable test is
> added for the new builtin.

s/Set R/Set RN/

> 
> gcc/
> 
>   * config/rs6000/rs6000-builtins.def (__builtin_mffscrni,
>   __builtin_mffscrnd): Add builtin definitions.
>   * config/rs6000/rs6000-overload.def (__builtin_mffscrn): Add
>   overloaded definition.
>   * doc/extend.texi: Add documentation for __builtin_mffscrn.
> 
> gcc/testsuite/
> 
>   * gcc.target/powerpc/builtin-mffscrn.c: Add testcase for new
>   builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def |   7 ++
>  gcc/config/rs6000/rs6000-overload.def |   5 +
>  gcc/doc/extend.texi   |   8 ++
>  .../gcc.target/powerpc/builtin-mffscrn.c  | 106 ++
>  4 files changed, 126 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/builtin-mffscrn.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index 92d9b46e1b9..67125473684 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2875,6 +2875,13 @@
>pure vsc __builtin_vsx_xl_len_r (void *, signed long);
>  XL_LEN_R xl_len_r {}
>  
> +; Immediate instruction only uses the least significant two bits of the
> +; const int.
> +  double __builtin_mffscrni (const int<2>);
> +MFFSCRNI rs6000_mffscrni {}
> +
> +  double __builtin_mffscrnd (double);
> +MFFSCRNF rs6000_mffscrn {}
>  

Why are them put in [power9-64] rather than [power9]?  IMHO [power9] is the
correct stanza for them.  Besides, {nosoft} attribute is required.

>  ; Builtins requiring hardware support for IEEE-128 floating-point.
>  [ieee128-hw]
> diff --git a/gcc/config/rs6000/rs6000-overload.def 
> b/gcc/config/rs6000/rs6000-overload.def
> index c582490c084..adda2df69ea 100644
> --- a/gcc/config/rs6000/rs6000-overload.def
> +++ b/gcc/config/rs6000/rs6000-overload.def
> @@ -78,6 +78,11 @@
>  ; like after a required newline, but nowhere else.  Lines beginning with
>  ; a semicolon are also treated as blank lines.
>  
> +[MFFSCR, __builtin_mffscrn, __builtin_mffscrn]
> +  double __builtin_mffscrn (const int<2>);
> +MFFSCRNI
> +  double __builtin_mffscrn (double);
> +MFFSCRNF
>  
>  [BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
>vsq __builtin_vec_bcdadd (vsq, vsq, const int);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index ed8b9c8a87b..f16c046051a 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18455,6 +18455,9 @@ int __builtin_dfp_dtstsfi_ov_td (unsigned int 
> comparison, _Decimal128 value);
>  
>  double __builtin_mffsl(void);
>  
> +double __builtin_mffscrn (unsigned long int);
> +double __builtin_mffscrn (double);

s/unsigned long int/const int/

Note that this section is for all configurations and your implementation is put
__builtin_mffscrn power9 only, so if the intention (requirement) is to make this
be for also all configurations, we need to deal with the cases without the 
support
of actual hw insns mffscrn{,i}, just like the existing handlings for mffsl etc.

> +
>  @end smallexample
>  The @code{__builtin_byte_in_set} function requires a
>  64-bit environment supporting ISA 3.0 or later.  This function returns
> @@ -18511,6 +18514,11 @@ the FPSCR.  The instruction is a lower latency 
> version of the @code{mffs}
>  instruction.  If the @code{mffsl} instruction is not available, then the
>  builtin uses the older @code{mffs} instruction to read the FPSCR.
>  
> +The @code{__builtin_mffscrn} returns the contents of the control bits in the
> +FPSCR, bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI, RN).  The
> +contents of bits [62:63] of the unsigned long int or double argument are 
> placed
> +into bits [62:63] of the FPSCR (RN).
> +

I know this description is copied from ISA doc, but this part is for GCC 
documentation,
the doc

[PATCH] RISC-V: Add missing torture-init and torture-finish for rvv.exp

2023-05-21 Thread Kito Cheng via Gcc-patches
Hi Vineet:

Could you help to test this patch, this could resolve that issue on our
machine, but I would like to also work for other env.

Thanks :)

---

We got bunch of following error message for multi-lib run:

ERROR: torture-init: torture_without_loops is not empty as expected
ERROR: tcl error code NONE

And seems we need torture-init and torture-finish around the test
loop.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Add torture-init and
torture-finish.
---
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp 
b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
index bc99cc0c3cf4..19179564361a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
@@ -39,6 +39,7 @@ if [istarget riscv32-*-*] then {
 
 # Initialize `dg'.
 dg-init
+torture-init
 
 # Main loop.
 set CFLAGS "$DEFAULT_CFLAGS -march=$gcc_march -mabi=$gcc_mabi -O3"
@@ -69,5 +70,7 @@ foreach op $AUTOVEC_TEST_OPTS {
 dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/autovec/vls-vlmax/*.\[cS\]]] \
"-std=c99 -O3 -ftree-vectorize --param 
riscv-autovec-preference=fixed-vlmax" $CFLAGS
 
+torture-finish
+
 # All done.
 dg-finish
-- 
2.40.1



Re: [PATCHv2, rs6000] Splat vector small V2DI constants with ISA 2.07 instructions [PR104124]

2023-05-21 Thread Kewen.Lin via Gcc-patches
Hi Haochen,

on 2023/5/4 16:56, HAO CHEN GUI wrote:
> Hi,
>   This patch adds a new insn for vector splat with small V2DI constants on P8.
> If the value of constant is in RANGE (-16, 15) and not 0 or -1, it can be 
> loaded
> with vspltisw and vupkhsw on P8. It should be efficient than loading vector 
> from
> TOC.
> 
>   Compared to last version, the main change is to move the constant check from
> easy_altivec_constant to easy_altivec_constant and remove some unnecessary 
> mode
> checks.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> 2023-05-04  Haochen Gui 
> 
> gcc/
>   PR target/104124
>   * config/rs6000/altivec.md (*altivec_vupkhs_direct): Rename
>   to...
>   (altivec_vupkhs_direct): ...this.
>   * config/rs6000/constraints.md (wT constraint): New constant for a
>   vector constraint that can be loaded with vspltisw and vupkhsw.
>   * config/rs6000/predicates.md (vspltisw_vupkhsw_constant_split): New
>   predicate for wT constraint.
>   (easy_vector_constant): Call vspltisw_vupkhsw_constant_p to Check if
>   a vector constant can be synthesized with a vspltisw and a vupkhsw.
>   * config/rs6000/rs6000-protos.h (vspltisw_vupkhsw_constant_p): Declare.
>   * config/rs6000/rs6000.cc (vspltisw_vupkhsw_constant_p): Call
>   * (vspltisw_vupkhsw_constant_p): New function to return true if OP
>   mode is V2DI and can be synthesized with vupkhsw and vspltisw.
>   * config/rs6000/vsx.md (*vspltisw_v2di_split): New insn to load up
>   constants with vspltisw and vupkhsw.
> 
> gcc/testsuite/
>   PR target/104124
>   * gcc.target/powerpc/pr104124.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 49b0c964f4d..2c932854c33 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2542,7 +2542,7 @@ (define_insn "altivec_vupkhs"
>  }
>[(set_attr "type" "vecperm")])
> 
> -(define_insn "*altivec_vupkhs_direct"
> +(define_insn "altivec_vupkhs_direct"
>[(set (match_operand:VP 0 "register_operand" "=v")
>   (unspec:VP [(match_operand: 1 "register_operand" "v")]
>UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
> diff --git a/gcc/config/rs6000/constraints.md 
> b/gcc/config/rs6000/constraints.md
> index c4a6ccf4efb..e7f185660c0 100644
> --- a/gcc/config/rs6000/constraints.md
> +++ b/gcc/config/rs6000/constraints.md
> @@ -144,6 +144,10 @@ (define_constraint "wS"
>"@internal Vector constant that can be loaded with XXSPLTIB & sign 
> extension."
>(match_test "xxspltib_constant_split (op, mode)"))
> 
> +(define_constraint "wT"
> +  "@internal Vector constant that can be loaded with vspltisw & vupkhsw."
> +  (match_test "vspltisw_vupkhsw_constant_split (op, mode)"))
> +
>  ;; ISA 3.0 DS-form instruction that has the bottom 2 bits 0 and no update 
> form.
>  ;; Used by LXSD/STXSD/LXSSP/STXSSP.  In contrast to "Y", the multiple-of-four
>  ;; offset is enforced for 32-bit too.
> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
> index 52c65534e51..ff0f625d508 100644
> --- a/gcc/config/rs6000/predicates.md
> +++ b/gcc/config/rs6000/predicates.md
> @@ -694,6 +694,16 @@ (define_predicate "xxspltib_constant_split"
>return num_insns > 1;
>  })
> 
> +;; Return true if the operand is a constant that can be loaded with a 
> vspltisw
> +;; instruction and then a vupkhsw instruction.
> +
> +(define_predicate "vspltisw_vupkhsw_constant_split"
> +  (match_code "const_vector")
> +{
> +  int value;
> +
> +  return vspltisw_vupkhsw_constant_p (op, mode, &value);

Just "return vspltisw_vupkhsw_constant_p (op, mode);"?

> +})
> 
>  ;; Return 1 if the operand is constant that can loaded directly with a 
> XXSPLTIB
>  ;; instruction.
> @@ -742,6 +752,11 @@ (define_predicate "easy_vector_constant"
>&& xxspltib_constant_p (op, mode, &num_insns, &value))
>   return true;
> 
> +  /* V2DI constant within RANGE (-16, 15) can be synthesized with a
> +  vspltisw and a vupkhsw.  */
> +  if (vspltisw_vupkhsw_constant_p (op, mode, &value))
> + return true;
> +
>return easy_altivec_constant (op, mode);
>  }
> 
> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index 1a4fc1df668..ba39a73abf8 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, 
> rtx, int, int, int,
> 
>  extern int easy_altivec_constant (rtx, machine_mode);
>  extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *);
> +extern bool vspltisw_vupkhsw_constant_p (rtx, machine_mode, int *);

Use "vspltisw_vupkhsw_constant_p (rtx, machine_mode, int * = nullptr);", the 
last
argument can be optional (nullptr if it's not specified).

>  extern int vspltis_shifted (rtx);
>  extern HOST_WIDE_INT const_vector_