[PATCH] config.gcc: Support mips*64*-linux-muslabi64 as ABI64 by default

2024-09-22 Thread YunQiang Su
LLVM introduced this triple support.  Let's sync with it.

gcc
* config.gcc: Add mips*64*-linux-muslabi64 triple support.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f09ce9f63a0..848fe7da717 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2684,7 +2684,7 @@ mips*-*-linux*)   # Linux MIPS, 
either endian.
default_mips_arch=mips64r2
enable_mips_multilibs="yes"
;;
-   mips64*-*-linux-gnuabi64 | mipsisa64*-*-linux-gnuabi64)
+   mips64*-*-linux-gnuabi64 | mipsisa64*-*-linux-gnuabi64 | 
mips*64*-linux-muslabi64)
default_mips_abi=64
enable_mips_multilibs="yes"
;;
-- 
2.45.2



[PATCH] Git ignores .vscode

2024-09-05 Thread YunQiang Su
ChangeLog
* .gitignore: Add .vscode.
---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index 93a16b0b950..f044fe16b5f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -38,6 +38,7 @@ cscope.out
 
 .local.vimrc
 .lvimrc
+.vscode
 
 .clang-format
 .clang-tidy
-- 
2.39.3 (Apple Git-146)



[PATCH] RISC-V: Fix out of index in riscv_select_multilib_by_abi

2024-09-05 Thread YunQiang Su
commit b5c2aae48723c9098a8a3dab1409b30fd87bbf56
Author: YunQiang Su 
Date:   Thu Sep 5 15:14:43 2024 +0800

RISC-V: Lookup reversely in riscv_select_multilib_by_abi

The last element should use index
   multilib_infos.size () - 1

gcc
* common/config/riscv/riscv-common.cc(riscv_select_multilib_by_abi):
Fix out of index problem.
---
 gcc/common/config/riscv/riscv-common.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 2c1ce7fc7cb..bd42fd01532 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -2079,7 +2079,7 @@ riscv_select_multilib_by_abi (
   const std::string &riscv_current_abi_str,
   const std::vector &multilib_infos)
 {
-  for (ssize_t i = multilib_infos.size (); i >= 0; --i)
+  for (ssize_t i = multilib_infos.size () - 1; i >= 0; --i)
 if (riscv_current_abi_str == multilib_infos[i].abi_str)
   return xstrdup (multilib_infos[i].path.c_str ());
 
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] RISC-V: Lookup reversely in riscv_select_multilib_by_abi

2024-09-05 Thread YunQiang Su
Kito Cheng  于2024年9月5日周四 16:36写道:
>
> LGTM, thanks for catching this, but commit log seems not right?
> should it be -print-multi-directory or -print-multi-os-directory
> rather than --print-multilib-os-dir?

Yes. It is a typo.
I used `--print-multilib-os-dir`, and yes, as you said, `-print-multi-directory`
has same problem.

> (I guess should be -print-multi-directory per your output)
>
> Anyway, you can go ahead and push that after the fix:)
>
>
> On Thu, Sep 5, 2024 at 3:30 PM YunQiang Su  wrote:
> >
> > From: YunQiang Su 
> >
> > When use --print-multilib-os-dir, gcc outputs different value
> > with full -march option and the base one only.
> >
> > $ ./gcc/xgcc --print-multilib-os-dir -mabi=lp64d -march=rv64gc
> > lib64/lp64d
> >
> > $ ./gcc/xgcc --print-multilib-os-dir -mabi=lp64d -march=rv64gc_zba
> > .
> >
> > The reason is that in multilib.h, the fallback value of multilib
> > is listed as the 1st one in `multilib_raw[]`.
> >
> > gcc
> > * common/config/riscv/riscv-common.cc(riscv_select_multilib_by_abi):
> > look up reversely as the fallback path is listed as the 1st one.
> > ---
> >  gcc/common/config/riscv/riscv-common.cc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/common/config/riscv/riscv-common.cc 
> > b/gcc/common/config/riscv/riscv-common.cc
> > index 62c6e1dab1f..2c1ce7fc7cb 100644
> > --- a/gcc/common/config/riscv/riscv-common.cc
> > +++ b/gcc/common/config/riscv/riscv-common.cc
> > @@ -2079,7 +2079,7 @@ riscv_select_multilib_by_abi (
> >const std::string &riscv_current_abi_str,
> >const std::vector &multilib_infos)
> >  {
> > -  for (size_t i = 0; i < multilib_infos.size (); ++i)
> > +  for (ssize_t i = multilib_infos.size (); i >= 0; --i)
> >  if (riscv_current_abi_str == multilib_infos[i].abi_str)
> >return xstrdup (multilib_infos[i].path.c_str ());
> >
> > --
> > 2.39.3 (Apple Git-146)
> >


[PATCH] RISC-V: Lookup reversely in riscv_select_multilib_by_abi

2024-09-05 Thread YunQiang Su
From: YunQiang Su 

When use --print-multilib-os-dir, gcc outputs different value
with full -march option and the base one only.

$ ./gcc/xgcc --print-multilib-os-dir -mabi=lp64d -march=rv64gc
lib64/lp64d

$ ./gcc/xgcc --print-multilib-os-dir -mabi=lp64d -march=rv64gc_zba
.

The reason is that in multilib.h, the fallback value of multilib
is listed as the 1st one in `multilib_raw[]`.

gcc
* common/config/riscv/riscv-common.cc(riscv_select_multilib_by_abi):
look up reversely as the fallback path is listed as the 1st one.
---
 gcc/common/config/riscv/riscv-common.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 62c6e1dab1f..2c1ce7fc7cb 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -2079,7 +2079,7 @@ riscv_select_multilib_by_abi (
   const std::string &riscv_current_abi_str,
   const std::vector &multilib_infos)
 {
-  for (size_t i = 0; i < multilib_infos.size (); ++i)
+  for (ssize_t i = multilib_infos.size (); i >= 0; --i)
 if (riscv_current_abi_str == multilib_infos[i].abi_str)
   return xstrdup (multilib_infos[i].path.c_str ());
 
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] MIPS: Add some floating point instructions support for MIPSr6

2024-09-02 Thread YunQiang Su
Jie Mei  于2024年7月26日周五 14:50写道:
>
> This patch adds some floating point instructiions from mips32r6,
> for instance, MINA/MAXA.fmt, RINT.fmt, CLASS.fmt etc.
>
> Also add built-in functions to MIPSr6 to better handle tests
> for MIPSr6.
>
> gcc/ChangeLog:
>
> * config/mips/i6400.md (i6400_fpu_minmax): Include
> fclass type.
> (i6400_fpu_fadd): Include frint type.
> * config/mips/mips.cc (AVAIL_NON_MIPS16): Add an entry
> for __builtin_mipsr6_xxx.

Since libc has fmaxmag/fminmag, maybe we should use __builtin_fmaxmag.
And `rint/rintf` exist in libc.


> (MIPSR6_BUILTIN_PURE): Same as above.
> (CODE_FOR_mipsr6_min_a_s, CODE_FOR_mipsr6_min_a_d)
> (CODE_FOR_mipsr6_max_a_s, CODE_FOR_mipsr6_max_a_d)
> (CODE_FOR_mipsr6_rint_s, CODE_FOR_mipsr6_rint_d)
> (CODE_FOR_mipsr6_class_s, CODE_FOR_mipsr6_class_d):
> New code_aliasing macros.
> (mips_builtins): Add mips32r6 min_a_s, min_a_d, max_a_s,
> max_a_d, rint_s, rint_d, class_s, class_d builtins.
> * config/mips/mips.h (ISA_HAS_FRINT): Define a new macro.
> (ISA_HAS_FCLASS): Same as above.
> * config/mips/mips.md (UNSPEC_FRINT): New unspec.
> (UNSPEC_FCLASS): Same as above.
> (type): Add frint and fclass.
> (fmin_a_): Generates MINA.fmt instructions.
> (fmax_a_): Generates MAXA.fmt instructions.
> (frint_): Generates RINT.fmt instructions.
> (fclass_): Generates CLASS.fmt instructions.
> * config/mips/p6600.md (p6600_fpu_fadd): Include
> frint type.
> (p6600_fpu_fabs): Incled fclass type.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/mips/mips-class.c: New tests for MIPSr6
> * gcc.target/mips/mips-minamaxa.c: Same as above.
> * gcc.target/mips/mips-rint.c: Same as above.
> ---
>  gcc/config/mips/i6400.md  |  8 +--
>  gcc/config/mips/mips.cc   | 28 ++
>  gcc/config/mips/mips.h|  4 ++
>  gcc/config/mips/mips.md   | 52 ++-
>  gcc/config/mips/p6600.md  |  8 +--
>  gcc/testsuite/gcc.target/mips/mips-class.c| 17 ++
>  gcc/testsuite/gcc.target/mips/mips-minamaxa.c | 31 +++
>  gcc/testsuite/gcc.target/mips/mips-rint.c | 17 ++
>  8 files changed, 155 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/mips/mips-class.c
>  create mode 100644 gcc/testsuite/gcc.target/mips/mips-minamaxa.c
>  create mode 100644 gcc/testsuite/gcc.target/mips/mips-rint.c
>
> diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
> index d6f691ee217..48ce980e1c2 100644
> --- a/gcc/config/mips/i6400.md
> +++ b/gcc/config/mips/i6400.md
> @@ -219,16 +219,16 @@
> (eq_attr "type" "fabs,fneg,fmove"))
>"i6400_fpu_short, i6400_fpu_apu")
>
> -;; min, max
> +;; min, max, fclass
>  (define_insn_reservation "i6400_fpu_minmax" 2
>(and (eq_attr "cpu" "i6400")
> -   (eq_attr "type" "fminmax"))
> +   (eq_attr "type" "fminmax,fclass"))
>"i6400_fpu_short+i6400_fpu_logic")
>
> -;; fadd, fsub, fcvt
> +;; fadd, fsub, fcvt, frint
>  (define_insn_reservation "i6400_fpu_fadd" 4
>(and (eq_attr "cpu" "i6400")
> -   (eq_attr "type" "fadd,fcvt"))
> +   (eq_attr "type" "fadd,fcvt,frint"))
>"i6400_fpu_long, i6400_fpu_apu")
>
>  ;; fmul
> diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> index 6c797b62164..14a1f23eb70 100644
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -15775,6 +15775,7 @@ AVAIL_NON_MIPS16 (dspr2_32, !TARGET_64BIT && 
> TARGET_DSPR2)
>  AVAIL_NON_MIPS16 (loongson, TARGET_LOONGSON_MMI)
>  AVAIL_MIPS16E2_OR_NON_MIPS16 (cache, TARGET_CACHE_BUILTIN)
>  AVAIL_NON_MIPS16 (msa, TARGET_MSA)
> +AVAIL_NON_MIPS16 (r6, mips_isa_rev >= 6)
>
>  /* Construct a mips_builtin_description from the given arguments.
>
> @@ -15940,6 +15941,14 @@ AVAIL_NON_MIPS16 (msa, TARGET_MSA)
>  "__builtin_msa_" #INSN,  MIPS_BUILTIN_DIRECT_NO_TARGET,\
>  FUNCTION_TYPE, mips_builtin_avail_msa, false }
>
> +/* Define a MIPSr6 MIPS_BUILTIN_DIRECT pure function __builtin_mipsr6_
> +   for instruction CODE_FOR_mipsr6_.  FUNCTION_TYPE is a 
> builtin_description
> +   field.  */
> +#define MIPSR6_BUILTIN_PURE(INSN, FUNCTION_TYPE)   \
> +{ CODE_FOR_mipsr6_ ## INSN, MIPS_FP_COND_f,  
>   \
> +"__builtin_mipsr6_" #INSN,  MIPS_BUILTIN_DIRECT,   \
> +FUNCTION_TYPE, mips_builtin_avail_r6, true }
> +
>  #define CODE_FOR_mips_sqrt_ps CODE_FOR_sqrtv2sf2
>  #define CODE_FOR_mips_addq_ph CODE_FOR_addv2hi3
>  #define CODE_FOR_mips_addu_qb CODE_FOR_addv4qi3
> @@ -16177,6 +16186,15 @@ AVAIL_NON_MIPS16 (msa, TARGET_MSA)
>  #define CODE_FOR_msa_ldi_w CODE_FOR_msa_ldiv4si
>  #define CODE_FOR_msa_ldi_d CODE_FOR_msa_ldiv2di
>
> +#define CODE_FOR_mipsr6_min_a_s CODE_FOR_fmin_a_sf
> +#define CODE

Re: [PING] [PATCH V3 07/10] mips: Adjust dot-product backend patterns

2024-08-28 Thread YunQiang Su
Victor Do Nascimento  于2024年8月28日周三 23:15写道:
>
> Hello,
>
> Gentle reminder for this simple renaming patch :)
>

Approved, but, it will be better if we can add a test case for it.

> Thanks,
> Victor
>
>
> On 8/15/24 09:44, Victor Do Nascimento wrote:
> > Following the migration of the dot_prod optab from a direct to a
> > conversion-type optab, ensure all back-end patterns incorporate the
> > second machine mode into pattern names.
> >
> > gcc/ChangeLog:
> >
> >   * config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to...
> >   (sdot_prodv2siv4hi): ...this.
> > ---
> >   gcc/config/mips/loongson-mmi.md | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/mips/loongson-mmi.md 
> > b/gcc/config/mips/loongson-mmi.md
> > index dd166bfa4c9..4d958730139 100644
> > --- a/gcc/config/mips/loongson-mmi.md
> > +++ b/gcc/config/mips/loongson-mmi.md
> > @@ -394,7 +394,7 @@ (define_insn "loongson_pmaddhw"
> > "pmaddhw\t%0,%1,%2"
> > [(set_attr "type" "fmul")])
> >
> > -(define_expand "sdot_prodv4hi"
> > +(define_expand "sdot_prodv2siv4hi"
> > [(match_operand:V2SI 0 "register_operand" "")
> >  (match_operand:V4HI 1 "register_operand" "")
> >  (match_operand:V4HI 2 "register_operand" "")


[PATCH] MIPS: Support vector reduc for MSA

2024-08-26 Thread YunQiang Su
From: YunQiang Su 

We have SHF.fmt and HADD_S/U.fmt with MSA, which can be used for
vector reduc.

For min/max for U8/S8, we can
SHF.B W1, W0, 0xb1  # swap byte inner every half
MIN.B W1, W1, W0
SHF.H W2, W1, 0xb1  # swap half inner every word
MIN.B W2, W2, W1
SHF.W W3, W2, 0xb1  # swap word inner every doubleword
MIN.B W4, W3, W2
SHF.W W4, W4, 0x4e  # swap the two doubleword
MIN.B W4, W4, W3

For plus of S8/U8, we can use HADD
HADD.H  W0, W0, W0
HADD.W  W0, W0, W0
HADD.D  W0, W0, W0
SHF.W   W1, W0, 0x4e  # swap the two doubleword
ADDV.D  W1, W1, W0
COPY_S.B  T0, W1  # COPY_U.B for U8

We can do similar for S16/U16/S32/U32/S64/U64/FLOAT/DOUBLE.

gcc

* config/mips/mips-msa.md: (MSA_NO_HADD): we have HADD for
S8/U8/S16/U16/S32/U32 only.
reduc_smin_scal_: New define pattern.
reduc_smax_scal_: Ditto.
reduc_umin_scal_: Ditto.
reduc_umax_scal_: Ditto.
reduc_plus_scal_: Ditto.
reduc_plus_scal_v4si: Ditto.
reduc_plus_scal_v8hi: Ditto.
reduc_plus_scal_v16qi: Ditto.
reduc__scal_: Ditto.
* config/mips/mips-protos.h: New function mips_expand_msa_reduc.
* config/mips/mips.cc: New function mips_expand_msa_reduc.
* config/mips/mips.md: Define any_bitwise iterator.

gcc/testsuite:

gcc.target/mips/msa-reduc.c: New tests.
---
 gcc/config/mips/mips-msa.md   | 128 ++
 gcc/config/mips/mips-protos.h |   1 +
 gcc/config/mips/mips.cc   |  41 +++
 gcc/config/mips/mips.md   |   4 +
 gcc/testsuite/gcc.target/mips/msa-reduc.c | 119 
 5 files changed, 293 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/msa-reduc.c

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 377c63f0d35..976f296402e 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -125,6 +125,9 @@ (define_mode_iterator IMSA_WH  [V4SI V8HI])
 ;; Only floating-point modes.
 (define_mode_iterator FMSA [V2DF V4SF])
 
+;; Only used for reduce_plus_scal: V4SI, V8HI, V16QI have HADD.
+(define_mode_iterator MSA_NO_HADD [V2DF V4SF V2DI])
+
 ;; The attribute gives the integer vector mode with same size.
 (define_mode_attr VIMODE
   [(V2DF "V2DI")
@@ -2802,3 +2805,128 @@ (define_insn "msa__v_"
   (set_attr "mode" "TI")
   (set_attr "compact_form" "never")
   (set_attr "branch_likely" "no")])
+
+
+;; Vector reduction operation
+(define_expand "reduc_smin_scal_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:MSA 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (mode);
+  mips_expand_msa_reduc (gen_smin3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+ const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smax_scal_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:MSA 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (mode);
+  mips_expand_msa_reduc (gen_smax3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+ const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umin_scal_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:IMSA 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (mode);
+  mips_expand_msa_reduc (gen_umin3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+ const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umax_scal_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:IMSA 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (mode);
+  mips_expand_msa_reduc (gen_umax3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+ const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_"
+  [(match_operand: 0 "register_operand")
+   (match_operand:MSA_NO_HADD 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (mode);
+  mips_expand_msa_reduc (gen_add3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+ const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_v4si"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:V4SI 1 "register_operand")]
+  "ISA_HAS_MSA"
+{
+  rtx tmp = gen_reg_rtx (SImode);
+  rtx tmp1 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_msa_hadd_s_d (tmp1, operands[1], operands[1]));

[PATCH] MIPS: Include missing mips16.S in libgcc/lib1funcs.S

2024-08-23 Thread YunQiang Su
mips16.S was missing since
commit 29b74545531f6afbee9fc38c267524326dbfbedf
Date:   Thu Jun 1 10:14:24 2023 +0800

MIPS: Add speculation_barrier support

Without mips16.S included, some symbols will miss for mips16, and
so some software will fail to build.

libgcc/ChangeLog:

* config/mips/lib1funcs.S: Includes mips16.S.
---
 libgcc/config/mips/lib1funcs.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgcc/config/mips/lib1funcs.S b/libgcc/config/mips/lib1funcs.S
index fa8114b37d9..324a84e7846 100644
--- a/libgcc/config/mips/lib1funcs.S
+++ b/libgcc/config/mips/lib1funcs.S
@@ -19,7 +19,7 @@ a copy of the GCC Runtime Library Exception along with this 
program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 .  */
 
-//#include "mips16.S"
+#include "mips16.S"
 
 #ifdef L_speculation_barrier
 
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] Build/Cross: Look for target headers from include if sys-include doesn't exist

2024-08-20 Thread YunQiang Su
YunQiang Su  于2024年7月5日周五 16:14写道:
>
> Ping again.

Ping.


Re: [PATCH 2/2] RISC-V: Allow uninitialized preferred_else_value for RVV

2024-07-11 Thread YunQiang Su
Richard Biener  于2024年7月11日周四 20:21写道:
>
> On Thu, Jul 11, 2024 at 2:13 PM YunQiang Su  wrote:
> >
> > From: YunQiang Su 
> >
> > PR target/115840.
> >
> > In riscv_preferred_else_value, we create an uninitialized tmp var
> > for else value, instead of the 0 (as default_preferred_else_value)
> > or the pre-exists VAR (as aarch64 does), so that we can use agnostic
> > policy.
> >
> > The problem is that `warn_uninit` will emit a warning:
> >   ({anonymous})’ may be used uninitialized
> >
> > Let's mark this tmp var as "allow_uninitialized".
> >
> > This problem is found when I try to build glibc with V extension.
> >
> > gcc
> > PR target/115840.
> > * config/riscv/riscv.cc(riscv_preferred_else_value): Mark
> > tmp_var as allow_unitialized.
> >
> > gcc/testsuite
> > * gcc.dg/vect/pr115840.c: New testcase.
> > ---
> >  gcc/config/riscv/riscv.cc|  6 +-
> >  gcc/testsuite/gcc.dg/vect/pr115840.c | 11 +++
> >  2 files changed, 16 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/vect/pr115840.c
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 61fa74e9322..08159d7cbbc 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -11431,7 +11431,11 @@ riscv_preferred_else_value (unsigned ifn, tree 
> > vectype, unsigned int nops,
> > tree *ops)
> >  {
> >if (riscv_v_ext_mode_p (TYPE_MODE (vectype)))
> > -return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype));
> > +{
> > +  tree tmp_var = create_tmp_var (vectype);
> > +  TREE_ALLOW_UNINITIALIZED (tmp_var) = 1;
>
> Does it work when you do
>
>  TREE_NO_WARNING (tmp_var) = 1;
>

Thanks.  It works.  I did notice it, while I worried that there may be
some other
warnings, that TREE_NO_WARNING may cover them.

> ?
>
> > +  return get_or_create_ssa_default_def (cfun, tmp_var);
> > +}
> >
> >return default_preferred_else_value (ifn, vectype, nops, ops);
> >  }
> > diff --git a/gcc/testsuite/gcc.dg/vect/pr115840.c 
> > b/gcc/testsuite/gcc.dg/vect/pr115840.c
> > new file mode 100644
> > index 000..09dc9e4eb7c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/pr115840.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-Wall -Werror" } */
> > +
> > +double loads[16];
> > +
> > +void
> > +foo (double loadavg[], int count)
> > +{
> > +  for (int i = 0; i < count; i++)
> > +loadavg[i] = loads[i] / 1.5;
> > +}
> > --
> > 2.45.1
> >


[PATCH v2] RISC-V: NO_WARNING preferred else value for RVV

2024-07-11 Thread YunQiang Su
From: YunQiang Su 

PR target/115840.

In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.

The problem is that `warn_uninit` will emit a warning:
  '({anonymous})' may be used uninitialized

Let's mark this tmp var as NO_WARNING.

This problem is found when I try to build glibc with V extension.

gcc
PR target/115840.
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as NO_WARNING.

gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.
---
 gcc/config/riscv/riscv.cc|  6 +-
 gcc/testsuite/gcc.dg/vect/pr115840.c | 11 +++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr115840.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 61fa74e9322..276998a992b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -11431,7 +11431,11 @@ riscv_preferred_else_value (unsigned ifn, tree 
vectype, unsigned int nops,
tree *ops)
 {
   if (riscv_v_ext_mode_p (TYPE_MODE (vectype)))
-return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype));
+{
+  tree tmp_var = create_tmp_var (vectype);
+  TREE_NO_WARNING (tmp_var) = 1;
+  return get_or_create_ssa_default_def (cfun, tmp_var);
+}
 
   return default_preferred_else_value (ifn, vectype, nops, ops);
 }
diff --git a/gcc/testsuite/gcc.dg/vect/pr115840.c 
b/gcc/testsuite/gcc.dg/vect/pr115840.c
new file mode 100644
index 000..09dc9e4eb7c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr115840.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall -Werror" } */
+
+double loads[16];
+
+void
+foo (double loadavg[], int count)
+{
+  for (int i = 0; i < count; i++)
+loadavg[i] = loads[i] / 1.5;
+}
-- 
2.45.1



[PATCH 1/2] Add allow_uninitialized to tree_base.u.bits for VAR_DECL

2024-07-11 Thread YunQiang Su
From: YunQiang Su 

Uninitialized internal temp variable may be useful in some case,
such as for COND_LEN_MUL etc on RISC-V with V extension: If an
const or pre-exists VAR is used, we have to use "undisturbed"
policy; if an uninitialized VAR is used, we can use "agnostic".
With "agnostic", the microarchitectures can omit copying part of
the VAR.

gcc
* tree-core.h(tree_base): Add u.bits.allow_uninitialized.
* tree.h: Add new macro TREE_ALLOW_UNINITIALIZED.
* tree-ssa-uninit.cc(warn_uninit): Don't warn if VAR is
marked as allow_uninitialized.
---
 gcc/tree-core.h| 5 -
 gcc/tree-ssa-uninit.cc | 4 
 gcc/tree.h | 4 
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index 27c569c7702..984201199f6 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1101,7 +1101,10 @@ struct GTY(()) tree_base {
   unsigned nameless_flag : 1;
   unsigned atomic_flag : 1;
   unsigned unavailable_flag : 1;
-  unsigned spare0 : 2;
+  /* Don't warn if uninitialized.  RISC-V V has tail agnostic/undisturbed
+policy, which may be get benifits if we use an uninitialized var.  */
+  unsigned allow_uninitialized : 1;
+  unsigned spare0 : 1;
 
   unsigned spare1 : 8;
 
diff --git a/gcc/tree-ssa-uninit.cc b/gcc/tree-ssa-uninit.cc
index 726684e472a..12861e1dbc9 100644
--- a/gcc/tree-ssa-uninit.cc
+++ b/gcc/tree-ssa-uninit.cc
@@ -142,6 +142,10 @@ warn_uninit (opt_code opt, tree t, tree var, gimple 
*context,
   if (!has_undefined_value_p (t))
 return;
 
+  /* VAR may mark itself as allow_uninitialized.  */
+  if (TREE_ALLOW_UNINITIALIZED (var))
+return;
+
   /* Ignore COMPLEX_EXPR as initializing only a part of a complex
  turns in a COMPLEX_EXPR with the not initialized part being
  set to its previous (undefined) value.  */
diff --git a/gcc/tree.h b/gcc/tree.h
index 28e8e71b036..381780fde2e 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3311,6 +3311,10 @@ extern void decl_fini_priority_insert (tree, 
priority_type);
 #define VAR_DECL_IS_VIRTUAL_OPERAND(NODE) \
   (VAR_DECL_CHECK (NODE)->base.u.bits.saturating_flag)
 
+/* In a VAR_DECL, nonzero if NODE is allowed to be uninitialized.  */
+#define TREE_ALLOW_UNINITIALIZED(NODE) \
+  (VAR_DECL_CHECK (NODE)->base.u.bits.allow_uninitialized)
+
 /* In a VAR_DECL, nonzero if this is a non-local frame structure.  */
 #define DECL_NONLOCAL_FRAME(NODE)  \
   (VAR_DECL_CHECK (NODE)->base.default_def_flag)
-- 
2.45.1



[PATCH 2/2] RISC-V: Allow uninitialized preferred_else_value for RVV

2024-07-11 Thread YunQiang Su
From: YunQiang Su 

PR target/115840.

In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.

The problem is that `warn_uninit` will emit a warning:
  ({anonymous})’ may be used uninitialized

Let's mark this tmp var as "allow_uninitialized".

This problem is found when I try to build glibc with V extension.

gcc
PR target/115840.
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as allow_unitialized.

gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.
---
 gcc/config/riscv/riscv.cc|  6 +-
 gcc/testsuite/gcc.dg/vect/pr115840.c | 11 +++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr115840.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 61fa74e9322..08159d7cbbc 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -11431,7 +11431,11 @@ riscv_preferred_else_value (unsigned ifn, tree 
vectype, unsigned int nops,
tree *ops)
 {
   if (riscv_v_ext_mode_p (TYPE_MODE (vectype)))
-return get_or_create_ssa_default_def (cfun, create_tmp_var (vectype));
+{
+  tree tmp_var = create_tmp_var (vectype);
+  TREE_ALLOW_UNINITIALIZED (tmp_var) = 1;
+  return get_or_create_ssa_default_def (cfun, tmp_var);
+}
 
   return default_preferred_else_value (ifn, vectype, nops, ops);
 }
diff --git a/gcc/testsuite/gcc.dg/vect/pr115840.c 
b/gcc/testsuite/gcc.dg/vect/pr115840.c
new file mode 100644
index 000..09dc9e4eb7c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr115840.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Wall -Werror" } */
+
+double loads[16];
+
+void
+foo (double loadavg[], int count)
+{
+  for (int i = 0; i < count; i++)
+loadavg[i] = loads[i] / 1.5;
+}
-- 
2.45.1



Re: [PATCH] Build/Cross: Look for target headers from include if sys-include doesn't exist

2024-07-05 Thread YunQiang Su
Ping again.


[PATCH v3] MIPS: Output $0 instead of 0 for conditional trap if one operand is zero

2024-07-05 Thread YunQiang Su
We have done so for MIPSr6, which removes the support of condtional
trap with IMM.  To be consistent, Let's do so for pre-R6.

We also add 2 new tests
1) be sure that $0 is used.
2) be sure we expand the condtional trap compare with constant,
   instead of leaving it to GAS.

We decide to so so for MIPSr6 is that, we find a problem for code
.setnoreorder
.setnomacro
teq $2,0
GAS expands it instead of converting it to `teq $2,$0`:
li  $3,0
teq $2,$3
It is wrong, as we ask for `nomacro`.

GCC works well with `teq $2,IMM`, if IMM is not zero.  To be
sure that it will always be so in future, Let's add a test for it.

gcc
* config/mips/mips.md(conditional_trap): Output $0 instead of
IMM0.

gcc/testsuite:
* gcc.target/mips/trap-compare-0.c: Testcase to be sure that
$0 is used instead of IMM0 for conditional trap.
* gcc.target/mips/trap-compare-imm-r6.c: Testcase to be sure
that we expand condtional trap compare with constant.
---
 gcc/config/mips/mips.md   |  2 +-
 .../gcc.target/mips/trap-compare-0.c  | 31 
 .../gcc.target/mips/trap-compare-imm-r6.c | 36 +++
 3 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/trap-compare-0.c
 create mode 100644 gcc/testsuite/gcc.target/mips/trap-compare-imm-r6.c

diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index fd64d3d001a..591ae3cb438 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -1254,7 +1254,7 @@ (define_insn "*conditional_trap"
 (match_operand:GPR 2 "arith_operand" "dI")])
(const_int 0))]
   "ISA_HAS_COND_TRAPI"
-  "t%C0\t%z1,%2"
+  "t%C0\t%z1,%z2"
   [(set_attr "type" "trap")])
 
 ;;
diff --git a/gcc/testsuite/gcc.target/mips/trap-compare-0.c 
b/gcc/testsuite/gcc.target/mips/trap-compare-0.c
new file mode 100644
index 000..fb0078d34f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/trap-compare-0.c
@@ -0,0 +1,31 @@
+/* Check that we use $0 instead of 0 in conditional trap.  */
+/* { dg-do compile } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+
+NOMIPS16
+void teq0 (int i) {
+  if (i == 0)
+__builtin_trap();
+}
+
+NOMIPS16
+void tne0 (int i) {
+  if (i != 0)
+__builtin_trap();
+}
+
+NOMIPS16
+void tge0 (int i) {
+  if (i >= 0)
+__builtin_trap();
+}
+NOMIPS16
+void tlt0 (int i) {
+  if (i < 0)
+__builtin_trap();
+}
+
+/* { dg-final { scan-assembler "teq0:.*\tteq\t\\\$4,\\\$0" } } */
+/* { dg-final { scan-assembler "tne0:.*\ttne\t\\\$4,\\\$0" } } */
+/* { dg-final { scan-assembler "tge0:.*\ttge\t\\\$4,\\\$0" } } */
+/* { dg-final { scan-assembler "tlt0:.*\ttlt\t\\\$4,\\\$0" } } */
diff --git a/gcc/testsuite/gcc.target/mips/trap-compare-imm-r6.c 
b/gcc/testsuite/gcc.target/mips/trap-compare-imm-r6.c
new file mode 100644
index 000..b12e40672d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/trap-compare-imm-r6.c
@@ -0,0 +1,36 @@
+/* Check that no teq $2,imm macro is used for R6.  */
+/* { dg-do compile } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+/* { dg-options "isa_rev>=6" } */
+
+NOMIPS16
+void teq5 (int i) {
+  if (i == 5)
+__builtin_trap();
+}
+
+NOMIPS16
+void tne5 (int i) {
+  if (i != 5)
+__builtin_trap();
+}
+
+NOMIPS16
+void tge5 (int i) {
+  if (i >= 5)
+__builtin_trap();
+}
+NOMIPS16
+void tlt5 (int i) {
+  if (i < 5)
+__builtin_trap();
+}
+
+/* { dg-final { scan-assembler "teq5:.*\tli\t\\\$2,5.*\tteq\t\\\$4,\\\$2" } } 
*/
+/* { dg-final { scan-assembler-not "teq5:.*\tteq\t\\\$4,5" } } */
+/* { dg-final { scan-assembler "tne5:.*\tli\t\\\$2,5.*\ttne\t\\\$4,\\\$2" } } 
*/
+/* { dg-final { scan-assembler-not "tne5:.*\ttne\t\\\$4,5" } } */
+/* { dg-final { scan-assembler "tge5:.*\tli\t\\\$2,4.*\ttge\t\\\$2,\\\$4" } } 
*/
+/* { dg-final { scan-assembler-not "tge5:.*\ttge\t\\\$4,5" } } */
+/* { dg-final { scan-assembler "tlt5:.*\tli\t\\\$2,4.*\ttge\t\\\$2,\\\$4" } } 
*/
+/* { dg-final { scan-assembler-not "tlt5:.*\ttlt\t\\\$4,5" } } */
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS/testsuite: Fix umips-save-restore-1.c

2024-06-28 Thread YunQiang Su
With some recent optimization, -O1/-O2/-O3 can archive almost same
performace/size by stack load/store.  Thus lwm/swm will save/store
less callee-saved register.  In fact only $16 is saved with swm.

To be sure that this optimization does exist, let's add 2 more
function calls.  So that lwm/swm can be much more profitable.

If we add only once more, -O1 will still use stack load/store.

gcc/testsuite
* gcc.target/mips/umips-save-restore-1.c: Be sure lwm/swm
are used for more callee-saved registers with addtional
2 more function calls.
---
 gcc/testsuite/gcc.target/mips/umips-save-restore-1.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c 
b/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c
index ff1ea4b339a..0e2c4dcc844 100644
--- a/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c
+++ b/gcc/testsuite/gcc.target/mips/umips-save-restore-1.c
@@ -7,12 +7,14 @@ int bar (int, int, int, int, int);
 MICROMIPS int
 foo (int n, int a, int b, int c, int d)
 {
-  int i, j;
+  int i, j, k, l;
 
   i = bar (n, a, b, c, d);
   j = bar (n, a, b, c, d);
-  return i + j;
+  k = bar (n, a, b, c, d);
+  l = bar (n, a, b, c, d);
+  return i + j + k + l;
 }
 
-/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(0|1),\\\$31" } } */
-/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(0|1),\\\$31" } } */
+/* { dg-final { scan-assembler "\tswm\t\\\$16-\\\$2(2|3),\\\$31" } } */
+/* { dg-final { scan-assembler "\tlwm\t\\\$16-\\\$2(2|3),\\\$31" } } */
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS/testsuite: Add -mfpxx to call-clobbered-1.c

2024-06-27 Thread YunQiang Su
The scan-assembler-times rules only fit for -mfp32 and -mfpxx.
It fails if we are configured as FP64 by default, as it has
one less sdc1/ldc1 pair.

gcc/testsuite
* gcc.target/mips/call-clobbered-1.c: Add -mfpxx.
---
 gcc/testsuite/gcc.target/mips/call-clobbered-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/mips/call-clobbered-1.c 
b/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
index 77294aa3c2d..8880ad13684 100644
--- a/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
+++ b/gcc/testsuite/gcc.target/mips/call-clobbered-1.c
@@ -1,6 +1,6 @@
 /* Check that we handle call-clobbered FPRs correctly.  */
 /* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
-/* { dg-options "isa>=2 -mabi=32 -mhard-float -ffixed-f0 -ffixed-f1 -ffixed-f2 
-ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -ffixed-f8 -ffixed-f9 
-ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 -ffixed-f15 
-ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19" } */
+/* { dg-options "isa>=2 -mabi=32 -mfpxx -mhard-float -ffixed-f0 -ffixed-f1 
-ffixed-f2 -ffixed-f3 -ffixed-f4 -ffixed-f5 -ffixed-f6 -ffixed-f7 -ffixed-f8 
-ffixed-f9 -ffixed-f10 -ffixed-f11 -ffixed-f12 -ffixed-f13 -ffixed-f14 
-ffixed-f15 -ffixed-f16 -ffixed-f17 -ffixed-f18 -ffixed-f19" } */
 
 void bar (void);
 double a;
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS: Support more cases with alien mode of SHF.DF

2024-06-27 Thread YunQiang Su
Currently, we support the cases that strictly fit for the instructions.
For example, for V16QImode, we only support shuffle like
(0<=N0, N1, N2, N3<=3 here)
N0, N1, N2, N3
N0+4N1+4N2+4,   N3+4
N0+8N1+8N2+8,   N3+8
N0+12   N1+12   N2+12,  N3+12

While in fact we can support more cases to try use other SHF.DF
instructions not strictly fitting the mode.

1) We can use SHF.H to support more cases for V16QImode:
(M0/M1/M2/M3 are 0 or 2 or 4 or 6)
M0  M0+1,   M1, M1+1
M2  M2+1,   M3, M3+1
M0+8M0+9,   M1+8,   M1+9
M2+8M2+9,   M3+8,   M3+9

2) We can use SHF.W to support some cases for V16QImode:
(M0/M1/M2/M3 are 0 or 4 or 8 or 12)
M0, M0+1,   M0+2,   M0+3
M1, M1+1,   M1+2,   M1+3
M2, M2+1,   M2+2,   M2+3
M3, M3+1,   M3+2,   M3+3

3) We can use SHF.W to support some cases for V8HImode:
(M0/M1/M2/M3 are 0 or 2 or 4 or 6)
M0, M0+1
M1, M1+1
M2, M2+1
M3, M3+1

4) We can also use SHF.W to swap the 2 parts of V2DF or V2DI.

gcc
* config/mips/mips-protos.h: New function mips_msa_shf_i8.
* config/mips/mips.cc(mips_const_vector_shuffle_set_p):
Support more cases try to use alien mode instruction;
(mips_msa_shf_i8): New function to get the correct MSA SHF
instruction and IMM.
---
 gcc/config/mips/mips-msa.md   |  35 
 gcc/config/mips/mips-protos.h |   1 +
 gcc/config/mips/mips.cc   | 149 ++
 3 files changed, 170 insertions(+), 15 deletions(-)

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 0081b688ce9..377c63f0d35 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -125,9 +125,6 @@ (define_mode_iterator IMSA_WH  [V4SI V8HI])
 ;; Only floating-point modes.
 (define_mode_iterator FMSA [V2DF V4SF])
 
-;; Only used for immediate set shuffle elements instruction.
-(define_mode_iterator MSA_WHB_W [V4SI V8HI V16QI V4SF])
-
 ;; The attribute gives the integer vector mode with same size.
 (define_mode_attr VIMODE
   [(V2DF "V2DI")
@@ -2520,21 +2517,29 @@ (define_insn "msa_sat_u_"
(set_attr "mode" "")])
 
 (define_insn "msa_shf_"
-  [(set (match_operand:MSA_WHB_W 0 "register_operand" "=f")
-   (vec_select:MSA_WHB_W
- (match_operand:MSA_WHB_W 1 "register_operand" "f")
+  [(set (match_operand:MSA 0 "register_operand" "=f")
+   (vec_select:MSA
+ (match_operand:MSA 1 "register_operand" "f")
  (match_operand 2 "par_const_vector_shf_set_operand" "")))]
   "ISA_HAS_MSA"
 {
-  HOST_WIDE_INT val = 0;
-  unsigned int i;
-
-  /* We convert the selection to an immediate.  */
-  for (i = 0; i < 4; i++)
-val |= INTVAL (XVECEXP (operands[2], 0, i)) << (2 * i);
-
-  operands[2] = GEN_INT (val);
-  return "shf.\t%w0,%w1,%X2";
+  HOST_WIDE_INT rval = mips_msa_shf_i8 (operands);
+  /* 0b11100100 means that there is no shf needed at all.  This RTL
+ should be optimized out in some pass.  */
+  if ((rval & 0xff) == 0xe4)
+gcc_unreachable ();
+  operands[2] = GEN_INT (rval & 0xff);
+  switch (rval & 0xff00)
+  {
+  default: gcc_unreachable ();
+  case 0x400:
+return "shf.w\t%w0,%w1,%X2";
+  case 0x200:
+return "shf.h\t%w0,%w1,%X2";
+  case 0x100:
+return "shf.b\t%w0,%w1,%X2";
+  }
+  gcc_unreachable ();
 }
   [(set_attr "type" "simd_shf")
(set_attr "mode" "")])
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 75f80984c03..90b4c87fdea 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -387,6 +387,7 @@ extern mulsidi3_gen_fn mips_mulsidi3_gen_fn (enum rtx_code);
 extern void mips_register_frame_header_opt (void);
 extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, rtx *, 
bool);
 extern void mips_expand_vec_cmp_expr (rtx *);
+extern HOST_WIDE_INT mips_msa_shf_i8 (rtx *);
 
 extern void mips_emit_speculation_barrier_function (void);
 
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 7d4791157d1..6c797b62164 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -2079,6 +2079,72 @@ mips_const_vector_shuffle_set_p (rtx op, machine_mode 
mode)
   int nsets = nunits / 4;
   int set = 0;
   int i, j;
+  int val[4];
+  bool ok;
+
+  /* We support swapping 2 Doubleword part with shf.w.  */
+  if (ISA_HAS_MSA && (mode == V2DFmode || mode == V2DImode))
+{
+  if (!IN_RANGE (INTVAL (XVECEXP (op, 0, 0)), 0, 1)
+ || !IN_RANGE (INTVAL (XVECEXP (op, 0, 1)), 0, 1))
+   return false;
+}
+
+  if (ISA_HAS_MSA && mode == V16QImode)
+{
+ /* We can use shf.w if the elements are in-order inner 32bit.  */
+  ok = true;
+  for (j = 0; j < 4; j++)
+   {
+ val[0] = INTVAL (XVECEXP (op, 0, j * 4));
+ val[1] = INTVAL (XVECEXP (op, 0, j * 4 + 1));
+ val[2] = INTVAL (XVECEXP (op, 0, j * 4 + 2));
+ 

[PATCH] Testsuite/MIPS: Fix msa.c: test7_v2f64, test7_v4f32, test43_v2i64

2024-06-27 Thread YunQiang Su
BNEGI.W/D are used for test7_v2f64 and test7_v4f32 now.  It is
an improvment since that we can save a instruction.

ILVR.D is used for test43_v2i64 now, instead of INSVE.D.

gcc/testsuite
gcc.target/mips/msa.c: Fix test7_v2f64, test7_v4f32 and
test43_v2i64.
---
 gcc/testsuite/gcc.target/mips/msa.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/mips/msa.c 
b/gcc/testsuite/gcc.target/mips/msa.c
index b741f35556f..62d0606dfef 100644
--- a/gcc/testsuite/gcc.target/mips/msa.c
+++ b/gcc/testsuite/gcc.target/mips/msa.c
@@ -90,8 +90,8 @@
 /* { dg-final { scan-assembler-times "test7_v8u16:.*subv.h.*test7_v8u16" 1 } } 
*/
 /* { dg-final { scan-assembler-times "test7_v4u32:.*subv.w.*test7_v4u32" 1 } } 
*/
 /* { dg-final { scan-assembler-times "test7_v2u64:.*subv.d.*test7_v2u64" 1 } } 
*/
-/* { dg-final { scan-assembler-times "test7_v4f32:.*fsub.w.*test7_v4f32" 1 } } 
*/
-/* { dg-final { scan-assembler-times "test7_v2f64:.*fsub.d.*test7_v2f64" 1 } } 
*/
+/* { dg-final { scan-assembler-times "test7_v4f32:.*bnegi.w.*test7_v4f32" 1 } 
} */
+/* { dg-final { scan-assembler-times "test7_v2f64:.*bnegi.d.*test7_v2f64" 1 } 
} */
 /* { dg-final { scan-assembler-times "test8_v16i8:.*xor.v.*test8_v16i8" 1 } } 
*/
 /* { dg-final { scan-assembler-times "test8_v8i16:.*xor.v.*test8_v8i16" 1 } } 
*/
 /* { dg-final { scan-assembler-times "test8_v4i32:.*xor.v.*test8_v4i32" 1 } } 
*/
@@ -401,7 +401,7 @@
 /* { dg-final { scan-assembler-times "test43_v16i8:.*insve.b.*test43_v16i8" 1 
} } */
 /* { dg-final { scan-assembler-times "test43_v8i16:.*insve.h.*test43_v8i16" 1 
} } */
 /* { dg-final { scan-assembler-times "test43_v4i32:.*insve.w.*test43_v4i32" 1 
} } */
-/* { dg-final { scan-assembler-times "test43_v2i64:.*insve.d.*test43_v2i64" 1 
} } */
+/* { dg-final { scan-assembler-times "test43_v2i64:.*ilvr.d.*test43_v2i64" 1 } 
} */
 /* { dg-final { scan-assembler-times "test44_v16i8:.*copy_s.b.*test44_v16i8" 1 
} } */
 /* { dg-final { scan-assembler-times "test44_v8i16:.*copy_s.h.*test44_v8i16" 1 
} } */
 /* { dg-final { scan-assembler-times "test44_v4i32:.*copy_s.w.*test44_v4i32" 1 
} } */
-- 
2.39.3 (Apple Git-146)



Re: [PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-27 Thread YunQiang Su
Maciej W. Rozycki  于2024年6月28日周五 01:01写道:
>
> On Thu, 27 Jun 2024, YunQiang Su wrote:
>
> > >  The missed optimisation in GAS, which used not to trigger pre-R6, is
> > > irrelevant from this change's point of view and just adds noise.  I'm
> > > surprised that it worked even in the first place, as I reckon GCC is
> > > supposed to emit regular MIPS code in the `.set nomacro' mode nowadays,
> >
> > In fact, GCC works well if IMM is not zero in mips_expand_conditional_trap
> >
> >   mode = GET_MODE (XEXP (comparison, 0));
> >   op0 = force_reg (mode, op0);
> >   if (!(ISA_HAS_COND_TRAPI
> > ? arith_operand (op1, mode)
> > : reg_or_0_operand (op1, mode)))
> > op1 = force_reg (mode, op1);  // <- here
> >
> > This problem happens due to that GCC trust GAS so much ;)
> > It believe that GAS can recognize `TEQ $2,0`.
>
>  Nope, the use of `reg_or_0_operand' (and the `J' constraint) implies the
> use of the `z' print operand modifier in the output template, there's no
> immediate operand expected to be ever produced from the output template in
> this case, which is the very bug (from commit 82f84ecbb47c ("MIPS32R6 and
> MIPS64R6 support") BTW) you have fixed.
>
>  It is by pure chance that it worked before, because TEQ is an assembly
> macro (and `.set nomacro' should warn about it and with -Werror ultimately

In fact it doesn't work.  I find this problem when I tried to fix some
GCC testcases.

> prevent assembly from succeeding) rather than a direct machine operation.
> It wouldn't have worked in the latter case at all (i.e. with some other
> instructions; there are existing examples in mips.md).
>
> > >  Overall ISTM there is no need for distinct insns for ISA_HAS_COND_TRAPI
> > > and !ISA_HAS_COND_TRAPI cases each and this would better be sorted with
> > > predicates and constraints, especially as the output pattern is the same
> > > in both cases anyway.  This would prevent special-casing from being needed
> > > in `mips_expand_conditional_trap' as well.
> > >
> >
> > I agree. The patch should be quite simple
> >
> >[(trap_if (match_operator:GPR 0 "trap_comparison_operator"
> > [(match_operand:GPR 1 "reg_or_0_operand" 
> > "dJ")
> >  (match_operand:GPR 2 "arith_operand" 
> > "dI")])
> > (const_int 0))]
> >"ISA_HAS_COND_TRAPI"
> > -  "t%C0\t%z1,%2"
> > +  "t%C0\t%z1,%z2"
> >[(set_attr "type" "trap")])
>
>  Nope, this is wrong.
>

> in both cases anyway.  This would prevent special-casing from being needed
> in `mips_expand_conditional_trap' as well.

We cannot make  `mips_expand_conditional_trap' simpler at this point.
As for pre-R6, we have TEQI, so that we can use it if IMM can be
represented with 16bit.
For R6 and IMM out range of 16bit, we have to emit more RTLs/INSNs to
load it into a reg.

Yes, we can merge the two template to

(define_insn "*conditional_trap"
  [(trap_if (match_operator:GPR 0 "trap_comparison_operator"
[(match_operand:GPR 1 "reg_or_0_operand" "dJ")
 (match_operand:GPR 2 "arith_operand" "dI")])
(const_int 0))]
  "ISA_HAS_COND_TRAP"
  {
 if (!ISA_HAS_COND_TRAPI && !reg_or_0_operand(operands[2],
GET_MODE(operands[2])))
gcc_unreachable();
 return "t%C0\t%z1,%z2";
   }
  [(set_attr "type" "trap")])


>   Maciej


Re: [PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-27 Thread YunQiang Su
Maciej W. Rozycki  于2024年6月27日周四 00:07写道:
>
> On Thu, 20 Jun 2024, YunQiang Su wrote:
>
> > MIPSr6 removes condition trap instructions with imm, so the instruction
> > like `teq $2,imm` will be converted to
> >   li $at, imm
> >   teq $2, $at
> >
> > The current version of Gas cannot detect if imm is zero, and output
> >   teq $2, $0
> > Let's do it in GCC.
>
>  This description should state that the change is a fix for an actual bug
> in GCC where the output pattern does not match the constraints supplied,
> and what consequences this has that the fix addressed.  There is no `imm'
> in the general sense here, just the special case of zero.
>
>  The missed optimisation in GAS, which used not to trigger pre-R6, is
> irrelevant from this change's point of view and just adds noise.  I'm
> surprised that it worked even in the first place, as I reckon GCC is
> supposed to emit regular MIPS code in the `.set nomacro' mode nowadays,

In fact, GCC works well if IMM is not zero in mips_expand_conditional_trap

  mode = GET_MODE (XEXP (comparison, 0));
  op0 = force_reg (mode, op0);
  if (!(ISA_HAS_COND_TRAPI
? arith_operand (op1, mode)
: reg_or_0_operand (op1, mode)))
op1 = force_reg (mode, op1);  // <- here

This problem happens due to that GCC trust GAS so much ;)
It believe that GAS can recognize `TEQ $2,0`.


> which is the only way to guarantee that instruction lengths known to GCC
> do not accidentally disagree with what the assembler has produced, such
> as in the case of the bug your change has addressed.
>
>  Overall ISTM there is no need for distinct insns for ISA_HAS_COND_TRAPI
> and !ISA_HAS_COND_TRAPI cases each and this would better be sorted with
> predicates and constraints, especially as the output pattern is the same
> in both cases anyway.  This would prevent special-casing from being needed
> in `mips_expand_conditional_trap' as well.
>

I agree. The patch should be quite simple

   [(trap_if (match_operator:GPR 0 "trap_comparison_operator"
[(match_operand:GPR 1 "reg_or_0_operand" "dJ")
 (match_operand:GPR 2 "arith_operand" "dI")])
(const_int 0))]
   "ISA_HAS_COND_TRAPI"
-  "t%C0\t%z1,%2"
+  "t%C0\t%z1,%z2"
   [(set_attr "type" "trap")])

I haven't do so, due to that I am wondering whether they have some
performance difference.

>   Maciej


Re: [PATCH] Add a late-combine pass [PR106594]

2024-06-25 Thread YunQiang Su
Just FYI. This patch does something to gcc.target/mips/madd-8.c, and
gcc.target/mips/msub-8.c.

-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler \tmul\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmadd\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmflo\t
-PASS: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmtlo\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler \tmul\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmadd\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmflo\t
+FAIL: gcc.target/mips/madd-8.c   -O2   scan-assembler-not \tmtlo\t

-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler \tmul\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmflo\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmsub\t
-FAIL: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmtlo\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler \tmul\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmflo\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmsub\t
+PASS: gcc.target/mips/msub-8.c   -O2   scan-assembler-not \tmtlo\t

Quite interesting.  I will inverest the real reason.


Re: [PATCH] Build/Cross: Look for target headers from include if sys-include doesn't exist

2024-06-21 Thread YunQiang Su
YunQiang Su  于2024年6月14日周五 20:12写道:
>
> PR 115416
>
> When we build a cross toolchain, while without --with-sysroot,
> target headers are expected in
>   ${test_exec_prefix}/${target_noncanonical}/sys-include
> while it is true only with --with-headers option is used. In other
> cases, the path should be
>   ${test_exec_prefix}/${target_noncanonical}/include
> such as Debian's cross toolchain.
>
> Debian's cross toolchain has directory structures like:
>/usr//lib
> /include
> /bin/
>
> For this case, we cannot use "--prefix=/usr --with-sysroot=/", as
> gcc/configure will use headers of build, aka in /usr/include to detect
> features.  And fixinclude also uses the headers of build.
>
> Let's use the `include` if `sys-include` doesn't exist.
>
> For Makefile.in, the compare @includedir@ and $(prefix)/include is not
> correct, as the --includedir option is used to set where the headers
> should be installed.
>
> gcc:
> PR 115415.
> configure.ac: Set target_header_dir and CROSS_SYSTEM_HEADER_DIR
> to ${test_exec_prefix}/${target_noncanonical}/include when cross
> and without --with-sysroot and without --with-headers.
> configure: Regenerate.
> Makefile.in: Set CROSS_SYSTEM_HEADER_DIR as configure, and don't
> compare @includedir@ and $(prefix)/include.
> ---
>  gcc/Makefile.in  | 6 +-
>  gcc/configure| 8 ++--
>  gcc/configure.ac | 4 
>  3 files changed, 11 insertions(+), 7 deletions(-)
>

Gently ping.


Re: [PATCH] Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel

2024-06-21 Thread YunQiang Su
> >
> >  And FAOD I think a stub check has to remain even after the removal and
> > just cause `configure' to bail out if an unsupported obsolete version of
> > GAS has been identified.
> >

Ohh,  I think that we shouldn't remove it now, as I have figure out
the PCREL patch,
and I am still waiting your response of PCREL support of Binutils.

My plan is, once Binutils is ready, I can submit my GCC patch.
I don't want to rewrite them.

And then, we can remove all no_explicit_relocs support. I mean that I
plan to remove
all `TARGET_EXPLICIT_RELOCS` macro related code in mips.cc/mips.h/mips.md etc.


Re: [PATCH] Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel

2024-06-21 Thread YunQiang Su
Maciej W. Rozycki  于2024年6月21日周五 22:00写道:
>
> On Fri, 21 Jun 2024, Maciej W. Rozycki wrote:
>
> > > Yeah, agreed FWIW.  This was necessary while the feature was relatively
> > > new, and while we still supported IRIX as, but I can't see any reasonable
> > > justification for using such an ancient binutils with modern GCC.
> > >
> > > Getting rid of -mno-explicit-relocs altogether might simplify things.
> >
> >  FWIW I tend to agree too, although I think the current mess has to be
> > fixed first (and backported to the release branches) before going forward
> > with the removal.
>
>  And FAOD I think a stub check has to remain even after the removal and
> just cause `configure' to bail out if an unsupported obsolete version of
> GAS has been identified.
>

Sure. And it is also useful to emit error if we cannot find mips binutils.
In fact, sometimes, I meet a problem if I forget to install mips binutils first.

>   Maciej


Re: [PATCH] Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel

2024-06-21 Thread YunQiang Su
Maciej W. Rozycki  于2024年6月21日周五 20:55写道:
>
> On Fri, 21 Jun 2024, Richard Sandiford wrote:
>
> > > We check gcc_cv_as_mips_explicit_relocs if 
> > > gcc_cv_as_mips_explicit_relocs_pcrel
> > > only, while gcc_cv_as_mips_explicit_relocs is used by later code.
> > >
> > > Maybe, it is time for use to set gcc_cv_as_mips_explicit_relocs always 
> > > now,
> > > as it has been in Binutils for more than 20 years.
> >
> > Yeah, agreed FWIW.  This was necessary while the feature was relatively
> > new, and while we still supported IRIX as, but I can't see any reasonable
> > justification for using such an ancient binutils with modern GCC.
> >
> > Getting rid of -mno-explicit-relocs altogether might simplify things.
>
>  FWIW I tend to agree too, although I think the current mess has to be
> fixed first (and backported to the release branches) before going forward
> with the removal.
>

Sure.

>  And AFAICT the proposed change is the wrong one: it has to be analysed
> how we came at the current breakage and then the state reproducing how it
> used to work before recreated.
>
>  Perhaps we need to check for general explicit reloc support first, before
> following with PC-relative relocs.  It seems natural to me this way,
> because you can't have support for PC-relative relocs (narrower scope)
> unless you have general explicit reloc support (wider scope) in the first
> place, so I wonder why we came up with what we have now.
>

I guess that we can suppose that these stages (some-future-one/pcrel/base)
are a strict superset one by one.

So we can detect the newest one, if it is OK, all older ones are also available.
If we check the oldest one first, we will have some trouble with AC_DEFINE,
as we may emit multiple "#define MIPS_EXPLICIT_RELOCS".

>   Maciej


Re: [PATCH] testsuite/ubsan/overflow-div-3.c: Use SIGTRAP for MIPS

2024-06-20 Thread YunQiang Su
>
>  Then GCC emits the wrong trap instruction, wherever it comes from and
> whatever has caused it.  The correct ones for integer division by zero

Thanks so much. It is not the bug of Linux kernel or GCC.
It is a bug of me ;) and qemu.

Qemu didn't pass the code of TEQ correctly; and I haven't run this test on
real hardware.


[PATCH] testsuite/ubsan/overflow-div-3.c: Use SIGTRAP for MIPS

2024-06-20 Thread YunQiang Su
The DIV instructions of MIPS won't be trapped themself if the divisor
is zero.  The compiler will emit a conditional trap instruct for it.
So the signal will be SIGTRAP instead of SIGFPE.

gcc/testsuite
* c-c++-common/ubsan/overflow-div-3.c: Use SIGTRAP for MIPS.
---
 gcc/testsuite/c-c++-common/ubsan/overflow-div-3.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/c-c++-common/ubsan/overflow-div-3.c 
b/gcc/testsuite/c-c++-common/ubsan/overflow-div-3.c
index 479dffb0304..eef58aca832 100644
--- a/gcc/testsuite/c-c++-common/ubsan/overflow-div-3.c
+++ b/gcc/testsuite/c-c++-common/ubsan/overflow-div-3.c
@@ -7,6 +7,11 @@
 #include 
 
 int cnt;
+#ifdef __mips
+ int sig = SIGTRAP;
+#else
+ int sig = SIGFPE;
+#endif
 
 __attribute__((noipa)) int
 foo (int x, int y)
@@ -30,7 +35,7 @@ main (void)
   sigemptyset (&s.sa_mask);
   s.sa_handler = handler;
   s.sa_flags = 0;
-  sigaction (SIGFPE, &s, NULL);
+  sigaction (sig, &s, NULL);
   volatile int a = foo (42, 0);
   cnt++;
   volatile int b = foo (INT_MIN, -1);
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] MIPS: Use Reg0 instead of const0_rtx for TRAP

2024-06-19 Thread YunQiang Su
YunQiang Su  于2024年6月20日周四 11:20写道:
>
> Maciej W. Rozycki  于2024年6月20日周四 01:24写道:
> >
> > On Wed, 19 Jun 2024, YunQiang Su wrote:
> >
> > > MIPSr6 removes condition trap instructions with imm, so the instruction
> > > like `teq $2,imm` will be converted to
> > >   li $at, imm
> > >   teq $2, $at
> > >
> > > The current version of Gas cannot detect if imm is zero, and output
> > >   teq $2, $0
> > > Let's do it in GCC.
> >
> >  It seems like an output pattern issue with `*conditional_trap_reg'
> > insn to me.
> >
>
> Yes. You are right. We should update `*conditional_trap_reg'.
>
> > > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> > > index 48924116937..ba1e6214656 100644
> > > --- a/gcc/config/mips/mips.cc
> > > +++ b/gcc/config/mips/mips.cc
> > > @@ -6026,7 +6026,7 @@ mips_expand_conditional_trap (rtx comparison)
> > >
> > >emit_insn (gen_rtx_TRAP_IF (VOIDmode,
> > > gen_rtx_fmt_ee (code, mode, op0, op1),
> > > -   const0_rtx));
> > > +   gen_rtx_REG (mode, GP_REG_FIRST)));
> >
> >  IOW this just papers over the actual issue.
> >
>
> I think that we still need it, as it will make the RTL more easy to 
> understand.
> I think that we should make the surprise in RTL as less as possible.
>

Ohh, you are right. It seems some RTL optimization passes prefers const0_rtx
much more. It is not easy to use REG0 here.

> >  FWIW,
> >
> >   Maciej


[PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-19 Thread YunQiang Su
MIPSr6 removes condition trap instructions with imm, so the instruction
like `teq $2,imm` will be converted to
  li $at, imm
  teq $2, $at

The current version of Gas cannot detect if imm is zero, and output
  teq $2, $0
Let's do it in GCC.

gcc
* config/mips/mips.md(conditional_trap_reg): Output $0 instead
of 0 if !ISA_HAS_COND_TRAPI.
---
 gcc/config/mips/mips.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 9962313602a..fd64d3d001a 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -1245,7 +1245,7 @@ (define_insn "*conditional_trap_reg"
 (match_operand:GPR 2 "reg_or_0_operand" "dJ")])
(const_int 0))]
   "ISA_HAS_COND_TRAP && !ISA_HAS_COND_TRAPI"
-  "t%C0\t%z1,%2"
+  "t%C0\t%z1,%z2"
   [(set_attr "type" "trap")])
 
 (define_insn "*conditional_trap"
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] MIPS: Use Reg0 instead of const0_rtx for TRAP

2024-06-19 Thread YunQiang Su
Maciej W. Rozycki  于2024年6月20日周四 01:24写道:
>
> On Wed, 19 Jun 2024, YunQiang Su wrote:
>
> > MIPSr6 removes condition trap instructions with imm, so the instruction
> > like `teq $2,imm` will be converted to
> >   li $at, imm
> >   teq $2, $at
> >
> > The current version of Gas cannot detect if imm is zero, and output
> >   teq $2, $0
> > Let's do it in GCC.
>
>  It seems like an output pattern issue with `*conditional_trap_reg'
> insn to me.
>

Yes. You are right. We should update `*conditional_trap_reg'.

> > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> > index 48924116937..ba1e6214656 100644
> > --- a/gcc/config/mips/mips.cc
> > +++ b/gcc/config/mips/mips.cc
> > @@ -6026,7 +6026,7 @@ mips_expand_conditional_trap (rtx comparison)
> >
> >emit_insn (gen_rtx_TRAP_IF (VOIDmode,
> > gen_rtx_fmt_ee (code, mode, op0, op1),
> > -   const0_rtx));
> > +   gen_rtx_REG (mode, GP_REG_FIRST)));
>
>  IOW this just papers over the actual issue.
>

I think that we still need it, as it will make the RTL more easy to understand.
I think that we should make the surprise in RTL as less as possible.

>  FWIW,
>
>   Maciej


Re: [PATCH] build: Fix missing variable quotes and typo

2024-06-19 Thread YunQiang Su
Collin Funk  于2024年6月20日周四 07:40写道:
>
> I've just fixed the quotes and that typo in one patch.  I hope you don't
> mind.  When using Autoconf 2.69 and Automake 1.15.1 that copyright diff
> goes away.  I'm not familiar with the gcc-autoregen bot but I think this
> should make it happy.
>
> -- >8 --
>
> When dlopen and pthread_create are in libc the variable is
> set to "none required", therefore running configure will show
> the following errors:
>
> ./configure: line 8997: test: too many arguments
> ./configure: line 8999: test: too many arguments
> ./configure: line 9003: test: too many arguments
> ./configure: line 9005: test: =: unary operator expected
>
> ChangeLog:
>
> PR bootstrap/115453
> * configure.ac: Quote variable result of AC_SEARCH_LIBS.  Fix
> typo ac_cv_search_pthread_crate.
> * configure: Regenerate.
>
> Signed-off-by: Collin Funk 
> ---

I committed it. And if you are using git format-patch, you can add
-V2/-V3/-V4 option if you are resending a updated patch.


[PATCH] Build: Set gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel

2024-06-19 Thread YunQiang Su
We check gcc_cv_as_mips_explicit_relocs if gcc_cv_as_mips_explicit_relocs_pcrel
only, while gcc_cv_as_mips_explicit_relocs is used by later code.

Maybe, it is time for use to set gcc_cv_as_mips_explicit_relocs always now,
as it has been in Binutils for more than 20 years.

gcc
* configure.ac: Set gcc_cv_as_mips_explicit_relocs if
gcc_cv_as_mips_explicit_relocs_pcrel.
* configure: Regenerate.
---
 gcc/configure| 2 ++
 gcc/configure.ac | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/gcc/configure b/gcc/configure
index 9dc0b65dfaa..ad998105da3 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -30278,6 +30278,8 @@ $as_echo "#define MIPS_EXPLICIT_RELOCS 
MIPS_EXPLICIT_RELOCS_BASE" >>confdefs.h
 
 fi
 
+else
+  gcc_cv_as_mips_explicit_relocs=yes
 fi
 
 if test x$gcc_cv_as_mips_explicit_relocs = xno; then \
diff --git a/gcc/configure.ac b/gcc/configure.ac
index b2243e9954a..c51d3ca5f1b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5255,6 +5255,8 @@ LCF0:
 [  lw $4,%gp_rel(foo)($4)],,
   [AC_DEFINE(MIPS_EXPLICIT_RELOCS, MIPS_EXPLICIT_RELOCS_BASE,
 [Define if assembler supports %reloc.])])
+else
+  gcc_cv_as_mips_explicit_relocs=yes
 fi
 
 if test x$gcc_cv_as_mips_explicit_relocs = xno; then \
-- 
2.39.3 (Apple Git-146)



Re: [gcc r15-1436] build: Fix missing variable quotes

2024-06-19 Thread YunQiang Su
Thanks.  Sorry for the noise. I have reverted
   8088374a868aacab4dff208ec3e3fde790a1d9a3
   c6a9ab8c920f297c4efd289182aef9fbc73f5906

I will submit and back port the modification of gcc_cv_as_mips_explicit_relocs
separately.

@Collin Funk Can you sent a new correct/full patch?


[PATCH] MIPS: Implement vcond_mask optabs for MSA

2024-06-19 Thread YunQiang Su
Currently, we have `mips_expand_vec_cond_expr`, which calculate
cmp_res first.  We can just add a new extra argument to ask it
to use operands[3] as cmp_res instead of calculating from operands[4]
and operands[5].

gcc
* config/mips/mips.cc(mips_expand_vec_cond_expr): Add extra
argument to info that opernads[3] is cmp_res already.
* config/mips/mips-msa.md(vcond_mask): Define new expand.
(vcondu): Use mips_expand_vec_cond_expr with 4th argument.
(vcond): Ditto.
---
 gcc/config/mips/mips-msa.md   | 17 +++--
 gcc/config/mips/mips-protos.h |  2 +-
 gcc/config/mips/mips.cc   | 18 --
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 779157f2a0c..0081b688ce9 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -411,6 +411,19 @@ (define_expand "vec_set"
   DONE;
 })
 
+(define_expand "vcond_mask_"
+  [(match_operand:MSA 0 "register_operand")
+   (match_operand:MSA 1 "reg_or_m1_operand")
+   (match_operand:MSA 2 "reg_or_0_operand")
+   (match_operand:IMSA 3 "register_operand")]
+  "ISA_HAS_MSA
+   && (GET_MODE_NUNITS (mode) == GET_MODE_NUNITS (mode))"
+{
+  mips_expand_vec_cond_expr (mode, mode, operands, true);
+  DONE;
+})
+
+
 (define_expand "vcondu"
   [(match_operand:MSA 0 "register_operand")
(match_operand:MSA 1 "reg_or_m1_operand")
@@ -421,7 +434,7 @@ (define_expand "vcondu"
   "ISA_HAS_MSA
&& (GET_MODE_NUNITS (mode) == GET_MODE_NUNITS (mode))"
 {
-  mips_expand_vec_cond_expr (mode, mode, operands);
+  mips_expand_vec_cond_expr (mode, mode, operands, 
false);
   DONE;
 })
 
@@ -435,7 +448,7 @@ (define_expand "vcond"
   "ISA_HAS_MSA
&& (GET_MODE_NUNITS (mode) == GET_MODE_NUNITS (mode))"
 {
-  mips_expand_vec_cond_expr (mode, mode, operands);
+  mips_expand_vec_cond_expr (mode, mode, operands, 
false);
   DONE;
 })
 
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index fcc0a0ae663..75f80984c03 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -385,7 +385,7 @@ extern mulsidi3_gen_fn mips_mulsidi3_gen_fn (enum rtx_code);
 #endif
 
 extern void mips_register_frame_header_opt (void);
-extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
+extern void mips_expand_vec_cond_expr (machine_mode, machine_mode, rtx *, 
bool);
 extern void mips_expand_vec_cmp_expr (rtx *);
 
 extern void mips_emit_speculation_barrier_function (void);
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index b7acf041903..b1219385096 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -22777,14 +22777,20 @@ mips_expand_vec_cmp_expr (rtx *operands)
 
 void
 mips_expand_vec_cond_expr (machine_mode mode, machine_mode vimode,
-  rtx *operands)
+  rtx *operands, bool mask)
 {
-  rtx cond = operands[3];
-  rtx cmp_op0 = operands[4];
-  rtx cmp_op1 = operands[5];
-  rtx cmp_res = gen_reg_rtx (vimode);
+  rtx cmp_res;
+  if (mask)
+cmp_res = operands[3];
+  else
+{
+  rtx cond = operands[3];
+  rtx cmp_op0 = operands[4];
+  rtx cmp_op1 = operands[5];
+  cmp_res = gen_reg_rtx (vimode);
 
-  mips_expand_msa_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1);
+  mips_expand_msa_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1);
+}
 
   /* We handle the following cases:
  1) r = a CMP b ? -1 : 0
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS: Use Reg0 instead of const0_rtx for TRAP

2024-06-19 Thread YunQiang Su
MIPSr6 removes condition trap instructions with imm, so the instruction
like `teq $2,imm` will be converted to
  li $at, imm
  teq $2, $at

The current version of Gas cannot detect if imm is zero, and output
  teq $2, $0
Let's do it in GCC.

gcc
* config/mips/mips.cc(mips_expand_conditional_trap): Use Reg0
instead of const0_rtx.
---
 gcc/config/mips/mips.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 48924116937..ba1e6214656 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -6026,7 +6026,7 @@ mips_expand_conditional_trap (rtx comparison)
 
   emit_insn (gen_rtx_TRAP_IF (VOIDmode,
  gen_rtx_fmt_ee (code, mode, op0, op1),
- const0_rtx));
+ gen_rtx_REG (mode, GP_REG_FIRST)));
 }
 
 /* Initialize *CUM for a call to a function of type FNTYPE.  */
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS: Set condmove cost to SET(REG, REG)

2024-06-18 Thread YunQiang Su
On most uarch, the cost condmove is same as other noraml integer,
and it should be COSTS_N_INSNS(1).

In GCC12 or previous, the condmove is always enabled, and from
GCC13, we start to compare the cost.

The generic rtx_cost give the result of COSTS_N_INSN(2).
Let's define it to COSTS_N_INSN(1) in mips_rtx_costs.

gcc
* config/mips/mips.cc(mips_rtx_costs): Set condmove cost.
* config/mips/mips.md(mov_on_,
mov_on__mips16e2,
mov_on__ne
mov_on__ne_mips16e2): Define name by
remove starting *, so that we can use CODE_FOR_.

gcc/testsute
* gcc.target/mips/movcc-2.c: Add k?100:1000 test.
---
 gcc/config/mips/mips.cc | 24 
 gcc/config/mips/mips.md |  8 
 gcc/testsuite/gcc.target/mips/movcc-2.c | 14 ++
 3 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index b7acf041903..48924116937 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -4692,6 +4692,30 @@ mips_rtx_costs (rtx x, machine_mode mode, int outer_code,
  *total = mips_set_reg_reg_cost (GET_MODE (SET_DEST (x)));
  return true;
}
+  int insn_code;
+  if (register_operand (SET_DEST (x), VOIDmode)
+ && GET_CODE (SET_SRC (x)) == IF_THEN_ELSE)
+   insn_code = recog_memoized (make_insn_raw (x));
+  else
+   insn_code = -1;
+  switch (insn_code)
+   {
+   /* MIPS16e2 ones may be listed here, while the only known CPU core
+  that implements MIPS16e2 is interAptiv.  The Dependency delays
+  of MOVN/MOVZ on interAptiv is 3.  */
+   case CODE_FOR_movsi_on_si:
+   case CODE_FOR_movdi_on_si:
+   case CODE_FOR_movsi_on_di:
+   case CODE_FOR_movdi_on_di:
+   case CODE_FOR_movsi_on_si_ne:
+   case CODE_FOR_movdi_on_si_ne:
+   case CODE_FOR_movsi_on_di_ne:
+   case CODE_FOR_movdi_on_di_ne:
+ *total = mips_set_reg_reg_cost (GET_MODE (SET_DEST (x)));
+ return true;
+   default:
+ break;
+   }
   return false;
 
 default:
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 508fb1afa6c..9962313602a 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -7492,7 +7492,7 @@ (define_insn "insn_pseudo"
 
 ;; MIPS4 Conditional move instructions.
 
-(define_insn "*mov_on_"
+(define_insn "mov_on_"
   [(set (match_operand:GPR 0 "register_operand" "=d,d")
(if_then_else:GPR
 (match_operator 4 "equality_operator"
@@ -7507,7 +7507,7 @@ (define_insn "*mov_on_"
   [(set_attr "type" "condmove")
(set_attr "mode" "")])
 
-(define_insn "*mov_on__mips16e2"
+(define_insn "mov_on__mips16e2"
   [(set (match_operand:GPR 0 "register_operand" "=d,d,d,d")
(if_then_else:GPR
 (match_operator 4 "equality_operator"
@@ -7525,7 +7525,7 @@ (define_insn "*mov_on__mips16e2"
(set_attr "mode" "")
(set_attr "extended_mips16" "yes")])
 
-(define_insn "*mov_on__ne"
+(define_insn "mov_on__ne"
   [(set (match_operand:GPR 0 "register_operand" "=d,d")
(if_then_else:GPR
 (match_operand:GPR2 1 "register_operand" ",")
@@ -7538,7 +7538,7 @@ (define_insn "*mov_on__ne"
   [(set_attr "type" "condmove")
(set_attr "mode" "")])
 
-(define_insn "*mov_on__ne_mips16e2"
+(define_insn "mov_on__ne_mips16e2"
   [(set (match_operand:GPR 0 "register_operand" "=d,d,d,d")
   (if_then_else:GPR
(match_operand:GPR2 1 "register_operand" 
",,t,t")
diff --git a/gcc/testsuite/gcc.target/mips/movcc-2.c 
b/gcc/testsuite/gcc.target/mips/movcc-2.c
index 1926e6460d1..cbda3c8febc 100644
--- a/gcc/testsuite/gcc.target/mips/movcc-2.c
+++ b/gcc/testsuite/gcc.target/mips/movcc-2.c
@@ -3,6 +3,8 @@
 /* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
 /* { dg-final { scan-assembler "\tmovz\t" } } */
 /* { dg-final { scan-assembler "\tmovn\t" } } */
+/* { dg-final { scan-assembler "\tmovz\t" } } */
+/* { dg-final { scan-assembler "\tmovn\t" } } */
 
 void ext_long (long);
 
@@ -17,3 +19,15 @@ sub5 (long i, long j, int k)
 {
   ext_long (!k ? i : j);
 }
+
+NOMIPS16 long
+sub6 (int k)
+{
+  return !k ? 100 : 1000;
+}
+
+NOMIPS16 long
+sub7 (int k)
+{
+  return !k ? 100 : 1000;
+}
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] build: Fix missing variable quotes

2024-06-18 Thread YunQiang Su
OK for trunk?


-- 
YunQiang Su


[RFC] MIPS: Use SLL+BGEZ for one bit test on pre-R2

2024-06-18 Thread YunQiang Su
PR target/111376.
Currently, we are using LUI/ANDI/BEQZ for on-bit-test if the bitpos>=16,
while in fact we can use SLL/BGEZ.

Note:
1) if bitpos<16, we can use ANDI/BEQZ.
2) For R2+, we have EXT.

Known problems:
  1. On some uarch, SLL has more delay, such as 74K:
 See the talk in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111376.
  2. We haven't test it on any real pre-R2 hardware for performance.
 So, I request some test here.
---
 gcc/config/mips/mips.md   | 33 +++
 .../gcc.target/mips/mips3-one-bit-test.c  | 55 +++
 2 files changed, 88 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/mips3-one-bit-test.c

diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 806fd29cf97..508fb1afa6c 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -6256,6 +6256,39 @@ (define_insn "*branch_bit_inverted"
 }
   [(set_attr "type" "branch")
(set_attr "branch_likely" "no")])
+
+(define_insn_and_split "*branch_on_bit"
+  [(set (pc)
+   (if_then_else
+   (match_operator 0 "equality_operator"
+   [(zero_extract:GPR (match_operand:GPR 2 "register_operand" "d")
+(const_int 1)
+(match_operand:GPR 3 "const_int_operand"))
+(const_int 0)])
+   (label_ref (match_operand 1))
+   (pc)))]
+  "!ISA_HAS_BBIT && !ISA_HAS_EXT_INS && !TARGET_MIPS16 && UINTVAL 
(operands[3]) >= 16"
+  "#"
+  "&& !reload_completed"
+  [(set (match_dup 4)
+   (ashift:GPR (match_dup 2) (match_dup 3)))
+   (set (pc)
+   (if_then_else
+   (match_op_dup 0 [(match_dup 4) (const_int 0)])
+   (label_ref (match_operand 1))
+   (pc)))]
+{
+  int shift = GET_MODE_BITSIZE (mode) - 1 - INTVAL (operands[3]);
+  operands[3] = GEN_INT (shift);
+  operands[4] = gen_reg_rtx (mode);
+
+  if (GET_CODE (operands[0]) == EQ)
+operands[0] = gen_rtx_GE (mode, operands[4], const0_rtx);
+  else
+operands[0] = gen_rtx_LT (mode, operands[4], const0_rtx);
+}
+[(set_attr "type" "branch")])
+
 
 ;;
 ;;  
diff --git a/gcc/testsuite/gcc.target/mips/mips3-one-bit-test.c 
b/gcc/testsuite/gcc.target/mips/mips3-one-bit-test.c
new file mode 100644
index 000..50672e71d73
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/mips3-one-bit-test.c
@@ -0,0 +1,55 @@
+/* { dg-options "-mips3 -mgp64" } */
+/* FIXME: -Os fails due to rtx_cost: PR115473.  */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os" } { "" } } */
+/* { dg-final { scan-assembler "f32_15:.*andi\t\\\$4,\\\$4,0x8000" } } */
+/* { dg-final { scan-assembler "f32:.*sll\t\\\$4,\\\$4,15" } } */
+/* { dg-final { scan-assembler "f64_15:.*andi\t\\\$4,\\\$4,0x8000" } } */
+/* { dg-final { scan-assembler "f64:.*dsll\t\\\$4,\\\$4,47" } } */
+
+/* Test to make sure we can use sll+bgtz to test one bit.
+   See PR111376.  */
+
+int f1();
+int f2();
+
+/* If the bits is < 16, we can use andi+beqz.  */
+NOMIPS16 int
+f32_15(int a)
+{
+  int p = (a & (1<<15));
+  if (p)
+return f1();
+  else
+return f2();
+}
+
+/* If the bits >= 16, we can use sll+bgez.  */
+NOMIPS16 int
+f32(int a)
+{
+  int p = (a & (1<<16));
+  if (p)
+return f1();
+  else
+return f2();
+}
+
+NOMIPS16 int
+f64_15(long long a)
+{
+  long long p = (a & (1LL<<15));
+  if (p)
+return f1();
+  else
+return f2();
+}
+
+NOMIPS16 int
+f64(long long a)
+{
+  long long p = (a & (1LL<<16));
+  if (p)
+return f1();
+  else
+return f2();
+}
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] tree-optimization/115254 - don't account single-lane SLP against discovery limit

2024-06-16 Thread YunQiang Su
Richard Biener  于2024年6月6日周四 14:20写道:
>
> On Thu, 6 Jun 2024, YunQiang Su wrote:
>
> > Richard Biener  于2024年5月28日周二 17:47写道:
> > >
> > > The following avoids accounting single-lane SLP to the discovery
> > > limit.  As the two testcases show this makes discovery fail,
> > > unfortunately even not the same across targets.  The following
> > > should fix two FAILs for GCN as a side-effect.
> > >
> > > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
> > >
> > > PR tree-optimization/115254
> > > * tree-vect-slp.cc (vect_build_slp_tree): Only account
> > > multi-lane SLP to limit.
> > >
> > > * gcc.dg/vect/slp-cond-2-big-array.c: Expect 4 times SLP.
> > > * gcc.dg/vect/slp-cond-2.c: Likewise.
> >
> > With this patch, MIPS/MSA still has only 3 times SLP.
> > I am digging the problem
>
> I bet it's an issue with missed permutes.  f3() requires interleaving
> of two VnQImode vectors.
>

Thanks. This problem disappears when I try to implement vcond_mask.


Re: [PATCH] LoongArch: Use bstrins for "value & (-1u << const)"

2024-06-14 Thread YunQiang Su
Xi Ruoyao  于2024年6月9日周日 21:50写道:
>
> A move/bstrins pair is as fast as a (addi.w|lu12i.w|lu32i.d|lu52i.d)/and
> pair, and twice fast as a srli/slli pair.  When the src reg and the dst

Just want to know that why not adjust the RTX cost of bstrins vs srli/slli?
It may benefit more cases.

> reg happens to be the same, the move instruction can be optimized away.
>


[PATCH] Build/Cross: Look for target headers from include if sys-include doesn't exist

2024-06-14 Thread YunQiang Su
PR 115416

When we build a cross toolchain, while without --with-sysroot,
target headers are expected in
  ${test_exec_prefix}/${target_noncanonical}/sys-include
while it is true only with --with-headers option is used. In other
cases, the path should be
  ${test_exec_prefix}/${target_noncanonical}/include
such as Debian's cross toolchain.

Debian's cross toolchain has directory structures like:
   /usr//lib
/include
/bin/

For this case, we cannot use "--prefix=/usr --with-sysroot=/", as
gcc/configure will use headers of build, aka in /usr/include to detect
features.  And fixinclude also uses the headers of build.

Let's use the `include` if `sys-include` doesn't exist.

For Makefile.in, the compare @includedir@ and $(prefix)/include is not
correct, as the --includedir option is used to set where the headers
should be installed.

gcc:
PR 115415.
configure.ac: Set target_header_dir and CROSS_SYSTEM_HEADER_DIR
to ${test_exec_prefix}/${target_noncanonical}/include when cross
and without --with-sysroot and without --with-headers.
configure: Regenerate.
Makefile.in: Set CROSS_SYSTEM_HEADER_DIR as configure, and don't
compare @includedir@ and $(prefix)/include.
---
 gcc/Makefile.in  | 6 +-
 gcc/configure| 8 ++--
 gcc/configure.ac | 4 
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f5adb647d3f..349f988dc08 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -560,11 +560,7 @@ LINKER_PLUGIN_API_H = $(srcdir)/../include/plugin-api.h
 # Default native SYSTEM_HEADER_DIR, to be overridden by targets.
 NATIVE_SYSTEM_HEADER_DIR = @NATIVE_SYSTEM_HEADER_DIR@
 # Default cross SYSTEM_HEADER_DIR, to be overridden by targets.
-ifeq (@includedir@,$(prefix)/include)
-  CROSS_SYSTEM_HEADER_DIR = @CROSS_SYSTEM_HEADER_DIR@
-else
-  CROSS_SYSTEM_HEADER_DIR = @includedir@
-endif
+CROSS_SYSTEM_HEADER_DIR = @CROSS_SYSTEM_HEADER_DIR@
 
 # autoconf sets SYSTEM_HEADER_DIR to one of the above.
 # Purge it of unnecessary internal relative paths
diff --git a/gcc/configure b/gcc/configure
index aaf5899cc03..d11e97d1758 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -15124,6 +15124,10 @@ if test x$host != x$target || test 
"x$TARGET_SYSTEM_ROOT" != x ||
 target_header_dir="${with_build_sysroot}${native_system_header_dir}"
   elif test "x$with_sysroot" = x; then
 target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
+if ! test -d ${target_header_dir};then
+  target_header_dir="${test_exec_prefix}/${target_noncanonical}/include"
+fi
+CROSS_SYSTEM_HEADER_DIR=${target_header_dir}
   elif test "x$with_sysroot" = xyes; then
 
target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-root${native_system_header_dir}"
   else
@@ -21410,7 +21414,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 21413 "configure"
+#line 21417 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -21516,7 +21520,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 21519 "configure"
+#line 21523 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index f8d67efeb98..54e6776747e 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2512,6 +2512,10 @@ if test x$host != x$target || test 
"x$TARGET_SYSTEM_ROOT" != x ||
 target_header_dir="${with_build_sysroot}${native_system_header_dir}"
   elif test "x$with_sysroot" = x; then
 target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
+if test -d ${target_header_dir};then
+  target_header_dir="${test_exec_prefix}/${target_noncanonical}/include"
+fi
+CROSS_SYSTEM_HEADER_DIR=${target_header_dir}
   elif test "x$with_sysroot" = xyes; then
 
target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-root${native_system_header_dir}"
   else
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] build: Fix missing variable quotes

2024-06-14 Thread YunQiang Su
Sam James  于2024年6月14日周五 09:02写道:
>
> Collin Funk  writes:
>
> > When dlopen and pthread_create are in libc the variable is
> > set to "none required", therefore running configure will show
> > the following errors:
> >
> > ./configure: line 8997: test: too many arguments
> > ./configure: line 8999: test: too many arguments
> > ./configure: line 9003: test: too many arguments
> > ./configure: line 9005: test: =: unary operator expected
> >
> > ChangeLog:
> >
> >   * configure.ac: Quote variable result of AC_SEARCH_LIBS.
> > * configure: Regenerate.
>
> This is PR115453 (which also needs to address a 'crate' typo).
>

I noticed another similar problem. I guess that we can put them in a
single patch:
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -5317,7 +5327,7 @@ x:

 AC_MSG_CHECKING(assembler and linker for explicit JALR relocation)
 gcc_cv_as_ld_jalr_reloc=no
-if test $gcc_cv_as_mips_explicit_relocs = yes; then
+if test x$gcc_cv_as_mips_explicit_relocs = xyes; then
   if test $in_tree_ld = yes ; then
 if test "$gcc_cv_gld_major_version" -eq 2 -a
"$gcc_cv_gld_minor_version" -ge 20 -o "$gcc_cv_gld_major_version" -gt
2 \
&& test $in_tree_ld_is_elf = yes; then


Re: [PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-06-10 Thread YunQiang Su
Richard Sandiford  于2024年6月6日周四 17:54写道:
>
> YunQiang Su  writes:
> > YunQiang Su  于2024年5月29日周三 10:02写道:
> >>
> >> Richard Sandiford  于2024年5月29日周三 05:28写道:
> >> >
> >> > YunQiang Su  writes:
> >> > > If `find_a_program` cannot find `as/ld/objcopy` and we are a cross 
> >> > > toolchain,
> >> > > the final fallback is `as/ld` of system.  In fact, we can have a try 
> >> > > with
> >> > > -as/ld/objcopy before fallback to native as/ld/objcopy.
> >> > >
> >> > > This patch is derivatived from Debian's patch:
> >> > >   gcc-search-prefixed-as-ld.diff
> >> >
> >> > I'm probably making you repeat a previous discussion, sorry, but could
> >> > you describe the use case in more detail?  The current approach to
> >> > handling cross toolchains has been used for many years.  Presumably
> >> > this patch is supporting a different way of organising things,
> >> > but I wasn't sure from the description what it was.
> >> >
> >> > AIUI, we currently assume that cross as, ld and objcopy will be
> >> > installed under those names in $prefix/$target_alias/bin (aka 
> >> > $tooldir/bin).
> >> > E.g.:
> >> >
> >> >bin/aarch64-elf-as = aarch64-elf/bin/as
> >> >
> >> > GCC should then find as in aarch64-elf/bin.
> >> >
> >> > Is that not true in your case?
> >> >
> >>
> >> Yes. This patch is only about the final fallback. I mean aarch64-elf/bin/as
> >> still has higher priority than bin/aarch64-elf-as.
> >>
> >> In the current code, we find gas with:
> >> /prefix/aarch64-elf/bin/as > $PATH/as
> >>
> >> And this patch a new one between them:
> >> /prefix/aarch64-elf/bin/as > $PATH/aarch64-elf-as > $PATH/as
> >>
> >> > To be clear, I'm not saying the patch is wrong.  I'm just trying to
> >> > understand why the patch is needed.
> >> >
> >>
> >> Yes. If gcc is configured correctly, it is not so useful.
> >> In some case for some lazy user, it may be useful,
> >> for example, the binutils installed into different prefix with libc etc.
> >>
> >> For example, binutils is installed into /usr/aarch64-elf/bin, while
> >> libc is installed into /usr/local/aarch64-elf/.
> >>
> >
> > Any idea about it? Is it a use case making sense?
>
> Yeah, I think it makes sense.  GCC and binutils are separate packages.
> Users could cherry-pick a GCC installation and a separate binutils
> installation rather than bundling them together into a single
> toolchain.  And not everyone will have permission to change $tooldir.
>
> So I agree we should support searching the user's path for an
> as/ld/etc. based on the tool prefix.  Unfortunately, I don't think
> I understand the code & constraints well enough to do a review.
>
> In particular, it seems unfortunate that we need to do a trial
> subcommand invocation before committing to the prefixed name.
> And, if we continue to search for "as" in the user's path as a fallback,
> it's not 100% obvious that "${triple}-as" later in the path should trump
> "as" earlier in the path.
>
> In some ways, it seems more consistent to do the replacement without
> first doing a trial invocation.  But I don't know whether that would
> break existing use cases.  (To be clear, I wouldn't feel comfortable

Yes. This is also my worry as some users may set $PATH manually
to a path which contains target `as`, such as
   export PATH="/usr/aarch64-linux-gnu/bin:$PATH"

> approving a patch to do that without buy-in from other maintainers.)
>
> Thanks,
> Richard


Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2024-06-09 Thread YunQiang Su
> > The rtx_cost may consider the compare operation in `seq` as quite expensive.
> Overall it sounds like a target issue to me -- ie, now that we're
> testing for profitability instead of just assuming it's profitable some
> targets need adjustment.  Either in their costing model or in the
> testsuite expectations.
>

Yes. You are right, I find the real problem. In mips-cpus.def

MIPS_CPU ("mips32", PROCESSOR_4KC, MIPS_ISA_MIPS32,
PTF_AVOID_BRANCHLIKELY_ALWAYS)
MIPS_CPU ("mips64", PROCESSOR_5KC, MIPS_ISA_MIPS64,
PTF_AVOID_BRANCHLIKELY_ALWAYS)
MIPS_CPU ("mips64r2", PROCESSOR_5KC, MIPS_ISA_MIPS64R2,
PTF_AVOID_BRANCHLIKELY_ALWAYS)
MIPS_CPU ("mips64r3", PROCESSOR_5KC, MIPS_ISA_MIPS64R3,
PTF_AVOID_BRANCHLIKELY_ALWAYS)
MIPS_CPU ("mips64r5", PROCESSOR_5KC, MIPS_ISA_MIPS64R5,
PTF_AVOID_BRANCHLIKELY_ALWAYS)

Here PROCESSOR_4KC and PROCESSOR_5KC are both FPU-less.

> Jeff
>


-- 
YunQiang Su


Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2024-06-09 Thread YunQiang Su
YunQiang Su  于2024年6月9日周日 18:25写道:
>
> > >
> > > gcc/ChangeLog:
> > >
> > >   * ifcvt.cc (cond_move_process_if_block):
> > >   Consider the result of targetm.noce_conversion_profitable_p()
> > >   when replacing the original sequence with the converted one.
> > THanks.  I pushed this to the trunk.
> >
>
> Sorry for the delay report. With this patch the test
> gcc.target/mips/movcc-3.c fails.
>

The problem may be caused by the different of `seq` and `edge e`.
In `seq`, there may be a compare operation, while
`default_max_noce_ifcvt_seq_cost`
only count the branch operation.

The rtx_cost may consider the compare operation in `seq` as quite expensive.


-- 
YunQiang Su


Re: [PATCH] ifcvt.cc: Prevent excessive if-conversion for conditional moves

2024-06-09 Thread YunQiang Su
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (cond_move_process_if_block):
> >   Consider the result of targetm.noce_conversion_profitable_p()
> >   when replacing the original sequence with the converted one.
> THanks.  I pushed this to the trunk.
>

Sorry for the delay report. With this patch the test
gcc.target/mips/movcc-3.c fails.


> Jeff



-- 
YunQiang Su


[PATCH v2] MIPS: Use signaling fcmp instructions for LT/LE/LTGT

2024-06-08 Thread YunQiang Su
LT/LE: c.lt.fmt/c.le.fmt on pre-R6 and cmp.lt.fmt/cmp.le.fmt have
different semantic:
   c.lt.fmt will signal for all NaN, including qNaN;
   cmp.lt.fmt will only signal sNaN, while not qNaN;
   cmp.slt.fmt has the same semantic as c.lt.fmt;
   lt/le of RTL will signaling qNaN.

while in `s__using_`, RTL operation
`lt`/`le` are convert to c/cmp's lt/le, which is correct for C.cond.fmt,
while not for CMP.cond.fmt. Let's convert them to slt/sle if ISA_HAS_CCF.

For LTGT, which signals qNaN, `sne` of r6 has same semantic, while pre-R6
has only inverse one `ngl`.  Thus for RTL we have to use the `uneq` as the
operator, and introduce a new CC mode: CCEmode to mark it as signaling.

This patch can fix
   gcc.dg/torture/pr91323.c for pre-R6;
   gcc.dg/torture/builtin-iseqsig-* for R6.

gcc:
* config/mips/mips-modes.def: New CC_MODE CCE.
* config/mips/mips-protos.h(mips_output_compare): New function.
* config/mips/mips.cc(mips_allocate_fcc): Set CCEmode count=1.
(mips_emit_compare): Use CCEmode for LTGT/LT/LE for pre-R6.
(mips_output_compare): New function. Convert lt/le to slt/sle
for R6; convert ueq to ngl for CCEmode.
(mips_hard_regno_mode_ok_uncached): Mention CCEmode.
* config/mips/mips.h: Mention CCEmode for LOAD_EXTEND_OP.
* config/mips/mips.md(FPCC): Add CCE.
(define_mode_iterator MOVECC): Mention CCE.
(define_mode_attr reg): Add CCE with "z".
(define_mode_attr fpcmp): Add CCE with "c".
(define_code_attr fcond): ltgt should use sne instead of ne.
(s__using_): call mips_output_compare.
---
 gcc/config/mips/mips-modes.def |  1 +
 gcc/config/mips/mips-protos.h  |  2 ++
 gcc/config/mips/mips.cc| 48 +++---
 gcc/config/mips/mips.h |  2 +-
 gcc/config/mips/mips.md| 19 +-
 5 files changed, 61 insertions(+), 11 deletions(-)

diff --git a/gcc/config/mips/mips-modes.def b/gcc/config/mips/mips-modes.def
index 323570928fc..21f50a22546 100644
--- a/gcc/config/mips/mips-modes.def
+++ b/gcc/config/mips/mips-modes.def
@@ -54,4 +54,5 @@ ADJUST_ALIGNMENT (CCV4, 16);
 CC_MODE (CCDSP);
 
 /* For floating point conditions in FP registers.  */
+CC_MODE (CCE);
 CC_MODE (CCF);
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 835f42128b9..fcc0a0ae663 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -394,4 +394,6 @@ extern bool mips_bit_clear_p (enum machine_mode, unsigned 
HOST_WIDE_INT);
 extern void mips_bit_clear_info (enum machine_mode, unsigned HOST_WIDE_INT,
  int *, int *);
 
+extern const char *mips_output_compare (const char *fpcmp, const char *fcond,
+   const char *fmt, const char *fpcc_mode, bool swap);
 #endif /* ! GCC_MIPS_PROTOS_H */
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 278d9446482..b7acf041903 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -5659,7 +5659,7 @@ mips_allocate_fcc (machine_mode mode)
 
   gcc_assert (TARGET_HARD_FLOAT && ISA_HAS_8CC);
 
-  if (mode == CCmode)
+  if (mode == CCmode || mode == CCEmode)
 count = 1;
   else if (mode == CCV2mode)
 count = 2;
@@ -5788,17 +5788,57 @@ mips_emit_compare (enum rtx_code *code, rtx *op0, rtx 
*op1, bool need_eq_ne_p)
  /* Three FP conditions cannot be implemented by reversing the
 operands for C.cond.fmt, instead a reversed condition code is
 required and a test for false.  */
+ machine_mode ccmode = CCmode;
+ switch (*code)
+   {
+   case LTGT:
+   case LT:
+   case LE:
+ ccmode = CCEmode;
+ break;
+   default:
+ break;
+   }
  *code = mips_reversed_fp_cond (&cmp_code) ? EQ : NE;
  if (ISA_HAS_8CC)
-   *op0 = mips_allocate_fcc (CCmode);
+   *op0 = mips_allocate_fcc (ccmode);
  else
-   *op0 = gen_rtx_REG (CCmode, FPSW_REGNUM);
+   *op0 = gen_rtx_REG (ccmode, FPSW_REGNUM);
}
 
   *op1 = const0_rtx;
   mips_emit_binary (cmp_code, *op0, cmp_op0, cmp_op1);
 }
 }
+
+
+const char *
+mips_output_compare (const char *fpcmp, const char *fcond,
+   const char *fmt, const char *fpcc_mode, bool swap)
+{
+  const char *fc = fcond;
+
+  if (ISA_HAS_CCF)
+{
+  /* c.lt.fmt is signaling, while cmp.lt.fmt is quiet.  */
+  if (strcmp (fcond, "lt") == 0)
+   fc = "slt";
+  else if (strcmp (fcond, "le") == 0)
+   fc = "sle";
+}
+  else if (strcmp (fpcc_mode, "cce") == 0)
+{
+  /* It was LTGT, while we have only inverse one.  It was then converted
+to UNEQ by mips_reversed_fp_cond, and we used CCEmode to mark it.
+Lets convert it back to ngl now.  */
+  if (strcmp (fcond, "ueq") == 0)
+   fc = "ngl";
+}
+  if (swap)
+return concat(fpcmp, ".", fc, 

[PATCH] MIPS/testsuite: add -mno-branch-likely to r10k-cache-barrier-13.c

2024-06-08 Thread YunQiang Su
In mips.cc(mips_reorg_process_insns), there is this claim:

Also delete cache barriers if the last instruction
was an annulled branch.  INSN will not be speculatively
executed.

And with -O1 on mips64, we can generate binary code like this,
which fails this test.

gcc/testsuite
* gcc.target/mips/r10-cache-barrier-13.c: Add -mno-branch-likely
option.
---
 gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c 
b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c
index ee9c84b5988..ac005fb08b3 100644
--- a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c
+++ b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c
@@ -1,4 +1,4 @@
-/* { dg-options "-mr10k-cache-barrier=store" } */
+/* { dg-options "-mr10k-cache-barrier=store -mno-branch-likely" } */
 
 /* Test that indirect calls are protected.  */
 
-- 
2.39.3 (Apple Git-146)



[PATCH] MIPS: Use signaling fcmp instructions for LT/LE/LTGT

2024-06-07 Thread YunQiang Su
LT/LE: c.lt.fmt/c.le.fmt on pre-R6 and cmp.lt.fmt/cmp.le.fmt have
different semantic:
   c.lt.fmt will signal for all NaN, including qNaN;
   cmp.lt.fmt will only signal sNaN, while not qNaN;
   cmp.slt.fmt has the same semantic as c.lt.fmt;
   lt/le of RTL will signaling qNaN.

while in `s__using_`, RTL operation
`lt`/`le` are convert to c/cmp's lt/le, which is correct for C.cond.fmt,
while not for CMP.cond.fmt. Let's convert them to slt/sle if ISA_HAS_CCF.

For LTGT, which signals qNaN, `sne` of r6 has same semantic, while pre-R6
has only inverse one `ngl`.  Thus for RTL we have to use the `uneq` as the
operator, and introduce a new CC mode: CCEmode to mark it as signaling.

This patch can fix
   gcc.dg/torture/pr91323.c for pre-R6;
   gcc.dg/torture/builtin-iseqsig-* for R6.

gcc:
* config/mips/mips-modes.def: New CC_MODE CCE.
* config/mips/mips-protos.h(mips_output_compare): New function.
* config/mips/mips.cc(mips_allocate_fcc): Set CCEmode count=1.
(mips_emit_compare): Use CCEmode for LTGT/LT/LE for pre-R6.
(mips_output_compare): New function. Convert lt/le to slt/sle
for R6; convert ueq to ngl for CCEmode.
(mips_hard_regno_mode_ok_uncached): Mention CCEmode.
* config/mips/mips.h: Mention CCEmode for LOAD_EXTEND_OP.
* config/mips/mips.md(FPCC): Add CCE.
(define_mode_attr reg): Add CCE with "z".
(define_mode_attr fpcmp): Add CCE with "c".
(define_code_attr fcond): ltgt should use sne instead of ne.
(s__using_): call mips_output_compare.
---
 gcc/config/mips/mips-modes.def |  1 +
 gcc/config/mips/mips-protos.h  |  2 ++
 gcc/config/mips/mips.cc| 48 +++---
 gcc/config/mips/mips.h |  2 +-
 gcc/config/mips/mips.md| 16 +++-
 5 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/gcc/config/mips/mips-modes.def b/gcc/config/mips/mips-modes.def
index 323570928fc..21f50a22546 100644
--- a/gcc/config/mips/mips-modes.def
+++ b/gcc/config/mips/mips-modes.def
@@ -54,4 +54,5 @@ ADJUST_ALIGNMENT (CCV4, 16);
 CC_MODE (CCDSP);
 
 /* For floating point conditions in FP registers.  */
+CC_MODE (CCE);
 CC_MODE (CCF);
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 835f42128b9..fcc0a0ae663 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -394,4 +394,6 @@ extern bool mips_bit_clear_p (enum machine_mode, unsigned 
HOST_WIDE_INT);
 extern void mips_bit_clear_info (enum machine_mode, unsigned HOST_WIDE_INT,
  int *, int *);
 
+extern const char *mips_output_compare (const char *fpcmp, const char *fcond,
+   const char *fmt, const char *fpcc_mode, bool swap);
 #endif /* ! GCC_MIPS_PROTOS_H */
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 278d9446482..b7acf041903 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -5659,7 +5659,7 @@ mips_allocate_fcc (machine_mode mode)
 
   gcc_assert (TARGET_HARD_FLOAT && ISA_HAS_8CC);
 
-  if (mode == CCmode)
+  if (mode == CCmode || mode == CCEmode)
 count = 1;
   else if (mode == CCV2mode)
 count = 2;
@@ -5788,17 +5788,57 @@ mips_emit_compare (enum rtx_code *code, rtx *op0, rtx 
*op1, bool need_eq_ne_p)
  /* Three FP conditions cannot be implemented by reversing the
 operands for C.cond.fmt, instead a reversed condition code is
 required and a test for false.  */
+ machine_mode ccmode = CCmode;
+ switch (*code)
+   {
+   case LTGT:
+   case LT:
+   case LE:
+ ccmode = CCEmode;
+ break;
+   default:
+ break;
+   }
  *code = mips_reversed_fp_cond (&cmp_code) ? EQ : NE;
  if (ISA_HAS_8CC)
-   *op0 = mips_allocate_fcc (CCmode);
+   *op0 = mips_allocate_fcc (ccmode);
  else
-   *op0 = gen_rtx_REG (CCmode, FPSW_REGNUM);
+   *op0 = gen_rtx_REG (ccmode, FPSW_REGNUM);
}
 
   *op1 = const0_rtx;
   mips_emit_binary (cmp_code, *op0, cmp_op0, cmp_op1);
 }
 }
+
+
+const char *
+mips_output_compare (const char *fpcmp, const char *fcond,
+   const char *fmt, const char *fpcc_mode, bool swap)
+{
+  const char *fc = fcond;
+
+  if (ISA_HAS_CCF)
+{
+  /* c.lt.fmt is signaling, while cmp.lt.fmt is quiet.  */
+  if (strcmp (fcond, "lt") == 0)
+   fc = "slt";
+  else if (strcmp (fcond, "le") == 0)
+   fc = "sle";
+}
+  else if (strcmp (fpcc_mode, "cce") == 0)
+{
+  /* It was LTGT, while we have only inverse one.  It was then converted
+to UNEQ by mips_reversed_fp_cond, and we used CCEmode to mark it.
+Lets convert it back to ngl now.  */
+  if (strcmp (fcond, "ueq") == 0)
+   fc = "ngl";
+}
+  if (swap)
+return concat(fpcmp, ".", fc, ".", fmt, "\t%Z0%2,%1", NULL);
+  return concat(fpcmp,

[committed] MIPS: Need COSTS_N_INSNS in mips_insn_cost

2024-06-05 Thread YunQiang Su
From: YunQiang Su 

In mips_insn_cost, COSTS_N_INSNS is missing when we return the cost
if count * ratio > 0.

gcc
* config/mips/mips.cc(mips_insn_cost): Add missing COSTS_N_INSNS
to count.
---
 gcc/config/mips/mips.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index b478cddc8ad..278d9446482 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -4199,7 +4199,7 @@ mips_insn_cost (rtx_insn *x, bool speed)
 
   count = get_attr_insn_count (x);
   ratio = get_attr_perf_ratio (x);
-  cost = count * ratio;
+  cost = COSTS_N_INSNS (count) * ratio;
   if (cost > 0)
 return cost;
 
-- 
2.39.3 (Apple Git-146)



Re: [PATCH] tree-optimization/115254 - don't account single-lane SLP against discovery limit

2024-06-05 Thread YunQiang Su
Richard Biener  于2024年5月28日周二 17:47写道:
>
> The following avoids accounting single-lane SLP to the discovery
> limit.  As the two testcases show this makes discovery fail,
> unfortunately even not the same across targets.  The following
> should fix two FAILs for GCN as a side-effect.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
>
> PR tree-optimization/115254
> * tree-vect-slp.cc (vect_build_slp_tree): Only account
> multi-lane SLP to limit.
>
> * gcc.dg/vect/slp-cond-2-big-array.c: Expect 4 times SLP.
> * gcc.dg/vect/slp-cond-2.c: Likewise.

With this patch, MIPS/MSA still has only 3 times SLP.
I am digging the problem


> ---
>  .../gcc.dg/vect/slp-cond-2-big-array.c|  2 +-
>  gcc/testsuite/gcc.dg/vect/slp-cond-2.c|  2 +-
>  gcc/tree-vect-slp.cc  | 31 +++
>  3 files changed, 20 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c 
> b/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> index cb7eb94b3a3..9a9f63c0b8d 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-cond-2-big-array.c
> @@ -128,4 +128,4 @@ main ()
>return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-cond-2.c 
> b/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> index 1dcee46cd95..08bbb3dbec6 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-cond-2.c
> @@ -128,4 +128,4 @@ main ()
>return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> } } */
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 0dd9a4daf6a..bbfde8849c1 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -1725,21 +1725,26 @@ vect_build_slp_tree (vec_info *vinfo,
>SLP_TREE_SCALAR_STMTS (res) = stmts;
>bst_map->put (stmts.copy (), res);
>
> -  if (*limit == 0)
> +  /* Single-lane SLP doesn't have the chance of run-away, do not account
> + it to the limit.  */
> +  if (stmts.length () > 1)
>  {
> -  if (dump_enabled_p ())
> -   dump_printf_loc (MSG_NOTE, vect_location,
> -"SLP discovery limit exceeded\n");
> -  /* Mark the node invalid so we can detect those when still in use
> -as backedge destinations.  */
> -  SLP_TREE_SCALAR_STMTS (res) = vNULL;
> -  SLP_TREE_DEF_TYPE (res) = vect_uninitialized_def;
> -  res->failed = XNEWVEC (bool, group_size);
> -  memset (res->failed, 0, sizeof (bool) * group_size);
> -  memset (matches, 0, sizeof (bool) * group_size);
> -  return NULL;
> +  if (*limit == 0)
> +   {
> + if (dump_enabled_p ())
> +   dump_printf_loc (MSG_NOTE, vect_location,
> +"SLP discovery limit exceeded\n");
> + /* Mark the node invalid so we can detect those when still in use
> +as backedge destinations.  */
> + SLP_TREE_SCALAR_STMTS (res) = vNULL;
> + SLP_TREE_DEF_TYPE (res) = vect_uninitialized_def;
> + res->failed = XNEWVEC (bool, group_size);
> + memset (res->failed, 0, sizeof (bool) * group_size);
> + memset (matches, 0, sizeof (bool) * group_size);
> + return NULL;
> +   }
> +  --*limit;
>  }
> -  --*limit;
>
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location,
> --
> 2.35.3



-- 
YunQiang Su


Re: [PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-06-05 Thread YunQiang Su
YunQiang Su  于2024年5月29日周三 10:02写道:
>
> Richard Sandiford  于2024年5月29日周三 05:28写道:
> >
> > YunQiang Su  writes:
> > > If `find_a_program` cannot find `as/ld/objcopy` and we are a cross 
> > > toolchain,
> > > the final fallback is `as/ld` of system.  In fact, we can have a try with
> > > -as/ld/objcopy before fallback to native as/ld/objcopy.
> > >
> > > This patch is derivatived from Debian's patch:
> > >   gcc-search-prefixed-as-ld.diff
> >
> > I'm probably making you repeat a previous discussion, sorry, but could
> > you describe the use case in more detail?  The current approach to
> > handling cross toolchains has been used for many years.  Presumably
> > this patch is supporting a different way of organising things,
> > but I wasn't sure from the description what it was.
> >
> > AIUI, we currently assume that cross as, ld and objcopy will be
> > installed under those names in $prefix/$target_alias/bin (aka $tooldir/bin).
> > E.g.:
> >
> >bin/aarch64-elf-as = aarch64-elf/bin/as
> >
> > GCC should then find as in aarch64-elf/bin.
> >
> > Is that not true in your case?
> >
>
> Yes. This patch is only about the final fallback. I mean aarch64-elf/bin/as
> still has higher priority than bin/aarch64-elf-as.
>
> In the current code, we find gas with:
> /prefix/aarch64-elf/bin/as > $PATH/as
>
> And this patch a new one between them:
> /prefix/aarch64-elf/bin/as > $PATH/aarch64-elf-as > $PATH/as
>
> > To be clear, I'm not saying the patch is wrong.  I'm just trying to
> > understand why the patch is needed.
> >
>
> Yes. If gcc is configured correctly, it is not so useful.
> In some case for some lazy user, it may be useful,
> for example, the binutils installed into different prefix with libc etc.
>
> For example, binutils is installed into /usr/aarch64-elf/bin, while
> libc is installed into /usr/local/aarch64-elf/.
>

Any idea about it? Is it a use case making sense?


Re: [PATCH] expmed: TRUNCATE value1 if needed in store_bit_field_using_insv

2024-06-05 Thread YunQiang Su
Richard Sandiford  于2024年6月5日周三 23:20写道:
>
> YunQiang Su  writes:
> > Richard Sandiford  于2024年6月5日周三 22:14写道:
> >>
> >> YunQiang Su  writes:
> >> > PR target/113179.
> >> >
> >> > In `store_bit_field_using_insv`, we just use SUBREG if value_mode
> >> >>= op_mode, while in some ports, a sign_extend will be needed,
> >> > such as MIPS64:
> >> >   If either GPR rs or GPR rt does not contain sign-extended 32-bit
> >> >   values (bits 63..31 equal), then the result of the operation is
> >> >   UNPREDICTABLE.
> >> >
> >> > The problem happens for the code like:
> >> >   struct xx {
> >> > int a:4;
> >> > int b:24;
> >> > int c:3;
> >> > int d:1;
> >> >   };
> >> >
> >> >   void xx (struct xx *a, long long b) {
> >> > a->d = b;
> >> >   }
> >> >
> >> > In the above code, the hard register contains `b`, may be note well
> >> > sign-extended.
> >> >
> >> > gcc/
> >> >   PR target/113179
> >> >   * expmed.c(store_bit_field_using_insv): TRUNCATE value1 if
> >> >   needed.
> >> >
> >> > gcc/testsuite
> >> >   PR target/113179
> >> >   * gcc.target/mips/pr113179.c: New tests.
> >> > ---
> >> >  gcc/expmed.cc| 12 +---
> >> >  gcc/testsuite/gcc.target/mips/pr113179.c | 18 ++
> >> >  2 files changed, 27 insertions(+), 3 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/mips/pr113179.c
> >> >
> >> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> >> > index 4ec035e4843..6a582593da8 100644
> >> > --- a/gcc/expmed.cc
> >> > +++ b/gcc/expmed.cc
> >> > @@ -704,9 +704,15 @@ store_bit_field_using_insv (const extraction_insn 
> >> > *insv, rtx op0,
> >> >   }
> >> > else
> >> >   {
> >> > -   tmp = gen_lowpart_if_possible (op_mode, value1);
> >> > -   if (! tmp)
> >> > - tmp = gen_lowpart (op_mode, force_reg (value_mode, 
> >> > value1));
> >> > +   if (targetm.mode_rep_extended (op_mode, value_mode))
> >> > + tmp = simplify_gen_unary (TRUNCATE, op_mode,
> >> > +   value1, value_mode);
> >> > +   else
> >> > + {
> >> > +   tmp = gen_lowpart_if_possible (op_mode, value1);
> >> > +   if (! tmp)
> >> > + tmp = gen_lowpart (op_mode, force_reg (value_mode, 
> >> > value1));
> >> > + }
> >> >   }
> >> > value1 = tmp;
> >> >   }
> >>
> >> I notice this patch is already applied.  Was it approved?  I didn't
> >> see an approval in my feed or in the archives.
> >>
> >
> > Sorry. I was supposed that it only effects MIPS targets since only MIPS 
> > defines
> >   targetm.mode_rep_extended
> >
> >> Although it might not make any difference on current targets,
> >> I think the conditional should logically be based on
> >> TRULY_NOOP_TRUNCATION(_MODES_P) rather than targetm.mode_rep_extended.
> >>
> >> TRULY_NOOP_TRUNCATION is a correctness question: can I use subregs
> >> to do this truncation?  targetm.mode_rep_extended is instead an
> >> optimisation question: can I assume that a particular extension is free?
> >> Here we're trying to avoid a subreg for correctness, rather than trying
> >> to take advantage of a cheap extension.
> >>
> >> So I think the code should be:
> >>
> >>   if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode))
> >> {
> >>   tmp = simplify_subreg (op_mode, value1, value_mode, 0);
> >>   if (! tmp)
> >> tmp = simplify_gen_subreg (op_mode,
> >>force_reg (value_mode, value1),
> >>value_mode, 0);
> >> }
> >>   else if (GET_MODE_SIZE (op_mode) < GET_MODE_SIZE (value_mode)
> >>&& !TRULY_NOOP_TRUNCATION_MODES_P (op_mode, value_mode))
> &

Re: [PATCH] expmed: TRUNCATE value1 if needed in store_bit_field_using_insv

2024-06-05 Thread YunQiang Su
Richard Sandiford  于2024年6月5日周三 22:14写道:
>
> YunQiang Su  writes:
> > PR target/113179.
> >
> > In `store_bit_field_using_insv`, we just use SUBREG if value_mode
> >>= op_mode, while in some ports, a sign_extend will be needed,
> > such as MIPS64:
> >   If either GPR rs or GPR rt does not contain sign-extended 32-bit
> >   values (bits 63..31 equal), then the result of the operation is
> >   UNPREDICTABLE.
> >
> > The problem happens for the code like:
> >   struct xx {
> > int a:4;
> > int b:24;
> > int c:3;
> > int d:1;
> >   };
> >
> >   void xx (struct xx *a, long long b) {
> > a->d = b;
> >   }
> >
> > In the above code, the hard register contains `b`, may be note well
> > sign-extended.
> >
> > gcc/
> >   PR target/113179
> >   * expmed.c(store_bit_field_using_insv): TRUNCATE value1 if
> >   needed.
> >
> > gcc/testsuite
> >   PR target/113179
> >   * gcc.target/mips/pr113179.c: New tests.
> > ---
> >  gcc/expmed.cc| 12 +---
> >  gcc/testsuite/gcc.target/mips/pr113179.c | 18 ++
> >  2 files changed, 27 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/mips/pr113179.c
> >
> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > index 4ec035e4843..6a582593da8 100644
> > --- a/gcc/expmed.cc
> > +++ b/gcc/expmed.cc
> > @@ -704,9 +704,15 @@ store_bit_field_using_insv (const extraction_insn 
> > *insv, rtx op0,
> >   }
> > else
> >   {
> > -   tmp = gen_lowpart_if_possible (op_mode, value1);
> > -   if (! tmp)
> > - tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
> > +   if (targetm.mode_rep_extended (op_mode, value_mode))
> > + tmp = simplify_gen_unary (TRUNCATE, op_mode,
> > +   value1, value_mode);
> > +   else
> > + {
> > +   tmp = gen_lowpart_if_possible (op_mode, value1);
> > +   if (! tmp)
> > + tmp = gen_lowpart (op_mode, force_reg (value_mode, 
> > value1));
> > + }
> >   }
> > value1 = tmp;
> >   }
>
> I notice this patch is already applied.  Was it approved?  I didn't
> see an approval in my feed or in the archives.
>

Sorry. I was supposed that it only effects MIPS targets since only MIPS defines
  targetm.mode_rep_extended

> Although it might not make any difference on current targets,
> I think the conditional should logically be based on
> TRULY_NOOP_TRUNCATION(_MODES_P) rather than targetm.mode_rep_extended.
>
> TRULY_NOOP_TRUNCATION is a correctness question: can I use subregs
> to do this truncation?  targetm.mode_rep_extended is instead an
> optimisation question: can I assume that a particular extension is free?
> Here we're trying to avoid a subreg for correctness, rather than trying
> to take advantage of a cheap extension.
>
> So I think the code should be:
>
>   if (GET_MODE_SIZE (value_mode) < GET_MODE_SIZE (op_mode))
> {
>   tmp = simplify_subreg (op_mode, value1, value_mode, 0);
>   if (! tmp)
> tmp = simplify_gen_subreg (op_mode,
>force_reg (value_mode, value1),
>value_mode, 0);
> }
>   else if (GET_MODE_SIZE (op_mode) < GET_MODE_SIZE (value_mode)
>&& !TRULY_NOOP_TRUNCATION_MODES_P (op_mode, value_mode))

In fact I don't think so. For other targets besides MIPS, it is safe even
!TRULY_NOOP_TRUNCATION_MODES_P (op_mode, value_mode)
as INS instruction may use the low part of a register safely.

It is only not true on MIPS ISA documents as
 If either GPR rs or GPR rt does not contain sign-extended 32-bit
 values (bits 63..31 equal), then the result of the operation is
 UNPREDICTABLE.

It is very annoying. I haven't noticed a similar problem on any other
ISA documents.
In fact I don't know any real MIPS hardware that is "UNPREDICTABLE" in
this case.

> tmp = simplify_gen_unary (TRUNCATE, op_mode, value1, value_mode);
>   else
> {
>   tmp = gen_lowpart_if_possible (op_mode, value1);
>   if (! tmp)
> tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
> }
>
> (also inclues unnesting of the else).  Could you try chang

Re: [PATCH 49/52] mips: New hook implementation mips_c_mode_for_floating_type

2024-06-03 Thread YunQiang Su
Kewen Lin  于2024年6月3日周一 11:03写道:
>
> This is to add new port specific hook implementation
> mips_c_mode_for_floating_type, remove macros FLOAT_TYPE_SIZE
> and DOUBLE_TYPE_SIZE, rename LONG_DOUBLE_TYPE_SIZE to
> MIPS_LONG_DOUBLE_TYPE_SIZE since we poison LONG_DOUBLE_TYPE_SIZE
> but some subtarget wants to redefine it and some macro defines
> need it.
>

Good for me if the framework is approved.

> gcc/ChangeLog:
>
> * config/mips/mips.cc (mips_c_mode_for_floating_type): New function.
> (TARGET_C_MODE_FOR_FLOATING_TYPE): New macro.
> * config/mips/mips.h (UNITS_PER_FPVALUE): Replace 
> LONG_DOUBLE_TYPE_SIZE
> with MIPS_LONG_DOUBLE_TYPE_SIZE.
> (MAX_FIXED_MODE_SIZE): Likewise.
> (STRUCTURE_SIZE_BOUNDARY): Likewise.
> (BIGGEST_ALIGNMENT): Likewise.
> (FLOAT_TYPE_SIZE): Remove.
> (DOUBLE_TYPE_SIZE): Remove.
> (LONG_DOUBLE_TYPE_SIZE): Rename to ...
> (MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
> * config/mips/n32-elf.h (LONG_DOUBLE_TYPE_SIZE): Rename to ...
> (MIPS_LONG_DOUBLE_TYPE_SIZE): ... this.
> ---
>  gcc/config/mips/mips.cc   | 14 ++
>  gcc/config/mips/mips.h| 13 ++---
>  gcc/config/mips/n32-elf.h |  4 ++--
>  3 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> index b63d40a357b..3e5035a385c 100644
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -22972,6 +22972,18 @@ mips_asm_file_end (void)
>  file_end_indicate_exec_stack ();
>  }
>
> +/* Implement TARGET_C_MODE_FOR_FLOATING_TYPE.  Return TFmode or DFmode
> +   for TI_LONG_DOUBLE_TYPE which is for long double type, go with the
> +   default one for the others.  */
> +
> +static machine_mode
> +mips_c_mode_for_floating_type (enum tree_index ti)
> +{
> +  if (ti == TI_LONG_DOUBLE_TYPE)
> +return MIPS_LONG_DOUBLE_TYPE_SIZE == 64 ? DFmode : TFmode;
> +  return default_mode_for_floating_type (ti);
> +}
> +
>  void
>  mips_bit_clear_info (enum machine_mode mode, unsigned HOST_WIDE_INT m,
>   int *start_pos, int *size)
> @@ -23340,6 +23352,8 @@ mips_bit_clear_p (enum machine_mode mode, unsigned 
> HOST_WIDE_INT m)
>  #undef TARGET_ASM_FILE_END
>  #define TARGET_ASM_FILE_END mips_asm_file_end
>
> +#undef TARGET_C_MODE_FOR_FLOATING_TYPE
> +#define TARGET_C_MODE_FOR_FLOATING_TYPE mips_c_mode_for_floating_type
>
>  struct gcc_target targetm = TARGET_INITIALIZER;
>
> diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
> index 9d965966f2f..7a9b18c8096 100644
> --- a/gcc/config/mips/mips.h
> +++ b/gcc/config/mips/mips.h
> @@ -1654,7 +1654,7 @@ FP_ASM_SPEC "\
>  #define UNITS_PER_FPVALUE  \
>(TARGET_SOFT_FLOAT_ABI ? 0   \
> : TARGET_SINGLE_FLOAT ? UNITS_PER_FPREG \
> -   : LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
> +   : MIPS_LONG_DOUBLE_TYPE_SIZE / BITS_PER_UNIT)
>
>  /* The number of bytes in a double.  */
>  #define UNITS_PER_DOUBLE (TYPE_PRECISION (double_type_node) / BITS_PER_UNIT)
> @@ -1665,9 +1665,8 @@ FP_ASM_SPEC "\
>  #define LONG_TYPE_SIZE (TARGET_LONG64 ? 64 : 32)
>  #define LONG_LONG_TYPE_SIZE 64
>
> -#define FLOAT_TYPE_SIZE 32
> -#define DOUBLE_TYPE_SIZE 64
> -#define LONG_DOUBLE_TYPE_SIZE (TARGET_NEWABI ? 128 : 64)
> +/* LONG_DOUBLE_TYPE_SIZE gets poisoned, so add MIPS_ prefix.  */
> +#define MIPS_LONG_DOUBLE_TYPE_SIZE (TARGET_NEWABI ? 128 : 64)
>
>  /* Define the sizes of fixed-point types.  */
>  #define SHORT_FRACT_TYPE_SIZE 8
> @@ -1684,7 +1683,7 @@ FP_ASM_SPEC "\
>
>  /* long double is not a fixed mode, but the idea is that, if we
> support long double, we also want a 128-bit integer type.  */
> -#define MAX_FIXED_MODE_SIZE LONG_DOUBLE_TYPE_SIZE
> +#define MAX_FIXED_MODE_SIZE MIPS_LONG_DOUBLE_TYPE_SIZE
>
>  /* Width in bits of a pointer.  */
>  #ifndef POINTER_SIZE
> @@ -1705,10 +1704,10 @@ FP_ASM_SPEC "\
>  #define STRUCTURE_SIZE_BOUNDARY 8
>
>  /* There is no point aligning anything to a rounder boundary than
> -   LONG_DOUBLE_TYPE_SIZE, unless under MSA the bigggest alignment is
> +   MIPS_LONG_DOUBLE_TYPE_SIZE, unless under MSA the bigggest alignment is
> BITS_PER_MSA_REG.  */
>  #define BIGGEST_ALIGNMENT \
> -  (ISA_HAS_MSA ? BITS_PER_MSA_REG : LONG_DOUBLE_TYPE_SIZE)
> +  (ISA_HAS_MSA ? BITS_PER_MSA_REG : MIPS_LONG_DOUBLE_TYPE_SIZE)
>
>  /* All accesses must be aligned.  */
>  #define STRICT_ALIGNMENT (!ISA_HAS_UNALIGNED_ACCESS)
> diff --git a/gcc/config/mips/n32-elf.h b/gcc/config/mips/n32-elf.h
> index 94a90d847f0..01c8a852539 100644
> --- a/gcc/config/mips/n32-elf.h
> +++ b/gcc/config/mips/n32-elf.h
> @@ -26,5 +26,5 @@ along with GCC; see the file COPYING3.  If not see
>  #define NO_DOLLAR_IN_LABEL
>
>  /* Force n32 to use 64-bit long doubles.  */
> -#undef LONG_DOUBLE_TYPE_SIZE
> -#define LONG_DOUBLE_TYPE_SIZE 64
> +#undef MIPS_LONG_DOUBLE_TYPE_SIZE
> +#define MIPS_LONG_DOUBLE_TYPE_SIZE 64
> --
> 2.43.0
>


Re: [PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-05-28 Thread YunQiang Su
Richard Sandiford  于2024年5月29日周三 05:28写道:
>
> YunQiang Su  writes:
> > If `find_a_program` cannot find `as/ld/objcopy` and we are a cross 
> > toolchain,
> > the final fallback is `as/ld` of system.  In fact, we can have a try with
> > -as/ld/objcopy before fallback to native as/ld/objcopy.
> >
> > This patch is derivatived from Debian's patch:
> >   gcc-search-prefixed-as-ld.diff
>
> I'm probably making you repeat a previous discussion, sorry, but could
> you describe the use case in more detail?  The current approach to
> handling cross toolchains has been used for many years.  Presumably
> this patch is supporting a different way of organising things,
> but I wasn't sure from the description what it was.
>
> AIUI, we currently assume that cross as, ld and objcopy will be
> installed under those names in $prefix/$target_alias/bin (aka $tooldir/bin).
> E.g.:
>
>bin/aarch64-elf-as = aarch64-elf/bin/as
>
> GCC should then find as in aarch64-elf/bin.
>
> Is that not true in your case?
>

Yes. This patch is only about the final fallback. I mean aarch64-elf/bin/as
still has higher priority than bin/aarch64-elf-as.

In the current code, we find gas with:
/prefix/aarch64-elf/bin/as > $PATH/as

And this patch a new one between them:
/prefix/aarch64-elf/bin/as > $PATH/aarch64-elf-as > $PATH/as

> To be clear, I'm not saying the patch is wrong.  I'm just trying to
> understand why the patch is needed.
>

Yes. If gcc is configured correctly, it is not so useful.
In some case for some lazy user, it may be useful,
for example, the binutils installed into different prefix with libc etc.

For example, binutils is installed into /usr/aarch64-elf/bin, while
libc is installed into /usr/local/aarch64-elf/.

> Thanks,
> Richard
>
> >
> > gcc
> >   * gcc.cc(execute): Looks for -as/ld/objcopy before fallback
> >   to native as/ld/objcopy.
> > ---
> >  gcc/gcc.cc | 20 
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> > index 830a4700a87..3dc6348d761 100644
> > --- a/gcc/gcc.cc
> > +++ b/gcc/gcc.cc
> > @@ -3293,6 +3293,26 @@ execute (void)
> >string = find_a_program(commands[0].prog);
> >if (string)
> >   commands[0].argv[0] = string;
> > +  else if (*cross_compile != '0'
> > + && !strcmp (commands[0].argv[0], commands[0].prog)
> > + && (!strcmp (commands[0].prog, "as")
> > + || !strcmp (commands[0].prog, "ld")
> > + || !strcmp (commands[0].prog, "objcopy")))
> > + {
> > +   string = concat (DEFAULT_REAL_TARGET_MACHINE, "-",
> > + commands[0].prog, NULL);
> > +   const char *string_args[] = {string, "--version", NULL};
> > +   int exit_status = 0;
> > +   int err = 0;
> > +   const char *errmsg = pex_one (PEX_SEARCH, string,
> > +   CONST_CAST (char **, string_args), string,
> > +   NULL, NULL, &exit_status, &err);
> > +   if (errmsg == NULL && exit_status == 0 && err == 0)
> > + {
> > +   commands[0].argv[0] = string;
> > +   commands[0].prog = string;
> > + }
> > + }
> >  }
> >
> >for (n_commands = 1, i = 0; argbuf.iterate (i, &arg); i++)


Re: [PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-05-28 Thread YunQiang Su
YunQiang Su  于2024年5月22日周三 17:54写道:
>
> If `find_a_program` cannot find `as/ld/objcopy` and we are a cross toolchain,
> the final fallback is `as/ld` of system.  In fact, we can have a try with
> -as/ld/objcopy before fallback to native as/ld/objcopy.
>
> This patch is derivatived from Debian's patch:
>   gcc-search-prefixed-as-ld.diff
>
> gcc
> * gcc.cc(execute): Looks for -as/ld/objcopy before fallback
> to native as/ld/objcopy.

ping. OK for the trunk?

> ---
>  gcc/gcc.cc | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> index 830a4700a87..3dc6348d761 100644
> --- a/gcc/gcc.cc
> +++ b/gcc/gcc.cc
> @@ -3293,6 +3293,26 @@ execute (void)
>string = find_a_program(commands[0].prog);
>if (string)
> commands[0].argv[0] = string;
> +  else if (*cross_compile != '0'
> +   && !strcmp (commands[0].argv[0], commands[0].prog)
> +   && (!strcmp (commands[0].prog, "as")
> +   || !strcmp (commands[0].prog, "ld")
> +   || !strcmp (commands[0].prog, "objcopy")))
> +   {
> + string = concat (DEFAULT_REAL_TARGET_MACHINE, "-",
> +   commands[0].prog, NULL);
> + const char *string_args[] = {string, "--version", NULL};
> + int exit_status = 0;
> + int err = 0;
> + const char *errmsg = pex_one (PEX_SEARCH, string,
> + CONST_CAST (char **, string_args), string,
> + NULL, NULL, &exit_status, &err);
> + if (errmsg == NULL && exit_status == 0 && err == 0)
> +   {
> + commands[0].argv[0] = string;
> + commands[0].prog = string;
> +   }
> +   }
>  }
>
>for (n_commands = 1, i = 0; argbuf.iterate (i, &arg); i++)
> --
> 2.39.2
>


[PATCH] MIPS16: Mark $2/$3 as clobbered if GP is used

2024-05-28 Thread YunQiang Su
PR Target/84790.
The gp init sequence
li  $2,%hi(_gp_disp)
addiu   $3,$pc,%lo(_gp_disp)
sll $2,16
addu$2,$3
is generated directly in `mips_output_function_prologue`, and does
not appear in the RTL.

So the IRA/IPA passes are not aware that $2/$3 have been clobbered,
so they may be used for cross (local) function call.

Let's mark $2/$3 clobber both:
  - Just after the UNSPEC_GP RTL of a function;
  - Just after a function call.

Reported-by: Matthias Schiffer 
Origin-Patch-by: Felix Fietkau .

gcc
* config/mips/mips.cc(mips16_gp_pseudo_reg): Mark
MIPS16_PIC_TEMP and MIPS_PROLOGUE_TEMP clobbered.
(mips_emit_call_insn): Mark MIPS16_PIC_TEMP and
MIPS_PROLOGUE_TEMP clobbered if MIPS16 and CALL_CLOBBERED_GP.
---
 gcc/config/mips/mips.cc | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index b63d40a357b..b478cddc8ad 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -3233,6 +3233,9 @@ mips_emit_call_insn (rtx pattern, rtx orig_addr, rtx 
addr, bool lazy_p)
 {
   rtx post_call_tmp_reg = gen_rtx_REG (word_mode, POST_CALL_TMP_REG);
   clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), post_call_tmp_reg);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn), MIPS16_PIC_TEMP);
+  clobber_reg (&CALL_INSN_FUNCTION_USAGE (insn),
+   MIPS_PROLOGUE_TEMP (word_mode));
 }
 
   return insn;
@@ -3329,7 +3332,13 @@ mips16_gp_pseudo_reg (void)
   rtx set = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx);
   rtx_insn *insn = emit_insn_after (set, scan);
   INSN_LOCATION (insn) = 0;
-
+  /* NewABI support hasn't been implement.  NewABI should generate RTL
+sequence instead of ASM sequence directly.  */
+  if (mips_current_loadgp_style () == LOADGP_OLDABI)
+   {
+ emit_clobber (MIPS16_PIC_TEMP);
+ emit_clobber (MIPS_PROLOGUE_TEMP (Pmode));
+   }
   pop_topmost_sequence ();
 }
 
-- 
2.39.2



[PATCH] MIPS/testsuite: Fix bseli.b fail in msa-builtins.c

2024-05-28 Thread YunQiang Su
commit 05daf617ea22e1d818295ed2d037456937e23530
Author: Jeff Law 
Date:   Sat May 25 12:39:05 2024 -0600

[committed] [v2] More logical op simplifications in simplify-rtx.cc

does some simplifications, and then `bseli.b $w1,$w0,255` is found that
it is same with `or.v $w1,$w0,$w1`. So there will be no bseli.b instruction
generated.

Let's use 254 instead of 255 to test the generation of `bseli.b`.

gcc/testsuite

* gcc.target/mips/msa-builtins.c: Use 254 instead of 255 for
bseli.b, as `bseli.b $w0,$w1,255` is same as `or.v $w0,$w0,$w1`.
---
 gcc/testsuite/gcc.target/mips/msa-builtins.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/mips/msa-builtins.c 
b/gcc/testsuite/gcc.target/mips/msa-builtins.c
index a679f065f34..6a146b3e6ae 100644
--- a/gcc/testsuite/gcc.target/mips/msa-builtins.c
+++ b/gcc/testsuite/gcc.target/mips/msa-builtins.c
@@ -705,7 +705,7 @@
 #define BNEG(T) NOMIPS16 T FN (bneg, T ## _DF) (T i, T j) { return BUILTIN 
(bneg, T ## _DF) (i, j); }
 #define BNEGI(T) NOMIPS16 T FN (bnegi, T ## _DF) (T i) { return BUILTIN 
(bnegi, T ## _DF) (i, 0); }
 #define BSEL(T) NOMIPS16 T FN (bsel, v) (T i, T j, T k) { return BUILTIN 
(bsel, v) (i, j, k); }
-#define BSELI(T) NOMIPS16 T FN (bseli, T ## _DF) (T i, T j) { return BUILTIN 
(bseli, T ## _DF) (i, j, U8MAX); }
+#define BSELI(T) NOMIPS16 T FN (bseli, T ## _DF) (T i, T j) { return BUILTIN 
(bseli, T ## _DF) (i, j, U8MAX-1); }
 #define BSET(T) NOMIPS16 T FN (bset, T ## _DF) (T i, T j) { return BUILTIN 
(bset, T ## _DF) (i, j); }
 #define BSETI(T) NOMIPS16 T FN (bseti, T ## _DF) (T i) { return BUILTIN 
(bseti, T ## _DF) (i, 0); }
 #define NLOC(T) NOMIPS16 T FN (nloc, T ## _DF) (T i) { return BUILTIN (nloc, T 
## _DF) (i); }
-- 
2.39.2



[PATCH v2 2/2] driver: Search -as/ld/objcopy before non-triple ones

2024-05-22 Thread YunQiang Su
When looking for as/ld/objcopy, `find_a_program/file_at_path` only
try to find the raw name, but won't find the one with -
prefix.

This patch is derivatived from Debian's patch:
gcc-search-prefixed-as-ld.diff

gcc
* gcc.cc(for_each_path): Add more space for -.
(file_at_path): Search -as/ld/objcopy before
non-triple ones.
---
 gcc/gcc.cc | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 3dc6348d761..0fa2eafea84 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -2820,6 +2820,8 @@ for_each_path (const struct path_prefix *paths,
{
  len = paths->max_len + extra_space + 1;
  len += MAX (MAX (suffix_len, multi_os_dir_len), multiarch_len);
+ /* triplet prefix for as, ld.  */
+ len += MAX (strlen (DEFAULT_REAL_TARGET_MACHINE), multiarch_len) + 2;
  path = XNEWVEC (char, len);
}
 
@@ -3033,6 +3035,17 @@ file_at_path (char *path, void *data)
   struct file_at_path_info *info = (struct file_at_path_info *) data;
   size_t len = strlen (path);
 
+  /* search for the -as / -ld / objcopy first.  */
+  if (! strcmp (info->name, "as") || ! strcmp (info->name, "ld")
+   || ! strcmp (info->name, "objcopy"))
+{
+  struct file_at_path_info prefix_info = *info;
+  prefix_info.name = concat (DEFAULT_REAL_TARGET_MACHINE, "-",
+   info->name, NULL);
+  prefix_info.name_len = strlen (prefix_info.name);
+  if (file_at_path (path, &prefix_info))
+   return path;
+}
   memcpy (path + len, info->name, info->name_len);
   len += info->name_len;
 
-- 
2.39.2



[PATCH v2 1/2] driver: Use -as/ld/objcopy as final fallback instead of native ones for cross

2024-05-22 Thread YunQiang Su
If `find_a_program` cannot find `as/ld/objcopy` and we are a cross toolchain,
the final fallback is `as/ld` of system.  In fact, we can have a try with
-as/ld/objcopy before fallback to native as/ld/objcopy.

This patch is derivatived from Debian's patch:
  gcc-search-prefixed-as-ld.diff

gcc
* gcc.cc(execute): Looks for -as/ld/objcopy before fallback
to native as/ld/objcopy.
---
 gcc/gcc.cc | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 830a4700a87..3dc6348d761 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -3293,6 +3293,26 @@ execute (void)
   string = find_a_program(commands[0].prog);
   if (string)
commands[0].argv[0] = string;
+  else if (*cross_compile != '0'
+   && !strcmp (commands[0].argv[0], commands[0].prog)
+   && (!strcmp (commands[0].prog, "as")
+   || !strcmp (commands[0].prog, "ld")
+   || !strcmp (commands[0].prog, "objcopy")))
+   {
+ string = concat (DEFAULT_REAL_TARGET_MACHINE, "-",
+   commands[0].prog, NULL);
+ const char *string_args[] = {string, "--version", NULL};
+ int exit_status = 0;
+ int err = 0;
+ const char *errmsg = pex_one (PEX_SEARCH, string,
+ CONST_CAST (char **, string_args), string,
+ NULL, NULL, &exit_status, &err);
+ if (errmsg == NULL && exit_status == 0 && err == 0)
+   {
+ commands[0].argv[0] = string;
+ commands[0].prog = string;
+   }
+   }
 }
 
   for (n_commands = 1, i = 0; argbuf.iterate (i, &arg); i++)
-- 
2.39.2



Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
Jakub Jelinek  于2024年5月22日周三 17:33写道:
>
> On Wed, May 22, 2024 at 05:23:33PM +0800, YunQiang Su wrote:
> > Jakub Jelinek  于2024年5月22日周三 17:14写道:
> > >
> > > On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote:
> > > > > --- gcc/gcc.cc.jj   2024-02-09 14:54:09.141489744 +0100
> > > > > +++ gcc/gcc.cc  2024-02-09 22:04:37.655678742 +0100
> > > > > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m
> > > > >   if (*p1++ != '<' || p[-2] != '>')
> > > > > fatal_error (input_location,
> > > > >  "specs %%include syntax malformed after "
> > > > > -"%ld characters",
> > > > > -(long) (p1 - buffer + 1));
> > > > > +"%td characters", p1 - buffer + 1);
> > > > >
> > > >
> > > > Should we use %td later for gcc itself? Since we may use older
> > > > compiler to build gcc.
> > > > My major workstation is Debian Bookworm, which has GCC 12, and then I
> > > > get some warnings:
> > >
> > > That is fine and expected.  During stage1 such warnings are intentionally
> > > not fatal, only in stage2+ when we know it is the same version of gcc
> > > we want those can be fatal.
> >
> > It may have only 1 stage in some cases.
> > For example we have a full binutils/libc stack, and just build a cross-gcc.
> > For all libraries for target, such as libgcc etc, it is OK; while for
> > host executables
> > it will be a problem.
>
> That is still ok, it is just a warning about unknown gcc format specifiers,
> at runtime the code from the compiler being built will be used and that
> handles those.  We have added dozens of these over years, %td/%zd certainly
> aren't an exception.  Just try to build with some older gcc version, say
> 4.8.5, and you'll see far more such warnings.

Thanks for your explaination. It's OK for me if it can work well at runtime.

> But also as recommended, you shouldn't be building cross-gcc with old
> version of gcc, you should use same version of the native compiler to
> build the cross compiler.
>
> https://gcc.gnu.org/install/build.html
>
> "To build a cross compiler, we recommend first building and installing a 
> native
> compiler. You can then use the native GCC compiler to build the cross
> compiler."
>
> Jakub
>


-- 
YunQiang Su


Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
Jakub Jelinek  于2024年5月22日周三 17:14写道:
>
> On Wed, May 22, 2024 at 05:05:30PM +0800, YunQiang Su wrote:
> > > --- gcc/gcc.cc.jj   2024-02-09 14:54:09.141489744 +0100
> > > +++ gcc/gcc.cc  2024-02-09 22:04:37.655678742 +0100
> > > @@ -2410,8 +2410,7 @@ read_specs (const char *filename, bool m
> > >   if (*p1++ != '<' || p[-2] != '>')
> > > fatal_error (input_location,
> > >  "specs %%include syntax malformed after "
> > > -"%ld characters",
> > > -(long) (p1 - buffer + 1));
> > > +"%td characters", p1 - buffer + 1);
> > >
> >
> > Should we use %td later for gcc itself? Since we may use older
> > compiler to build gcc.
> > My major workstation is Debian Bookworm, which has GCC 12, and then I
> > get some warnings:
>
> That is fine and expected.  During stage1 such warnings are intentionally
> not fatal, only in stage2+ when we know it is the same version of gcc
> we want those can be fatal.

It may have only 1 stage in some cases.
For example we have a full binutils/libc stack, and just build a cross-gcc.
For all libraries for target, such as libgcc etc, it is OK; while for
host executables
it will be a problem.

> Otherwise we could never add any new modifies...
>
> Jakub
>


-- 
YunQiang Su


Re: [PATCH] Add %[zt][diox] support to pretty-print

2024-05-22 Thread YunQiang Su
f the above.
> %lld, %lli, %llo, %llu, %llx: long long versions.
> %wd, %wi, %wo, %wu, %wx: HOST_WIDE_INT versions.
> +   %zd, %zi, %zo, %zu, %zx: size_t versions.
> +   %td, %ti, %to, %tu, %tx: ptrdiff_t versions.
> %f: double
> %c: character.
> %s: string.
> @@ -1422,7 +1447,7 @@ pp_format (pretty_printer *pp,
>   obstack_1grow (&buffer->chunk_obstack, *p);
>   p++;
> }
> -  while (strchr ("qwl+#", p[-1]));
> +  while (strchr ("qwlzt+#", p[-1]));
>
>if (p[-1] == '.')
> {
> @@ -1524,6 +1549,16 @@ pp_format (pretty_printer *pp,
>   wide = true;
>   continue;
>
> +   case 'z':
> + gcc_assert (!precision);
> + precision = 3;
> + continue;
> +
> +   case 't':
> + gcc_assert (!precision);
> + precision = 4;
> + continue;
> +
> case 'l':
>   /* We don't support precision beyond that of "long long".  */
>   gcc_assert (precision < 2);
> @@ -1570,8 +1605,8 @@ pp_format (pretty_printer *pp,
>   if (wide)
> pp_wide_integer (pp, va_arg (*text->m_args_ptr, HOST_WIDE_INT));
>   else
> -   pp_integer_with_precision
> - (pp, *text->m_args_ptr, precision, int, "d");
> +   pp_integer_with_precision (pp, *text->m_args_ptr, precision,
> +  int, "d");
>   break;
>
> case 'o':
> @@ -1579,8 +1614,8 @@ pp_format (pretty_printer *pp,
>     pp_scalar (pp, "%" HOST_WIDE_INT_PRINT "o",
>va_arg (*text->m_args_ptr, unsigned HOST_WIDE_INT));
>   else
> -   pp_integer_with_precision
> - (pp, *text->m_args_ptr, precision, unsigned, "o");
> +   pp_integer_with_precision (pp, *text->m_args_ptr, precision,
> +  unsigned, "o");
>   break;
>
> case 's':
> @@ -1599,8 +1634,8 @@ pp_format (pretty_printer *pp,
> pp_scalar (pp, HOST_WIDE_INT_PRINT_UNSIGNED,
>va_arg (*text->m_args_ptr, unsigned HOST_WIDE_INT));
>   else
> -   pp_integer_with_precision
> - (pp, *text->m_args_ptr, precision, unsigned, "u");
> +   pp_integer_with_precision (pp, *text->m_args_ptr, precision,
> +  unsigned, "u");
>   break;
>
> case 'f':
> @@ -1629,8 +1664,8 @@ pp_format (pretty_printer *pp,
> pp_scalar (pp, HOST_WIDE_INT_PRINT_HEX,
>va_arg (*text->m_args_ptr, unsigned HOST_WIDE_INT));
>   else
> -   pp_integer_with_precision
> - (pp, *text->m_args_ptr, precision, unsigned, "x");
> +   pp_integer_with_precision (pp, *text->m_args_ptr, precision,
> +  unsigned, "x");
>   break;
>
> case '.':
> @@ -2774,6 +2809,18 @@ test_pp_format ()
>ASSERT_PP_FORMAT_2 ("17 12345678", "%wo %x", (HOST_WIDE_INT)15, 
> 0x12345678);
>ASSERT_PP_FORMAT_2 ("0xcafebabe 12345678", "%wx %x", 
> (HOST_WIDE_INT)0xcafebabe,
>   0x12345678);
> +  ASSERT_PP_FORMAT_2 ("-27 12345678", "%zd %x", (ssize_t)-27, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("-5 12345678", "%zi %x", (ssize_t)-5, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("10 12345678", "%zu %x", (size_t)10, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("17 12345678", "%zo %x", (size_t)15, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("cafebabe 12345678", "%zx %x", (size_t)0xcafebabe,
> + 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("-27 12345678", "%td %x", (ptrdiff_t)-27, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("-5 12345678", "%ti %x", (ptrdiff_t)-5, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("10 12345678", "%tu %x", (ptrdiff_t)10, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("17 12345678", "%to %x", (ptrdiff_t)15, 0x12345678);
> +  ASSERT_PP_FORMAT_2 ("1afebabe 12345678", "%tx %x", (ptrdiff_t)0x1afebabe,
> + 0x12345678);
>ASSERT_PP_FORMAT_2 ("1.00 12345678", "%f %x", 1.0, 0x12345678);
>ASSERT_PP_FORMAT_2 ("A 12345678", "%c %x", 'A', 0x12345678);
>ASSERT_PP_FORMAT_2 ("hello world 12345678", "%s %x", "hello world",
>
> Jakub
>


-- 
YunQiang Su


Re: [PATCH] driver: Use -as/ld as final fallback instead of as/ld for cross

2024-05-21 Thread YunQiang Su
Andrew Pinski  于2024年5月21日周二 20:23写道:
>
> On Tue, May 21, 2024 at 5:12 AM YunQiang Su  wrote:
> >
> > If `find_a_program` cannot find `as/ld` and we are a cross toolchain,
> > the final fallback is `as/ld` of system.  In fact, we can have a try
> > with -as/ld before fallback to native as/ld.
> >
> > This patch is derivatived from Debian's patch:
> >   gcc-search-prefixed-as-ld.diff
> >
> > gcc
> > * gcc.cc(execute): Looks for -as/ld before fallback
> > to native as/ld.
> > ---
> >  gcc/gcc.cc | 21 +
> >  1 file changed, 21 insertions(+)
> >
> > diff --git a/gcc/gcc.cc b/gcc/gcc.cc
> > index 830a4700a87..8a1bdb5e3e2 100644
> > --- a/gcc/gcc.cc
> > +++ b/gcc/gcc.cc
> > @@ -3293,6 +3293,27 @@ execute (void)
> >string = find_a_program(commands[0].prog);
> >if (string)
> > commands[0].argv[0] = string;
> > +  else if (*cross_compile != '0'
> > +   && (!strcmp (commands[0].argv[0], "as")
> > +   || !strcmp (commands[0].argv[0], "ld")))
> > +   {
> > + string = XNEWVEC (char, strlen (commands[0].argv[0]) + 2
> > + + strlen (DEFAULT_REAL_TARGET_MACHINE));
> > + strcpy (string, DEFAULT_REAL_TARGET_MACHINE);
> > + strcat (string, "-");
> > + strcat (string, commands[0].argv[0]);
> > + const char *string_args[] = {string, "--version", NULL};
> > + int exit_status = 0;
> > + int err = 0;
> > + const char *errmsg = pex_one (PEX_SEARCH, string,
> > + CONST_CAST (char **, string_args), string,
> > + NULL, NULL, &exit_status, &err);
>
> I think this should be handled under find_a_program instead of
> execute. That should simplify things slightly.

Maybe. But it seems that they are two different problems.
`find_a_program` won't try to find any as/ld from user path dirs,
such as /usr/bin

My patch tries to resolve the problem: if  `find_a_program` fails to find
any usable ld/as, then let's fallback to /usr/bin/-as
instead of /usr/bin/as.

Yes, we should also make `find_a_program` look for  -as
from its search path, while I guess it should be done by another patch.

> You should also most likely use concat here instead of
> XNEWVEC/strcpy/strcat which will also simplify the code.
> Like string = concat (DEFAULT_REAL_TARGET_MACHINE, "-", commands[0].prog);
>
> I think this should be done for more than just as/ld but also objcopy
> (which is used for gsplit-dwarf).
> Is there a reason why you are needing to try to execute with
> "--version" as an argument here?
>

I try to make it possible to fallback to system's ld/as, if
-as/ld doesn't exist.
With `--version` args, I have a test to -as/ld.

> Thanks,
> Andrew Pinski
>
> > + if (errmsg == NULL && exit_status == 0 && err == 0)
> > +   {
> > + commands[0].argv[0] = string;
> > + commands[0].prog = string;
> > +   }
> > +   }
> >  }
> >
> >for (n_commands = 1, i = 0; argbuf.iterate (i, &arg); i++)
> > --
> > 2.39.2
> >


[PATCH] driver: Use -as/ld as final fallback instead of as/ld for cross

2024-05-21 Thread YunQiang Su
If `find_a_program` cannot find `as/ld` and we are a cross toolchain,
the final fallback is `as/ld` of system.  In fact, we can have a try
with -as/ld before fallback to native as/ld.

This patch is derivatived from Debian's patch:
  gcc-search-prefixed-as-ld.diff

gcc
* gcc.cc(execute): Looks for -as/ld before fallback
to native as/ld.
---
 gcc/gcc.cc | 21 +
 1 file changed, 21 insertions(+)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 830a4700a87..8a1bdb5e3e2 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -3293,6 +3293,27 @@ execute (void)
   string = find_a_program(commands[0].prog);
   if (string)
commands[0].argv[0] = string;
+  else if (*cross_compile != '0'
+   && (!strcmp (commands[0].argv[0], "as")
+   || !strcmp (commands[0].argv[0], "ld")))
+   {
+ string = XNEWVEC (char, strlen (commands[0].argv[0]) + 2
+ + strlen (DEFAULT_REAL_TARGET_MACHINE));
+ strcpy (string, DEFAULT_REAL_TARGET_MACHINE);
+ strcat (string, "-");
+ strcat (string, commands[0].argv[0]);
+ const char *string_args[] = {string, "--version", NULL};
+ int exit_status = 0;
+ int err = 0;
+ const char *errmsg = pex_one (PEX_SEARCH, string,
+ CONST_CAST (char **, string_args), string,
+ NULL, NULL, &exit_status, &err);
+ if (errmsg == NULL && exit_status == 0 && err == 0)
+   {
+ commands[0].argv[0] = string;
+ commands[0].prog = string;
+   }
+   }
 }
 
   for (n_commands = 1, i = 0; argbuf.iterate (i, &arg); i++)
-- 
2.39.2



[PATCH] MIPS: Remove -m(no-)lra option

2024-05-15 Thread YunQiang Su
PR target/113955
The `-mlra` option was introduced in 2014 for MIPS, and was set to
default since then.  It's time for us to drop no-lra support by
dropping -m(no-)lra options.

gcc:
* config/mips/mips.cc(mips_option_override):
Drop mips_lra_flag variable;
(mips_lra_p): Removed.
(TARGET_LRA_P): Remove definition here to use the default one.
* config/mips/mips.md(*mul_acc_si, *mul_acc_si_r3900,
  *mul_sub_si): Drop mips_lra_flag variable.
* config/mips/mips.opt(-mlra): Removed.
* config/mips/mips.opt.urls(mlra): Removed.
---
 gcc/config/mips/mips.cc   | 12 
 gcc/config/mips/mips.md   | 24 +++-
 gcc/config/mips/mips.opt  |  4 
 gcc/config/mips/mips.opt.urls |  2 --
 4 files changed, 3 insertions(+), 39 deletions(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index ce764a5cb35..b63d40a357b 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -20391,8 +20391,6 @@ mips_option_override (void)
 error ("unsupported combination: %s", "-mfp64 -mfpxx");
   else if (ISA_MIPS1 && !TARGET_FLOAT32)
 error ("%<-march=%s%> requires %<-mfp32%>", mips_arch_info->name);
-  else if (TARGET_FLOATXX && !mips_lra_flag)
-error ("%<-mfpxx%> requires %<-mlra%>");
 
   /* End of code shared with GAS.  */
 
@@ -22871,14 +22869,6 @@ mips_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
   return NO_REGS;
 }
 
-/* Implement TARGET_LRA_P.  */
-
-static bool
-mips_lra_p (void)
-{
-  return mips_lra_flag;
-}
-
 /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.  */
 
 static reg_class_t
@@ -23307,8 +23297,6 @@ mips_bit_clear_p (enum machine_mode mode, unsigned 
HOST_WIDE_INT m)
 
 #undef TARGET_SPILL_CLASS
 #define TARGET_SPILL_CLASS mips_spill_class
-#undef TARGET_LRA_P
-#define TARGET_LRA_P mips_lra_p
 #undef TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS
 #define TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS 
mips_ira_change_pseudo_allocno_class
 
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 26f758c90dd..7de85123e7c 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -1781,13 +1781,7 @@ (define_insn "*mul_acc_si"
(set_attr "mode""SI")
(set_attr "insn_count" "1,1,2")
(set (attr "enabled")
-(cond [(and (eq_attr "alternative" "0")
-(match_test "!mips_lra_flag"))
-  (const_string "yes")
-   (and (eq_attr "alternative" "1")
-(match_test "mips_lra_flag"))
-  (const_string "yes")
-   (eq_attr "alternative" "2")
+(cond [(eq_attr "alternative" "1,2")
   (const_string "yes")]
   (const_string "no")))])
 
@@ -1811,13 +1805,7 @@ (define_insn "*mul_acc_si_r3900"
(set_attr "mode""SI")
(set_attr "insn_count" "1,1,1,2")
(set (attr "enabled")
-(cond [(and (eq_attr "alternative" "0")
-(match_test "!mips_lra_flag"))
-  (const_string "yes")
-   (and (eq_attr "alternative" "1")
-(match_test "mips_lra_flag"))
-  (const_string "yes")
-   (eq_attr "alternative" "2,3")
+(cond [(eq_attr "alternative" "1,2,3")
   (const_string "yes")]
   (const_string "no")))])
 
@@ -2039,13 +2027,7 @@ (define_insn "*mul_sub_si"
(set_attr "mode" "SI")
(set_attr "insn_count" "1,1,2")
(set (attr "enabled")
-(cond [(and (eq_attr "alternative" "0")
-(match_test "!mips_lra_flag"))
-  (const_string "yes")
-   (and (eq_attr "alternative" "1")
-(match_test "mips_lra_flag"))
-  (const_string "yes")
-   (eq_attr "alternative" "2")
+(cond [(eq_attr "alternative" "1,2")
   (const_string "yes")]
   (const_string "no")))])
 
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index c1abb36212f..99fe9301900 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -413,10 +413,6 @@ msynci
 Target Mask(SYNCI)
 Use synci instruction to invalidate i-cache.
 
-mlra
-Target Var(mips_lra_flag) Init(1) Save
-Use LRA instead of reload.
-
 mlxc1-sxc1
 Target Var(mips_lxc1_sxc1) Init(1)
 Use lwxc1/swxc1/ldxc1/sdxc1 instructions where applicable.
diff --git a/gcc/config/mips/mips.opt.urls b/gcc/config/mips/mips.opt.urls
index 9d166646d65..5921d6929b2 100644
--- a/gcc/config/mips/mips.opt.urls
+++ b/gcc/config/mips/mips.opt.urls
@@ -222,8 +222,6 @@ UrlSuffix(gcc/MIPS-Options.html#index-msym32)
 msynci
 UrlSuffix(gcc/MIPS-Options.html#index-msynci)
 
-; skipping UrlSuffix for 'mlra' due to finding no URLs
-
 mlxc1-sxc1
 UrlSuffix(gcc/MIPS-Options.html#index-mlxc1-sxc1)
 
-- 
2.39.2



[PATCH] MIPS: Support constraint 'w' for MSA instruction

2024-05-08 Thread YunQiang Su
Support syntax like:
asm volatile ("fmadd.d %w0, %w1, %w2" : "+w"(a): "w"(b), "w"(c));

gcc
* config/mips/constraints.md: Add new constraint 'w'.

gcc/testsuite
* gcc.target/mips/msa-inline-asm.c: New test.
---
 gcc/config/mips/constraints.md | 3 +++
 gcc/testsuite/gcc.target/mips/msa-inline-asm.c | 9 +
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/msa-inline-asm.c

diff --git a/gcc/config/mips/constraints.md b/gcc/config/mips/constraints.md
index a96028dd746..f5c88179038 100644
--- a/gcc/config/mips/constraints.md
+++ b/gcc/config/mips/constraints.md
@@ -29,6 +29,9 @@ (define_register_constraint "t" "T_REG"
 (define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
   "A floating-point register (if available).")
 
+(define_register_constraint "w" "ISA_HAS_MSA ? FP_REGS : NO_REGS"
+  "A MIPS SIMD register (if available).")
+
 (define_register_constraint "h" "NO_REGS"
   "Formerly the @code{hi} register.  This constraint is no longer supported.")
 
diff --git a/gcc/testsuite/gcc.target/mips/msa-inline-asm.c 
b/gcc/testsuite/gcc.target/mips/msa-inline-asm.c
new file mode 100644
index 000..bdf6816ab3b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/msa-inline-asm.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-mno-mips16 -mfp64 -mhard-float -mmsa" } */
+
+double
+f(double a, double b, double c) {
+  asm volatile ("fmadd.d %w0, %w1, %w2" : "+w"(a): "w"(b), "w"(c));
+  return a;
+}
+/* { dg-final { scan-assembler "fmadd.d \\\$w0, \\\$w\[0-9\]*, \\\$w\[0-9\]*" 
} }  */
-- 
2.39.2



Re: [PATCH v3] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-04-28 Thread YunQiang Su
I will apply this patch.
While we still have a problem about
```
float max(float a, float b) { return a>=b?a:b; }
```
If it is compiled with `-ffinite-math-only -fsigned-zeros -O2
-mips32r6 -mabi=32`,
`max.s` can be used.

The max.fmt/min.fmt of MIPSr6 can process +0/-0 correctly.


Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-04-28 Thread YunQiang Su
Xi Ruoyao  于2024年3月26日周二 18:10写道:
>
> On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote:
>
> /* snip */
>
> > With -ffinite-math-only -fno-signed-zeros, it does work with
> > x >= y ? x : y
> > while without `-ffinite-math-only -fno-signed-zeros`, it cannot.
> > @Xi Ruoyao Is it expected by IEEE?
>
> When y is (quiet) NaN and x is not, fmax(x, y) should produce x but x >=
> y ? x : y should produce y.  Thus -ffinite-math-only is needed.
>
> When x is +0.0 and y is -0.0, x >= y ? x : y should produce +0.0 but
> fmax(x, y) may produce +0.0 or -0.0 (IEEE allows both and I don't see a
> more strict requirement in MIPS 6.06 manual either).  Thus -fno-signed-
> zeros is needed.
>

Yes, MIPS 6.06 requires `max.f Y,+0,-0` produce +0.
There is a table after the description of max.fmt instruction,
aka Table 4.1 Special Cases for FP MAX, MIN, MAXA, MINA.

> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH] config-ml.in: Fix multi-os-dir search

2024-04-28 Thread YunQiang Su
Jeff Law  于2024年1月3日周三 01:00写道:
>
>
>
> On 1/1/24 09:48, YunQiang Su wrote:
> > When building multilib libraries, CC/CXX etc are set with an option
> > -B*/lib/, instead of -B/lib/.
> > This will make some trouble in some case, for example building
> > cross toolchain based on Debian's cross packages:
> >
> >If we have libc6-dev-i386-amd64-cross packages installed on
> >a non-x86 machine. This package will have the files in
> >/usr/x86_4-linux-gnu/lib32.  The fellow configure will fail
> >when build libgcc for i386, with complains the libc is not
> >i386 ones:
> >   ../configure --enable-multilib --enable-multilib \
> >  --target=x86_64-linux-gnu
> >
> > Let's insert a "-B*/lib/`CC ${flags} --print-multi-os-directory`"
> > before "-B*/lib/".
> >
> > This patch is based on the patch used by Debian now.
> >
> > ChangeLog
> >
> >   * config-ml.in: Insert an -B option with multi-os-dir into
> >   compiler commands used to build libraries.
> I would prefer this to wait for gcc-15.   I'll go ahead and ACK it for
> gcc-15 though.
>

I noticed that the gcc-14 branch has been created, and the basever has also
been 15.0 now.
Is it time for this patch now?

> What would also be valuable would be to extract out the rest of the
> multiarch patches from the Debian patches and get those into into GCC
> proper.
>
> Jeff


[PATCH] expmed: TRUNCATE value1 if needed in store_bit_field_using_insv

2024-04-28 Thread YunQiang Su
PR target/113179.

In `store_bit_field_using_insv`, we just use SUBREG if value_mode
>= op_mode, while in some ports, a sign_extend will be needed,
such as MIPS64:
  If either GPR rs or GPR rt does not contain sign-extended 32-bit
  values (bits 63..31 equal), then the result of the operation is
  UNPREDICTABLE.

The problem happens for the code like:
  struct xx {
int a:4;
int b:24;
int c:3;
int d:1;
  };

  void xx (struct xx *a, long long b) {
a->d = b;
  }

In the above code, the hard register contains `b`, may be note well
sign-extended.

gcc/
PR target/113179
* expmed.c(store_bit_field_using_insv): TRUNCATE value1 if
needed.

gcc/testsuite
PR target/113179
* gcc.target/mips/pr113179.c: New tests.
---
 gcc/expmed.cc| 12 +---
 gcc/testsuite/gcc.target/mips/pr113179.c | 18 ++
 2 files changed, 27 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr113179.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 4ec035e4843..6a582593da8 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -704,9 +704,15 @@ store_bit_field_using_insv (const extraction_insn *insv, 
rtx op0,
}
  else
{
- tmp = gen_lowpart_if_possible (op_mode, value1);
- if (! tmp)
-   tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
+ if (targetm.mode_rep_extended (op_mode, value_mode))
+   tmp = simplify_gen_unary (TRUNCATE, op_mode,
+ value1, value_mode);
+ else
+   {
+ tmp = gen_lowpart_if_possible (op_mode, value1);
+ if (! tmp)
+   tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
+   }
}
  value1 = tmp;
}
diff --git a/gcc/testsuite/gcc.target/mips/pr113179.c 
b/gcc/testsuite/gcc.target/mips/pr113179.c
new file mode 100644
index 000..f32c5a16765
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr113179.c
@@ -0,0 +1,18 @@
+/* Check if the operand of INS is sign-extended on MIPS64.  */
+/* { dg-options "-mips64r2 -mabi=64" } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+
+struct xx {
+int a:1;
+int b:24;
+int c:6;
+int d:1;
+};
+
+long long xx (struct xx *a, long long b) {
+a->d = b;
+return b+1;
+}
+
+/* { dg-final { scan-assembler "\tsll\t\\\$3,\\\$5,0" } } */
+/* { dg-final { scan-assembler "\tdaddiu\t\\\$2,\\\$5,1" } } */
-- 
2.39.2



Re: [PATCH] mips: Fix C23 (...) functions returning large aggregates [PR114175]

2024-03-28 Thread YunQiang Su
Xi Ruoyao  于2024年3月20日周三 15:12写道:
>
> We were assuming TYPE_NO_NAMED_ARGS_STDARG_P don't have any named
> arguments and there is nothing to advance, but that is not the case
> for (...) functions returning by hidden reference which have one such
> artificial argument.  This is causing gcc.dg/c23-stdarg-{6,8,9}.c to
> fail.
>
> Fix the issue by checking if arg.type is NULL, as r14-9503 explains.
>
> gcc/ChangeLog:
>
> PR target/114175
> * config/mips/mips.cc (mips_setup_incoming_varargs): Only skip
> mips_function_arg_advance for TYPE_NO_NAMED_ARGS_STDARG_P
> functions if arg.type is NULL.
> ---
>
> Bootstrapped and regtested on mips64el-linux-gnuabi64.  Ok for trunk?
>

Thanks. LGTM.

>  gcc/config/mips/mips.cc | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> index 68e2ae8d8fa..ce764a5cb35 100644
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -6834,7 +6834,13 @@ mips_setup_incoming_varargs (cumulative_args_t cum,
>   argument.  Advance a local copy of CUM past the last "real" named
>   argument, to find out how many registers are left over.  */
>local_cum = *get_cumulative_args (cum);
> -  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl)))
> +
> +  /* For a C23 variadic function w/o any named argument, and w/o an
> + artifical argument for large return value, skip advancing args.
> + There is such an artifical argument iff. arg.type is non-NULL
> + (PR 114175).  */
> +  if (!TYPE_NO_NAMED_ARGS_STDARG_P (TREE_TYPE (current_function_decl))
> +  || arg.type != NULL_TREE)
>  mips_function_arg_advance (pack_cumulative_args (&local_cum), arg);
>
>/* Found out how many registers we need to save.  */
> --
> 2.44.0
>


Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-03-25 Thread YunQiang Su
Jie Mei  于2024年3月25日周一 17:46写道:
>
> This patch adds the smin/smax RTL mode for the
> min/max.fmt instructions.
>
> Also, since the min/max.fmt instrucions applies to the
> IEEE 754-2008 "minNum" and "maxNum" operations, this
> patch also provides the new "fmin3" and
> "fmax3" modes.
>
> gcc/ChangeLog:
>
> * config/mips/i6400.md (i6400_fpu_minmax): New
> define_insn_reservation.
> * config/mips/mips.h (ISA_HAS_FMIN_FMAX): Define new macro.
> * config/mips/mips.md (UNSPEC_FMIN): New unspec.
> (UNSPEC_FMAX): Same as above.
> (type): Add fminmax.
> (smin3): Generates MIN.fmt instructions.
> (smax3): Generates MAX.fmt instructions.
> (fmin3): Generates MIN.fmt instructions.
> (fmax3): Generates MAX.fmt instructions.
> * config/mips/p6600.md (p6600_fpu_fabs): Include fminmax
> type.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/mips/mips-minmax.c: New test for MIPS R6.
> ---
>  gcc/config/mips/i6400.md|  6 +++
>  gcc/config/mips/mips.h  |  2 +
>  gcc/config/mips/mips.md | 50 -
>  gcc/config/mips/p6600.md|  2 +-
>  gcc/testsuite/gcc.target/mips/mips-minmax.c | 40 +
>  5 files changed, 97 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax.c
>
> diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
> index 9f216fe0210..d6f691ee217 100644
> --- a/gcc/config/mips/i6400.md
> +++ b/gcc/config/mips/i6400.md
> @@ -219,6 +219,12 @@
> (eq_attr "type" "fabs,fneg,fmove"))
>"i6400_fpu_short, i6400_fpu_apu")
>
> +;; min, max
> +(define_insn_reservation "i6400_fpu_minmax" 2
> +  (and (eq_attr "cpu" "i6400")
> +   (eq_attr "type" "fminmax"))
> +  "i6400_fpu_short+i6400_fpu_logic")
> +
>  ;; fadd, fsub, fcvt
>  (define_insn_reservation "i6400_fpu_fadd" 4
>(and (eq_attr "cpu" "i6400")
> diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
> index 7145d23c650..5ce984ac99b 100644
> --- a/gcc/config/mips/mips.h
> +++ b/gcc/config/mips/mips.h
> @@ -1259,6 +1259,8 @@ struct mips_cpu_info {
>  #define ISA_HAS_9BIT_DISPLACEMENT  (mips_isa_rev >= 6  \
>  || ISA_HAS_MIPS16E2)
>
> +#define ISA_HAS_FMIN_FMAX  (mips_isa_rev >= 6)
> +
>  /* ISA has data indexed prefetch instructions.  This controls use of
> 'prefx', along with TARGET_HARD_FLOAT and TARGET_DOUBLE_FLOAT.
> (prefx is a cop1x instruction, so can only be used if FP is
> diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
> index b0fb5850a9e..26f758c90dd 100644
> --- a/gcc/config/mips/mips.md
> +++ b/gcc/config/mips/mips.md
> @@ -97,6 +97,10 @@
>UNSPEC_GET_FCSR
>UNSPEC_SET_FCSR
>
> +  ;; Floating-point unspecs.
> +  UNSPEC_FMIN
> +  UNSPEC_FMAX
> +
>;; HI/LO moves.
>UNSPEC_MFHI
>UNSPEC_MTHI
> @@ -370,6 +374,7 @@
>  ;; frsqrt   floating point reciprocal square root
>  ;; frsqrt1  floating point reciprocal square root step1
>  ;; frsqrt2  floating point reciprocal square root step2
> +;; fminmax  floating point min/max
>  ;; dspmac   DSP MAC instructions not saturating the accumulator
>  ;; dspmacsatDSP MAC instructions that saturate the accumulator
>  ;; accext   DSP accumulator extract instructions
> @@ -387,8 +392,8 @@
> 
> prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical,
> shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
> fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
> -   frsqrt,frsqrt1,frsqrt2,dspmac,dspmacsat,accext,accmod,dspalu,dspalusat,
> -   multi,atomic,syncloop,nop,ghost,multimem,
> +   frsqrt,frsqrt1,frsqrt2,fminmax,dspmac,dspmacsat,accext,accmod,dspalu,
> +   dspalusat,multi,atomic,syncloop,nop,ghost,multimem,
> simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
> simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
> simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
> @@ -7971,6 +7976,47 @@
>[(set_attr "move_type" "load")
> (set_attr "insn_count" "2")])
>
> +;;
> +;;  Float point MIN/MAX
> +;;
> +
> +(define_insn "smin3"
> +  [(set (match_operand:SCALARF 0 "register_operand" "=f")
> +   (smin:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
> + (match_operand:SCALARF 2 "register_operand" "f")))]
> +  "ISA_HAS_FMIN_FMAX"
> +  "min.\t%0,%1,%2"
> +  [(set_attr "type" "fminmax")
> +   (set_attr "mode" "")])
> +
> +(define_insn "smax3"
> +  [(set (match_operand:SCALARF 0 "register_operand" "=f")
> +   (smax:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
> + (match_operand:SCALARF 2 "register_operand" "f")))]
> +  "ISA_HAS_FMIN_FMAX"
> +  "max.\t%0,%1,%2"
> +  [(set_attr "type" "fminmax")
> +  (set_attr "mode" "")])

[PATCH] MIPS: Predefine __mips_strict_alignment if STRICT_ALIGNMENT

2024-03-20 Thread YunQiang Su
Arm32 predefines __ARM_FEATURE_UNALIGNED if -mno-unaligned-access,
and RISC-V predefines __riscv_misaligned_avoid.

Let's define __mips_strict_alignment for MIPSr6 and -mstrict-align
is used.

Not that, this macro is always defined for pre-R6.

gcc
config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Predefine
__mips_strict_alignment if STRICT_ALIGNMENT.
---
 gcc/config/mips/mips.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 6444a68dfd5..616a275b918 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -694,6 +694,9 @@ struct mips_cpu_info {
builtin_define ("__mips_compact_branches_always");  \
   else \
builtin_define ("__mips_compact_branches_optimal"); \
+   \
+  if (STRICT_ALIGNMENT)\
+   builtin_define ("__mips_strict_alignment"); \
 }  \
   while (0)
 
-- 
2.39.2



Re: [PATCH] Predefine __STRICT_ALIGN__ if STRICT_ALIGNMENT

2024-03-17 Thread YunQiang Su
Sam James  于2024年3月17日周日 14:04写道:
>
> YunQiang Su  writes:
>
> > Arm32 predefines __ARM_FEATURE_UNALIGNED if -mno-unaligned-access,
> > and RISC-V predefines __riscv_misaligned_avoid, while other ports
> > that support -mstrict-align/-mno-unaligned-access don't have such
> > macro, and these backend macros are only avaiable for c-family.
> > Note: Arm64 always predefine __ARM_FEATURE_UNALIGNED: See #111555.
>
> I would say tag the bug even if you're not fixing it, as it was related
> enough for you to cite it.
>

I am not sure that it is a bug for aarch64. This macro may be used to
determine whether hardware can support misaligned access, and
maybe all of Aarch64 CPUs can support it.

It should be determined by ARM people.


[PATCH] Predefine __STRICT_ALIGN__ if STRICT_ALIGNMENT

2024-03-16 Thread YunQiang Su
Arm32 predefines __ARM_FEATURE_UNALIGNED if -mno-unaligned-access,
and RISC-V predefines __riscv_misaligned_avoid, while other ports
that support -mstrict-align/-mno-unaligned-access don't have such
macro, and these backend macros are only avaiable for c-family.
Note: Arm64 always predefine __ARM_FEATURE_UNALIGNED: See #111555.

Let's add a generic one.

__STRICT_ALIGN__ is used instead of __STRICT_ALIGNMENT__, due to that
the later is used by some softwares, such as lzo2, syslinux etc.

gcc
* cppbuiltin.cc: Predefine __STRICT_ALIGNMENT__ if
STRICT_ALIGNMENT.
---
 gcc/cppbuiltin.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/cppbuiltin.cc b/gcc/cppbuiltin.cc
index c4bfc2917dc..d32efdf9a07 100644
--- a/gcc/cppbuiltin.cc
+++ b/gcc/cppbuiltin.cc
@@ -123,6 +123,9 @@ define_builtin_macros_for_compilation_flags (cpp_reader 
*pfile)
 
   cpp_define_formatted (pfile, "__FINITE_MATH_ONLY__=%d",
flag_finite_math_only);
+
+  if (STRICT_ALIGNMENT)
+cpp_define (pfile, "__STRICT_ALIGNMENT__");
 }
 
 
-- 
2.39.2



[commit] Regenerate opt.urls

2024-03-15 Thread YunQiang Su
Fixes: acc38ff59976 ("MIPS: Add -m(no-)strict-align option")

gcc/ChangeLog:

* config/riscv/riscv.opt.urls: Regenerated.
* config/rs6000/sysv4.opt.urls: Likewise.
* config/xtensa/xtensa.opt.urls: Likewise.
---
 gcc/config/riscv/riscv.opt.urls   | 2 +-
 gcc/config/rs6000/sysv4.opt.urls  | 2 +-
 gcc/config/xtensa/xtensa.opt.urls | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.opt.urls b/gcc/config/riscv/riscv.opt.urls
index f40795866cf..da31820e234 100644
--- a/gcc/config/riscv/riscv.opt.urls
+++ b/gcc/config/riscv/riscv.opt.urls
@@ -44,7 +44,7 @@ UrlSuffix(gcc/RISC-V-Options.html#index-mshorten-memrefs)
 ; skipping UrlSuffix for 'mcmodel=' due to finding no URLs
 
 mstrict-align
-UrlSuffix(gcc/RISC-V-Options.html#index-mstrict-align-3)
+UrlSuffix(gcc/RISC-V-Options.html#index-mstrict-align-4)
 
 ; skipping UrlSuffix for 'mexplicit-relocs' due to finding no URLs
 
diff --git a/gcc/config/rs6000/sysv4.opt.urls b/gcc/config/rs6000/sysv4.opt.urls
index f8d58d6602c..c155cddfa36 100644
--- a/gcc/config/rs6000/sysv4.opt.urls
+++ b/gcc/config/rs6000/sysv4.opt.urls
@@ -12,7 +12,7 @@ mbit-align
 UrlSuffix(gcc/RS_002f6000-and-PowerPC-Options.html#index-mbit-align)
 
 mstrict-align
-UrlSuffix(gcc/RS_002f6000-and-PowerPC-Options.html#index-mstrict-align-4)
+UrlSuffix(gcc/RS_002f6000-and-PowerPC-Options.html#index-mstrict-align-5)
 
 mrelocatable
 UrlSuffix(gcc/RS_002f6000-and-PowerPC-Options.html#index-mrelocatable)
diff --git a/gcc/config/xtensa/xtensa.opt.urls 
b/gcc/config/xtensa/xtensa.opt.urls
index 146db23d1e3..1f193a7da0c 100644
--- a/gcc/config/xtensa/xtensa.opt.urls
+++ b/gcc/config/xtensa/xtensa.opt.urls
@@ -33,5 +33,5 @@ mabi=windowed
 UrlSuffix(gcc/Xtensa-Options.html#index-mabi_003dwindowed)
 
 mstrict-align
-UrlSuffix(gcc/Xtensa-Options.html#index-mstrict-align-5)
+UrlSuffix(gcc/Xtensa-Options.html#index-mstrict-align-6)
 
-- 
2.39.2



Re: CI for "Option handling: add documentation URLs"

2024-03-15 Thread YunQiang Su
Great work. The CI works well now: it blames me ;)
https://builder.sourceware.org/buildbot/#/builders/269/builds/3846

When I add '-mstrict-align' option to MIPS,
the riscv.opt.urls, sysv4.opt.urls, xtensa.opt.urls are changed also.
(why they are effected?

So what's the best practice for this cases?
Should I push a new commit? Or in fact a single commit is preferred?

-- 
YunQiang Su


[commit] MIPS: Add -m(no-)strict-align option

2024-03-14 Thread YunQiang Su
We support options -m(no-)unaligned-access 2 years ago, while
currently most of other ports prefer -m(no-)strict-align.
Let's support -m(no-)strict-align, and keep -m(no-)unaligned-access
as alias.

gcc
* config/mips/mips.opt: Support -mstrict-align, and use
TARGET_STRICT_ALIGN as the flag; keep -m(no-)unaligned-access
as alias.
* config/mips/mips.h: Use TARGET_STRICT_ALIGN.
* config/mips/mips.opt.urls: Regenerate.
* doc/invoke.texi: Document -m(no-)strict-algin for MIPSr6.
---
 gcc/config/mips/mips.h|  2 +-
 gcc/config/mips/mips.opt  | 12 ++--
 gcc/config/mips/mips.opt.urls |  6 ++
 gcc/doc/invoke.texi   | 18 --
 4 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 7145d23c650..6444a68dfd5 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -251,7 +251,7 @@ struct mips_cpu_info {
 || ISA_HAS_MSA))
 
 /* ISA load/store instructions can handle unaligned address */
-#define ISA_HAS_UNALIGNED_ACCESS (TARGET_UNALIGNED_ACCESS \
+#define ISA_HAS_UNALIGNED_ACCESS (!TARGET_STRICT_ALIGN \
 && (mips_isa_rev >= 6))
 
 /* The ISA compression flags that are currently in effect.  */
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index ce36942aabe..c1abb36212f 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -429,9 +429,17 @@ mtune=
 Target RejectNegative Joined Var(mips_tune_option) ToLower 
Enum(mips_arch_opt_value)
 -mtune=PROCESSOR   Optimize the output for PROCESSOR.
 
+mstrict-align
+Target Var(TARGET_STRICT_ALIGN) Init(0)
+Don't generate code with unaligned load store, only valid for MIPS R6.
+
 munaligned-access
-Target Var(TARGET_UNALIGNED_ACCESS) Init(1)
-Generate code with unaligned load store, valid for MIPS R6.
+Target RejectNegative Alias(mstrict-align) NegativeAlias
+Generate code with unaligned load store for R6 (alias of -mno-strict-align).
+
+mno-unaligned-access
+Target RejectNegative Alias(mstrict-align)
+Don't generate code with unaligned load store for R6 (alias of -mstrict-align).
 
 muninit-const-in-rodata
 Target Var(TARGET_UNINIT_CONST_IN_RODATA)
diff --git a/gcc/config/mips/mips.opt.urls b/gcc/config/mips/mips.opt.urls
index 96aba041026..9d166646d65 100644
--- a/gcc/config/mips/mips.opt.urls
+++ b/gcc/config/mips/mips.opt.urls
@@ -233,9 +233,15 @@ UrlSuffix(gcc/MIPS-Options.html#index-mmadd4)
 mtune=
 UrlSuffix(gcc/MIPS-Options.html#index-mtune-10)
 
+mstrict-align
+UrlSuffix(gcc/MIPS-Options.html#index-mstrict-align-3)
+
 munaligned-access
 UrlSuffix(gcc/MIPS-Options.html#index-munaligned-access-1)
 
+mno-unaligned-access
+UrlSuffix(gcc/MIPS-Options.html#index-mno-unaligned-access-1)
+
 muninit-const-in-rodata
 UrlSuffix(gcc/MIPS-Options.html#index-muninit-const-in-rodata)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 85c938d4a14..864768fd2f4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1143,7 +1143,8 @@ Objective-C and Objective-C++ Dialects}.
 -mcheck-zero-division  -mno-check-zero-division
 -mdivide-traps  -mdivide-breaks
 -mload-store-pairs  -mno-load-store-pairs
--munaligned-access  -mno-unaligned-access
+-mstrict-align  -mno-strict-align
+-mno-unaligned-access  -munaligned-access
 -mmemcpy  -mno-memcpy  -mlong-calls  -mno-long-calls
 -mmad  -mno-mad  -mimadd  -mno-imadd  -mfused-madd  -mno-fused-madd  -nocpp
 -mfix-24k  -mno-fix-24k
@@ -28561,14 +28562,19 @@ instructions to enable load/store bonding.  This 
option is enabled by
 default but only takes effect when the selected architecture is known
 to support bonding.
 
+@opindex mstrict-align
+@opindex mno-strict-align
 @opindex munaligned-access
 @opindex mno-unaligned-access
-@item -munaligned-access
+@item -mstrict-align
+@itemx -mno-strict-align
+@itemx -munaligned-access
 @itemx -mno-unaligned-access
-Enable (disable) direct unaligned access for MIPS Release 6.
-MIPSr6 requires load/store unaligned-access support,
-by hardware or trap&emulate.
-So @option{-mno-unaligned-access} may be needed by kernel.
+Disable (enable) direct unaligned access for MIPS Release 6.
+MIPSr6 requires load/store unaligned-access support, by hardware or
+trap&emulate.  So @option{-mstrict-align} may be needed by kernel.  The
+options @option{-munaligned-access} and @option{-mno-unaligned-access}
+are obsoleted, and only for backward-compatible.
 
 @opindex mmemcpy
 @opindex mno-memcpy
-- 
2.39.2



[commit] invoke.texi: Fix some skipping UrlSuffix problem for MIPS

2024-02-21 Thread YunQiang Su
The problem is that, there are these lines in mips.opt.urls:
  ; skipping UrlSuffix for 'mabi=' due to finding no URLs
  ; skipping UrlSuffix for 'mno-flush-func' due to finding no URLs
  ; skipping UrlSuffix for 'mexplicit-relocs' due to finding no URLs

These lines is not fixed by this patch due to that we don't
document these options:
  ; skipping UrlSuffix for 'mlra' due to finding no URLs
  ; skipping UrlSuffix for 'mdebug' due to finding no URLs
  ; skipping UrlSuffix for 'meb' due to finding no URLs
  ; skipping UrlSuffix for 'mel' due to finding no URLs

gcc
* doc/invoke.texi(MIPS Options): Fix skipping UrlSuffix
problem of mabi=, mno-flush-func, mexplicit-relocs;
add missing leading - of mbranch-cost option.
* config/mips/mips.opt.urls: Regenerate.
---
 gcc/config/mips/mips.opt.urls | 12 ++--
 gcc/doc/invoke.texi   | 14 +-
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/gcc/config/mips/mips.opt.urls b/gcc/config/mips/mips.opt.urls
index ff2f0aee0e3..96aba041026 100644
--- a/gcc/config/mips/mips.opt.urls
+++ b/gcc/config/mips/mips.opt.urls
@@ -6,7 +6,8 @@ UrlSuffix(gcc/MIPS-Options.html#index-EB-2)
 EL
 UrlSuffix(gcc/MIPS-Options.html#index-EL-2)
 
-; skipping UrlSuffix for 'mabi=' due to finding no URLs
+mabi=
+UrlSuffix(gcc/MIPS-Options.html#index-mabi-3)
 
 mabicalls
 UrlSuffix(gcc/MIPS-Options.html#index-mabicalls)
@@ -65,9 +66,15 @@ UrlSuffix(gcc/MIPS-Options.html#index-membedded-data)
 meva
 UrlSuffix(gcc/MIPS-Options.html#index-meva)
 
+mexplicit-relocs=
+UrlSuffix(gcc/MIPS-Options.html#index-mexplicit-relocs-2)
+
 mexplicit-relocs
 UrlSuffix(gcc/MIPS-Options.html#index-mexplicit-relocs-2)
 
+mno-explicit-relocs
+UrlSuffix(gcc/MIPS-Options.html#index-mno-explicit-relocs-2)
+
 mextern-sdata
 UrlSuffix(gcc/MIPS-Options.html#index-mextern-sdata)
 
@@ -173,7 +180,8 @@ UrlSuffix(gcc/MIPS-Options.html#index-mno-float)
 mmcu
 UrlSuffix(gcc/MIPS-Options.html#index-mmcu-1)
 
-; skipping UrlSuffix for 'mno-flush-func' due to finding no URLs
+mno-flush-func
+UrlSuffix(gcc/MIPS-Options.html#index-mno-flush-func-1)
 
 mno-mdmx
 UrlSuffix(gcc/MIPS-Options.html#index-mno-mdmx)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8219a6a5947..58527e1ea3c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -27995,11 +27995,7 @@ Aliases of @option{-minterlink-compressed} and
 @option{-mno-interlink-compressed}.  These options predate the microMIPS ASE
 and are retained for backwards compatibility.
 
-@opindex mabi=32
-@opindex mabi=o64
-@opindex mabi=n32
-@opindex mabi=64
-@opindex mabi=eabi
+@opindex mabi
 @item -mabi=32
 @itemx -mabi=o64
 @itemx -mabi=n32
@@ -28486,9 +28482,8 @@ Enable (disable) use of the @code{%hi()} and 
@code{%lo()} assembler
 relocation operators.  This option has been superseded by
 @option{-mexplicit-relocs} but is retained for backwards compatibility.
 
-@opindex mexplicit-relocs=none
-@opindex mexplicit-relocs=base
-@opindex mexplicit-relocs=pcrel
+@opindex mexplicit-relocs
+@opindex mno-explicit-relocs
 @item -mexplicit-relocs=none
 @itemx -mexplicit-relocs=base
 @itemx -mexplicit-relocs=pcrel
@@ -28767,6 +28762,7 @@ Disable the insertion of cache barriers.  This is the 
default setting.
 @end table
 
 @opindex mflush-func
+@opindex mno-flush-func
 @item -mflush-func=@var{func}
 @itemx -mno-flush-func
 Specifies the function to call to flush the I and D caches, or to not
@@ -28778,7 +28774,7 @@ depends on the target GCC was configured for, but 
commonly is either
 @code{_flush_func} or @code{__cpu_flush}.
 
 @opindex mbranch-cost
-@item mbranch-cost=@var{num}
+@item -mbranch-cost=@var{num}
 Set the cost of branches to roughly @var{num} ``simple'' instructions.
 This cost is only a heuristic and is not guaranteed to produce
 consistent results across releases.  A zero cost redundantly selects
-- 
2.39.2



Re: CI for "Option handling: add documentation URLs"

2024-02-21 Thread YunQiang Su
Mark Wielaard  于2024年2月19日周一 06:58写道:
>
> Hi David,
>
> On Thu, Jan 04, 2024 at 09:57:09AM -0500, David Malcolm wrote:
> > I've pushed the .opt.urls patch kit to gcc trunk [1], so hopefully the
> > CI check you wrote can go live now.
>
> And then I was on vacation myself and forgot. I am sorry.
>
> So, I did try the regenerate-opt-urls locally, and it did generate the
> attached diff. Which seems to show we really need this automated.
>
> Going over the diff. The -Winfinite-recursion in rust does indeed seem
> new.  As do the -mapx-inline-asm-use-gpr32 and mevex512 for i386.  And
> the avr options -mskip-bug, -mflmap and mrodata-in-ram.  The change in
> common.opt.urls for -Wuse-after-free comes from it being moved from
> c++ to the c-family. The changes in mips.opt.urls seem to come from
> commit 46df1369 "doc/invoke: Remove duplicate explicit-relocs entry of
> MIPS".
>

For MIPS, it's due to malformed patches to invoke.text.
I will fix them.

> The changes in c.opt.urls seem mostly reordering. The sorting makes
> more sense after the diff imho. And must have come from commit
> 4666cbde5 "Sort warning options in c-family/c.opt".
>
> Also the documentation for -Warray-parameter was fixed.
>
> So I think the regenerate-opt-urls check does work as intended. So
> lets automate it, because it looks like nobody regenerated the
> url.opts after updating the documentation.
>
> But we should first apply this diff. Could you double check it is
> sane/correct?
>
> Thanks,
>
> Mark



-- 
YunQiang Su


Re: [PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-04 Thread YunQiang Su
Xi Ruoyao  于2024年2月5日周一 02:01写道:
>
> We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
> wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to
> fail when Python is built with MSA enabled.
>
> Use the bnegi.df instructions to simply reverse the sign bit instead.
>
> gcc/ChangeLog:
>
> * config/mips/mips-msa.md (elmsgnbit): New define_mode_attr.
> (neg2): Change the mode iterator from MSA to IMSA because
> in FP arithmetic we cannot use (0 - x) for -x.
> (neg2): New define_insn to implement FP vector negation,
> using a bnegi instruction to negate the sign bit.
> ---
>
> Bootstrapped and regtested on mips64el-linux-gnuabi64.  Ok for trunk
> and/or release branches?
>
>  gcc/config/mips/mips-msa.md | 18 +++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
>

LGTM, while I guess that we also need a test case.

> diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
> index 83d9a08e360..920161ed1d8 100644
> --- a/gcc/config/mips/mips-msa.md
> +++ b/gcc/config/mips/mips-msa.md
> @@ -231,6 +231,10 @@ (define_mode_attr bitimm
> (V4SI  "uimm5")
> (V2DI  "uimm6")])
>
> +;; The index of sign bit in FP vector elements.
> +(define_mode_attr elmsgnbit [(V2DF "63") (V4DF "63")
> +(V4SF "31") (V8SF "31")])
> +
>  (define_expand "vec_init"
>[(match_operand:MSA 0 "register_operand")
> (match_operand:MSA 1 "")]
> @@ -597,9 +601,9 @@ (define_expand "abs2"
>  })
>
>  (define_expand "neg2"
> -  [(set (match_operand:MSA 0 "register_operand")
> -   (minus:MSA (match_dup 2)
> -  (match_operand:MSA 1 "register_operand")))]
> +  [(set (match_operand:IMSA 0 "register_operand")
> +   (minus:IMSA (match_dup 2)
> +  (match_operand:IMSA 1 "register_operand")))]
>"ISA_HAS_MSA"
>  {
>rtx reg = gen_reg_rtx (mode);
> @@ -607,6 +611,14 @@ (define_expand "neg2"
>operands[2] = reg;
>  })
>
> +(define_insn "neg2"
> +  [(set (match_operand:FMSA 0 "register_operand" "=f")
> +   (neg (match_operand:FMSA 1 "register_operand" "f")))]
> +  "ISA_HAS_MSA"
> +  "bnegi.\t%w0,%w1,"
> +  [(set_attr "type" "simd_bit")
> +   (set_attr "mode" "")])
> +
>  (define_expand "msa_ldi"
>[(match_operand:IMSA 0 "register_operand")
> (match_operand 1 "const_imm10_operand")]
> --
> 2.43.0
>


[PATCH] MIPS: Accept arguments for -mexplicit-relocs

2024-01-19 Thread YunQiang Su
GAS introduced explicit relocs since 2001, and %pcrel_hi/low were
introduced in 2014.  In future, we may introduce more.

Let's convert -mexplicit-relocs option, and accpet options:
none, base, pcrel.

We also update gcc/configure.ac to set the value to option
the gas support when GCC itself is built.

gcc
* configure.ac: Detect the explicit relocs support for
mips, and define C macro MIPS_EXPLICIT_RELOCS.
* config.in: Regenerated.
* configure: Regenerated.
* doc/invoke.texi(MIPS Options): Add -mexplicit-relocs.
* config/mips/mips-opts.h: Define enum mips_explicit_relocs.
* config/mips/mips.cc(mips_set_compression_mode): Sorry if
!TARGET_EXPLICIT_RELOCS instead of just set it.
* config/mips/mips.h: Define TARGET_EXPLICIT_RELOCS and
TARGET_EXPLICIT_RELOCS_PCREL with mips_opt_explicit_relocs.
* config/mips/mips.opt: Introduce -mexplicit-relocs= option
and define -m(no-)explicit-relocs as aliases.
---
 gcc/config.in   |  6 +
 gcc/config/mips/mips-opts.h |  7 +
 gcc/config/mips/mips.cc |  5 ++--
 gcc/config/mips/mips.h  |  8 ++
 gcc/config/mips/mips.opt| 25 --
 gcc/configure   | 51 -
 gcc/configure.ac| 21 +++
 gcc/doc/invoke.texi | 16 
 8 files changed, 124 insertions(+), 15 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 99fd2d89fe3..ce1d073833f 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -2356,6 +2356,12 @@
 #endif
 
 
+/* Define if assembler supports %reloc. */
+#ifndef USED_FOR_TARGET
+#undef MIPS_EXPLICIT_RELOCS
+#endif
+
+
 /* Define if host mkdir takes a single argument. */
 #ifndef USED_FOR_TARGET
 #undef MKDIR_TAKES_ONE_ARG
diff --git a/gcc/config/mips/mips-opts.h b/gcc/config/mips/mips-opts.h
index 57bdbdfa721..4b0c2c09a3d 100644
--- a/gcc/config/mips/mips-opts.h
+++ b/gcc/config/mips/mips-opts.h
@@ -53,4 +53,11 @@ enum mips_cb_setting {
   MIPS_CB_OPTIMAL,
   MIPS_CB_ALWAYS
 };
+
+/* Enumerates the setting of the -mexplicit-relocs= option.  */
+enum mips_explicit_relocs {
+  MIPS_EXPLICIT_RELOCS_NONE,
+  MIPS_EXPLICIT_RELOCS_BASE,
+  MIPS_EXPLICIT_RELOCS_PCREL
+};
 #endif
diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 30e99811ff6..68e2ae8d8fa 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -20033,8 +20033,6 @@ mips_set_compression_mode (unsigned int 
compression_mode)
 call.  */
   flag_move_loop_invariants = 0;
 
-  target_flags |= MASK_EXPLICIT_RELOCS;
-
   /* Experiments suggest we get the best overall section-anchor
 results from using the range of an unextended LW or SW.  Code
 that makes heavy use of byte or short accesses can do better
@@ -20064,6 +20062,9 @@ mips_set_compression_mode (unsigned int 
compression_mode)
 
   if (TARGET_MSA)
sorry ("MSA MIPS16 code");
+
+  if (!TARGET_EXPLICIT_RELOCS)
+   sorry ("MIPS16 requires %<-mexplicit-relocs%>");
 }
   else
 {
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 8768933ba37..7145d23c650 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -145,6 +145,14 @@ struct mips_cpu_info {
  || TARGET_MICROMIPS)  \
 && mips_cb != MIPS_CB_NEVER)
 
+/* True if assembler support %gp_rel etc.  */
+#define TARGET_EXPLICIT_RELOCS \
+  (mips_opt_explicit_relocs >= MIPS_EXPLICIT_RELOCS_BASE)
+
+/* True if assembler support %pcrel_hi/%pcrel_lo.  */
+#define TARGET_EXPLICIT_RELOCS_PCREL \
+  (mips_opt_explicit_relocs >= MIPS_EXPLICIT_RELOCS_PCREL)
+
 /* True if the output file is marked as ".abicalls; .option pic0"
(-call_nonpic).  */
 #define TARGET_ABICALLS_PIC0 \
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index e8b411a8ffe..ce36942aabe 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -145,9 +145,30 @@ meva
 Target Var(TARGET_EVA)
 Use Enhanced Virtual Addressing instructions.
 
+Enum
+Name(mips_explicit_relocs) Type(int)
+The code model option names for -mexplicit-relocs:
+
+EnumValue
+Enum(mips_explicit_relocs) String(none) Value(MIPS_EXPLICIT_RELOCS_NONE)
+
+EnumValue
+Enum(mips_explicit_relocs) String(base) Value(MIPS_EXPLICIT_RELOCS_BASE)
+
+EnumValue
+Enum(mips_explicit_relocs) String(pcrel) Value(MIPS_EXPLICIT_RELOCS_PCREL)
+
+mexplicit-relocs=
+Target RejectNegative Joined Enum(mips_explicit_relocs) 
Var(mips_opt_explicit_relocs) Init(MIPS_EXPLICIT_RELOCS)
+Use %reloc() assembly operators.
+
 mexplicit-relocs
-Target Mask(EXPLICIT_RELOCS)
-Use NewABI-style %reloc() assembly operators.
+Target RejectNegative Alias(mexplicit-relocs=,base)
+Use %reloc() assembly operators (for backward compatibility).
+
+mno-explicit-relocs
+Target RejectNegative Alias(mexplicit-relocs=,none)
+Don't use %reloc() assembly operators (for backward compatibility).
 
 mextern-sdata

[commit] Sanitizer/MIPS: Use $t9 for preemptible function call

2024-01-17 Thread YunQiang Su
From: YunQiang Su 

Currently, almost all of the shared libraries of MIPS, rely on $t9
to get the address of current function, instead of PCREL instructions,
even on MIPSr6. So we have to set $t9 properly.

To get the address of preemptible function, we need the help of GOT.
MIPS/O32 has .cpload, which can help to generate 3 instructions to get GOT.
For __mips64, we can get GOT by:

lui $t8, %hi(%neg(%gp_rel(SANITIZER_STRINGIFY(TRAMPOLINE(func)
daddu $t8, $t8, $t9
daddiu $t8, $t8, %hi(%neg(%gp_rel(SANITIZER_STRINGIFY(TRAMPOLINE(func)

And then get the address of __interceptor_func, and jump to it

ld $t9, %got_disp(_interceptor" SANITIZER_STRINGIFY(func) ")($t8)
jr $t9

Upstream-Commit: 0a64367a72f1634321f5051221f05f2f364bd882

libsanitizer

* interception/interception.h (substitution_##func_name):
Use macro C_ASM_TAIL_CALL.
* sanitizer_common/sanitizer_asm.h: Define C_ASM_TAIL_CALL
for MIPS with help of t9.
---
 libsanitizer/interception/interception.h  |  5 ++--
 libsanitizer/sanitizer_common/sanitizer_asm.h | 23 +++
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/libsanitizer/interception/interception.h 
b/libsanitizer/interception/interception.h
index 9d8b60b2eef..58e969378a9 100644
--- a/libsanitizer/interception/interception.h
+++ b/libsanitizer/interception/interception.h
@@ -205,8 +205,9 @@ const interpose_substitution substitution_##func_name[] 
\
  ASM_TYPE_FUNCTION_STR "\n"
\
SANITIZER_STRINGIFY(TRAMPOLINE(func)) ":\n" 
\
SANITIZER_STRINGIFY(CFI_STARTPROC) "\n" 
\
-   SANITIZER_STRINGIFY(ASM_TAIL_CALL) " __interceptor_"
\
- SANITIZER_STRINGIFY(ASM_PREEMPTIBLE_SYM(func)) "\n"   
\
+   C_ASM_TAIL_CALL(SANITIZER_STRINGIFY(TRAMPOLINE(func)),  
\
+   "__interceptor_"
\
+ SANITIZER_STRINGIFY(ASM_PREEMPTIBLE_SYM(func))) "\n"  
\
SANITIZER_STRINGIFY(CFI_ENDPROC) "\n"   
\
".size  " SANITIZER_STRINGIFY(TRAMPOLINE(func)) ", "
\
 ".-" SANITIZER_STRINGIFY(TRAMPOLINE(func)) "\n"
\
diff --git a/libsanitizer/sanitizer_common/sanitizer_asm.h 
b/libsanitizer/sanitizer_common/sanitizer_asm.h
index bbb18cfbdf1..3af66a4e449 100644
--- a/libsanitizer/sanitizer_common/sanitizer_asm.h
+++ b/libsanitizer/sanitizer_common/sanitizer_asm.h
@@ -53,6 +53,29 @@
 # define ASM_TAIL_CALL tail
 #endif
 
+// Currently, almost all of the shared libraries rely on the value of
+// $t9 to get the address of current function, instead of PCREL, even
+// on MIPSr6. To be compatiable with them, we have to set $t9 properly.
+// MIPS uses GOT to get the address of preemptible functions.
+#if defined(__mips64)
+#  define C_ASM_TAIL_CALL(t_func, i_func)   \
+"lui $t8, %hi(%neg(%gp_rel(" t_func ")))\n" \
+"daddu $t8, $t8, $t9\n" \
+"daddiu $t8, $t8, %lo(%neg(%gp_rel(" t_func ")))\n" \
+"ld $t9, %got_disp(" i_func ")($t8)\n"  \
+"jr $t9\n"
+#elif defined(__mips__)
+#  define C_ASM_TAIL_CALL(t_func, i_func)   \
+".setnoreorder\n"   \
+".cpload $t9\n" \
+".setreorder\n" \
+"lw $t9, %got(" i_func ")($gp)\n"   \
+"jr $t9\n"
+#elif defined(ASM_TAIL_CALL)
+#  define C_ASM_TAIL_CALL(t_func, i_func)   \
+SANITIZER_STRINGIFY(ASM_TAIL_CALL) " " i_func
+#endif
+
 #if defined(__ELF__) && defined(__x86_64__) || defined(__i386__) || \
 defined(__riscv)
 # define ASM_PREEMPTIBLE_SYM(sym) sym@plt
-- 
2.39.2



Re: [PATCH] combine: Don't optimize SIGN_EXTEND of MEM on WORD_REGISTER_OPERATIONS targets [PR113010]

2024-01-16 Thread YunQiang Su
Greg McGary  于2024年1月17日周三 06:20写道:
>
> The sign bit of a sign-extending load cannot be known until runtime,
> so don't attempt to simplify it in the combiner.
>
> 2024-01-11  Greg McGary  
>
> PR rtl-optimization/113010
> * combine.cc (expand_compound_operation): Don't simplify
> SIGN_EXTEND of a MEM on WORD_REGISTER_OPERATIONS targets
>
> * gcc.c-torture/execute/pr113010.c: New test.
> ---
>  gcc/combine.cc | 5 +
>  gcc/testsuite/gcc.c-torture/execute/pr113010.c | 9 +
>  2 files changed, 14 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
>
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 812553c091e..ba587184dfc 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -7208,6 +7208,11 @@ expand_compound_operation (rtx x)
>if (len == 0)
> return x;
>
> +  /* Sign-extending loads can never be simplified at compile time.  */
> +  if (WORD_REGISTER_OPERATIONS && MEM_P (XEXP (x, 0))
> + && load_extend_op (inner_mode) == SIGN_EXTEND)
> +   return x;
> +
>break;
>
>  case ZERO_EXTRACT:
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c 
> b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
> new file mode 100644
> index 000..a95c613c1df
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
> @@ -0,0 +1,9 @@
> +int minus_1 = -1;
> +
> +int
> +main ()
> +{
> +  if ((0, 0xul) >= minus_1)

There is a warning option:

-Wsign-compare
Warn when a comparison between signed and unsigned values could
produce an incorrect result when the signed value is converted to unsigned.

> +__builtin_abort ();
> +  return 0;
> +}
> --
> 2.34.1
>


-- 
YunQiang Su


Re: [PATCH v2] LoongArch: testsuite:Added additional vectorization "-mlsx" option.

2024-01-14 Thread YunQiang Su
Xi Ruoyao  于2024年1月15日周一 12:11写道:
>
> On Mon, 2024-01-15 at 09:29 +0800, chenxiaolong wrote:
> > At 21:13 +0800 on Saturday, 2024-01-13, Xi Ruoyao wrote:
> > > At 15:28 +0800 on Saturday 2024-01-13, chenxiaolong wrote:
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >   * gcc.dg/pr104992.c: Added additional "-mlsx" compilation
> > > > options.
> > > >   * gcc.dg/signbit-2.c: Dito.
> > > >   * gcc.dg/tree-ssa/scev-16.c: Dito.
> > > >   * gfortran.dg/graphite/vect-pr40979.f90: Dito.
> > > >   * gfortran.dg/vect/fast-math-mgrid-resid.f: Dito.
> > >
> > > I don't feel it right about the changes to pr104992.c and scev-16.c
> > > because no other architectures add special options there.  Why are we
> > > so special?
>
> >
> > Because on the LoongArch architecture, GCC requires the addition of
> > vectorization options in order to generate vector code. Use the
> > check_effective_target_vect_cmdline_needed command in the lib/target-
> > supports.exp file to set whether the command line option is needed to
> > enable vectorizations. For example, ia64,x86,aarch64, and riscv
> > architectures, vectorization is enabled by default.
>
> But no.  The default baseline of 32-bit x86 is i686, which is basically
> a Pentium III launched in 1999 without any vector instructions.
>
> We are still missing something here.
>
There is a line
 #define vector __attribute__((vector_size(4*sizeof(int
I guess it is the syntax needs to be supported.




-- 
YunQiang Su


Re: MIPS: the method of getting GOT address for PIC code

2024-01-14 Thread YunQiang Su
YunQiang Su  于2023年8月25日周五 15:16写道:
>
> When working on LLVM, I found this problem
> https://github.com/llvm/llvm-project/issues/64974.
> Maybe it's time for us to reconsider the way of getting GOT address
> for PIC code.
>

I have my draft patch pushed to GitHub:
https://github.com/wzssyqa/gcc/tree/pcrel
And the patch is also attached.

Any comment is welcome.

-- 
YunQiang Su


0001-MIPS-PCREL-support.patch
Description: Binary data


[PATCH] MIPS: avoid $gp store if global_pointer is not $gp

2024-01-14 Thread YunQiang Su
$GP is used for expanding GOT load, and in the afterward passes,
we will try to use a temporary register instead.

If sucess, we have no need to store and reload $gp. The example
of failure is that the function calls a preemtive function.

We shouldn't use $GP for any other purpose in the code we generate.
If a user's inline asm code clobbers $GP, it's their duty to save
and restore $GP.

gcc
* config/mips/mips.cc (mips_compute_frame_info): If another
register is used as global_pointer, mark $GP live false.

gcc/testsuite
* gcc.target/mips/mips.exp (mips_option_groups):
Add -mxgot/-mno-xgot options.
* gcc.target/mips/xgot-n32-avoid-gp.c: New test.
* gcc.target/mips/xgot-n32-need-gp.c: New test.
---
 gcc/config/mips/mips.cc   |  2 ++
 gcc/testsuite/gcc.target/mips/mips.exp|  1 +
 gcc/testsuite/gcc.target/mips/xgot-n32-avoid-gp.c | 11 +++
 gcc/testsuite/gcc.target/mips/xgot-n32-need-gp.c  | 11 +++
 4 files changed, 25 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/xgot-n32-avoid-gp.c
 create mode 100644 gcc/testsuite/gcc.target/mips/xgot-n32-need-gp.c

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index e752019b5e2..30e99811ff6 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -11353,6 +11353,8 @@ mips_compute_frame_info (void)
  in, which is why the global_pointer field is initialised here and not
  earlier.  */
   cfun->machine->global_pointer = mips_global_pointer ();
+  if (cfun->machine->global_pointer != GLOBAL_POINTER_REGNUM)
+df_set_regs_ever_live (GLOBAL_POINTER_REGNUM, false);
 
   offset = frame->args_size + frame->cprestore_size;
 
diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 9f8d533cfa5..e028bc93b40 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -266,6 +266,7 @@ set mips_option_groups {
 stack-protector "-fstack-protector"
 stdlib "REQUIRES_STDLIB"
 unaligned-access "-m(no-|)unaligned-access"
+xgot "-m(no-|)xgot"
 }
 
 for { set option 0 } { $option < 32 } { incr option } {
diff --git a/gcc/testsuite/gcc.target/mips/xgot-n32-avoid-gp.c 
b/gcc/testsuite/gcc.target/mips/xgot-n32-avoid-gp.c
new file mode 100644
index 000..3f52fc5a765
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/xgot-n32-avoid-gp.c
@@ -0,0 +1,11 @@
+/* Check if we skip store and load gp if there is no stub function call.  */
+/* { dg-options "-mips64r2 -mxgot -mabi=n32 -fPIC" } */
+
+extern int a;
+int
+foo ()
+{
+  return a;
+}
+/* { dg-final { scan-assembler-not "\tsd\t\\\$28," } } */
+/* { dg-final { scan-assembler-not "\tld\t\\\$28," } } */
diff --git a/gcc/testsuite/gcc.target/mips/xgot-n32-need-gp.c 
b/gcc/testsuite/gcc.target/mips/xgot-n32-need-gp.c
new file mode 100644
index 000..631409cb7fe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/xgot-n32-need-gp.c
@@ -0,0 +1,11 @@
+/* We cannot skip store and load gp if there is stub function call.  */
+/* { dg-options "-mips64r2 -mxgot -mabi=n32 -fPIC" } */
+
+extern int f();
+int
+foo ()
+{
+  return f();
+}
+/* { dg-final { scan-assembler "\tsd\t\\\$28," } } */
+/* { dg-final { scan-assembler "\tld\t\\\$28," } } */
-- 
2.39.2



Re: [pushed][PR112918][LRA]: Fixing IRA ICE on m68k

2024-01-12 Thread YunQiang Su
Vladimir Makarov  于2024年1月11日周四 22:35写道:
>
> The following patch fixes
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112918
>
> The patch was successfully bootstrapped and tested on x86_64, aarch64,
> ppc64le

This patch causes some ICE on MIPS:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113354

PS: how to test cross build for mips:

1. apt install g++-multilib-mipsel-linux-gnu
2. apply patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/641619.html
3. ../configure --target=mipsel-linux-gnu \
  --includedir=/usr/mipsel-linux-gnu/include --enable-multilib \
  --with-arch-32=mips32r2 --with-fp-32=xx \
  --enable-multiarch --enable-targets=all \
  --with-arch-64=mips64r2 --prefix=/usr --disable-libsanitizer
4. make -j

-- 
YunQiang Su


[commit] MIPS: Add ATTRIBUTE_UNUSED to mips_start_function_definition

2024-01-11 Thread YunQiang Su
Fix build warning:
  mips.cc: warning: unused parameter 'decl'.

gcc
* config/mips/mips.cc (mips_start_function_definition):
Add ATTRIBUTE_UNUSED.
---
 gcc/config/mips/mips.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 60b336e43d0..e752019b5e2 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -7330,7 +7330,8 @@ mips_start_unique_function (const char *name)
function contains MIPS16 code.  */
 
 static void
-mips_start_function_definition (const char *name, bool mips16_p, tree decl)
+mips_start_function_definition (const char *name, bool mips16_p,
+   tree decl ATTRIBUTE_UNUSED)
 {
   if (mips16_p)
 fprintf (asm_out_file, "\t.set\tmips16\n");
-- 
2.39.2



Re: Ping^3: [PATCH] Add a late-combine pass [PR106594]

2024-01-05 Thread YunQiang Su
I have tested this patch on mips64el: No regression.


[committed] MIPS/testsuite: Include stdio.h in mipscop tests

2024-01-03 Thread YunQiang Su
gcc/testsuite

* gcc.c-torture/compile/mipscop-1.c: Include stdio.h.
* gcc.c-torture/compile/mipscop-2.c: Ditto.
* gcc.c-torture/compile/mipscop-3.c: Ditto.
* gcc.c-torture/compile/mipscop-4.c: Ditto.
---
 gcc/testsuite/gcc.c-torture/compile/mipscop-1.c | 1 +
 gcc/testsuite/gcc.c-torture/compile/mipscop-2.c | 1 +
 gcc/testsuite/gcc.c-torture/compile/mipscop-3.c | 1 +
 gcc/testsuite/gcc.c-torture/compile/mipscop-4.c | 1 +
 4 files changed, 4 insertions(+)

diff --git a/gcc/testsuite/gcc.c-torture/compile/mipscop-1.c 
b/gcc/testsuite/gcc.c-torture/compile/mipscop-1.c
index 8a40ba1c6b7..2ba0ee79b8b 100644
--- a/gcc/testsuite/gcc.c-torture/compile/mipscop-1.c
+++ b/gcc/testsuite/gcc.c-torture/compile/mipscop-1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target mips*-*-* } } */
 
+#include 
 register unsigned int cp0count asm ("$c0r1");
 
 int __attribute__ ((nomips16))
diff --git a/gcc/testsuite/gcc.c-torture/compile/mipscop-2.c 
b/gcc/testsuite/gcc.c-torture/compile/mipscop-2.c
index 94df41d65f9..6fffc8ec098 100644
--- a/gcc/testsuite/gcc.c-torture/compile/mipscop-2.c
+++ b/gcc/testsuite/gcc.c-torture/compile/mipscop-2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target mips*-*-* } } */
 
+#include 
 register unsigned int c3r1 asm ("$c3r1");
 
 extern unsigned int b, c;
diff --git a/gcc/testsuite/gcc.c-torture/compile/mipscop-3.c 
b/gcc/testsuite/gcc.c-torture/compile/mipscop-3.c
index cb4bd4d3efb..03e30117bc1 100644
--- a/gcc/testsuite/gcc.c-torture/compile/mipscop-3.c
+++ b/gcc/testsuite/gcc.c-torture/compile/mipscop-3.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target mips*-*-* } } */
 
+#include 
 register unsigned int c3r1 asm ("$c3r1"), c3r2 asm ("$c3r2");
 
 extern unsigned int b, c;
diff --git a/gcc/testsuite/gcc.c-torture/compile/mipscop-4.c 
b/gcc/testsuite/gcc.c-torture/compile/mipscop-4.c
index 263fc5cacc1..7e000c1c68a 100644
--- a/gcc/testsuite/gcc.c-torture/compile/mipscop-4.c
+++ b/gcc/testsuite/gcc.c-torture/compile/mipscop-4.c
@@ -1,5 +1,6 @@
 /* { dg-do compile { target mips*-*-* } } */
 
+#include 
 register unsigned long c3r1 asm ("$c3r1"), c3r2 asm ("$c3r2");
 
 extern unsigned long b, c;
-- 
2.39.2



[committed] MIPS: Add pattern insqisi_extended and inshisi_extended

2024-01-03 Thread YunQiang Su
This match pattern allows combination (zero_extract:DI 8, 24, QI)
with an sign-extend to 32bit INS instruction on TARGET_64BIT.

For SI mode, if the sign-bit is modified by bitops, we will need a
sign-extend operation.  Since 32bit INS instruction can be sure that
result is sign-extended, and the QImode src register is safe for INS, too.

(insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
(const_int 8 [0x8])
(const_int 24 [0x18]))
(subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1
 (nil))
(insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
(sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0))) "../xx.c":7:29 -1
 (nil))

Combine try to merge them to:

(insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
(sign_extend:DI (ior:SI (and:SI (subreg:SI (reg/v:DI 200 [ val ]) 0)
(const_int 16777215 [0xff]))
(ashift:SI (subreg:SI (reg:QI 205 [ MEM[(const unsigned char 
*)buf_8(D) + 3B] ]) 0)
(const_int 24 [0x18]) "../xx.c":7:29 18 {*insv_extended}
 (expr_list:REG_DEAD (reg:QI 205 [ MEM[(const unsigned char *)buf_8(D) + 
3B] ])
(nil)))

And do similarly for 16/16 pair:
(insn 13 12 14 2 (set (zero_extract:DI (reg/v:DI 198 [ val ])
(const_int 16 [0x10])
(const_int 16 [0x10]))
(subreg:DI (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) + 2B] 
]) 0)) "xx.c":5:30 286 {*insvdi}
 (expr_list:REG_DEAD (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) 
+ 2B] ])
(nil)))
(insn 14 13 17 2 (set (reg/v:DI 198 [ val ])
(sign_extend:DI (subreg:SI (reg/v:DI 198 [ val ]) 0))) "xx.c":5:30 241 
{extendsidi2}
 (nil))
>
(insn 14 13 17 2 (set (reg/v:DI 198 [ val ])
(sign_extend:DI (ior:SI (ashift:SI (subreg:SI (reg:HI 201 [ MEM[(const 
short unsigned int *)buf_6(D) + 2B] ]) 0)
(const_int 16 [0x10]))
(zero_extend:SI (subreg:HI (reg/v:DI 198 [ val ]) 0) 
"xx.c":5:30 284 {*inshisi_extended}
 (expr_list:REG_DEAD (reg:HI 201 [ MEM[(const short unsigned int *)buf_6(D) 
+ 2B] ])
(nil)))

Let's accept these patterns, and set the cost to 1 instruction.

gcc

PR rtl-optimization/104914
* config/mips/mips.md (insqisi_extended): New patterns.
(inshisi_extended): Ditto.

gcc/testsuite

* gcc.target/mips/pr104914.c: New test.
---
 gcc/config/mips/mips.md  | 24 +++
 gcc/testsuite/gcc.target/mips/pr104914.c | 25 
 2 files changed, 49 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c

diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index e1762ce105b..17dfcbd6722 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -4419,6 +4419,30 @@ (define_insn "*extzv_truncsi_exts"
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
+(define_insn "*insqisi_extended"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+(sign_extend:DI
+  (ior:SI (and:SI (subreg:SI (match_dup 0) 0)
+   (const_int 16777215))
+ (ashift:SI
+   (subreg:SI (match_operand:QI 1 "register_operand" "d") 0)
+   (const_int 24)]
+  "TARGET_64BIT && !TARGET_MIPS16 && ISA_HAS_EXT_INS"
+  "ins\t%0,%1,24,8"
+  [(set_attr "mode" "SI")
+   (set_attr "perf_ratio" "1")])
+
+(define_insn "*inshisi_extended"
+  [(set (match_operand:DI 0 "register_operand" "=d")
+(sign_extend:DI
+  (ior:SI
+   (ashift:SI (subreg:SI (match_operand:HI 1 "register_operand" "d") 0)
+ (const_int 16))
+   (zero_extend:SI (subreg:HI (match_dup 0) 0)]
+  "TARGET_64BIT && !TARGET_MIPS16 && ISA_HAS_EXT_INS"
+  "ins\t%0,%1,16,16"
+  [(set_attr "mode" "SI")
+   (set_attr "perf_ratio" "1")])
 
 (define_expand "insvmisalign"
   [(set (zero_extract:GPR (match_operand:BLK 0 "memory_operand")
diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c 
b/gcc/testsuite/gcc.target/mips/pr104914.c
new file mode 100644
index 000..5dd10e84c17
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr104914.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-mabi=64" } */
+
+extern void abort (void);
+extern void exit (int);
+
+NOMIPS16 int test (const unsigned char *buf)
+{
+  int val;
+  ((unsigned char*)&val)[0] = *buf++;
+  ((unsigned char*)&val)[1] = *buf++;
+  ((unsigned char*)&val)[2] = *buf++;
+  ((unsigned char*)&val)[3] = *buf++;
+  if(val > 0)
+return 1;
+  else
+return 0;
+}
+
+int main ()
+{
+  if (test("\xff\xff\xff\xff") != 0)
+abort();
+  exit(0);
+}
-- 
2.39.2



[committed] MIPS: Implement TARGET_INSN_COSTS

2024-01-03 Thread YunQiang Su
When combine some instructions, the generic `rtx_cost`
may over estimate the cost of result RTL, due to that
the RTL may be quite complex and `rtx_cost` has no
information that this RTL can be convert to simple
hardware instruction(s).

In this case, Let's use `insn_count * perf_ratio` to
estimate the cost if both of them are available.
Otherwise fallback to pattern_cost.

When non-speed, Let's use the length as cost.

gcc

* config/mips/mips.cc (mips_insn_cost): New function.

gcc/testsuite

* gcc.target/mips/data-sym-multi-pool.c: Skip Os or -O0.
---
 gcc/config/mips/mips.cc   | 33 +++
 .../gcc.target/mips/data-sym-multi-pool.c |  2 +-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 3131749d6ea..46b7d9b64ff 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -4170,6 +4170,37 @@ mips_set_reg_reg_cost (machine_mode mode)
 }
 }
 
+/* Implement TARGET_INSN_COSTS.  */
+
+static int
+mips_insn_cost (rtx_insn *x, bool speed)
+{
+  int cost;
+  int count;
+  int ratio;
+
+  if (recog_memoized (x) < 0
+  && GET_CODE (PATTERN (x)) != ASM_INPUT
+  && asm_noperands (PATTERN (x)) < 0)
+goto pattern_cost;
+
+  /* FIXME: return get_attr_length?  More tests may be needed.  */
+  if (!speed)
+goto pattern_cost;
+
+  count = get_attr_insn_count (x);
+  ratio = get_attr_perf_ratio (x);
+  cost = count * ratio;
+  if (cost > 0)
+return cost;
+
+pattern_cost:
+  cost = pattern_cost (PATTERN (x), speed);
+  /* If the cost is zero, then it's likely a complex insn.
+ FIXME: Return COSTS_N_INSNS (2)?  More tests are needed.  */
+  return cost;
+}
+
 /* Implement TARGET_RTX_COSTS.  */
 
 static bool
@@ -23069,6 +23100,8 @@ mips_bit_clear_p (enum machine_mode mode, unsigned 
HOST_WIDE_INT m)
 #define TARGET_RTX_COSTS mips_rtx_costs
 #undef TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST mips_address_cost
+#undef  TARGET_INSN_COST
+#define TARGET_INSN_COST mips_insn_cost
 
 #undef TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P
 #define TARGET_NO_SPECULATION_IN_DELAY_SLOTS_P 
mips_no_speculation_in_delay_slots_p
diff --git a/gcc/testsuite/gcc.target/mips/data-sym-multi-pool.c 
b/gcc/testsuite/gcc.target/mips/data-sym-multi-pool.c
index 3cf2d4f0248..8643095ff9f 100644
--- a/gcc/testsuite/gcc.target/mips/data-sym-multi-pool.c
+++ b/gcc/testsuite/gcc.target/mips/data-sym-multi-pool.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-mips16 -mcode-readable=yes -fno-tree-vrp 
-fno-tree-dominator-opts" } */
-/* { dg-skip-if "per-function expected output" { *-*-* } { "-flto" } { "" } } 
*/
+/* { dg-skip-if "per-function expected output" { *-*-* } { "-flto" "-O0" "-Os" 
} { "" } } */
 
 /* This testcase generates multiple constant pools within a function body.  */
 
-- 
2.39.2



[committed] MIPS: define_attr perf_ratio in mips.md

2024-01-03 Thread YunQiang Su
The accurate cost of an pattern can get with
 insn_count * perf_ratio

The default value is set to 0 instead of 1, since that
we will need to distinguish the default value and it is
really set for an pattern.  Since it is not set for most
patterns yet, to use it, we will need to be sure that it's
value is greater than 0.

This attr will be used in `mips_insn_cost`.

gcc

* config/mips/mips.md (perf_ratio): New attribute.
---
 gcc/config/mips/mips.md | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 6d47241ea3a..e1762ce105b 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -312,6 +312,10 @@ (define_attr "sync_insn2" "nop,and,xor,not"
 ;; "11" specifies MEMMODEL_ACQUIRE.
 (define_attr "sync_memmodel" "" (const_int 10))
 
+;; Performance ratio.  Add this attr to the slow INSNs.
+;; Used by mips_insn_cost.
+(define_attr "perf_ratio" "" (const_int 0))
+
 ;; Accumulator operand for madd patterns.
 (define_attr "accum_in" "none,0,1,2,3,4,5" (const_string "none"))
 
-- 
2.39.2



[PATCH] config-ml.in: Fix multi-os-dir search

2024-01-01 Thread YunQiang Su
When building multilib libraries, CC/CXX etc are set with an option
-B*/lib/, instead of -B/lib/.
This will make some trouble in some case, for example building
cross toolchain based on Debian's cross packages:

  If we have libc6-dev-i386-amd64-cross packages installed on
  a non-x86 machine. This package will have the files in
  /usr/x86_4-linux-gnu/lib32.  The fellow configure will fail
  when build libgcc for i386, with complains the libc is not
  i386 ones:
 ../configure --enable-multilib --enable-multilib \
--target=x86_64-linux-gnu

Let's insert a "-B*/lib/`CC ${flags} --print-multi-os-directory`"
before "-B*/lib/".

This patch is based on the patch used by Debian now.

ChangeLog

* config-ml.in: Insert an -B option with multi-os-dir into
compiler commands used to build libraries.
---
 config-ml.in | 20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/config-ml.in b/config-ml.in
index 68854a4f16c..645cac822fd 100644
--- a/config-ml.in
+++ b/config-ml.in
@@ -514,7 +514,12 @@ multi-do:
else \
  if [ -d ../$${dir}/$${lib} ]; then \
flags=`echo $$i | sed -e 's/^[^;]*;//' -e 's/@/ -/g'`; \
-   if (cd ../$${dir}/$${lib}; $(MAKE) $(FLAGS_TO_PASS) \
+   libsuffix_=`$${compiler} $${flags} --print-multi-os-directory`; 
\
+   if (cd ../$${dir}/$${lib}; $(MAKE) $(subst \
+   -B$(build_tooldir)/lib/, \
+   -B$(build_tooldir)/lib/$${libsuffix_}/ \
+   -B$(build_tooldir)/lib/, \
+   $(FLAGS_TO_PASS)) \
CFLAGS="$(CFLAGS) $${flags}" \
CCASFLAGS="$(CCASFLAGS) $${flags}" \
FCFLAGS="$(FCFLAGS) $${flags}" \
@@ -768,6 +773,7 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
# Create a regular expression that matches any string as long
# as ML_POPDIR.
popdir_rx=`echo "${ML_POPDIR}" | sed 's,.,.,g'`
+   multi_osdir=`${CC-gcc} ${flags} --print-multi-os-directory 2>/dev/null`
CC_=
for arg in ${CC}; do
  case $arg in
@@ -775,6 +781,8 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
CC_="${CC_}"`echo "X${arg}" | sed -n 
"s/X\\(-[BIL]${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X-[BIL]${popdir_rx}\\(.*\\)/\1/p"`' ' ;;
  "${ML_POPDIR}"/*)
CC_="${CC_}"`echo "X${arg}" | sed -n 
"s/X\\(${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
+ -B*/lib/)
+   CC_="${CC_}${arg}${multi_osdir} ${arg} " ;;
  *)
CC_="${CC_}${arg} " ;;
  esac
@@ -787,6 +795,8 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
CXX_="${CXX_}"`echo "X${arg}" | sed -n 
"s/X\\(-[BIL]${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X-[BIL]${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
  "${ML_POPDIR}"/*)
CXX_="${CXX_}"`echo "X${arg}" | sed -n 
"s/X\\(${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
+ -B*/lib/)
+   CXX_="${CXX_}${arg}${multi_osdir} ${arg} " ;;
  *)
CXX_="${CXX_}${arg} " ;;
  esac
@@ -799,6 +809,8 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
F77_="${F77_}"`echo "X${arg}" | sed -n 
"s/X\\(-[BIL]${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X-[BIL]${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
  "${ML_POPDIR}"/*)
F77_="${F77_}"`echo "X${arg}" | sed -n 
"s/X\\(${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
+ -B*/lib/)
+   F77_="${F77_}${arg}${multi_osdir} ${arg} " ;;
  *)
F77_="${F77_}${arg} " ;;
  esac
@@ -811,6 +823,8 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
GFORTRAN_="${GFORTRAN_}"`echo "X${arg}" | sed -n 
"s/X\\(-[BIL]${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X-[BIL]${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
  "${ML_POPDIR}"/*)
GFORTRAN_="${GFORTRAN_}"`echo "X${arg}" | sed -n 
"s/X\\(${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
+ -B*/lib/)
+   GFORTRAN_="${GFORTRAN_}${arg}${multi_osdir} ${arg} " ;;
  *)
GFORTRAN_="${GFORTRAN_}${arg} " ;;
  esac
@@ -823,6 +837,8 @@ if [ -n "${multidirs}" ] && [ -z "${ml_norecursion}" ]; then
GOC_="${GOC_}"`echo "X${arg}" | sed -n 
"s/X\\(-[BIL]${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | sed -n 
"s/X-[BIL]${popdir_rx}\\(.*\\)/\\1/p"`' ' ;;
  "${ML_POPDIR}"/*)
GOC_="${GOC_}"`echo "X${arg}" | sed -n 
"s/X\\(${popdir_rx}\\).*/\\1/p"`/${ml_dir}`echo "X${arg}" | se

Re: Ping^3: [PATCH] Add a late-combine pass [PR106594]

2023-12-31 Thread YunQiang Su
Richard Sandiford  于2023年12月30日周六 23:35写道:
>
> Ping^3
>

I am testing it on MIPS.

> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -55,6 +55,7 @@ static const struct default_options 
> aarch_option_optimization_table[] =
>  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
>  /* Enable redundant extension instructions removal at -O2 and higher.  */
>  { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
> +{ OPT_LEVELS_2_PLUS, OPT_flate_combine_instructions, NULL, 1 },
>  #if (TARGET_DEFAULT_ASYNC_UNWIND_TABLES == 1)

Need refresh here.

>  { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
>  { OPT_LEVELS_ALL, OPT_funwind_tables, NULL, 1},
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5a9284d635c..d0576ac97cf 100644


  1   2   3   >