Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Jeff Law




On 12/23/23 17:49, Roger Sayle wrote:


Hi YunQiang (and Jeff),


MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) ==
true based on that the hard register is always sign-extended, but
here the hard register is polluted by zero_extract.


I suspect that the bug here is that the MIPS backend shouldn't be
returning true for TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode).
It's true that the backend stores SImode values in DImode registers
by sign extending them, but this doesn't mean that any DImode pseudo
register can be truncated to an SImode pseudo just by SUBREG/register
naming.  As you point out, if the high bits of a DImode value are
random, truncation isn't a no-op, and requires an explicit
sign-extension instruction.
What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't actually a 
truncation!  The output precision is first, the input precision is 
second.  The docs explicitly state the output precision should be 
smaller than the input precision (which makes sense for truncation).


That's where I'd start with trying to untangle this mess.





I agree with Jeff there's an invariant that isn't correctly being
modelled by the MIPS machine description.  A machine description
probably shouldn't define an addsi3  pattern if what it actually
supports is (sign_extend:DI (truncate:SI (plus:DI (reg:DI x) (reg:DI
y Trying to model this as SImode addition plus a SUBREG_PROMOTED
flag is less than ideal.
It's less than ideal, but we ended up taking a similar approach in the 
RV world.  We actually have a subset of 32bit instructions in rv64, 
including a 32bit add.


The semantics are that it's a (sign_extend:DI (plus:SI (op1) (op2)))

Modeling it that way was actually critical in eliminating redundant sign 
extensions.


But regardless, it looks like there's something weird going on in the 
MIPS port.


jeff


Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Jeff Law




On 12/23/23 15:46, YunQiang Su wrote:

Jeff Law  于2023年12月24日周日 00:51写道:




On 12/23/23 01:58, YunQiang Su wrote:

On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
if 31 or above bits is polluted by an bitops, we will need an
truncate. Let's emit one, and mark let's use the same hardreg
as in and out, the RTL may like:

(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
  (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
   (nil))

We use /s/u flags to mark it as really needed, as in
combine_simplify_rtx, this insn may be considered as truncated,
so let's skip this combination.

gcc/ChangeLog:
  PR: 104914.
  * combine.cc (try_combine): Skip combine with truncate if
   dest is subreg and has /u/s flags on platforms
   TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
   * expr.cc (expand_assignment): Emit a truncate insn, if
   31+ bits is polluted for SImode.

gcc/testsuite/ChangeLog:
   PR: 104914.
   * gcc.target/mips/pr104914.c: New testcase.

I would suggest you show the RTL before/after whatever transformation
has caused problems on your target and explain why you think the
transformation is incorrect.



Before this patch, the RTL is like this
  (insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
(const_int 8 [0x8])
(const_int 24 [0x18]))
(subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1
 (nil))
   (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
(sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
(nil))
  (jump_insn 23 20 24 2 (set (pc)
(if_then_else (lt (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
(const_int 0 [0]))
 (label_ref 32)
(pc))) "../xx.c":10:5 -1
(int_list:REG_BR_PROB 440234148 (nil))
   -> 32)

and then, when combine
   (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
  (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
   (nil))
will be convert to
   (note 20 19 23 2 NOTE_INSN_DELETED)
MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true
based on that the hard register is always sign-extended, but here
the hard register is polluted by zero_extract.

If we just patch combine.cc to make it not eat sign_extend, here,
sign_extend will still disappear in the later passes, due to mips define
sign_extend as "emit_note (NOTE_INSN_DELETED)".

So I tried to insert a new truncate RTX here,
 (insn 21 20 24 2 (set (reg/v:DI 200 [ val ])
  (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
  (nil))
This is the RTL for this C code
  int32_t fun (int64_t arg) {
   int32_t a = (int32_t) arg;
   return a;
  }
But, the `reload` pass will get an ICE. I haven't dig the real problem.
If the new RTX is
 (insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
(truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
(nil))
`reload` pass will happily accept it, and then it is converted to
  # this instruction will be sure the reg is well sign extended.
  `sll $rN, $rN, 0`
hard instruction.

The problem is that simple-rtx (called by combine) will believe that
REG 200 has been truncated to SImode, as the dest has an
subreg:SI.

So, I use /s/u flags to tell combine don't do so.


Focus on the RTL semantics as well as the target specific semantics
because both are critically important here.

I strongly suspect you're just papering over a problem elsewhere.



Yes. I also guess so.  Any new idea?
Well, I see multiple intertwined issues and I think MIPS has largely 
mucked this up.


At a high level DI -> SI truncation is not a nop on MIPS64.  We must 
explicitly sign extend the value from SI->DI to preserve the invariant 
that SI mode objects are extended to DImode.  If we fail to do that, 
then the SImode conditional branch patterns simply aren't going to work.


What doesn't make sense to me is that for truncation, the output mode is 
going to be smaller than the input mode.  Which makes logical sense and 
is codified in the documentation:



@deftypefn {Target Hook} bool TARGET_TRULY_NOOP_TRUNCATION (poly_uint64 
@var{outprec}, poly_uint64 @var{inprec})
This hook returns true if it is safe to ``convert'' a value of
@var{inprec} bits to one of @var{outprec} bits (where @var{outprec} is
smaller than @var{inprec}) by merely operating on it as if it had only
@var{outprec} bits.  The default returns true unconditionally, which
is correct for most machines.  When @code{TARGET_TRULY_NOOP_TRUNCATION}
returns false, the machine description should provide a @code{trunc}
optab to specify the RTL that performs the required truncation.



Yet the implementation in the mips backend:


static bool
mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
{
  return !TARGET_64BIT || inprec 

RE: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread Li, Pan2
Thanks Jeff for comments.

> Isn't this going to XPASS for non-vector configurations?

Yes, I think we still need something like riscv_v here.

> If I understand correctly, the test requires loop unrolling and its 
> associated variable expansion to trigger the desired behavior.  VLA 
> style vectorization is inhibiting loop unrolling and thus we get the 
> failure?

Yes, exactly.

> So the natural question here is whether or not aarch64 SVE sees the same 
> failure, if not, why?  If so, then can we conditionalize this on an 
> effective target test (check_effective_target_vect_variable_length perhaps?)

Sure, will have a try for this.

Pan

-Original Message-
From: Jeff Law  
Sent: Sunday, December 24, 2023 1:20 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; richard.guent...@gmail.com
Subject: Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with 
variable factor



On 12/23/23 04:07, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to XFAIL the test case pr30957-1.c for the RVV when
> build the elf with some configurations (list at the end of the log)
> It will be vectorized during vect_transform_loop with a variable factor.
> It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
> Of course, it will do nothing during unroll_loops when loop->unroll is 1.
> 
> After this patch the loops vectorized with a variable factor of the RVV
> will be treated as XFAIL by the tree dump.
> 
> Aka the blow configuration will be treated as XFAIL and we still need
> further investigation for the failures of other configurations.
> 
> * riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> 

RE: [PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread Li, Pan2
Committed, thanks Jeff.

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, December 23, 2023 11:38 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
kito.ch...@gmail.com; richard.guent...@gmail.com; tamar.christ...@arm.com
Subject: Re: [PATCH v2] RISC-V: XFail the signbit-5 run test for RVV



On 12/23/23 05:39, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to XFail the signbit-5 run test case for
> the RVV.  Given the case has one limitation like "This test does not
> work when the truth type does not match vector type." in the beginning
> of the test file.  Aka, the RVV vector truth type is not integer type.
> 
> The target board of riscv-sim like below will pick up `-march=rv64gcv`
> when building the run test elf. Thus, the RVV cannot bypass this test
> case like aarch64_sve with additional option `-march=armv8-a`.
> 
>riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
> 
> For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.
> 
> The signbit-5.c passed test with below configurations but we need
> further investigation for the failures of other configurations.
> 
> * riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
> * riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow
> * 
> riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
> * 
> riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
> * 
> riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
> * 
> 

Re: Fortran: Use non conflicting file extensions for intermediates [PR81615]

2023-12-23 Thread Rimvydas Jasinskas
Documentation part.
The makeinfo gcc/fortran/gfortran.texi does not seem to have any new warnings.
Is there a specific reason thy -fc-prototypes (Interoperability
Options section) is excluded from manpage?

Regards,
Rimvydas
From 3adb6cd8a2aa1576a8ff63b280239e725f1f112e Mon Sep 17 00:00:00 2001
From: Rimvydas Jasinskas 
Date: Sat, 23 Dec 2023 18:59:09 +
Subject: Fortran: Add Developer Options mini-section to documentation

Separate out -fdump-* options to the new section.  Sort by option name.

While there, document -save-temps intermediates.

gcc/fortran/ChangeLog:

	* invoke.texi: Add Developer Options section.  Move '-fdump-*'
	to it.  Add small examples about changed '-save-temps' behavior.

Signed-off-by: Rimvydas Jasinskas 
---
 gcc/fortran/invoke.texi | 117 ++--
 1 file changed, 77 insertions(+), 40 deletions(-)

diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index c7fd019a7c5..6b85fb8dff0 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -94,12 +94,13 @@ one is not the default.
  compiled.
 * Preprocessing Options::  Enable and customize preprocessing.
 * Error and Warning Options:: How picky should the compiler be?
-* Debugging Options::   Symbol tables, measurements, and debugging dumps.
+* Debugging Options::   Symbol tables, measurements.
 * Directory Options::   Where to find module files
 * Link Options ::   Influencing the linking step
 * Runtime Options:: Influencing runtime behavior
 * Code Gen Options::Specifying conventions for function calls, data layout
 and register usage.
+* Developer Options::   Printing GNU Fortran specific info, debugging dumps.
 * Interoperability Options::  Options for interoperability with other
   languages.
 * Environment Variables:: Environment variables that affect @command{gfortran}.
@@ -159,9 +160,8 @@ and warnings}.
 }
 
 @item Debugging Options
-@xref{Debugging Options,,Options for debugging your program or GNU Fortran}.
-@gccoptlist{-fbacktrace -fdump-fortran-optimized -fdump-fortran-original
--fdebug-aux-vars -fdump-fortran-global -fdump-parse-tree -ffpe-trap=@var{list}
+@xref{Debugging Options,,Options for debugging your program}.
+@gccoptlist{-fbacktrace -fdebug-aux-vars -ffpe-trap=@var{list}
 -ffpe-summary=@var{list}
 }
 
@@ -201,6 +201,12 @@ and warnings}.
 -fpack-derived -frealloc-lhs -frecursive -frepack-arrays
 -fshort-enums -fstack-arrays
 }
+
+@item Developer Options
+@xref{Developer Options,,GNU Fortran Developer Options}.
+@gccoptlist{-fdump-fortran-global -fdump-fortran-optimized
+-fdump-fortran-original -fdump-parse-tree -save-temps
+}
 @end table
 
 @node Fortran Dialect Options
@@ -1280,40 +1286,14 @@ and other GNU compilers.
 Some of these have no effect when compiling programs written in Fortran.
 
 @node Debugging Options
-@section Options for debugging your program or GNU Fortran
+@section Options for debugging your program
 @cindex options, debugging
 @cindex debugging information options
 
 GNU Fortran has various special options that are used for debugging
-either your program or the GNU Fortran compiler.
+your program.
 
 @table @gcctabopt
-@opindex @code{fdump-fortran-original}
-@item -fdump-fortran-original
-Output the internal parse tree after translating the source program
-into internal representation.  This option is mostly useful for
-debugging the GNU Fortran compiler itself. The output generated by
-this option might change between releases. This option may also
-generate internal compiler errors for features which have only
-recently been added.
-
-@opindex @code{fdump-fortran-optimized}
-@item -fdump-fortran-optimized
-Output the parse tree after front-end optimization.  Mostly useful for
-debugging the GNU Fortran compiler itself. The output generated by
-this option might change between releases.  This option may also
-generate internal compiler errors for features which have only
-recently been added.
-
-@opindex @code{fdump-parse-tree}
-@item -fdump-parse-tree
-Output the internal parse tree after translating the source program
-into internal representation.  Mostly useful for debugging the GNU
-Fortran compiler itself. The output generated by this option might
-change between releases. This option may also generate internal
-compiler errors for features which have only recently been added. This
-option is deprecated; use @code{-fdump-fortran-original} instead.
-
 @opindex @code{fdebug-aux-vars}
 @item -fdebug-aux-vars
 Renames internal variables created by the gfortran front end and makes
@@ -1323,14 +1303,6 @@ useful for debugging the compiler's code generation together with
 @code{-fdump-tree-original} and enabling debugging of the executable
 program by using @code{-g} or @code{-ggdb3}.
 
-@opindex @code{fdump-fortran-global}
-@item -fdump-fortran-global
-Output a list of the global identifiers after translating into
-middle-end 

RE: Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle
> There's a PR in Bugzilla around this representational issue on MIPS, but I
can't find
> it straight away.

Found it.  It's PR rtl-optimization/104914, where we've already
discussed this in comments #15 and #16.

> -Original Message-
> From: Roger Sayle 
> Sent: 24 December 2023 00:50
> To: 'gcc-patches@gcc.gnu.org' ; 'YunQiang Su'
> 
> Cc: 'Jeff Law' 
> Subject: Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for
SImode
> 
> 
> Hi YunQiang (and Jeff),
> 
> > MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) ==
> true
> > based on that the hard register is always sign-extended, but here the
> > hard register is polluted by zero_extract.
> 
> I suspect that the bug here is that the MIPS backend shouldn't be
returning
> true for TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode).   It's true
> that the backend stores SImode values in DImode registers by sign
extending
> them, but this doesn't mean that any DImode pseudo register can be
truncated to
> an SImode pseudo just by SUBREG/register naming.  As you point out, if the
high
> bits of a DImode value are random, truncation isn't a no-op, and requires
an
> explicit sign-extension instruction.
> 
> There's a PR in Bugzilla around this representational issue on MIPS, but I
can't find
> it straight away.
> 
> Out of curiosity, how badly affected is the testsuite if mips.cc's
> mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec) is
changed
> to just return !TARGET_64BIT ?
> 
> I agree with Jeff there's an invariant that isn't correctly being modelled
by the
> MIPS machine description.  A machine description probably shouldn't define
an
> addsi3  pattern if what it actually supports is (sign_extend:DI
(truncate:SI (plus:DI
> (reg:DI x) (reg:DI y Trying to model this as SImode addition plus a
> SUBREG_PROMOTED flag is less than ideal.
> 
> Just my thoughts.  I'm curious what other folks think.
> 
> Cheers,
> Roger
> --




Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle


Hi YunQiang (and Jeff),

> MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true
> based on that the hard register is always sign-extended, but here
> the hard register is polluted by zero_extract.

I suspect that the bug here is that the MIPS backend shouldn't be returning
true for TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode).   It's true
that the backend stores SImode values in DImode registers by sign extending
them, but this doesn't mean that any DImode pseudo register can be truncated
to an SImode pseudo just by SUBREG/register naming.  As you point out, if
the
high bits of a DImode value are random, truncation isn't a no-op, and
requires
an explicit sign-extension instruction.

There's a PR in Bugzilla around this representational issue on MIPS, but I
can't find it straight away.

Out of curiosity, how badly affected is the testsuite if mips.cc's
mips_truly_noop_truncation (poly_uint64 outprec, poly_uint64 inprec)
is changed to just return !TARGET_64BIT ?

I agree with Jeff there's an invariant that isn't correctly being modelled
by
the MIPS machine description.  A machine description probably shouldn't
define an addsi3  pattern if what it actually supports is
(sign_extend:DI (truncate:SI (plus:DI (reg:DI x) (reg:DI y
Trying to model this as SImode addition plus a SUBREG_PROMOTED flag
is less than ideal.

Just my thoughts.  I'm curious what other folks think.

Cheers,
Roger
--




[committed] CRIS: Fix PR middle-end/113109; "throw" failing

2023-12-23 Thread Hans-Peter Nilsson
No test-case, but the regress-367 from r14-6674-g4759383245ac97 is
"back" to regress-10 for cris-elf+cris-sim with this patch applied
to gcc from that revision.

Also, I wonder why none of those other targets with a MEM for
EH_RETURN_HANDLER_RTX with an address relative to
frame_pointer_rtx (as opposed to hard_frame_pointer_rtx or
virtual_incoming_args_rtx) don't see the same problem.

Oh well.  Merry Xmas.

brgds, H-P

-- >8 --
TL;DR: the "dse1" pass removed the eh-return-address store.  The
PA also marks its EH_RETURN_HANDLER_RTX as volatile, for the same
reason, as does visum.  See PR32769 - it's the same thing on PA.

Conceptually, it's logical that stores to incoming args are
optimized out on the return path or if no loads are seen -
at least before epilogue expansion, when the subsequent load
isn't seen in the RTL, as is the case for the "dse1" pass.

I haven't looked into why this problem, that appeared for the PA
already in 2007, was seen for CRIS only recently (with
r14-6674-g4759383245ac97).

PR middle-end/113109
* config/cris/cris.cc (cris_eh_return_handler_rtx): New function.
* config/cris/cris-protos.h (cris_eh_return_handler_rtx): Prototype.
* config/cris/cris.h (EH_RETURN_HANDLER_RTX): Redefine to call
cris_eh_return_handler_rtx.
---
 gcc/config/cris/cris-protos.h |  1 +
 gcc/config/cris/cris.cc   | 16 
 gcc/config/cris/cris.h|  3 +--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/gcc/config/cris/cris-protos.h b/gcc/config/cris/cris-protos.h
index 666e04f9eeec..06678c723b56 100644
--- a/gcc/config/cris/cris-protos.h
+++ b/gcc/config/cris/cris-protos.h
@@ -28,6 +28,7 @@ extern bool cris_reload_address_legitimized (rtx, 
machine_mode, int, int, int);
 extern int cris_side_effect_mode_ok (enum rtx_code, rtx *, int, int,
  int, int, int);
 extern rtx cris_return_addr_rtx (int, rtx);
+extern rtx cris_eh_return_handler_rtx ();
 extern rtx cris_split_movdx (rtx *);
 extern bool cris_base_p (const_rtx, bool);
 extern bool cris_base_or_autoincr_p (const_rtx, bool);
diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 7705c25ed6c0..38a4dd29114d 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -1382,6 +1382,22 @@ cris_return_addr_rtx (int count, rtx frameaddr 
ATTRIBUTE_UNUSED)
 : NULL_RTX;
 }
 
+/* Setting the EH return return address is done by a *store* to a memory
+   address expressed as relative to "*incoming* args".  That store will
+   be optimized away, unless the MEM is marked as volatile.  N.B.: no
+   optimization opportunities are expected to be lost due to this hack;
+   __builtin_eh_return isn't called from elsewhere than the EH machinery
+   in libgcc.  */
+
+rtx
+cris_eh_return_handler_rtx ()
+{
+  rtx ret = cris_return_addr_rtx (0, NULL_RTX);
+  gcc_assert (MEM_P (ret));
+  MEM_VOLATILE_P (ret) = true;
+  return ret;
+}
+
 /* Accessor used in cris.md:return because cfun->machine isn't available
there.  */
 
diff --git a/gcc/config/cris/cris.h b/gcc/config/cris/cris.h
index 087b226ee475..ced356088302 100644
--- a/gcc/config/cris/cris.h
+++ b/gcc/config/cris/cris.h
@@ -551,8 +551,7 @@ enum reg_class
 #define CRIS_STACKADJ_REG CRIS_STRUCT_VALUE_REGNUM
 #define EH_RETURN_STACKADJ_RTX gen_rtx_REG (SImode, CRIS_STACKADJ_REG)
 
-#define EH_RETURN_HANDLER_RTX \
-  cris_return_addr_rtx (0, NULL)
+#define EH_RETURN_HANDLER_RTX cris_eh_return_handler_rtx ()
 
 #define INIT_EXPANDERS cris_init_expanders ()
 
-- 
2.30.2



[ARC PATCH] Table-driven ashlsi implementation for better code/rtx_costs.

2023-12-23 Thread Roger Sayle

One of the cool features of the H8 backend is its use of tables to select
optimal shift implementations for different CPU variants.  This patch
borrows (plagiarizes) that idiom for SImode left shifts in the ARC backend
(for CPUs without a barrel-shifter).  This provides a convenient mechanism
for both selecting the best implementation strategy (for speed vs. size),
and providing accurate rtx_costs [without duplicating a lot of logic].
Left shift RTX costs are especially important for use in synth_mult.

An example improvement is:

int foo(int x) { return 32768*x; }

which is now generated with -O2 -mcpu=em -mswap as:

foo:bmsk_s  r0,r0,16
swapr0,r0
j_s.d   [blink]
ror r0,r0

where previously the ARC backend would generate a loop:

foo:mov lp_count,15
lp  2f
add r0,r0,r0
nop
2:  # end single insn loop
j_s [blink]


Tested with a cross-compiler to arc-linux hosted on x86_64,
with no new (compile-only) regressions from make -k check.
Ok for mainline if this passes Claudiu's and/or Jeff's testing?
[Thanks again to Jeff for finding the typo in my last ARC patch]

2023-12-23  Roger Sayle  

gcc/ChangeLog
* config/arc/arc.cc (arc_shift_alg): New enumerated type for
left shift implementation strategies.
(arc_shift_info): Type for each entry of the shift strategy table.
(arc_shift_context_idx): Return a integer value for each code
generation context, used as an index
(arc_ashl_alg): Table indexed by context and shifted bit count.
(arc_split_ashl): Use the arc_ashl_alg table to select SImode
left shift implementation.
(arc_rtx_costs) : Use the arc_ashl_alg table to
provide accurate costs, when optimizing for speed or size.


Thanks in advance,
Roger
--

diff --git a/gcc/config/arc/arc.cc b/gcc/config/arc/arc.cc
index 3f4eb5a5736..925bffaa7d6 100644
--- a/gcc/config/arc/arc.cc
+++ b/gcc/config/arc/arc.cc
@@ -4222,6 +4222,253 @@ output_shift_loop (enum rtx_code code, rtx *operands)
   return "";
 }
 
+/* See below where shifts are handled for explanation of this enum.  */
+enum arc_shift_alg
+{
+  SHIFT_MOVE,  /* Register-to-register move.  */
+  SHIFT_LOOP,  /* Zero-overhead loop implementation.  */
+  SHIFT_INLINE,/* Mmultiple LSHIFTs and LSHIFT-PLUSs.  */ 
+  SHIFT_AND_ROT,/* Bitwise AND, then ROTATERTs.  */
+  SHIFT_SWAP,  /* SWAP then multiple LSHIFTs/LSHIFT-PLUSs.  */
+  SHIFT_AND_SWAP_ROT   /* Bitwise AND, then SWAP, then ROTATERTs.  */
+};
+
+struct arc_shift_info {
+  enum arc_shift_alg alg;
+  unsigned int cost;
+};
+
+/* Return shift algorithm context, an index into the following tables.
+ * 0 for -Os (optimize for size)   3 for -O2 (optimized for speed)
+ * 1 for -Os -mswap TARGET_V2  4 for -O2 -mswap TARGET_V2
+ * 2 for -Os -mswap !TARGET_V2 5 for -O2 -mswap !TARGET_V2  */
+static unsigned int
+arc_shift_context_idx ()
+{
+  if (optimize_function_for_size_p (cfun))
+{
+  if (!TARGET_SWAP)
+   return 0;
+  if (TARGET_V2)
+   return 1;
+  return 2;
+}
+  else
+{
+  if (!TARGET_SWAP)
+   return 3;
+  if (TARGET_V2)
+   return 4;
+  return 5;
+}
+}
+
+static const arc_shift_info arc_ashl_alg[6][32] = {
+  {  /* 0: -Os.  */
+{ SHIFT_MOVE, COSTS_N_INSNS (1) },  /*  0 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (1) },  /*  1 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (2) },  /*  2 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (2) },  /*  3 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (3) },  /*  4 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (3) },  /*  5 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (3) },  /*  6 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (4) },  /*  7 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (4) },  /*  8 */
+{ SHIFT_INLINE,   COSTS_N_INSNS (4) },  /*  9 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 10 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 11 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 12 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 13 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 14 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 15 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 16 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 17 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 18 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 19 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 20 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 21 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 22 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 23 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 24 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 25 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 26 */
+{ SHIFT_LOOP, COSTS_N_INSNS (4) },  /* 27 */
+ 

[PATCH v2] libstdc++: Use _GLIBCXX_USE_BUILTIN_TRAIT

2023-12-23 Thread Ken Matsui
This patch uses _GLIBCXX_USE_BUILTIN_TRAIT macro instead of __has_builtin
in the type_traits header for traits that have a corresponding fallback
non-built-in implementation.  This macro supports to toggle the use of
built-in traits in the type_traits header through
_GLIBCXX_DO_NOT_USE_BUILTIN_TRAITS macro, without needing to modify the
source code.

libstdc++-v3/ChangeLog:

* include/std/type_traits: Use _GLIBCXX_USE_BUILTIN_TRAIT.

Signed-off-by: Ken Matsui 
Reviewed-by: Patrick Palka 
---
 libstdc++-v3/include/std/type_traits | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index f00c07f94f9..ba35ffb27fa 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1481,7 +1481,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __bool_constant<__is_base_of(_Base, _Derived)>
 { };
 
-#if __has_builtin(__is_convertible)
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_convertible)
   template
 struct is_convertible
 : public __bool_constant<__is_convertible(_From, _To)>
@@ -1531,7 +1531,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #ifdef __cpp_lib_is_nothrow_convertible // C++ >= 20
 
-#if __has_builtin(__is_nothrow_convertible)
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_nothrow_convertible)
   /// is_nothrow_convertible_v
   template
 inline constexpr bool is_nothrow_convertible_v
@@ -1606,7 +1606,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { using type = _Tp; };
 
   /// remove_cv
-#if __has_builtin(__remove_cv)
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_cv)
   template
 struct remove_cv
 { using type = __remove_cv(_Tp); };
@@ -1672,7 +1672,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Reference transformations.
 
   /// remove_reference
-#if __has_builtin(__remove_reference)
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_reference)
   template
 struct remove_reference
 { using type = __remove_reference(_Tp); };
@@ -3537,7 +3537,7 @@ template
* @{
*/
 #ifdef __cpp_lib_remove_cvref // C++ >= 20
-# if __has_builtin(__remove_cvref)
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__remove_cvref)
   template
 struct remove_cvref
 { using type = __remove_cvref(_Tp); };
-- 
2.43.0



Re: 回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension

2023-12-23 Thread 钟居哲
I suggest you send the first patch which support theadvector with only adding 
"th.".
After it's done, then we can talk about it later.



juzhe.zh...@rivai.ai
 
发件人: joshua
发送时间: 2023-12-23 11:37
收件人: juzhe.zh...@rivai.ai; gcc-patches
抄送: Jim Wilson; palmer; andrew; philipp.tomsich; jeffreyalaw; christoph.muellner
主题: 回复:回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension
Hi Juzhe,

Sorry but I'm not quite familiar with the group_overlap framework. Could you 
take this pattern as an example to show how to disable an alternative in some 
target?

Joshua

--
发件人:juzhe.zh...@rivai.ai 
发送时间:2023年12月22日(星期五) 18:32
收件人:"cooper.joshua"; 
"gcc-patches"
抄 送:Jim Wilson; palmer; 
andrew; "philipp.tomsich"; 
jeffreyalaw; 
"christoph.muellner"; 
jinma; "cooper.qu"
主 题:Re: 回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension

Yeah.

(define_insn "@pred_msbc"
  [(set (match_operand: 0 "register_operand""=vr, vr, ")
  (unspec:
 [(minus:VI
   (match_operand:VI 1 "register_operand" "  0, vr,  vr")
   (match_operand:VI 2 "register_operand" " vr,  0,  vr"))
  (match_operand: 3 "register_operand"" vm, vm,  vm")
  (unspec:
[(match_operand 4 "vector_length_operand" " rK, rK,  rK")
 (match_operand 5 "const_int_operand" "  i,  i,   i")
 (reg:SI VL_REGNUM)
 (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))]
  "TARGET_VECTOR"
  "vmsbc.vvm\t%0,%1,%2,%3"
  [(set_attr "type" "vicalu")
   (set_attr "mode" "")
   (set_attr "vl_op_idx" "4")
   (set (attr "avl_type_idx") (const_int 5))])

You should use an attribute to disable alternative 0 and alternative 1 
constraint.


juzhe.zh...@rivai.ai
 
发件人: joshua
发送时间: 2023-12-22 18:29
收件人: juzhe.zh...@rivai.ai; gcc-patches
抄送: Jim Wilson; palmer; andrew; philipp.tomsich; jeffreyalaw; 
christoph.muellner; jinma; cooper.qu
主题: 回复:回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension
Hi Juzhe,
What xtheadvector needs to handle is just that destination vector register 
cannot overlap source vector register group for instructions like vmadc/vmsbc. 
That is not what group_overlap means. We nned to add "&" to the registers in 
the corresponding xtheadvector patterns while rvv 1.0 doesn't have this 
constraint.

(define_insn "@pred_th_msbc"
  [(set (match_operand: 0 "register_operand""=")
(unspec:
[(minus:VI
  (match_operand:VI 1 "register_operand" "  vr")
  (match_operand:VI 2 "register_operand" " vr"))
(match_operand: 3 "register_operand"" vm")
(unspec:
  [(match_operand 4 "vector_length_operand" " rK")
(match_operand 5 "const_int_operand" "  i")
(reg:SI VL_REGNUM)
(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))]
  "TARGET_XTHEADVECTOR"
  "vmsbc.vvm\t%0,%1,%2,%3"
  [(set_attr "type" "vicalu")
  (set_attr "mode" "")
  (set_attr "vl_op_idx" "4")
  (set (attr "avl_type_idx") (const_int 5))])

Joshua







--
发件人:juzhe.zh...@rivai.ai 
发送时间:2023年12月22日(星期五) 16:07
收件人:"cooper.joshua"; 
"gcc-patches"
抄 送:Jim Wilson; palmer; 
andrew; "philipp.tomsich"; 
jeffreyalaw; 
"christoph.muellner"; 
jinma; "cooper.qu"
主 题:Re: 回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension

You mean theadvector doesn't want the current RVV1.0 register overlap magic  as 
follows ?
The destination EEW is smaller than the source EEW and the overlap is in the 
lowest-numbered part of the source register group (e.g., when LMUL=1, vnsrl.wi 
v0, v0, 3 is legal, but a destination of v1 is not).
The destination EEW is greater than the source EEW, the source EMUL is at least 
1, and the overlap is in the highest-numbered part of the destination register 
group (e.g., when LMUL=8, vzext.vf4 v0, v6 is legal, but a source of v0, v2, or 
v4 is not).

If yes, I suggest disable the overlap constraint using attribute, More details 
you can learn from 

(set_attr "group_overlap"


juzhe.zh...@rivai.ai
 
发件人: joshua
发送时间: 2023-12-22 11:33
收件人: 钟居哲; gcc-patches
抄送: jim.wilson.gcc; palmer; andrew; philipp.tomsich; Jeff Law; Christoph 
Müllner; jinma; Cooper Qu
主题: 回复:[PATCH v3 0/6] RISC-V: Support XTheadVector extension
Hi Juzhe,

Thank you for your comprehensive comments.

Classifying theadvector intrinsics into 3 kinds is really important to make our 
patchset more organized. 

For 1) and 3), I will split out the patches soon and hope they will be merged 
quickly.
For 2), according to the differences between vector and xtheadvector, it can be 
classfied into 3 kinds.

First is renamed load/store, renamed narrowing integer right shift, renamed 
narrowing fixed-point clip, and etc. I think we can use ASM targethook to 
rewrite the whole string of the instructions, although it will still be a heavy 
work.
Second is no pseudo instruction like vneg/vfneg. We will add these pseudo 
instructions in binutils to make xtheadvector more compatible with 

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread YunQiang Su
Jeff Law  于2023年12月24日周日 00:51写道:
>
>
>
> On 12/23/23 01:58, YunQiang Su wrote:
> > On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
> > if 31 or above bits is polluted by an bitops, we will need an
> > truncate. Let's emit one, and mark let's use the same hardreg
> > as in and out, the RTL may like:
> >
> > (insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
> >  (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
> >   (nil))
> >
> > We use /s/u flags to mark it as really needed, as in
> > combine_simplify_rtx, this insn may be considered as truncated,
> > so let's skip this combination.
> >
> > gcc/ChangeLog:
> >  PR: 104914.
> >  * combine.cc (try_combine): Skip combine with truncate if
> >   dest is subreg and has /u/s flags on platforms
> >   TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
> >   * expr.cc (expand_assignment): Emit a truncate insn, if
> >   31+ bits is polluted for SImode.
> >
> > gcc/testsuite/ChangeLog:
> >   PR: 104914.
> >   * gcc.target/mips/pr104914.c: New testcase.
> I would suggest you show the RTL before/after whatever transformation
> has caused problems on your target and explain why you think the
> transformation is incorrect.
>

Before this patch, the RTL is like this
 (insn 19 18 20 2 (set (zero_extract:DI (reg/v:DI 200 [ val ])
   (const_int 8 [0x8])
   (const_int 24 [0x18]))
   (subreg:DI (reg:QI 205) 0)) "../xx.c":7:29 -1
(nil))
  (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
   (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
   (nil))
 (jump_insn 23 20 24 2 (set (pc)
   (if_then_else (lt (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
   (const_int 0 [0]))
(label_ref 32)
   (pc))) "../xx.c":10:5 -1
   (int_list:REG_BR_PROB 440234148 (nil))
  -> 32)

and then, when combine
  (insn 20 19 23 2 (set (reg/v:DI 200 [ val ])
 (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val ]) 0)))
"../xx.c":7:29 -1
  (nil))
will be convert to
  (note 20 19 23 2 NOTE_INSN_DELETED)
MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true
based on that the hard register is always sign-extended, but here
the hard register is polluted by zero_extract.

If we just patch combine.cc to make it not eat sign_extend, here,
sign_extend will still disappear in the later passes, due to mips define
sign_extend as "emit_note (NOTE_INSN_DELETED)".

So I tried to insert a new truncate RTX here,
(insn 21 20 24 2 (set (reg/v:DI 200 [ val ])
 (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
 (nil))
This is the RTL for this C code
 int32_t fun (int64_t arg) {
  int32_t a = (int32_t) arg;
  return a;
 }
But, the `reload` pass will get an ICE. I haven't dig the real problem.
If the new RTX is
(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
   (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
   (nil))
`reload` pass will happily accept it, and then it is converted to
 # this instruction will be sure the reg is well sign extended.
 `sll $rN, $rN, 0`
hard instruction.

The problem is that simple-rtx (called by combine) will believe that
REG 200 has been truncated to SImode, as the dest has an
subreg:SI.

So, I use /s/u flags to tell combine don't do so.

> Focus on the RTL semantics as well as the target specific semantics
> because both are critically important here.
>
> I strongly suspect you're just papering over a problem elsewhere.
>

Yes. I also guess so.  Any new idea?
In the previous threads, you suggested that we can just insert an
truncate instruction just before the comparison.
It still have some problem:
 1. There may be no comparison just after the zero_extract,
 instead some normal calculation, such as add/sub.
 Then, the calculation will get a malformed register, and
 in the ISA document, it is claimed UNPREDICTABLE.
 2. Insert an RTX before every comparison will cause performance
 regression, since in the most case, it is not needed.
 3. Inserting an RTX before comparison still needs some dirty hack
 like this.

>
> > ---
> >   gcc/combine.cc   | 23 +-
> >   gcc/expr.cc  | 17 
> >   gcc/testsuite/gcc.target/mips/pr104914.c | 25 
> >   3 files changed, 64 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c
> >
> > diff --git a/gcc/combine.cc b/gcc/combine.cc
> > index 1cda4dd57f2..04b9c414053 100644
> > --- a/gcc/combine.cc
> > +++ b/gcc/combine.cc
> > @@ -3294,6 +3294,28 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn 
> > *i1, rtx_insn *i0,
> > n_occurrences = 0;/* `subst' counts here */
> > 

[PATCH v2 8/8] libstdc++: Optimize std::is_unbounded_array compilation performance

2023-12-23 Thread Ken Matsui
This patch optimizes the compilation performance of
std::is_unbounded_array by dispatching to the new
__is_unbounded_array built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_unbounded_array_v): Use
__is_unbounded_array built-in trait.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index d53911b2fa0..a548982236d 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3670,11 +3670,16 @@ template
   /// True for a type that is an array of unknown bound.
   /// @ingroup variable_templates
   /// @since C++20
+# if _GLIBCXX_USE_BUILTIN_TRAIT(__is_unbounded_array)
+  template
+inline constexpr bool is_unbounded_array_v = __is_unbounded_array(_Tp);
+# else
   template
 inline constexpr bool is_unbounded_array_v = false;
 
   template
 inline constexpr bool is_unbounded_array_v<_Tp[]> = true;
+# endif
 
   /// True for a type that is an array of known bound.
   /// @since C++20
-- 
2.43.0



[PATCH v2 7/8] c++: Implement __is_unbounded_array built-in trait

2023-12-23 Thread Ken Matsui
This patch implements built-in trait for std::is_unbounded_array.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_unbounded_array.
* constraint.cc (diagnose_trait_expr): Handle
CPTK_IS_UNBOUNDED_ARRAY.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of
__is_unbounded_array.
* g++.dg/ext/is_unbounded_array.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc  |  3 ++
 gcc/cp/cp-trait.def   |  1 +
 gcc/cp/semantics.cc   |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  |  3 ++
 gcc/testsuite/g++.dg/ext/is_unbounded_array.C | 37 +++
 5 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_unbounded_array.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 44bf8dc5c5b..defb2ac0ee8 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3828,6 +3828,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_TRIVIALLY_COPYABLE:
   inform (loc, "  %qT is not trivially copyable", t1);
   break;
+case CPTK_IS_UNBOUNDED_ARRAY:
+  inform (loc, "  %qT is not an unbounded array", t1);
+  break;
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 18e2d0f3480..05514a51c21 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -92,6 +92,7 @@ DEFTRAIT_EXPR (IS_TRIVIAL, "__is_trivial", 1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, "__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
+DEFTRAIT_EXPR (IS_UNBOUNDED_ARRAY, "__is_unbounded_array", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
 DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 01a7ccc5225..176eda629ff 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12501,6 +12501,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_TRIVIALLY_COPYABLE:
   return trivially_copyable_p (type1);
 
+case CPTK_IS_UNBOUNDED_ARRAY:
+  return array_of_unknown_bound_p (type1);
+
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
@@ -12677,6 +12680,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
+case CPTK_IS_UNBOUNDED_ARRAY:
 case CPTK_IS_UNION:
 case CPTK_IS_VOLATILE:
   break;
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 96b7a89e4f1..b1430e9bd8b 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -158,6 +158,9 @@
 #if !__has_builtin (__is_trivially_copyable)
 # error "__has_builtin (__is_trivially_copyable) failed"
 #endif
+#if !__has_builtin (__is_unbounded_array)
+# error "__has_builtin (__is_unbounded_array) failed"
+#endif
 #if !__has_builtin (__is_union)
 # error "__has_builtin (__is_union) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_unbounded_array.C 
b/gcc/testsuite/g++.dg/ext/is_unbounded_array.C
new file mode 100644
index 000..283a74e1a0a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_unbounded_array.C
@@ -0,0 +1,37 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)  \
+  SA(TRAIT(TYPE) == EXPECT);   \
+  SA(TRAIT(const TYPE) == EXPECT); \
+  SA(TRAIT(volatile TYPE) == EXPECT);  \
+  SA(TRAIT(const volatile TYPE) == EXPECT)
+
+class ClassType { };
+class IncompleteClass;
+union IncompleteUnion;
+
+SA_TEST_CATEGORY(__is_unbounded_array, int[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, int[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, int[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, int[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, float*[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[2], false);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[], true);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, ClassType[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, IncompleteClass[2][3], false);
+SA_TEST_CATEGORY(__is_unbounded_array, IncompleteClass[][3], true);
+SA_TEST_CATEGORY(__is_unbounded_array, 

[PATCH v2 5/8] c++: Implement __is_pointer built-in trait

2023-12-23 Thread Ken Matsui
This patch implements built-in trait for std::is_pointer.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_pointer.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_POINTER.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_pointer.
* g++.dg/ext/is_pointer.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 ++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 ++
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 ++
 gcc/testsuite/g++.dg/ext/is_pointer.C| 51 
 5 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index d9ba62d443a..44bf8dc5c5b 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3795,6 +3795,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_POD:
   inform (loc, "  %qT is not a POD type", t1);
   break;
+case CPTK_IS_POINTER:
+  inform (loc, "  %qT is not a pointer", t1);
+  break;
 case CPTK_IS_POLYMORPHIC:
   inform (loc, "  %qT is not a polymorphic type", t1);
   break;
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index e9347453829..18e2d0f3480 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -82,6 +82,7 @@ DEFTRAIT_EXPR (IS_NOTHROW_CONVERTIBLE, 
"__is_nothrow_convertible", 2)
 DEFTRAIT_EXPR (IS_OBJECT, "__is_object", 1)
 DEFTRAIT_EXPR (IS_POINTER_INTERCONVERTIBLE_BASE_OF, 
"__is_pointer_interconvertible_base_of", 2)
 DEFTRAIT_EXPR (IS_POD, "__is_pod", 1)
+DEFTRAIT_EXPR (IS_POINTER, "__is_pointer", 1)
 DEFTRAIT_EXPR (IS_POLYMORPHIC, "__is_polymorphic", 1)
 DEFTRAIT_EXPR (IS_REFERENCE, "__is_reference", 1)
 DEFTRAIT_EXPR (IS_SAME, "__is_same", 2)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index c7971c969ae..01a7ccc5225 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12471,6 +12471,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_POD:
   return pod_type_p (type1);
 
+case CPTK_IS_POINTER:
+  return TYPE_PTR_P (type1);
+
 case CPTK_IS_POLYMORPHIC:
   return CLASS_TYPE_P (type1) && TYPE_POLYMORPHIC_P (type1);
 
@@ -12670,6 +12673,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_MEMBER_OBJECT_POINTER:
 case CPTK_IS_MEMBER_POINTER:
 case CPTK_IS_OBJECT:
+case CPTK_IS_POINTER:
 case CPTK_IS_REFERENCE:
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index b2e2f2f694d..96b7a89e4f1 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -125,6 +125,9 @@
 #if !__has_builtin (__is_pod)
 # error "__has_builtin (__is_pod) failed"
 #endif
+#if !__has_builtin (__is_pointer)
+# error "__has_builtin (__is_pointer) failed"
+#endif
 #if !__has_builtin (__is_polymorphic)
 # error "__has_builtin (__is_polymorphic) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_pointer.C 
b/gcc/testsuite/g++.dg/ext/is_pointer.C
new file mode 100644
index 000..d6e39565950
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_pointer.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+SA(!__is_pointer(int));
+SA(__is_pointer(int*));
+SA(__is_pointer(int**));
+
+SA(__is_pointer(const int*));
+SA(__is_pointer(const int**));
+SA(__is_pointer(int* const));
+SA(__is_pointer(int** const));
+SA(__is_pointer(int* const* const));
+
+SA(__is_pointer(volatile int*));
+SA(__is_pointer(volatile int**));
+SA(__is_pointer(int* volatile));
+SA(__is_pointer(int** volatile));
+SA(__is_pointer(int* volatile* volatile));
+
+SA(__is_pointer(const volatile int*));
+SA(__is_pointer(const volatile int**));
+SA(__is_pointer(const int* volatile));
+SA(__is_pointer(volatile int* const));
+SA(__is_pointer(int* const volatile));
+SA(__is_pointer(const int** volatile));
+SA(__is_pointer(volatile int** const));
+SA(__is_pointer(int** const volatile));
+SA(__is_pointer(int* const* const volatile));
+SA(__is_pointer(int* volatile* const volatile));
+SA(__is_pointer(int* const volatile* const volatile));
+
+SA(!__is_pointer(int&));
+SA(!__is_pointer(const int&));
+SA(!__is_pointer(volatile int&));
+SA(!__is_pointer(const volatile int&));
+
+SA(!__is_pointer(int&&));
+SA(!__is_pointer(const int&&));
+SA(!__is_pointer(volatile int&&));
+SA(!__is_pointer(const volatile int&&));
+
+SA(!__is_pointer(int[3]));
+SA(!__is_pointer(const int[3]));
+SA(!__is_pointer(volatile int[3]));
+SA(!__is_pointer(const volatile int[3]));
+
+SA(!__is_pointer(int(int)));
+SA(__is_pointer(int(*const)(int)));
+SA(__is_pointer(int(*volatile)(int)));
+SA(__is_pointer(int(*const 

[PATCH v2 6/8] libstdc++: Optimize std::is_pointer compilation performance

2023-12-23 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_pointer
by dispatching to the new __is_pointer built-in trait.

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_pointer): Use
__is_pointer built-in trait.  Optimize its implementation.
* include/std/type_traits (is_pointer): Likewise.
(is_pointer_v): Likewise.

Co-authored-by: Jonathan Wakely 
Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/bits/cpp_type_traits.h | 29 ++
 libstdc++-v3/include/std/type_traits| 44 +
 2 files changed, 65 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h 
b/libstdc++-v3/include/bits/cpp_type_traits.h
index 4312f32a4e0..c348df97f72 100644
--- a/libstdc++-v3/include/bits/cpp_type_traits.h
+++ b/libstdc++-v3/include/bits/cpp_type_traits.h
@@ -363,6 +363,13 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   //
   // Pointer types
   //
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct __is_pointer : __truth_type<_IsPtr>
+{
+  enum { __value = _IsPtr };
+};
+#else
   template
 struct __is_pointer
 {
@@ -377,6 +384,28 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3)
   typedef __true_type __type;
 };
 
+  template
+struct __is_pointer<_Tp* const>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
+  template
+struct __is_pointer<_Tp* volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+#endif
+
+  template
+struct __is_pointer<_Tp* const volatile>
+{
+  enum { __value = 1 };
+  typedef __true_type __type;
+};
+
   //
   // An arithmetic type is an integer type or a floating point type
   //
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 30b0778e58a..d53911b2fa0 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -542,19 +542,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public true_type { };
 #endif
 
-  template
-struct __is_pointer_helper
+  /// is_pointer
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+  template
+struct is_pointer
+: public __bool_constant<__is_pointer(_Tp)>
+{ };
+#else
+  template
+struct is_pointer
 : public false_type { };
 
   template
-struct __is_pointer_helper<_Tp*>
+struct is_pointer<_Tp*>
 : public true_type { };
 
-  /// is_pointer
   template
-struct is_pointer
-: public __is_pointer_helper<__remove_cv_t<_Tp>>::type
-{ };
+struct is_pointer<_Tp* const>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* volatile>
+: public true_type { };
+
+  template
+struct is_pointer<_Tp* const volatile>
+: public true_type { };
+#endif
 
   /// is_lvalue_reference
   template
@@ -3252,8 +3266,22 @@ template 
   inline constexpr bool is_array_v<_Tp[_Num]> = true;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_pointer)
+template 
+  inline constexpr bool is_pointer_v = __is_pointer(_Tp);
+#else
 template 
-  inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
+  inline constexpr bool is_pointer_v = false;
+template 
+  inline constexpr bool is_pointer_v<_Tp*> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* volatile> = true;
+template 
+  inline constexpr bool is_pointer_v<_Tp* const volatile> = true;
+#endif
+
 template 
   inline constexpr bool is_lvalue_reference_v = false;
 template 
-- 
2.43.0



[PATCH v2 4/8] libstdc++: Optimize std::is_volatile compilation performance

2023-12-23 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_volatile
by dispatching to the new __is_volatile built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_volatile): Use __is_volatile
built-in trait.
(is_volatile_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index f40831de838..30b0778e58a 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -851,6 +851,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #endif
 
   /// is_volatile
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+  template
+struct is_volatile
+: public __bool_constant<__is_volatile(_Tp)>
+{ };
+#else
   template
 struct is_volatile
 : public false_type { };
@@ -858,6 +864,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_volatile<_Tp volatile>
 : public true_type { };
+#endif
 
   /// is_trivial
   template
@@ -3344,10 +3351,15 @@ template 
   inline constexpr bool is_function_v<_Tp&&> = false;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_volatile)
+template 
+  inline constexpr bool is_volatile_v = __is_volatile(_Tp);
+#else
 template 
   inline constexpr bool is_volatile_v = false;
 template 
   inline constexpr bool is_volatile_v = true;
+#endif
 
 template 
   inline constexpr bool is_trivial_v = __is_trivial(_Tp);
-- 
2.43.0



[PATCH v2 3/8] c++: Implement __is_volatile built-in trait

2023-12-23 Thread Ken Matsui
This patch implements built-in trait for std::is_volatile.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_volatile.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_VOLATILE.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_volatile.
* g++.dg/ext/is_volatile.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_volatile.C   | 20 
 5 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index f1b07aa2853..d9ba62d443a 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3828,6 +3828,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_UNION:
   inform (loc, "  %qT is not a union", t1);
   break;
+case CPTK_IS_VOLATILE:
+  inform (loc, "  %qT is not a volatile type", t1);
+  break;
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
   inform (loc, "  %qT is not a reference that binds to a temporary "
  "object of type %qT (direct-initialization)", t1, t2);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 36faed9c0b3..e9347453829 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -92,6 +92,7 @@ DEFTRAIT_EXPR (IS_TRIVIALLY_ASSIGNABLE, 
"__is_trivially_assignable", 2)
 DEFTRAIT_EXPR (IS_TRIVIALLY_CONSTRUCTIBLE, "__is_trivially_constructible", -1)
 DEFTRAIT_EXPR (IS_TRIVIALLY_COPYABLE, "__is_trivially_copyable", 1)
 DEFTRAIT_EXPR (IS_UNION, "__is_union", 1)
+DEFTRAIT_EXPR (IS_VOLATILE, "__is_volatile", 1)
 DEFTRAIT_EXPR (REF_CONSTRUCTS_FROM_TEMPORARY, 
"__reference_constructs_from_temporary", 2)
 DEFTRAIT_EXPR (REF_CONVERTS_FROM_TEMPORARY, 
"__reference_converts_from_temporary", 2)
 DEFTRAIT_TYPE (REMOVE_CV, "__remove_cv", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 364d87ee34d..c7971c969ae 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12501,6 +12501,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_UNION:
   return type_code1 == UNION_TYPE;
 
+case CPTK_IS_VOLATILE:
+  return CP_TYPE_VOLATILE_P (type1);
+
 case CPTK_REF_CONSTRUCTS_FROM_TEMPORARY:
   return ref_xes_from_temporary (type1, type2, /*direct_init=*/true);
 
@@ -12671,6 +12674,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_SAME:
 case CPTK_IS_SCOPED_ENUM:
 case CPTK_IS_UNION:
+case CPTK_IS_VOLATILE:
   break;
 
 case CPTK_IS_LAYOUT_COMPATIBLE:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index e3640faeb96..b2e2f2f694d 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -158,6 +158,9 @@
 #if !__has_builtin (__is_union)
 # error "__has_builtin (__is_union) failed"
 #endif
+#if !__has_builtin (__is_volatile)
+# error "__has_builtin (__is_volatile) failed"
+#endif
 #if !__has_builtin (__reference_constructs_from_temporary)
 # error "__has_builtin (__reference_constructs_from_temporary) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_volatile.C 
b/gcc/testsuite/g++.dg/ext/is_volatile.C
new file mode 100644
index 000..80a1cfc880d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_volatile.C
@@ -0,0 +1,20 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+using cClassType = const ClassType;
+using vClassType = volatile ClassType;
+using cvClassType = const volatile ClassType;
+
+// Positive tests.
+SA(__is_volatile(volatile int));
+SA(__is_volatile(const volatile int));
+SA(__is_volatile(vClassType));
+SA(__is_volatile(cvClassType));
+
+// Negative tests.
+SA(!__is_volatile(int));
+SA(!__is_volatile(const int));
+SA(!__is_volatile(ClassType));
+SA(!__is_volatile(cClassType));
-- 
2.43.0



[PATCH v2 2/8] libstdc++: Optimize std::is_const compilation performance

2023-12-23 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_const
by dispatching to the new __is_const built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in
trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index f00c07f94f9..f40831de838 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3315,10 +3322,15 @@ template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
 template 
-- 
2.43.0



[PATCH v2 1/8] c++: Implement __is_const built-in trait

2023-12-23 Thread Ken Matsui
This patch implements built-in trait for std::is_const.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_const.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
* g++.dg/ext/is_const.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_const.C  | 20 
 5 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eeacead52a5..f1b07aa2853 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3734,6 +3734,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_CLASS:
   inform (loc, "  %qT is not a class", t1);
   break;
+case CPTK_IS_CONST:
+  inform (loc, "  %qT is not a const type", t1);
+  break;
 case CPTK_IS_CONSTRUCTIBLE:
   if (!t2)
 inform (loc, "  %qT is not default constructible", t1);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 394f006f20f..36faed9c0b3 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -64,6 +64,7 @@ DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
 DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
 DEFTRAIT_EXPR (IS_BOUNDED_ARRAY, "__is_bounded_array", 1)
 DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
+DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
 DEFTRAIT_EXPR (IS_CONSTRUCTIBLE, "__is_constructible", -1)
 DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
 DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index e6dba29ee81..364d87ee34d 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12415,6 +12415,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_CLASS:
   return NON_UNION_CLASS_TYPE_P (type1);
 
+case CPTK_IS_CONST:
+  return CP_TYPE_CONST_P (type1);
+
 case CPTK_IS_CONSTRUCTIBLE:
   return is_xible (INIT_EXPR, type1, type2);
 
@@ -12657,6 +12660,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ARRAY:
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
+case CPTK_IS_CONST:
 case CPTK_IS_ENUM:
 case CPTK_IS_FUNCTION:
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 02b4b4d745d..e3640faeb96 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -71,6 +71,9 @@
 #if !__has_builtin (__is_class)
 # error "__has_builtin (__is_class) failed"
 #endif
+#if !__has_builtin (__is_const)
+# error "__has_builtin (__is_const) failed"
+#endif
 #if !__has_builtin (__is_constructible)
 # error "__has_builtin (__is_constructible) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
b/gcc/testsuite/g++.dg/ext/is_const.C
new file mode 100644
index 000..8a0e8df72a9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_const.C
@@ -0,0 +1,20 @@
+// { dg-do compile { target c++11 } }
+
+#define SA(X) static_assert((X),#X)
+
+class ClassType { };
+using cClassType = const ClassType;
+using vClassType = volatile ClassType;
+using cvClassType = const volatile ClassType;
+
+// Positive tests.
+SA(__is_const(const int));
+SA(__is_const(const volatile int));
+SA(__is_const(cClassType));
+SA(__is_const(cvClassType));
+
+// Negative tests.
+SA(!__is_const(int));
+SA(!__is_const(volatile int));
+SA(!__is_const(ClassType));
+SA(!__is_const(vClassType));
-- 
2.43.0



[PATCH v2 0/8] Optimize more type traits

2023-12-23 Thread Ken Matsui
This patch series implements __is_const, __is_volatile, __is_pointer,
and __is_unbounded_array built-in traits, which were isolated from my
previous patch series "Optimize type traits compilation performance"
because they contained performance regression.  I confirmed that this
patch series does not cause any performance regression.  The main reason
of the performance regression were the exhaustiveness of the benchmarks
and the instability of the benchmark results.  Here are new benchmark
results:

is_const: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_const.md#sat-dec-23-090605-am-pst-2023

time: -4.36603%, peak memory: -0.300891%, total memory: -0.247934%

is_volatile_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_volatile_v.md#sat-dec-23-091518-am-pst-2023

time: -4.06816%, peak memory: -0.609298%, total memory: -0.659134%

is_pointer: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_pointer.md#sat-dec-23-124903-pm-pst-2023

time: -2.47124%, peak memory: -2.98207%, total memory: -4.0811%

is_unbounded_array_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_unbounded_array_v.md#sat-dec-23-010046-pm-pst-2023

time: -1.50025%, peak memory: -1.07386%, total memory: -2.32394%

Changes in v2:

- Removed testsuite_tr1.h includes from the testcases.

Ken Matsui (8):
  c++: Implement __is_const built-in trait
  libstdc++: Optimize std::is_const compilation performance
  c++: Implement __is_volatile built-in trait
  libstdc++: Optimize std::is_volatile compilation performance
  c++: Implement __is_pointer built-in trait
  libstdc++: Optimize std::is_pointer compilation performance
  c++: Implement __is_unbounded_array built-in trait
  libstdc++: Optimize std::is_unbounded_array compilation performance

 gcc/cp/constraint.cc  | 12 +++
 gcc/cp/cp-trait.def   |  4 +
 gcc/cp/semantics.cc   | 16 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  | 12 +++
 gcc/testsuite/g++.dg/ext/is_const.C   | 20 +
 gcc/testsuite/g++.dg/ext/is_pointer.C | 51 +
 gcc/testsuite/g++.dg/ext/is_unbounded_array.C | 37 ++
 gcc/testsuite/g++.dg/ext/is_volatile.C| 20 +
 libstdc++-v3/include/bits/cpp_type_traits.h   | 29 
 libstdc++-v3/include/std/type_traits  | 73 +--
 10 files changed, 266 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_unbounded_array.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C

-- 
2.43.0



Re: [PATCH 1/8] c++: Implement __is_const built-in trait

2023-12-23 Thread Ken Matsui
On Sat, Dec 23, 2023 at 1:36 PM Ken Matsui  wrote:
>
> This patch implements built-in trait for std::is_const.
>
> gcc/cp/ChangeLog:
>
> * cp-trait.def: Define __is_const.
> * constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
> * semantics.cc (trait_expr_value): Likewise.
> (finish_trait_expr): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
> * g++.dg/ext/is_const.C: New test.
>
> Signed-off-by: Ken Matsui 
> ---
>  gcc/cp/constraint.cc |  3 +++
>  gcc/cp/cp-trait.def  |  1 +
>  gcc/cp/semantics.cc  |  4 
>  gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
>  gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
>  5 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
>
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index eeacead52a5..f1b07aa2853 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3734,6 +3734,9 @@ diagnose_trait_expr (tree expr, tree args)
>  case CPTK_IS_CLASS:
>inform (loc, "  %qT is not a class", t1);
>break;
> +case CPTK_IS_CONST:
> +  inform (loc, "  %qT is not a const type", t1);
> +  break;
>  case CPTK_IS_CONSTRUCTIBLE:
>if (!t2)
>  inform (loc, "  %qT is not default constructible", t1);
> diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
> index 394f006f20f..36faed9c0b3 100644
> --- a/gcc/cp/cp-trait.def
> +++ b/gcc/cp/cp-trait.def
> @@ -64,6 +64,7 @@ DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
>  DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
>  DEFTRAIT_EXPR (IS_BOUNDED_ARRAY, "__is_bounded_array", 1)
>  DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
> +DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
>  DEFTRAIT_EXPR (IS_CONSTRUCTIBLE, "__is_constructible", -1)
>  DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
>  DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index e6dba29ee81..364d87ee34d 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -12415,6 +12415,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, 
> tree type2)
>  case CPTK_IS_CLASS:
>return NON_UNION_CLASS_TYPE_P (type1);
>
> +case CPTK_IS_CONST:
> +  return CP_TYPE_CONST_P (type1);
> +
>  case CPTK_IS_CONSTRUCTIBLE:
>return is_xible (INIT_EXPR, type1, type2);
>
> @@ -12657,6 +12660,7 @@ finish_trait_expr (location_t loc, cp_trait_kind 
> kind, tree type1, tree type2)
>  case CPTK_IS_ARRAY:
>  case CPTK_IS_BOUNDED_ARRAY:
>  case CPTK_IS_CLASS:
> +case CPTK_IS_CONST:
>  case CPTK_IS_ENUM:
>  case CPTK_IS_FUNCTION:
>  case CPTK_IS_MEMBER_FUNCTION_POINTER:
> diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
> b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> index 02b4b4d745d..e3640faeb96 100644
> --- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> +++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
> @@ -71,6 +71,9 @@
>  #if !__has_builtin (__is_class)
>  # error "__has_builtin (__is_class) failed"
>  #endif
> +#if !__has_builtin (__is_const)
> +# error "__has_builtin (__is_const) failed"
> +#endif
>  #if !__has_builtin (__is_constructible)
>  # error "__has_builtin (__is_constructible) failed"
>  #endif
> diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
> b/gcc/testsuite/g++.dg/ext/is_const.C
> new file mode 100644
> index 000..8f2d7c2fce9
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/ext/is_const.C
> @@ -0,0 +1,19 @@
> +// { dg-do compile { target c++11 } }
> +
> +#include 
> +

Please ignore this patch series.  I should have removed testsuite_tr1.h.

> +using namespace __gnu_test;
> +
> +#define SA(X) static_assert((X),#X)
> +
> +// Positive tests.
> +SA(__is_const(const int));
> +SA(__is_const(const volatile int));
> +SA(__is_const(cClassType));
> +SA(__is_const(cvClassType));
> +
> +// Negative tests.
> +SA(!__is_const(int));
> +SA(!__is_const(volatile int));
> +SA(!__is_const(ClassType));
> +SA(!__is_const(vClassType));
> --
> 2.43.0
>


[PATCH 2/8] libstdc++: Optimize std::is_const compilation performance

2023-12-23 Thread Ken Matsui
This patch optimizes the compilation performance of std::is_const
by dispatching to the new __is_const built-in trait.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_const): Use __is_const built-in
trait.
(is_const_v): Likewise.

Signed-off-by: Ken Matsui 
---
 libstdc++-v3/include/std/type_traits | 12 
 1 file changed, 12 insertions(+)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index f00c07f94f9..f40831de838 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -835,6 +835,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Type properties.
 
   /// is_const
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+  template
+struct is_const
+: public __bool_constant<__is_const(_Tp)>
+{ };
+#else
   template
 struct is_const
 : public false_type { };
@@ -842,6 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_const<_Tp const>
 : public true_type { };
+#endif
 
   /// is_volatile
   template
@@ -3315,10 +3322,15 @@ template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
 #endif
 
+#if _GLIBCXX_USE_BUILTIN_TRAIT(__is_const)
+template 
+  inline constexpr bool is_const_v = __is_const(_Tp);
+#else
 template 
   inline constexpr bool is_const_v = false;
 template 
   inline constexpr bool is_const_v = true;
+#endif
 
 #if _GLIBCXX_USE_BUILTIN_TRAIT(__is_function)
 template 
-- 
2.43.0



[PATCH 1/8] c++: Implement __is_const built-in trait

2023-12-23 Thread Ken Matsui
This patch implements built-in trait for std::is_const.

gcc/cp/ChangeLog:

* cp-trait.def: Define __is_const.
* constraint.cc (diagnose_trait_expr): Handle CPTK_IS_CONST.
* semantics.cc (trait_expr_value): Likewise.
(finish_trait_expr): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/has-builtin-1.C: Test existence of __is_const.
* g++.dg/ext/is_const.C: New test.

Signed-off-by: Ken Matsui 
---
 gcc/cp/constraint.cc |  3 +++
 gcc/cp/cp-trait.def  |  1 +
 gcc/cp/semantics.cc  |  4 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C |  3 +++
 gcc/testsuite/g++.dg/ext/is_const.C  | 19 +++
 5 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index eeacead52a5..f1b07aa2853 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -3734,6 +3734,9 @@ diagnose_trait_expr (tree expr, tree args)
 case CPTK_IS_CLASS:
   inform (loc, "  %qT is not a class", t1);
   break;
+case CPTK_IS_CONST:
+  inform (loc, "  %qT is not a const type", t1);
+  break;
 case CPTK_IS_CONSTRUCTIBLE:
   if (!t2)
 inform (loc, "  %qT is not default constructible", t1);
diff --git a/gcc/cp/cp-trait.def b/gcc/cp/cp-trait.def
index 394f006f20f..36faed9c0b3 100644
--- a/gcc/cp/cp-trait.def
+++ b/gcc/cp/cp-trait.def
@@ -64,6 +64,7 @@ DEFTRAIT_EXPR (IS_ASSIGNABLE, "__is_assignable", 2)
 DEFTRAIT_EXPR (IS_BASE_OF, "__is_base_of", 2)
 DEFTRAIT_EXPR (IS_BOUNDED_ARRAY, "__is_bounded_array", 1)
 DEFTRAIT_EXPR (IS_CLASS, "__is_class", 1)
+DEFTRAIT_EXPR (IS_CONST, "__is_const", 1)
 DEFTRAIT_EXPR (IS_CONSTRUCTIBLE, "__is_constructible", -1)
 DEFTRAIT_EXPR (IS_CONVERTIBLE, "__is_convertible", 2)
 DEFTRAIT_EXPR (IS_EMPTY, "__is_empty", 1)
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index e6dba29ee81..364d87ee34d 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -12415,6 +12415,9 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree 
type2)
 case CPTK_IS_CLASS:
   return NON_UNION_CLASS_TYPE_P (type1);
 
+case CPTK_IS_CONST:
+  return CP_TYPE_CONST_P (type1);
+
 case CPTK_IS_CONSTRUCTIBLE:
   return is_xible (INIT_EXPR, type1, type2);
 
@@ -12657,6 +12660,7 @@ finish_trait_expr (location_t loc, cp_trait_kind kind, 
tree type1, tree type2)
 case CPTK_IS_ARRAY:
 case CPTK_IS_BOUNDED_ARRAY:
 case CPTK_IS_CLASS:
+case CPTK_IS_CONST:
 case CPTK_IS_ENUM:
 case CPTK_IS_FUNCTION:
 case CPTK_IS_MEMBER_FUNCTION_POINTER:
diff --git a/gcc/testsuite/g++.dg/ext/has-builtin-1.C 
b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
index 02b4b4d745d..e3640faeb96 100644
--- a/gcc/testsuite/g++.dg/ext/has-builtin-1.C
+++ b/gcc/testsuite/g++.dg/ext/has-builtin-1.C
@@ -71,6 +71,9 @@
 #if !__has_builtin (__is_class)
 # error "__has_builtin (__is_class) failed"
 #endif
+#if !__has_builtin (__is_const)
+# error "__has_builtin (__is_const) failed"
+#endif
 #if !__has_builtin (__is_constructible)
 # error "__has_builtin (__is_constructible) failed"
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/is_const.C 
b/gcc/testsuite/g++.dg/ext/is_const.C
new file mode 100644
index 000..8f2d7c2fce9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_const.C
@@ -0,0 +1,19 @@
+// { dg-do compile { target c++11 } }
+
+#include 
+
+using namespace __gnu_test;
+
+#define SA(X) static_assert((X),#X)
+
+// Positive tests.
+SA(__is_const(const int));
+SA(__is_const(const volatile int));
+SA(__is_const(cClassType));
+SA(__is_const(cvClassType));
+
+// Negative tests.
+SA(!__is_const(int));
+SA(!__is_const(volatile int));
+SA(!__is_const(ClassType));
+SA(!__is_const(vClassType));
-- 
2.43.0



[PATCH 0/8] Optimize more type traits

2023-12-23 Thread Ken Matsui
This patch series implements __is_const, __is_volatile, __is_pointer,
and __is_unbounded_array built-in traits, which were isolated from my
previous patch series "Optimize type traits compilation performance"
because they contained performance regression.  I confirmed that this
patch series does not cause any performance regression.  The main reason
of the performance regression were the exhaustiveness of the benchmarks
and the instability of the benchmark results.  Here are new benchmark
results:

is_const: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_const.md#sat-dec-23-090605-am-pst-2023

time: -4.36603%, peak memory: -0.300891%, total memory: -0.247934%

is_volatile_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_volatile_v.md#sat-dec-23-091518-am-pst-2023

time: -4.06816%, peak memory: -0.609298%, total memory: -0.659134%

is_pointer: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_pointer.md#sat-dec-23-124903-pm-pst-2023

time: -2.47124%, peak memory: -2.98207%, total memory: -4.0811%

is_unbounded_array_v: 
https://github.com/ken-matsui/gcc-bench/blob/main/is_unbounded_array_v.md#sat-dec-23-010046-pm-pst-2023

time: -1.50025%, peak memory: -1.07386%, total memory: -2.32394%

Ken Matsui (8):
  c++: Implement __is_const built-in trait
  libstdc++: Optimize std::is_const compilation performance
  c++: Implement __is_volatile built-in trait
  libstdc++: Optimize std::is_volatile compilation performance
  c++: Implement __is_pointer built-in trait
  libstdc++: Optimize std::is_pointer compilation performance
  c++: Implement __is_unbounded_array built-in trait
  libstdc++: Optimize std::is_unbounded_array compilation performance

 gcc/cp/constraint.cc  | 12 +++
 gcc/cp/cp-trait.def   |  4 +
 gcc/cp/semantics.cc   | 16 
 gcc/testsuite/g++.dg/ext/has-builtin-1.C  | 12 +++
 gcc/testsuite/g++.dg/ext/is_const.C   | 19 +
 gcc/testsuite/g++.dg/ext/is_pointer.C | 51 +
 gcc/testsuite/g++.dg/ext/is_unbounded_array.C | 37 ++
 gcc/testsuite/g++.dg/ext/is_volatile.C| 19 +
 libstdc++-v3/include/bits/cpp_type_traits.h   | 29 
 libstdc++-v3/include/std/type_traits  | 73 +--
 10 files changed, 264 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_const.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_pointer.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_unbounded_array.C
 create mode 100644 gcc/testsuite/g++.dg/ext/is_volatile.C

-- 
2.43.0



Re: [PATCH RFC] c++/modules: __class_type_info and modules

2023-12-23 Thread Nathan Sidwell

On 12/18/23 17:10, Jason Merrill wrote:

On 12/18/23 16:57, Nathan Sidwell wrote:

On 12/18/23 16:31, Jason Merrill wrote:

Tested x86_64-pc-linux-gnu.  Does this make sense?  Did you have another theory
about how to merge these?


Why isn't push_abi_namespace doing the right setup here? (and I think 
get_global_binding might be similarly problematic?)


What would the right setup be?  It pushes into the global module, but before 
this change lookup doesn't find things imported into the global module, and so 
we get two independent (and so non-equivalent) declarations.


The comment for get_namespace_binding says "Users of this who, having found 
nothing, push a new decl must be prepared for that pushing to match an existing 
decl."  But if lookup_elaborated_type fails, so we pushtag a new type, 
check_module_override doesn't try to merge them because TREE_PUBLIC isn't set on 
the TYPE_DECL yet at that point, and they coexist until we complain about 
redeclaring __dynamic_cast with non-matching parameter types.


I tried setting TREE_PUBLIC on the TYPE_DECL, and then check_module_override 
called duplicate_decls, and rejected the redeclaration as a different type.


sigh, it seems that doesn't work as intended, I guess your approace is a 
pragmatic workaround, much as I dislike special-casing particular identifier. 
Perhaps comment with an appropriate FIXME?


I've realized there's problems with completeness here -- the 'invisible' type 
may be complete, but the current TU only foreward-declares it.  Our AST can't 
represent that right now.  And I'm not sure if there are template instantiation 
issues -- is the type complete or not in any particular instantiaton?


nathan




-- 8< --

Doing a dynamic_cast in both TUs broke because we were declaring a new
__class_type_info in _b that conflicted with the one imported in the global
module from _a.  lookup_elaborated_type has a comment that we probably don't
want to find such imports in general, but in this case it seems necessary to
make the artificial lazy declarations of RTTI types work.

gcc/cp/ChangeLog:

* name-lookup.cc (lookup_elaborated_type): Look for bindings
in the global namespace in the ABI namespace.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr106304_b.C: Add dynamic_cast.
---
  gcc/cp/name-lookup.cc | 10 ++
  gcc/testsuite/g++.dg/modules/pr106304_b.C |  1 +
  2 files changed, 11 insertions(+)

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 09dc6ef3e5a..f15b338025d 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -8092,6 +8092,16 @@ lookup_elaborated_type (tree name, TAG_how how)
    // FIXME: This isn't quite right, if we find something
    // here, from the language PoV we're not supposed to
    // know it?
+  // We at least need to do this in __cxxabiv1 to unify lazy
+  // declarations of __class_type_info in build_dynamic_cast_1.
+  if (current_namespace == abi_node)
+    {
+  tree g = (BINDING_VECTOR_CLUSTER (*slot, 0)
+    .slots[BINDING_SLOT_GLOBAL]);
+  for (ovl_iterator iter (g); iter; ++iter)
+    if (qualify_lookup (*iter, LOOK_want::TYPE))
+  return *iter;
+    }
  }
  }
  }
diff --git a/gcc/testsuite/g++.dg/modules/pr106304_b.C 
b/gcc/testsuite/g++.dg/modules/pr106304_b.C

index e8333909c8d..0d1da086176 100644
--- a/gcc/testsuite/g++.dg/modules/pr106304_b.C
+++ b/gcc/testsuite/g++.dg/modules/pr106304_b.C
@@ -5,4 +5,5 @@ module pr106304;
  void f(A& a) {
    as_b(a);
+  dynamic_cast();
  }

base-commit: 5347263b347d02e875879ca40ca6e289ac178919
prerequisite-patch-id: 66735c0c7beb22586ed4b632d10ec9094bb9920c






--
Nathan Sidwell



[PATCH] reassoc vs uninitialized variable {PR112581]

2023-12-23 Thread Andrew Pinski
Like r14-2293-g11350734240dba and r14-2289-gb083203f053f16,
reassociation can combine across a few bb and one of the usage
can be an uninitializated variable and if going from an conditional
usage to an unconditional usage can cause wrong code.
This uses maybe_undef_p like other passes where this can happen.

Note if-to-switch uses the function (init_range_entry) provided
by ressociation so we need to call mark_ssa_maybe_undefs there;
otherwise we assume almost all ssa names are uninitialized.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/112581
* gimple-if-to-switch.cc (pass_if_to_switch::execute): Call
mark_ssa_maybe_undefs.
* tree-ssa-reassoc.cc (can_reassociate_op_p): Uninitialized
variables can not be reassociated.
(init_range_entry): Check for uninitialized variables too.
(init_reassoc): Call mark_ssa_maybe_undefs.

gcc/testsuite/ChangeLog:

PR tree-optimization/112581
* gcc.c-torture/execute/pr112581-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/gimple-if-to-switch.cc|  3 ++
 .../gcc.c-torture/execute/pr112581-1.c| 37 +++
 gcc/tree-ssa-reassoc.cc   |  7 +++-
 3 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr112581-1.c

diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
index 7792a6024cd..af8d6684d32 100644
--- a/gcc/gimple-if-to-switch.cc
+++ b/gcc/gimple-if-to-switch.cc
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "alloc-pool.h"
 #include "tree-switch-conversion.h"
 #include "tree-ssa-reassoc.h"
+#include "tree-ssa.h"
 
 using namespace tree_switch_conversion;
 
@@ -494,6 +495,8 @@ pass_if_to_switch::execute (function *fun)
   auto_vec all_candidates;
   hash_map conditions_in_bbs;
 
+  mark_ssa_maybe_undefs ();
+
   basic_block bb;
   FOR_EACH_BB_FN (bb, fun)
 find_conditions (bb, _in_bbs);
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr112581-1.c 
b/gcc/testsuite/gcc.c-torture/execute/pr112581-1.c
new file mode 100644
index 000..14081c96d58
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr112581-1.c
@@ -0,0 +1,37 @@
+/* { dg-require-effective-target int32plus } */
+/* PR tree-optimization/112581 */
+/* reassociation, used to combine 2 bb to together,
+   that made an unitialized variable unconditional used
+   which then at runtime would cause an infinite loop.  */
+int a = -1, b = 2501896061, c, d, e, f = 3, g;
+int main() {
+  unsigned h;
+  int i;
+  d = 0;
+  for (; d < 1; d++) {
+int j = ~-((6UL ^ a) / b);
+if (b)
+L:
+  if (!f)
+continue;
+if (c)
+  i = 1;
+if (j) {
+  i = 0;
+  while (e)
+;
+}
+g = -1 % b;
+h = ~(b || h);
+f = g || 0;
+a = a || 0;
+if (!a)
+  h = 0;
+while (h > 4294967294)
+  if (i)
+break;
+if (c)
+  goto L;
+  }
+  return 0;
+}
diff --git a/gcc/tree-ssa-reassoc.cc b/gcc/tree-ssa-reassoc.cc
index cdef9f7cdc3..94873745928 100644
--- a/gcc/tree-ssa-reassoc.cc
+++ b/gcc/tree-ssa-reassoc.cc
@@ -647,6 +647,9 @@ can_reassociate_op_p (tree op)
 {
   if (TREE_CODE (op) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
 return false;
+  /* Uninitialized variables can't participate in reassociation. */
+  if (TREE_CODE (op) == SSA_NAME && ssa_name_maybe_undef_p (op))
+return false;
   /* Make sure asm goto outputs do not participate in reassociation since
  we have no way to find an insertion place after asm goto.  */
   if (TREE_CODE (op) == SSA_NAME
@@ -2600,7 +2603,8 @@ init_range_entry (struct range_entry *r, tree exp, gimple 
*stmt)
}
 
   if (TREE_CODE (arg0) != SSA_NAME
- || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg0))
+ || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (arg0)
+ || ssa_name_maybe_undef_p (arg0))
break;
   loc = gimple_location (stmt);
   switch (code)
@@ -7418,6 +7422,7 @@ init_reassoc (void)
   free (bbs);
   calculate_dominance_info (CDI_POST_DOMINATORS);
   plus_negates = vNULL;
+  mark_ssa_maybe_undefs ();
 }
 
 /* Cleanup after the reassociation pass, and print stats if
-- 
2.39.3



Re: [PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread Jeff Law




On 12/23/23 04:07, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump.

Aka the blow configuration will be treated as XFAIL and we still need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

* 

Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-23 Thread Xi Ruoyao
On Sun, 2023-12-24 at 00:56 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> > Hi,
> > 
> > This patch will cause the following tests to fail:
> > 
> > +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn, 
> > at recog.cc:2812)
> > +FAIL: gcc.dg/vect/pr97081-2.c (test for excess errors)
> > +FAIL: gcc.dg/vect/pr97081-2.c -flto -ffat-lto-objects (internal compiler 
> > error: in extract_insn, at recog.cc:2812)
> > +FAIL: gcc.dg/vect/pr97081-2.c -flto -ffat-lto-objects (test for excess 
> > errors)
> 
> I can reproduce it now but it did not happen when I submitted the patch.
> The difference may be caused by a different binutils version or some
> other changes in GCC.  I'll figure it out...

Phew, it was simple.  I uploaded an earlier draft version of this patch
onto the dev box :(.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] LoongArch: Expand left rotate to right rotate with negated amount

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 15:00 +0800, chenglulu wrote:
> Hi,
> 
> This patch will cause the following tests to fail:
> 
> +FAIL: gcc.dg/vect/pr97081-2.c (internal compiler error: in extract_insn, at 
> recog.cc:2812)
> +FAIL: gcc.dg/vect/pr97081-2.c (test for excess errors)
> +FAIL: gcc.dg/vect/pr97081-2.c -flto -ffat-lto-objects (internal compiler 
> error: in extract_insn, at recog.cc:2812)
> +FAIL: gcc.dg/vect/pr97081-2.c -flto -ffat-lto-objects (test for excess 
> errors)

I can reproduce it now but it did not happen when I submitted the patch.
The difference may be caused by a different binutils version or some
other changes in GCC.  I'll figure it out...

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Jeff Law




On 12/23/23 01:58, YunQiang Su wrote:

On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
if 31 or above bits is polluted by an bitops, we will need an
truncate. Let's emit one, and mark let's use the same hardreg
as in and out, the RTL may like:

(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
 (truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
  (nil))

We use /s/u flags to mark it as really needed, as in
combine_simplify_rtx, this insn may be considered as truncated,
so let's skip this combination.

gcc/ChangeLog:
 PR: 104914.
 * combine.cc (try_combine): Skip combine with truncate if
dest is subreg and has /u/s flags on platforms
TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
* expr.cc (expand_assignment): Emit a truncate insn, if
31+ bits is polluted for SImode.

gcc/testsuite/ChangeLog:
PR: 104914.
* gcc.target/mips/pr104914.c: New testcase.
I would suggest you show the RTL before/after whatever transformation 
has caused problems on your target and explain why you think the 
transformation is incorrect.


Focus on the RTL semantics as well as the target specific semantics 
because both are critically important here.


I strongly suspect you're just papering over a problem elsewhere.



---
  gcc/combine.cc   | 23 +-
  gcc/expr.cc  | 17 
  gcc/testsuite/gcc.target/mips/pr104914.c | 25 
  3 files changed, 64 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 1cda4dd57f2..04b9c414053 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3294,6 +3294,28 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
n_occurrences = 0;  /* `subst' counts here */
subst_low_luid = DF_INSN_LUID (i2);
  
+  /* Don't try to combine a TRUNCATE INSN, if it's DEST is SUBREG and has

+FLAG /s/u.  We use these 2 flags to mark this INSN as really needed:
+normally, it means that the bits of 31+ of this variable is polluted
+by a bitops.  The reason of existing of case (subreg:SI (reg:DI)) is
+that, the same hardreg may act as src and dest.  */
+  if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && INSN_P (i2))
+   {
+ rtx i2dest_o = SET_DEST (PATTERN (i2));
+ rtx i2src_o = SET_SRC (PATTERN (i2));
+ if (GET_CODE (i2dest_o) == SUBREG
+ && GET_MODE (i2dest_o) == SImode
+ && GET_MODE (SUBREG_REG (i2dest_o)) == DImode
+ && SUBREG_PROMOTED_VAR_P (i2dest_o)
+ && SUBREG_PROMOTED_GET (i2dest_o) == SRP_SIGNED
+ && GET_CODE (i2src_o) == TRUNCATE
+ && GET_MODE (i2src_o) == SImode
+ && rtx_equal_p (SUBREG_REG (i2dest_o), XEXP (i2src_o, 0))
+ )
+   return 0;
+   }
So checking SI/DI like this is just wrong.  There's nothing special 
about SI/DI.Checking for equality between the destination and source 
also seems wrong -- if the state of the sign bit is wrong, it's wrong 
regardless of whether or not the source/destination register is the same.





@@ -5326,7 +5348,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
  
 UNIQUE_COPY is true if each substitution must be unique.  We do this

 by copying if `n_occurrences' is nonzero.  */
-
  static rtx
  subst (rtx x, rtx from, rtx to, bool in_dest, bool in_cond, bool unique_copy)
  {
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9fef2bf6585..f7236040a34 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6284,6 +6284,23 @@ expand_assignment (tree to, tree from, bool nontemporal)
nontemporal, reversep);
  convert_move (SUBREG_REG (to_rtx), to_rtx1,
SUBREG_PROMOTED_SIGN (to_rtx));
+
+ rtx last = get_last_insn ();
+ if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && known_ge (bitregion_end, 31)
+ && SUBREG_PROMOTED_VAR_P (to_rtx)
+ && SUBREG_PROMOTED_SIGN (to_rtx) == SRP_SIGNED
+ && GET_MODE (to_rtx) == SImode
+ && GET_MODE (SUBREG_REG (to_rtx)) == DImode
+ && GET_CODE (SET_SRC (PATTERN (last))) == SIGN_EXTEND
+ )
+   {
+ insn_code icode = convert_optab_handler
+   (trunc_optab, SImode, DImode);
+ if (icode != CODE_FOR_nothing)
+   emit_unop_insn (icode, to_rtx,
+   SUBREG_REG (to_rtx), TRUNCATE);
+   }

Similar comments about the modes apply here.

But again, my sense is there's a higher 

Re: [PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread Jeff Law




On 12/23/23 05:39, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to XFail the signbit-5 run test case for
the RVV.  Given the case has one limitation like "This test does not
work when the truth type does not match vector type." in the beginning
of the test file.  Aka, the RVV vector truth type is not integer type.

The target board of riscv-sim like below will pick up `-march=rv64gcv`
when building the run test elf. Thus, the RVV cannot bypass this test
case like aarch64_sve with additional option `-march=armv8-a`.

   riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow

For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.

The signbit-5.c passed test with below configurations but we need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 

[PATCH v2] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread pan2 . li
From: Pan Li 

This patch would like to XFail the signbit-5 run test case for
the RVV.  Given the case has one limitation like "This test does not
work when the truth type does not match vector type." in the beginning
of the test file.  Aka, the RVV vector truth type is not integer type.

The target board of riscv-sim like below will pick up `-march=rv64gcv`
when building the run test elf. Thus, the RVV cannot bypass this test
case like aarch64_sve with additional option `-march=armv8-a`.

  riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow

For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.

The signbit-5.c passed test with below configurations but we need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8
* 

[PATCH v1] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

2023-12-23 Thread pan2 . li
From: Pan Li 

This patch would like to XFAIL the test case pr30957-1.c for the RVV when
build the elf with some configurations (list at the end of the log)
It will be vectorized during vect_transform_loop with a variable factor.
It won't benefit from unrolling/peeling and mark the loop->unroll as 1.
Of course, it will do nothing during unroll_loops when loop->unroll is 1.

After this patch the loops vectorized with a variable factor of the RVV
will be treated as XFAIL by the tree dump.

Aka the blow configuration will be treated as XFAIL and we still need
further investigation for the failures of other configurations.

* riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve32f/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gc_zve64d/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl1024b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl256b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=dynamic/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m2/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m4/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-lmul=m8/--param=riscv-autovec-preference=fixed-vlmax
* 
riscv-sim/-march=rv64gcv_zvl512b/-mabi=lp64d/-mcmodel=medlow/--param=riscv-autovec-preference=fixed-vlmax

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Add XFAIL for RVV when vectorized with

RE: Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-23 Thread Li, Pan2
Thanks all for comments, will have a try for riscv_v and send V2 if everything 
goes well.

Pan

From: 钟居哲 
Sent: Friday, December 22, 2023 6:44 AM
To: Jeff Law ; Li, Pan2 ; gcc-patches 

Cc: Wang, Yanzhang ; kito.cheng 
; richard.guenther ; Tamar 
Christina 
Subject: Re: Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

Maybe use riscv_v ?


juzhe.zh...@rivai.ai

From: Jeff Law
Date: 2023-12-22 03:16
To: pan2.li; 
gcc-patches
CC: juzhe.zhong; 
yanzhang.wang; 
kito.cheng; 
richard.guenther; 
tamar.christina
Subject: Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV


On 12/20/23 19:25, pan2...@intel.com wrote:
> From: Pan Li mailto:pan2...@intel.com>>
>
> This patch would like to XFail the signbit-5 run test case for
> the RVV.  Given the case has one limitation like "This test does not
> work when the truth type does not match vector type." in the beginning
> of the test file.  Aka, the RVV vector truth type is not integer type.
>
> The target board of riscv-sim like below will pick up `-march=rv64gcv`
> when building the run test elf. Thus, the RVV cannot bypass this test
> case like aarch64_sve with additional option `-march=armv8-a`.
>
>riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
>
> For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.
But isn't that just going to turn this into an XPASS when vector is not
enabled?

Looking at a recent rv64gc run of mine:

> PASS: gcc.dg/signbit-5.c (test for excess errors)
> PASS: gcc.dg/signbit-5.c execution test


Ideally we'd find a way to handle with and without vector.

jeff



Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 18:44 +0800, Xi Ruoyao wrote:
> On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > > The performance drop has nothing to do with this patch. I found that the 
> > > h264 performance compiled 
> > > by r14-6787 compared to r14-6421 dropped by 6.4%. 
> 
> Then I guess we should create a bug report...
> 
> >  But there is a problem. My regression test has the following two fail 
> > items.(based on r14-6787)
> 
> > +FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors)

I guess this is https://gcc.gnu.org/PR28123.

> > +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6

I'll take a look on this.  Maybe it will show up with Binutils trunk (I
just realized I tested this patch with Binutils 2.41, and it's not
sufficient to really test the change).

> Strange.  I didn't see them on r14-6650 (with or without the patch).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-23 Thread Xi Ruoyao
On Sat, 2023-12-23 at 10:29 +0800, chenglulu wrote:
> > The performance drop has nothing to do with this patch. I found that the 
> > h264 performance compiled 
> > by r14-6787 compared to r14-6421 dropped by 6.4%. 

Then I guess we should create a bug report...

>  But there is a problem. My regression test has the following two fail 
> items.(based on r14-6787)

> +FAIL: gcc.dg/cpp/_Pragma3.c (test for excess errors)
> +FAIL: gcc.dg/pr86617.c scan-rtl-dump-times final "mem/v" 6

Strange.  I didn't see them on r14-6650 (with or without the patch).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: OpenMP offloading vs. C++ static local variables

2023-12-23 Thread Thomas Schwinge
Hi!

On 2023-12-21T13:58:23+0100, Jakub Jelinek  wrote:
> On Thu, Dec 21, 2023 at 01:31:19PM +0100, Thomas Schwinge wrote:
>> [...] the gimplification-level code re
>> 'Static locals [...] need to be "omp declare target"' runs *after*
>> 'omp_discover_implicit_declare_target'.  Thus my "move" idea above.
>
> Can't we mark the static locals already during that discovery?

Well, that's precisely what I had tried to communicate, earlier on.  ;-)

I'll work on that, as a refactoring, after I've gotten the current
implementation idea working.

> The addition during gimplification was probably made when we didn't have
> that at all.


>> OK to push, for a start, the attached
>> "GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static 
>> local variables support"?
>> That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
>> target testing, where we thus see a number of FAIL -> PASS progressions.
>
>> For now, for single-threaded GCN, nvptx target use only; extension for
>> multi-threaded offloading use to follow later.
>>
>>  libgcc/
>>  * c++-minimal/README: New.
>>  * c++-minimal/guard.c: New.
>>  * config/gcn/t-amdgcn (LIB2ADD): Add it.
>>  * config/nvptx/t-nvptx (LIB2ADD): Likewise.
>
>> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/cxxabi.h'.  */
>> +
>> +  int
>> +  __cxa_guard_acquire(__guard*);
>> +
>> +  void
>> +  __cxa_guard_release(__guard*);
>> +
>> +  void
>> +  __cxa_guard_abort(__guard*);
>
> When all this isn't inside a namespace, shouldn't it be indented by
> 2 spaces less?
>
>> +
>> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/guard.cc'.  */
>> +
>> +# undef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>> +# undef _GLIBCXX_GUARD_SET_AND_RELEASE
>> +# define _GLIBCXX_GUARD_SET_AND_RELEASE(G) _GLIBCXX_GUARD_SET (G)
>
> And without a space after # here?

Well, those were just un-edited copy'n'pastes from the original files;
now indentation/space-corrected for viewing pleasure.

> Otherwise LGTM, but hope that one day we'll get rid of it again.

Yep.

Pushed to master branch commit c0bf7ea189ecf252152fe15134f70f576bcd20b2
"GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local 
variables support",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c0bf7ea189ecf252152fe15134f70f576bcd20b2 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 20 Dec 2023 12:27:48 +0100
Subject: [PATCH] GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for
 C++ static local variables support

For now, for single-threaded GCN, nvptx target use only; extension for
multi-threaded offloading use is to follow later.  Eventually switch to
libstdc++-v3/libsupc++ proper.

	libgcc/
	* c++-minimal/README: New.
	* c++-minimal/guard.c: New.
	* config/gcn/t-amdgcn (LIB2ADD): Add it.
	* config/nvptx/t-nvptx (LIB2ADD): Likewise.
---
 libgcc/c++-minimal/README   |  2 +
 libgcc/c++-minimal/guard.c  | 97 +
 libgcc/config/gcn/t-amdgcn  |  3 ++
 libgcc/config/nvptx/t-nvptx |  3 ++
 4 files changed, 105 insertions(+)
 create mode 100644 libgcc/c++-minimal/README
 create mode 100644 libgcc/c++-minimal/guard.c

diff --git a/libgcc/c++-minimal/README b/libgcc/c++-minimal/README
new file mode 100644
index 000..832f1265f7e
--- /dev/null
+++ b/libgcc/c++-minimal/README
@@ -0,0 +1,2 @@
+Minimal hacked-up version of some C++ support for offload devices, until we
+have libstdc++-v3/libsupc++ proper.
diff --git a/libgcc/c++-minimal/guard.c b/libgcc/c++-minimal/guard.c
new file mode 100644
index 000..e9937b07a62
--- /dev/null
+++ b/libgcc/c++-minimal/guard.c
@@ -0,0 +1,97 @@
+/* 'libstdc++-v3/libsupc++/guard.cc' for offload devices, until we have
+   libstdc++-v3/libsupc++ proper.
+
+   Copyright (C) 2002-2023 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#if defined __AMDGCN__

Re: [commit v3 1/2] MIPS: Put the ret to the end of args of reconcat [PR112759]

2023-12-23 Thread Andreas Schwab
On Dez 23 2023, YunQiang Su wrote:

> diff --git a/gcc/config/mips/driver-native.cc 
> b/gcc/config/mips/driver-native.cc
> index afc276f5278..4ef48e14916 100644
> --- a/gcc/config/mips/driver-native.cc
> +++ b/gcc/config/mips/driver-native.cc
> @@ -44,6 +44,8 @@ const char *
>  host_detect_local_cpu (int argc, const char **argv)
>  {
>const char *cpu = NULL;
> +  /* Don't assigne any static string to ret.  If you need to do so,
  assign

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread YunQiang Su
On TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true platforms,
if 31 or above bits is polluted by an bitops, we will need an
truncate. Let's emit one, and mark let's use the same hardreg
as in and out, the RTL may like:

(insn 21 20 24 2 (set (subreg/s/u:SI (reg/v:DI 200 [ val ]) 0)
(truncate:SI (reg/v:DI 200 [ val ]))) "../xx.c":7:29 -1
 (nil))

We use /s/u flags to mark it as really needed, as in
combine_simplify_rtx, this insn may be considered as truncated,
so let's skip this combination.

gcc/ChangeLog:
PR: 104914.
* combine.cc (try_combine): Skip combine with truncate if
dest is subreg and has /u/s flags on platforms
TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true.
* expr.cc (expand_assignment): Emit a truncate insn, if
31+ bits is polluted for SImode.

gcc/testsuite/ChangeLog:
PR: 104914.
* gcc.target/mips/pr104914.c: New testcase.
---
 gcc/combine.cc   | 23 +-
 gcc/expr.cc  | 17 
 gcc/testsuite/gcc.target/mips/pr104914.c | 25 
 3 files changed, 64 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr104914.c

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 1cda4dd57f2..04b9c414053 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -3294,6 +3294,28 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
   n_occurrences = 0;   /* `subst' counts here */
   subst_low_luid = DF_INSN_LUID (i2);
 
+  /* Don't try to combine a TRUNCATE INSN, if it's DEST is SUBREG and has
+FLAG /s/u.  We use these 2 flags to mark this INSN as really needed:
+normally, it means that the bits of 31+ of this variable is polluted
+by a bitops.  The reason of existing of case (subreg:SI (reg:DI)) is
+that, the same hardreg may act as src and dest.  */
+  if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && INSN_P (i2))
+   {
+ rtx i2dest_o = SET_DEST (PATTERN (i2));
+ rtx i2src_o = SET_SRC (PATTERN (i2));
+ if (GET_CODE (i2dest_o) == SUBREG
+ && GET_MODE (i2dest_o) == SImode
+ && GET_MODE (SUBREG_REG (i2dest_o)) == DImode
+ && SUBREG_PROMOTED_VAR_P (i2dest_o)
+ && SUBREG_PROMOTED_GET (i2dest_o) == SRP_SIGNED
+ && GET_CODE (i2src_o) == TRUNCATE
+ && GET_MODE (i2src_o) == SImode
+ && rtx_equal_p (SUBREG_REG (i2dest_o), XEXP (i2src_o, 0))
+ )
+   return 0;
+   }
+
   /* If I1 feeds into I2 and I1DEST is in I1SRC, we need to make a unique
 copy of I2SRC each time we substitute it, in order to avoid creating
 self-referential RTL when we will be substituting I1SRC for I1DEST
@@ -5326,7 +5348,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src)
 
UNIQUE_COPY is true if each substitution must be unique.  We do this
by copying if `n_occurrences' is nonzero.  */
-
 static rtx
 subst (rtx x, rtx from, rtx to, bool in_dest, bool in_cond, bool unique_copy)
 {
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9fef2bf6585..f7236040a34 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6284,6 +6284,23 @@ expand_assignment (tree to, tree from, bool nontemporal)
nontemporal, reversep);
  convert_move (SUBREG_REG (to_rtx), to_rtx1,
SUBREG_PROMOTED_SIGN (to_rtx));
+
+ rtx last = get_last_insn ();
+ if (TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)
+ && known_ge (bitregion_end, 31)
+ && SUBREG_PROMOTED_VAR_P (to_rtx)
+ && SUBREG_PROMOTED_SIGN (to_rtx) == SRP_SIGNED
+ && GET_MODE (to_rtx) == SImode
+ && GET_MODE (SUBREG_REG (to_rtx)) == DImode
+ && GET_CODE (SET_SRC (PATTERN (last))) == SIGN_EXTEND
+ )
+   {
+ insn_code icode = convert_optab_handler
+   (trunc_optab, SImode, DImode);
+ if (icode != CODE_FOR_nothing)
+   emit_unop_insn (icode, to_rtx,
+   SUBREG_REG (to_rtx), TRUNCATE);
+   }
}
}
  else
diff --git a/gcc/testsuite/gcc.target/mips/pr104914.c 
b/gcc/testsuite/gcc.target/mips/pr104914.c
new file mode 100644
index 000..5dd10e84c17
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr104914.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-mabi=64" } */
+
+extern void abort (void);
+extern void exit (int);
+
+NOMIPS16 int test (const unsigned char *buf)
+{
+  int val;
+  ((unsigned char*))[0] = *buf++;
+  ((unsigned char*))[1] = *buf++;
+  ((unsigned char*))[2] = *buf++;
+  

[commit v3 2/2] MIPS: Don't add nan2008 option for -mtune=native

2023-12-23 Thread YunQiang Su
Users may wish just use -mtune=native for performance tuning only.
Let's don't make trouble for its case.

gcc/

* config/mips/driver-native.cc (host_detect_local_cpu):
don't add nan2008 option for -mtune=native.
---
 gcc/config/mips/driver-native.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/mips/driver-native.cc b/gcc/config/mips/driver-native.cc
index 4ef48e14916..b8c37d69215 100644
--- a/gcc/config/mips/driver-native.cc
+++ b/gcc/config/mips/driver-native.cc
@@ -93,7 +93,8 @@ host_detect_local_cpu (int argc, const char **argv)
 fallback_cpu:
 #if defined (__mips_nan2008)
   /* Put the ret to the end of list, since it may be NULL.  */
-  ret = reconcat (ret, " -mnan=2008 ", ret, NULL);
+  if (arch)
+ret = reconcat (ret, " -mnan=2008 ", ret, NULL);
 #endif
 
 #ifdef HAVE_GETAUXVAL
-- 
2.39.2



[commit v3 1/2] MIPS: Put the ret to the end of args of reconcat [PR112759]

2023-12-23 Thread YunQiang Su
The function `reconcat` cannot append string(s) to NULL,
as the concat process will stop at the first NULL.

Let's always put the `ret` to the end, as it may be NULL.
We keep use reconcat here, due to that reconcat can make it
easier if we add more hardware features detecting, for example
by hwcap.

gcc/

PR target/112759
* config/mips/driver-native.cc (host_detect_local_cpu):
Put the ret to the end of args of reconcat.
---
 gcc/config/mips/driver-native.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/config/mips/driver-native.cc b/gcc/config/mips/driver-native.cc
index afc276f5278..4ef48e14916 100644
--- a/gcc/config/mips/driver-native.cc
+++ b/gcc/config/mips/driver-native.cc
@@ -44,6 +44,8 @@ const char *
 host_detect_local_cpu (int argc, const char **argv)
 {
   const char *cpu = NULL;
+  /* Don't assigne any static string to ret.  If you need to do so,
+ use concat.  */
   char *ret = NULL;
   char buf[128];
   FILE *f;
@@ -90,7 +92,8 @@ host_detect_local_cpu (int argc, const char **argv)
 
 fallback_cpu:
 #if defined (__mips_nan2008)
-  ret = reconcat (ret, " -mnan=2008 ", NULL);
+  /* Put the ret to the end of list, since it may be NULL.  */
+  ret = reconcat (ret, " -mnan=2008 ", ret, NULL);
 #endif
 
 #ifdef HAVE_GETAUXVAL
@@ -104,7 +107,7 @@ fallback_cpu:
 #endif
 
   if (cpu)
-ret = reconcat (ret, ret, "-m", argv[0], "=", cpu, NULL);
+ret = reconcat (ret, " -m", argv[0], "=", cpu, ret, NULL);
 
   return ret;
 }
-- 
2.39.2



Re: [PATCH v2] MIPS: Put the ret to the end of args of reconcat [PR112759]

2023-12-23 Thread YunQiang Su
Jakub Jelinek  于2023年12月19日周二 16:40写道:
>
> On Tue, Dec 19, 2023 at 09:30:49AM +0800, YunQiang Su wrote:
> > The function `reconcat` cannot append string(s) to NULL,
> > as the concat process will stop at the first NULL.
> >
> > Let's always put the `ret` to the end, as it may be NULL.
> > We keep use reconcat here, due to that reconcat can make it
> > easier if we add more hardware features detecting, for example
> > by hwcap.
> >
> > gcc/
> >
> > PR target/112759
> > * config/mips/driver-native.cc (host_detect_local_cpu):
> >   Put the ret to the end of args of reconcat.
> > ---
> >  gcc/config/mips/driver-native.cc | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/config/mips/driver-native.cc 
> > b/gcc/config/mips/driver-native.cc
> > index afc276f5278..9a224b3f401 100644
> > --- a/gcc/config/mips/driver-native.cc
> > +++ b/gcc/config/mips/driver-native.cc
> > @@ -44,6 +44,8 @@ const char *
> >  host_detect_local_cpu (int argc, const char **argv)
> >  {
> >const char *cpu = NULL;
> > +  /* Don't assigne any static string to ret.  If you need to do so,
> > + use concat.  */
> >char *ret = NULL;
> >char buf[128];
> >FILE *f;
> > @@ -90,7 +92,8 @@ host_detect_local_cpu (int argc, const char **argv)
> >
> >  fallback_cpu:
> >  #if defined (__mips_nan2008)
> > -  ret = reconcat (ret, " -mnan=2008 ", NULL);
> > +  /* Put the ret to the end of list, since it maybe NULL.  */
> > +  ret = reconcat (ret, "-mnan=2008", ret, NULL);
> >  #endif
> >
> >  #ifdef HAVE_GETAUXVAL
> > @@ -104,7 +107,7 @@ fallback_cpu:
> >  #endif
> >
> >if (cpu)
> > -ret = reconcat (ret, ret, "-m", argv[0], "=", cpu, NULL);
> > +ret = reconcat (ret, "-m", argv[0], "=", cpu, ret, NULL);
>
> I think if you don't put any spaces, the above could return
> -march=loongson3a-mnan=2008
> which will not work.

Thanks.

> If you want to emit no spurious spaces around but emit them when needed,
> one way is to put there the space when needed, so
> ret = reconcat (ret, "-mnan=2008", ret ? " " : "", ret, NULL);
> or
> ret = reconcat (ret, "-m", argv[0], "=", cpu, ret ? " " : "", ret, NULL);
> would do it.
>
> I must say I'm also surprised by determining whether to use -mnan=2008 or
> not by how has the host compiler been configured, shouldn't that be
> querying properties of the hardware (say, perform some floating point
> operation that should result in a quiet NaN and see if it has the mantissa
> MSB set or clear)?  And, do you really want to add that -mnan=2008 twice

In fact, we cannot. Since the operating system env can have only one value.
Currently, the OS env can be NAN-legacy, and it can be used on NAN2008 hardware
if an extra kernel option is set. And vice versa.

If we detect the hardware, user will meet some problem when linking, such as
  linking -mnan=2008 module with previous -mnan=legacy modules

And the binaries for NaN2008, and NaN-legacy use different dynamic linkers.
If we don't pass -mnan=2008 here, if the gcc is configured --with-nan=2008,
the NaN-legacy dynamic linkers will be tried, and then failed.

Maybe it needs a better fix.
Let's keep it for now, and try to find a better solution.

> for -march=native -mtune=native, or just for one of those (I assume
> -mnan=2008 is an ABI option, so shouldn't be about tuning but about
> -march=).
>

Thank you. Your suggestion is correct.
I will skip -mtune, as a user may use it separately just for tuning.

> That said, don't really know anything about MIPS, so these are just random
> comments.
>
> Jakub
>