date:20231221

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu




在 2023/12/22 下午3:09, Xi Ruoyao 写道:

On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote:

在 2023/12/21 下午8:00, chenglulu 写道:

Sorry, I've been busy with something else these two days. I don't
think there's anything wrong with the code,

but I need to test the spec.:-)

Hi, Ruoyao:

After applying this patch, spec2006 464.h264 ref will have a 6.4%
performance drop. So I'm going to retest it.

I think 6.4% is large enough not to be a random error.

Is there an example showing the code regression?

And I'm wondering if keeping the peephole besides the new
define_insn_and_split produces a better result instead of solely relying
on define_insn_and_split?

I haven't debugged this yet, I'm retesting, if there is still such a big 
performance gap,


I think I need to see the reason.

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao

On Fri, 2023-12-22 at 11:44 +0800, chenglulu wrote:
> 
> 在 2023/12/21 下午8:00, chenglulu 写道:
> > Sorry, I've been busy with something else these two days. I don't 
> > think there's anything wrong with the code,
> > 
> > but I need to test the spec.:-)
> 
> Hi, Ruoyao:
> 
> After applying this patch, spec2006 464.h264 ref will have a 6.4% 
> performance drop. So I'm going to retest it.

I think 6.4% is large enough not to be a random error.

Is there an example showing the code regression?

And I'm wondering if keeping the peephole besides the new
define_insn_and_split produces a better result instead of solely relying
on define_insn_and_split?

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-21 Thread Lipeng Zhu


Hi Thomas,

On 2023/12/21 19:42, Thomas Schwinge wrote:

Hi!

On 2023-12-13T21:52:29+0100, I wrote:

On 2023-12-12T02:05:26+, "Zhu, Lipeng"  wrote:

On 2023/12/12 1:45, H.J. Lu wrote:

On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng  wrote:

On 2023/12/9 23:23, Jakub Jelinek wrote:

On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote:

This patch try to introduce the rwlock and split the read/write to
unit_root tree and unit_cache with rwlock instead of the mutex to
increase CPU efficiency. In the get_gfc_unit function, the
percentage to step into the insert_unit function is around 30%, in
most instances, we can get the unit in the phase of reading the
unit_cache or unit_root tree. So split the read/write phase by
rwlock would be an approach to make it more parallel.

BTW, the IPC metrics can gain around 9x in our test server with
220 cores. The benchmark we used is
https://github.com/rwesson/NEAT



Ok for trunk, thanks.



Thanks! Looking forward to landing to trunk.



Pushed for you.



I've just filed 
"'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution test 
timeouts".
Would you be able to look into that?


See my update in there.


Grüße
  Thomas
-- > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 
201, 80634 München; Gesellschaft mit beschränkter Haftung; 
Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: 
München; Registergericht München, HRB 106955




Since I don't have gcc bugzilla account. Reply in this thread:
Limit themselves to some lower 'OMP_NUM_THREADS' should be an option or 
increase the execution timeout?


But I can't reproduce the execution timeout failure on both powerpc9 and 
powerpc10 arch machine. And I also tried to decrease the CPU frequency 
from 2.6G to 800M, these test cases still can run successfully.


> so only a little bit of an improvement of the new "rwlock" 
libgfortran vs. old "mutex" GCC 10 one, curiously.  (But supposedly that 
depends on the hardware or other factors?)


The rwlock can increase the IPC a lot, maybe the wall time you listed is 
not obvious.


$ grep ^cpu < /proc/cpuinfo | uniq -c

192 cpu : POWER10 (architected), altivec supported

Native configuration is powerpc64le-unknown-linux-gnu

Schedule of variations:
unix

PASS: libgomp.fortran/rwlock_1.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for 
excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

PASS: libgomp.fortran/rwlock_1.f90   -O3 -g  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_1.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/rwlock_1.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for 
excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test

PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -O3 -g  execution test
PASS: libgomp.fortran/rwlock_2.f90   -Os  (test for excess errors)
PASS: libgomp.fortran/rwlock_2.f90   -Os  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O0  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O0  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O1  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O1  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O2  (test for excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O2  execution test
PASS: libgomp.fortran/rwlock_3.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for 
excess errors)
PASS: libgomp.fortran/rwlock_3.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test


Lipeng Zhu

Re: [PATCH v2] LoongArch: Implement FCCmode reload and cstore4

2023-12-21 Thread Jiahao Xu

SPECCPU 2017 and SPECCPU 2006 successfully built and tested, and this 
patch gives a 1.3% improvement in SPECCPU 2017 fprate on 3A6000, no 
performance regression was found. This is an effective optimization and 
looks good.


在 2023/12/15 下午4:57, Xi Ruoyao 写道:

We used a branch to load floating-point comparison results into GPR.
This is very slow when the branch is not predictable.

Implement movfcc so we can reload FCCmode into GPRs, FPRs, and MEM.
Then implement cstore4.

gcc/ChangeLog:

* config/loongarch/loongarch-tune.h
(loongarch_rtx_cost_data::movcf2gr): New field.
(loongarch_rtx_cost_data::movcf2gr_): New method.
(loongarch_rtx_cost_data::use_movcf2gr): New method.
* config/loongarch/loongarch-def.cc
(loongarch_rtx_cost_data::loongarch_rtx_cost_data): Set movcf2gr
to COSTS_N_INSNS (7) and movgr2cf to COSTS_N_INSNS (15), based
on timing on LA464.
(loongarch_cpu_rtx_cost_data): Set movcf2gr and movgr2cf to
COSTS_N_INSNS (1) for LA664.
(loongarch_rtx_cost_optimize_size): Set movcf2gr and movgr2cf to
COSTS_N_INSNS (1) + 1.
* config/loongarch/predicates.md (loongarch_fcmp_operator): New
predicate.
* config/loongarch/loongarch.md (movfcc): Change to
define_expand.
(movfcc_internal): New define_insn.
(fcc_to_): New define_insn.
(cstore4): New define_expand.
* config/loongarch/loongarch.cc
(loongarch_hard_regno_mode_ok_uncached): Allow FCCmode in GPRs
and GPRs.
(loongarch_secondary_reload): Reload FCCmode via FPR and/or GPR.
(loongarch_emit_float_compare): Call gen_reg_rtx instead of
loongarch_allocate_fcc.
(loongarch_allocate_fcc): Remove.
(loongarch_move_to_gpr_cost): Handle FCC_REGS -> GR_REGS.
(loongarch_move_from_gpr_cost): Handle GR_REGS -> FCC_REGS.
(loongarch_register_move_cost): Handle FCC_REGS -> FCC_REGS,
FCC_REGS -> FP_REGS, and FP_REGS -> FCC_REGS.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/movcf2gr.c: New test.
* gcc.target/loongarch/movcf2gr-via-fr.c: New test.
---

Superseds
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640497.html.

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

  gcc/config/loongarch/loongarch-def.cc | 13 +++-
  gcc/config/loongarch/loongarch-tune.h | 15 +++-
  gcc/config/loongarch/loongarch.cc | 70 ---
  gcc/config/loongarch/loongarch.md | 69 --
  gcc/config/loongarch/predicates.md|  4 ++
  .../gcc.target/loongarch/movcf2gr-via-fr.c| 10 +++
  gcc/testsuite/gcc.target/loongarch/movcf2gr.c |  9 +++
  7 files changed, 157 insertions(+), 33 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/loongarch/movcf2gr-via-fr.c
  create mode 100644 gcc/testsuite/gcc.target/loongarch/movcf2gr.c

diff --git a/gcc/config/loongarch/loongarch-def.cc 
b/gcc/config/loongarch/loongarch-def.cc
index 4a8885e8343..843be78e46e 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -101,15 +101,21 @@ loongarch_rtx_cost_data::loongarch_rtx_cost_data ()
  int_mult_di (COSTS_N_INSNS (4)),
  int_div_si (COSTS_N_INSNS (5)),
  int_div_di (COSTS_N_INSNS (5)),
+movcf2gr (COSTS_N_INSNS (7)),
+movgr2cf (COSTS_N_INSNS (15)),
  branch_cost (6),
  memory_latency (4) {}
  
  /* The following properties cannot be looked up directly using "cpucfg".

   So it is necessary to provide a default value for "unknown native"
   tune targets (i.e. -mtune=native while PRID does not correspond to
- any known "-mtune" type).  Currently all numbers are default.  */
+ any known "-mtune" type).  */
  array_tune loongarch_cpu_rtx_cost_data =
-  array_tune ();
+  array_tune ()
+.set (CPU_LA664,
+ loongarch_rtx_cost_data ()
+   .movcf2gr_ (COSTS_N_INSNS (1))
+   .movgr2cf_ (COSTS_N_INSNS (1)));
  
  /* RTX costs to use when optimizing for size.

 We use a value slightly larger than COSTS_N_INSNS (1) for all of them
@@ -125,7 +131,8 @@ const loongarch_rtx_cost_data 
loongarch_rtx_cost_optimize_size =
  .int_mult_si_ (COST_COMPLEX_INSN)
  .int_mult_di_ (COST_COMPLEX_INSN)
  .int_div_si_ (COST_COMPLEX_INSN)
-.int_div_di_ (COST_COMPLEX_INSN);
+.int_div_di_ (COST_COMPLEX_INSN)
+.movcf2gr_ (COST_COMPLEX_INSN);
  
  array_tune loongarch_cpu_issue_rate = array_tune ()

.set (CPU_NATIVE, 4)
diff --git a/gcc/config/loongarch/loongarch-tune.h 
b/gcc/config/loongarch/loongarch-tune.h
index 4aa01c54c08..7a75c8dd9d9 100644
--- a/gcc/config/loongarch/loongarch-tune.h
+++ b/gcc/config/loongarch/loongarch-tune.h
@@ -35,6 +35,8 @@ struct loongarch_rtx_cost_data
unsigned short int_mult_di;
unsigned short int_div_si;
unsigned short int_div_di;
+  unsigned short movcf2gr;
+  unsigned short movgr2cf;
unsigned short branch_

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-12-21 Thread YunQiang Su

> >
> >> (insn 20 19 23 2 (set (reg/v:DI 200 [ val+-4 ])
> >> (sign_extend:DI (subreg:SI (reg/v:DI 200 [ val+-4 ]) 4))) 
> >> "/app/example.cpp":7:29 -1
> >>  (nil))
>
> Haven't had chance to compile and look at it properly, but this subreg
> seems suspicious for MIPS, given the definition of TRULY_NOOP_TRUNCATION.
> We should instead use a truncdisi2 to narrow reg:DI 200 to an SI register,
> and then sign_extend it.
>
> This is easily missed in target-independent code because so few targets
> define TRULY_NOOP_TRUNCATION.
>
> Where is the subreg being generated?
>

It's from expand_assignment(tree to, tree from, bool nontemporal) in expr.cc.
to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE);




-- 
YunQiang Su

Re: [PATCH v2] Store_bit_field_1: Use SUBREG instead of REG if possible

2023-12-21 Thread YunQiang Su

>
> Note I think Andrews comment#7 in the PR is spot-on then, the issue
> isn't the bitfield inserts but the compare where combine elides
> the sign_extend in favor of a subreg.  That's likely some wrongdoing
> in simplify-rtx in the context of WORD_REGISTER_OPERATIONS.
>

Yes. There are 2 problems here. Any one of them can make this problem.
1) jump_insn eats sign_extend (and truncate) in
  /* Simplify X, an IF_THEN_ELSE expression.  Return the new expression.  */
  static rtx simplify_if_then_else (rtx x)

2) MIPS claims sign_extend to be deleted.
(define_insn_and_split "extendsidi2"
  [(set (match_operand:DI 0 "register_operand" "=d,l,d")
(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "0,0,m")))]
  "TARGET_64BIT"
  "@
   #
   #
   lw\t%0,%1"
  "&& reload_completed && register_operand (operands[1], VOIDmode)"
  [(const_int 0)]
{
  emit_note (NOTE_INSN_DELETED);
  DONE;
}
  [(set_attr "move_type" "move,move,load")
   (set_attr "mode" "DI")])

-- 
YunQiang Su

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu




在 2023/12/21 下午8:00, chenglulu 写道:
Sorry, I've been busy with something else these two days. I don't 
think there's anything wrong with the code,


but I need to test the spec.:-)


Hi, Ruoyao:

After applying this patch, spec2006 464.h264 ref will have a 6.4% 
performance drop. So I'm going to retest it.




在 2023/12/21 下午7:56, Xi Ruoyao 写道:

Ping :).

回复：[PATCH v3 0/6] RISC-V: Support XTheadVector extension

2023-12-21 Thread joshua

Hi Juzhe,
Thank you for your comprehensive comments.
Classifying theadvector intrinsics into 3 kinds is really important to make our 
patchset more organized. 
For 1) and 3), I will split out the patches soon and hope they will be merged 
quickly.
For 2), according to the differences between vector and xtheadvector, it can be 
classfied into 3 kinds.
First is renamed load/store, renamed narrowing integer right shift, renamed 
narrowing fixed-point clip, and etc. I think we can use ASM targethook to 
rewrite the whole string of the instructions, although it will still be a heavy 
work.
Second is no pseudo instruction like vneg/vfneg. We will add these pseudo 
instructions in binutils to make xtheadvector more compatible with vector.
Third is that destination vector register cannot overlap source vector register 
group for vmadc/vmsbc/widen arithmetic/narrow arithmetic. Currently I cannot 
come up with any better way than pattern copy. Do you have any suggestions?
Joshua
--
发件人：钟居哲 
发送时间：2023年12月21日(星期四) 07:04
收件人："cooper.joshua"; 
"gcc-patches"
抄　送："jim.wilson.gcc"; palmer; 
andrew; "philipp.tomsich"; Jeff 
Law; "Christoph Müllner"; 
"cooper.joshua"; 
jinma; Cooper Qu
主　题：Re: [PATCH v3 0/6] RISC-V: Support XTheadVector extension
Hi, Joshua.
Thanks for working hard on clean up codes and support tons of work on 
theadvector.
After fully review this patch, I understand you have 3 kinds of theadvector 
intrinsics from the codebase of current RVV1.0 GCC.
1). instructions that can leverage all current codes of RVV1.0 intrinsic with 
simply adding "th." prefix directly.
2). instructions that leverage current MD patterns but with some tweak and 
patterns copy since they are not simply added "th.".
3). new instructions that current RVV1.0 doesn't have like vlb instructions.
Overal, 1) and 3) look reasonable to me. But 2) need me some time to figure out 
the better way to do that (Current this patch with copying patterns is not 
approach I like)
So, I hope you can break this big patch into 3 different series patches.
1. Support partial theadvector instructions which leverage directly from 
current RVV1.0 with simple adding "th." prefix.
2. Support totally different name theadvector instructions but share same 
patterns as RVV1.0 instructions.
3. Support new headvector instructions like vlib...etc.
I think 1 and 3 separate patches can be quickly merged after my more details 
reviewed and approved in the following patches you send like V4 ?.
For 2, it's a bit more complicate, but I think we can support like ARM and 
other targets, use ASM targethook to rewrite the whole string of the 
instructions.
For example, like strided load/store, you can know this instructions from 
attribute:
(set_attr "type" "vlds")
juzhe.zh...@rivai.ai
From: Jun Sha (Joshua) 
Date: 2023-12-20 20:20
To: gcc-patches 
CC: jim.wilson.gcc ; palmer 
; andrew ; 
philipp.tomsich ; jeffreyalaw 
; christoph.muellner 
; juzhe.zhong ; Jun Sha (Joshua) ; Jin Ma 
; Xianmiao Qu 

Subject: [PATCH v3 0/6] RISC-V: Support XTheadVector extension
This patch series presents gcc implementation of the XTheadVector
extension [1].
[1] https://github.com/T-head-Semi/thead-extension-spec/ 

For some vector patterns that cannot be avoided, we use
"!TARGET_XTHEADVECTOR" to disable them in order not to
generate instructions that xtheadvector does not support,
causing 36 changes in vector.md.
For the th. prefix issue, we use current_output_insn and
the ASM_OUTPUT_OPCODE hook instead of directly modifying
patterns in vector.md.
We have run the GCC test suite and can confirm that there
are no regressions.
All the test results can be found in the following links,
Run without xtheadvector:
https://gcc.gnu.org/pipermail/gcc-testresults/2023-December/803686.html 

Run with xtheadvector:
https://gcc.gnu.org/pipermail/gcc-testresults/2023-December/803687.html 

Furthermore, we have run the tests in 
https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/main/examples 
, 
and all the tests passed.
Co-authored-by: Jin Ma 
Co-authored-by: Xianmiao Qu 
Co-authored-by: Christoph Müllner 
RISC-V: Refactor riscv-vector-builtins-bases.cc
RISC-V: Split csr_operand in predicates.md for vector patterns
RISC-V: Introduce XTheadVector as a subset of V1.0.0
RISC-

[pushed] c++: computed goto from catch block [PR81438]

2023-12-21 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

As with 37722, we don't clean up the exception object if a computed goto
leaves a catch block, but we can warn about that.

PR c++/81438

gcc/cp/ChangeLog:

* decl.cc (poplevel_named_label_1): Handle leaving catch.
(check_previous_goto_1): Likewise.
(check_goto_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/ext/label15.C: Require indirect_jumps.
* g++.dg/ext/label16.C: New test.
---
 gcc/cp/decl.cc | 42 --
 gcc/testsuite/g++.dg/ext/label15.C |  1 +
 gcc/testsuite/g++.dg/ext/label16.C | 34 
 3 files changed, 69 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/label16.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index e044bfa6701..6b4d89e7115 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -571,10 +571,14 @@ poplevel_named_label_1 (named_label_entry **slot, 
cp_binding_level *bl)
if (use->binding_level == bl)
  {
if (auto &cg = use->computed_goto)
- for (tree d = use->names_in_scope; d; d = DECL_CHAIN (d))
-   if (TREE_CODE (d) == VAR_DECL && !TREE_STATIC (d)
-   && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (d)))
- vec_safe_push (cg, d);
+ {
+   if (bl->kind == sk_catch)
+ vec_safe_push (cg, get_identifier ("catch"));
+   for (tree d = use->names_in_scope; d; d = DECL_CHAIN (d))
+ if (TREE_CODE (d) == VAR_DECL && !TREE_STATIC (d)
+ && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (d)))
+   vec_safe_push (cg, d);
+ }
 
use->binding_level = obl;
use->names_in_scope = obl->names;
@@ -3820,7 +3824,12 @@ check_previous_goto_1 (tree decl, cp_binding_level* 
level, tree names,
   identified = 2;
   if (complained)
for (tree d : computed)
- inform (DECL_SOURCE_LOCATION (d), "  does not destroy %qD", d);
+ {
+   if (DECL_P (d))
+ inform (DECL_SOURCE_LOCATION (d), "  does not destroy %qD", d);
+   else if (d == get_identifier ("catch"))
+ inform (*locus, "  does not clean up handled exception");
+ }
 }
 
   return !identified;
@@ -3963,15 +3972,32 @@ check_goto_1 (named_label_entry *ent, bool computed)
   auto names = ent->names_in_scope;
   for (auto b = current_binding_level; ; b = b->level_chain)
{
+ if (b->kind == sk_catch)
+   {
+ if (!identified)
+   {
+ complained
+   = identify_goto (decl, DECL_SOURCE_LOCATION (decl),
+&input_location, DK_ERROR, computed);
+ identified = 2;
+   }
+ if (complained)
+   inform (input_location,
+   "  does not clean up handled exception");
+   }
  tree end = b == level ? names : NULL_TREE;
  for (tree d = b->names; d != end; d = DECL_CHAIN (d))
{
  if (TREE_CODE (d) == VAR_DECL && !TREE_STATIC (d)
  && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (d)))
{
- complained = identify_goto (decl, DECL_SOURCE_LOCATION (decl),
- &input_location, DK_ERROR,
- computed);
+ if (!identified)
+   {
+ complained
+   = identify_goto (decl, DECL_SOURCE_LOCATION (decl),
+&input_location, DK_ERROR, computed);
+ identified = 2;
+   }
  if (complained)
inform (DECL_SOURCE_LOCATION (d),
"  does not destroy %qD", d);
diff --git a/gcc/testsuite/g++.dg/ext/label15.C 
b/gcc/testsuite/g++.dg/ext/label15.C
index f9d6a0dd626..5a23895d52d 100644
--- a/gcc/testsuite/g++.dg/ext/label15.C
+++ b/gcc/testsuite/g++.dg/ext/label15.C
@@ -1,4 +1,5 @@
 // PR c++/37722
+// { dg-do compile { target indirect_jumps } }
 // { dg-options "" }
 
 extern "C" int printf (const char *, ...);
diff --git a/gcc/testsuite/g++.dg/ext/label16.C 
b/gcc/testsuite/g++.dg/ext/label16.C
new file mode 100644
index 000..ea79b6ef1fc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/label16.C
@@ -0,0 +1,34 @@
+// PR c++/81438
+// { dg-do compile { target indirect_jumps } }
+// { dg-options "" }
+
+bool b;
+int main()
+{
+  try
+{
+  try { throw 3; }
+  catch(...) {
+  h:;  // { dg-warning "jump to label" }
+   try { throw 7; }
+   catch(...) {
+ if (b)
+   goto *&&h;  // { dg-message "computed goto" }
+   // { dg-message "handled exception" "" { target 
*-*-* }

[Committed, obvious] Testsuite: Fix failures in g++.dg/analyzer/placement-new-size.C

2023-12-21 Thread Sandra Loosemore

This testcase was failing on uses of int8_t, int64_t, etc without
including .

gcc/testsuite/ChangeLog
* g++.dg/analyzer/placement-new-size.C: Include .  Also
add missing newline to end of file.
---
 gcc/testsuite/g++.dg/analyzer/placement-new-size.C | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/g++.dg/analyzer/placement-new-size.C 
b/gcc/testsuite/g++.dg/analyzer/placement-new-size.C
index 75a5a159282..f6c7bd4de5f 100644
--- a/gcc/testsuite/g++.dg/analyzer/placement-new-size.C
+++ b/gcc/testsuite/g++.dg/analyzer/placement-new-size.C
@@ -2,6 +2,7 @@
 
 #include 
 #include 
+#include 
 
 extern int get_buf_size ();
 
@@ -34,4 +35,4 @@ void test_binop ()
   int32_t *i = ::new (p + 1) int32_t; /* { dg-warning "heap-based buffer 
overflow" } */
   *i = 42; /* { dg-warning "heap-based buffer overflow" } */
   free (p);
-}
\ No newline at end of file
+}
-- 
2.31.1

Re: [PATCH] treat argp-based mem as frame related in dse

2023-12-21 Thread Hans-Peter Nilsson

> From: Jiufu Guo 
> Date: Wed,  6 Dec 2023 17:27:58 +0800

> Hi,
> 
> The issue mentioned in PR112525 would be able to be handled by
>  
> updating dse.cc to treat arg_pointer_rtx similarly with frame_pointer_rtx.
>  
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30271#c10 also mentioned 
>  
> this idea.   
> 
> One thing, arpg area may be used to pass argument to callee. So, it would 
>
> be needed to check if call insns are using that mem.
> 
> Bootstrap ®test pass on ppc64{,le} and x86_64.
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu Guo)
> 
> 
>   PR rtl-optimization/112525
> 
> gcc/ChangeLog:
> 
>   * dse.cc (get_group_info): Add arg_pointer_rtx as frame_related.
>   (check_mem_read_rtx): Add parameter to indicate if it is checking mem
>   for call insn.
>   (scan_insn): Add mem checking on call usage.

This, when committed as r14-6674-g4759383245ac97, caused all
or most test that "throw" to fail for cris-elf at execution
time, but I don't see other targets failing (cf. gcc-testresults
archives).  I opened PR113109 and will dig a little deeper.

brgds, H-P

[PATCH] libgfortran: Bugfix if not define HAVE_ATOMIC_FETCH_ADD

2023-12-21 Thread Lipeng Zhu

This patch try to fix the bug when HAVE_ATOMIC_FETCH_ADD is
not defined in dec_waiting_unlocked function.

libgfortran/ChangeLog:

* io/io.h (dec_waiting_unlocked): Use
__gthread_rwlock_wrlock/__gthread_rwlock_unlock or
__gthread_mutex_lock/__gthread_mutex_unlock functions
to replace WRLOCK and RWUNLOCK macros.

Signed-off-by: Lipeng Zhu 
---
 libgfortran/io/io.h | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index 15daa0995b1..c7f0f7d7d9e 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -1020,9 +1020,15 @@ dec_waiting_unlocked (gfc_unit *u)
 #ifdef HAVE_ATOMIC_FETCH_ADD
   (void) __atomic_fetch_add (&u->waiting, -1, __ATOMIC_RELAXED);
 #else
-  WRLOCK (&unit_rwlock);
+#ifdef __GTHREAD_RWLOCK_INIT
+  __gthread_rwlock_wrlock (&unit_rwlock);
+  u->waiting--;
+  __gthread_rwlock_unlock (&unit_rwlock);
+#else
+  __gthread_mutex_lock (&unit_rwlock);
   u->waiting--;
-  RWUNLOCK (&unit_rwlock);
+  __gthread_mutex_unlock (&unit_rwlock);
+#endif
 #endif
 }
 
-- 
2.39.3

Re: [PATCH] RISC-V: Add crypto machine descriptions

2023-12-21 Thread Feng Wang

2023-12-22 09:59 Feng Wang  wrote:

Sorry for forgetting to add the patch version number. It should be [PATCH v8 
2/3]

>Patch v8: Remove unused iterator and add newline at the end.



>Patch v7: Remove mode of const_int_operand and typo. Add



>  newline at the end and comment at the beginning.



>Patch v6: Swap the operator order of vandn.vv



>Patch v5: Add vec_duplicate operator.



>Patch v4: Add process of SEW=64 in RV32 system.



>Patch v3: Moidfy constrains for crypto vector.



>Patch v2: Add crypto vector ins into RATIO attr and use vr as



>destination register.



>



>This patch add the crypto machine descriptions(vector-crypto.md) and



>some new iterators which are used by crypto vector ext.



>



>Co-Authored by: Songhe Zhu 



>Co-Authored by: Ciyan Pan 



>gcc/ChangeLog:



>



>   * config/riscv/iterators.md: Add rotate insn name.



>   * config/riscv/riscv.md: Add new insns name for crypto vector.



>   * config/riscv/vector-iterators.md: Add new iterators for crypto vector.



>   * config/riscv/vector.md: Add the corresponding attr for crypto vector.



>   * config/riscv/vector-crypto.md: New file.The machine descriptions for 
> crypto vector.



>---



> gcc/config/riscv/iterators.md    |   4 +-



> gcc/config/riscv/riscv.md    |  33 +-



> gcc/config/riscv/vector-crypto.md    | 654 +++



> gcc/config/riscv/vector-iterators.md |  36 ++



> gcc/config/riscv/vector.md   |  55 ++-



> 5 files changed, 761 insertions(+), 21 deletions(-)



> create mode 100755 gcc/config/riscv/vector-crypto.md



>



>diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md



>index ecf033f2fa7..f332fba7031 100644



>--- a/gcc/config/riscv/iterators.md



>+++ b/gcc/config/riscv/iterators.md



>@@ -304,7 +304,9 @@



>(umax "maxu")



>(clz "clz")



>(ctz "ctz")



>-   (popcount "cpop")])



>+   (popcount "cpop")



>+   (rotate "rol")



>+   (rotatert "ror")])



> 



> ;; ---



> ;; Int Iterators.



>diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md



>index ee8b71c22aa..88019a46a53 100644



>--- a/gcc/config/riscv/riscv.md



>+++ b/gcc/config/riscv/riscv.md



>@@ -427,6 +427,34 @@



> ;; vcompress    vector compress instruction



> ;; vmov whole vector register move



> ;; vector   unknown vector instruction



>+;; 17. Crypto Vector instructions



>+;; vandn    crypto vector bitwise and-not instructions



>+;; vbrev    crypto vector reverse bits in elements instructions



>+;; vbrev8   crypto vector reverse bits in bytes instructions



>+;; vrev8    crypto vector reverse bytes instructions



>+;; vclz crypto vector count leading Zeros instructions



>+;; vctz crypto vector count lrailing Zeros instructions



>+;; vrol crypto vector rotate left instructions



>+;; vror crypto vector rotate right instructions



>+;; vwsll    crypto vector widening shift left logical instructions



>+;; vclmul   crypto vector carry-less multiply - return low half 
>instructions



>+;; vclmulh  crypto vector carry-less multiply - return high half 
>instructions



>+;; vghsh    crypto vector add-multiply over GHASH Galois-Field 
>instructions



>+;; vgmul    crypto vector multiply over GHASH Galois-Field instrumctions



>+;; vaesef   crypto vector AES final-round encryption instructions



>+;; vaesem   crypto vector AES middle-round encryption instructions



>+;; vaesdf   crypto vector AES final-round decryption instructions



>+;; vaesdm   crypto vector AES middle-round decryption instructions



>+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
>instructions



>+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
>instructions



>+;; vaesz    crypto vector AES round zero encryption/decryption 
>instructions



>+;; vsha2ms  crypto vector SHA-2 message schedule instructions



>+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions



>+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions



>+;; vsm4k    crypto vector SM4 KeyExpansion instructions



>+;; vsm4r    crypto vector SM4 Rounds instructions



>+;; vsm3me   crypto vector SM3 Message Expansion instructions



>+;; vsm3c    crypto vector SM3 Compression instructions



> (define_attr "type"



>   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,



>    mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,



>@@ -446,7 +474,9 @@



>    vired,viwred,vfredu,vfredo,vfwredu,vfwredo,



>    vmalu,vmpop,vmffs

Re: [PATCH] RISC-V: Add crypto machine descriptions

2023-12-21 Thread juzhe.zh...@rivai.ai

Machine description part is ok from my side.

But I don't know the plan of vector crypto.

I'd like to wait kito or Jeff to make sure we allow vector-crypto intrinsics as 
part of GCC-14 release.

Thanks.


juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-22 09:59
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang
Subject: [PATCH] RISC-V: Add crypto machine descriptions
Patch v8: Remove unused iterator and add newline at the end.
Patch v7: Remove mode of const_int_operand and typo. Add
  newline at the end and comment at the beginning.
Patch v6: Swap the operator order of vandn.vv
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.
 
This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
gcc/ChangeLog:
 
* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
vector.
---
gcc/config/riscv/iterators.md|   4 +-
gcc/config/riscv/riscv.md|  33 +-
gcc/config/riscv/vector-crypto.md| 654 +++
gcc/config/riscv/vector-iterators.md |  36 ++
gcc/config/riscv/vector.md   |  55 ++-
5 files changed, 761 insertions(+), 21 deletions(-)
create mode 100755 gcc/config/riscv/vector-crypto.md
 
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
(umax "maxu")
(clz "clz")
(ctz "ctz")
- (popcount "cpop")])
+ (popcount "cpop")
+ (rotate "rol")
+ (rotatert "ror")])
;; ---
;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
;; vcompressvector compress instruction
;; vmov whole vector register move
;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
(define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2m

[PATCH] RISC-V: Add crypto machine descriptions

2023-12-21 Thread Feng Wang

Patch v8: Remove unused iterator and add newline at the end.
Patch v7: Remove mode of const_int_operand and typo. Add
  newline at the end and comment at the beginning.
Patch v6: Swap the operator order of vandn.vv
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.

This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.

Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
gcc/ChangeLog:

* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for 
crypto vector.
---
 gcc/config/riscv/iterators.md|   4 +-
 gcc/config/riscv/riscv.md|  33 +-
 gcc/config/riscv/vector-crypto.md| 654 +++
 gcc/config/riscv/vector-iterators.md |  36 ++
 gcc/config/riscv/vector.md   |  55 ++-
 5 files changed, 761 insertions(+), 21 deletions(-)
 create mode 100755 gcc/config/riscv/vector-crypto.md

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
 (umax "maxu")
 (clz "clz")
 (ctz "ctz")
-(popcount "cpop")])
+(popcount "cpop")
+(rotate "rol")
+(rotatert "ror")])
 
 ;; ---
 ;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
 ;; vcompressvector compress instruction
 ;; vmov whole vector register move
 ;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 ;; If a do

Re: [PATCH v7 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread juzhe.zh...@rivai.ai

Also the copy right is incorrect:
+;; Copyright (C) 2022-23 Free Software Foundation, Inc.

It should be:
Copyright (C) 2023 Free Software Foundation, Inc.



juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-22 09:38
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang
Subject: [PATCH v7 2/3] RISC-V: Add crypto machine descriptions
Patch v7: Remove mode of const_int_operand and typo. Add
  newline at the end and comment at the beginning.
Patch v6: Swap the operator order of vandn.vv
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.
 
This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
gcc/ChangeLog:
 
* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
vector.
---
gcc/config/riscv/iterators.md|   4 +-
gcc/config/riscv/riscv.md|  33 +-
gcc/config/riscv/vector-crypto.md| 654 +++
gcc/config/riscv/vector-iterators.md |  41 ++
gcc/config/riscv/vector.md   |  55 ++-
5 files changed, 766 insertions(+), 21 deletions(-)
create mode 100755 gcc/config/riscv/vector-crypto.md
 
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
(umax "maxu")
(clz "clz")
(ctz "ctz")
- (popcount "cpop")])
+ (popcount "cpop")
+ (rotate "rol")
+ (rotatert "ror")])
;; ---
;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
;; vcompressvector compress instruction
;; vmov whole vector register move
;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
(define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
;;

Re: [PATCH v7 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread juzhe.zh...@rivai.ai

\ No newline at end of file

Still no new line in vector-iterator.md


juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-22 09:38
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang
Subject: [PATCH v7 2/3] RISC-V: Add crypto machine descriptions
Patch v7: Remove mode of const_int_operand and typo. Add
  newline at the end and comment at the beginning.
Patch v6: Swap the operator order of vandn.vv
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.
 
This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
gcc/ChangeLog:
 
* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
vector.
---
gcc/config/riscv/iterators.md|   4 +-
gcc/config/riscv/riscv.md|  33 +-
gcc/config/riscv/vector-crypto.md| 654 +++
gcc/config/riscv/vector-iterators.md |  41 ++
gcc/config/riscv/vector.md   |  55 ++-
5 files changed, 766 insertions(+), 21 deletions(-)
create mode 100755 gcc/config/riscv/vector-crypto.md
 
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
(umax "maxu")
(clz "clz")
(ctz "ctz")
- (popcount "cpop")])
+ (popcount "cpop")
+ (rotate "rol")
+ (rotatert "ror")])
;; ---
;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
;; vcompressvector compress instruction
;; vmov whole vector register move
;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
(define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
;; If a doubleword move uses these expensive instructions,
@@ -3777,6 +3807,7 @@
(include "

[PATCH v7 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread Feng Wang

Patch v7: Remove mode of const_int_operand and typo. Add
  newline at the end and comment at the beginning.
Patch v6: Swap the operator order of vandn.vv
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.

This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.

Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
gcc/ChangeLog:

* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for 
crypto vector.
---
 gcc/config/riscv/iterators.md|   4 +-
 gcc/config/riscv/riscv.md|  33 +-
 gcc/config/riscv/vector-crypto.md| 654 +++
 gcc/config/riscv/vector-iterators.md |  41 ++
 gcc/config/riscv/vector.md   |  55 ++-
 5 files changed, 766 insertions(+), 21 deletions(-)
 create mode 100755 gcc/config/riscv/vector-crypto.md

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
 (umax "maxu")
 (clz "clz")
 (ctz "ctz")
-(popcount "cpop")])
+(popcount "cpop")
+(rotate "rol")
+(rotatert "ror")])
 
 ;; ---
 ;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
 ;; vcompressvector compress instruction
 ;; vmov whole vector register move
 ;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 ;; If a doubleword move uses these expensive instructions,
@@ -3777,6 +

Re: [PATCH v6 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread juzhe.zh...@rivai.ai

+  (match_operand: 3 "const_int_operand"" i,  i")] 
UNSPEC_CRYPTO_VI)
+   (match_operand: 3 "const_int_operand" " i")] UNSPEC_CRYPTO_VI1)

I think we don't need to specify the mode for const_int_operand like we have 
done for other RVV intrinsics s

+  (match_operand 5 "const_int_operand" "  i,  i")
+  (match_operand 6 "const_int_operand" "  i,  i")

\ No newline at end of file
Each file needs a newline.


+   (match_operand:VSI 1 "vector_merge_operand" " svu, 0")))]
What is "svu" ?





juzhe.zh...@rivai.ai
 
From: Feng Wang
Date: 2023-12-22 08:59
To: gcc-patches
CC: kito.cheng; jeffreyalaw; juzhe.zhong; Feng Wang
Subject: [PATCH v6 2/3] RISC-V: Add crypto machine descriptions
Patch v6: Swap the operator order of vandn.vv.Make report riscv.exp with
  "riscv-sim/-march=rv64gc/-mabi=lp64d/-mcmodel=medlow" is passed.
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.
 
This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.
 
Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 
 
gcc/ChangeLog:
 
* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for crypto 
vector.
---
gcc/config/riscv/iterators.md|   4 +-
gcc/config/riscv/riscv.md|  33 +-
gcc/config/riscv/vector-crypto.md| 635 +++
gcc/config/riscv/vector-iterators.md |  41 ++
gcc/config/riscv/vector.md   |  55 ++-
5 files changed, 747 insertions(+), 21 deletions(-)
create mode 100755 gcc/config/riscv/vector-crypto.md
 
diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
(umax "maxu")
(clz "clz")
(ctz "ctz")
- (popcount "cpop")])
+ (popcount "cpop")
+ (rotate "rol")
+ (rotatert "ror")])
;; ---
;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
;; vcompressvector compress instruction
;; vmov whole vector register move
;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
(define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up

[PATCH v6 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread Feng Wang

Patch v6: Swap the operator order of vandn.vv.Make report riscv.exp with
  "riscv-sim/-march=rv64gc/-mabi=lp64d/-mcmodel=medlow" is passed.
Patch v5: Add vec_duplicate operator.
Patch v4: Add process of SEW=64 in RV32 system.
Patch v3: Moidfy constrains for crypto vector.
Patch v2: Add crypto vector ins into RATIO attr and use vr as
destination register.

This patch add the crypto machine descriptions(vector-crypto.md) and
some new iterators which are used by crypto vector ext.

Co-Authored by: Songhe Zhu 
Co-Authored by: Ciyan Pan 

gcc/ChangeLog:

* config/riscv/iterators.md: Add rotate insn name.
* config/riscv/riscv.md: Add new insns name for crypto vector.
* config/riscv/vector-iterators.md: Add new iterators for crypto vector.
* config/riscv/vector.md: Add the corresponding attr for crypto vector.
* config/riscv/vector-crypto.md: New file.The machine descriptions for 
crypto vector.
---
 gcc/config/riscv/iterators.md|   4 +-
 gcc/config/riscv/riscv.md|  33 +-
 gcc/config/riscv/vector-crypto.md| 635 +++
 gcc/config/riscv/vector-iterators.md |  41 ++
 gcc/config/riscv/vector.md   |  55 ++-
 5 files changed, 747 insertions(+), 21 deletions(-)
 create mode 100755 gcc/config/riscv/vector-crypto.md

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index ecf033f2fa7..f332fba7031 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -304,7 +304,9 @@
 (umax "maxu")
 (clz "clz")
 (ctz "ctz")
-(popcount "cpop")])
+(popcount "cpop")
+(rotate "rol")
+(rotatert "ror")])
 
 ;; ---
 ;; Int Iterators.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index ee8b71c22aa..88019a46a53 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -427,6 +427,34 @@
 ;; vcompressvector compress instruction
 ;; vmov whole vector register move
 ;; vector   unknown vector instruction
+;; 17. Crypto Vector instructions
+;; vandncrypto vector bitwise and-not instructions
+;; vbrevcrypto vector reverse bits in elements instructions
+;; vbrev8   crypto vector reverse bits in bytes instructions
+;; vrev8crypto vector reverse bytes instructions
+;; vclz crypto vector count leading Zeros instructions
+;; vctz crypto vector count lrailing Zeros instructions
+;; vrol crypto vector rotate left instructions
+;; vror crypto vector rotate right instructions
+;; vwsllcrypto vector widening shift left logical instructions
+;; vclmul   crypto vector carry-less multiply - return low half 
instructions
+;; vclmulh  crypto vector carry-less multiply - return high half 
instructions
+;; vghshcrypto vector add-multiply over GHASH Galois-Field instructions
+;; vgmulcrypto vector multiply over GHASH Galois-Field instrumctions
+;; vaesef   crypto vector AES final-round encryption instructions
+;; vaesem   crypto vector AES middle-round encryption instructions
+;; vaesdf   crypto vector AES final-round decryption instructions
+;; vaesdm   crypto vector AES middle-round decryption instructions
+;; vaeskf1  crypto vector AES-128 Forward KeySchedule generation 
instructions
+;; vaeskf2  crypto vector AES-256 Forward KeySchedule generation 
instructions
+;; vaeszcrypto vector AES round zero encryption/decryption instructions
+;; vsha2ms  crypto vector SHA-2 message schedule instructions
+;; vsha2ch  crypto vector SHA-2 two rounds of compression instructions
+;; vsha2cl  crypto vector SHA-2 two rounds of compression instructions
+;; vsm4kcrypto vector SM4 KeyExpansion instructions
+;; vsm4rcrypto vector SM4 Rounds instructions
+;; vsm3me   crypto vector SM3 Message Expansion instructions
+;; vsm3ccrypto vector SM3 Compression instructions
 (define_attr "type"
   "unknown,branch,jump,jalr,ret,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,
@@ -446,7 +474,9 @@
vired,viwred,vfredu,vfredo,vfwredu,vfwredo,
vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,vfmovvf,vfmovfv,
vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,
-   vgather,vcompress,vmov,vector"
+   
vgather,vcompress,vmov,vector,vandn,vbrev,vbrev8,vrev8,vclz,vctz,vcpop,vrol,vror,vwsll,
+   
vclmul,vclmulh,vghsh,vgmul,vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz,
+   vsha2ms,vsha2ch,vsha2cl,vsm4k,vsm4r,vsm3me,vsm3c"
   (cond [(eq_attr "got" "load") (const_string "load")
 
 ;; If a doubleword move uses these expensive instructions,
@@ -3777,6 +3807,7 @@
 (include "

[pushed] c++: sizeof... mangling with alias template [PR95298]

2023-12-21 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

We were getting sizeof... mangling wrong when the argument after
substitution was a pack expansion that is not a simple T..., such as
list... in variadic-mangle4.C or (A+1)... in variadic-mangle5.C.  In the
former case we ICEd; in the latter case we wrongly mangled it as sZ
.

PR c++/95298

gcc/cp/ChangeLog:

* mangle.cc (write_expression): Handle v18 sizeof... bug.
* pt.cc (tsubst_pack_expansion): Keep TREE_VEC for sizeof...
(tsubst_expr): Don't strip TREE_VEC here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-mangle2.C: Add non-member.
* g++.dg/cpp0x/variadic-mangle4.C: New test.
* g++.dg/cpp0x/variadic-mangle5.C: New test.
* g++.dg/cpp0x/variadic-mangle5a.C: New test.
---
 gcc/cp/mangle.cc  | 14 +
 gcc/cp/pt.cc  | 12 ++--
 gcc/testsuite/g++.dg/cpp0x/variadic-mangle2.C |  8 +
 gcc/testsuite/g++.dg/cpp0x/variadic-mangle4.C | 29 +++
 gcc/testsuite/g++.dg/cpp0x/variadic-mangle5.C | 13 +
 .../g++.dg/cpp0x/variadic-mangle5a.C  | 13 +
 6 files changed, 86 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-mangle4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-mangle5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-mangle5a.C

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 365d470f46e..36c5ac5c4da 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -3444,6 +3444,7 @@ write_expression (tree expr)
 
   if (PACK_EXPANSION_P (op))
{
+sizeof_pack:
  if (abi_check (11))
{
  /* sZ rather than szDp.  */
@@ -3464,6 +3465,19 @@ write_expression (tree expr)
  int length = TREE_VEC_LENGTH (args);
  if (abi_check (10))
{
+ /* Before v19 we wrongly mangled all single pack expansions with
+sZ, but now only for expressions, as types ICEd (95298).  */
+ if (length == 1)
+   {
+ tree arg = TREE_VEC_ELT (args, 0);
+ if (TREE_CODE (arg) == EXPR_PACK_EXPANSION
+ && !abi_check (19))
+   {
+ op = arg;
+ goto sizeof_pack;
+   }
+   }
+
  /* sP * E # sizeof...(T), size of a captured
 template parameter pack from an alias template */
  write_string ("sP");
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2817657a8bb..5278ef6e981 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13572,7 +13572,15 @@ tsubst_pack_expansion (tree t, tree args, 
tsubst_flags_t complain,
   /* If the argument pack is a single pack expansion, pull it out.  */
   if (TREE_VEC_LENGTH (args) == 1
  && pack_expansion_args_count (args))
-   return TREE_VEC_ELT (args, 0);
+   {
+ tree arg = TREE_VEC_ELT (args, 0);
+ if (PACK_EXPANSION_SIZEOF_P (t)
+ && !TEMPLATE_PARM_P (PACK_EXPANSION_PATTERN (arg)))
+   /* Except if this isn't a simple sizeof...(T) which gets sZ
+  mangling, keep the TREE_VEC to get sP mangling.  */;
+ else
+   return TREE_VEC_ELT (args, 0);
+   }
 
   /* Types need no adjustment, nor does sizeof..., and if we still have
 some pack expansion args we won't do anything yet.  */
@@ -20261,8 +20269,6 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
{
  if (PACK_EXPANSION_P (expanded))
/* OK.  */;
- else if (TREE_VEC_LENGTH (expanded) == 1)
-   expanded = TREE_VEC_ELT (expanded, 0);
  else
expanded = make_argument_pack (expanded);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-mangle2.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic-mangle2.C
index ea96ef87308..596242ab8b7 100644
--- a/gcc/testsuite/g++.dg/cpp0x/variadic-mangle2.C
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-mangle2.C
@@ -8,6 +8,11 @@ struct A {
   template using M = int[sizeof...(T)];
   template void g(M &);
 };
+
+template using N = int[sizeof...(T)];
+template void f(N &);
+// equivalent to template void f(int(&)[sizeof...(T)])
+
 void g(A a)
 {
   int arr[3];
@@ -15,4 +20,7 @@ void g(A a)
   a.f<1,2,3>(arr);
   // { dg-final { scan-assembler "_ZN1A1gIJiiiEEEvRAsZT__i" } }
   a.g(arr);
+  // { dg-final { scan-assembler "_Z1fIJiiiEEvRAsZT__i" } }
+  f(arr);
 }
+
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-mangle4.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic-mangle4.C
new file mode 100644
index 000..6930180d777
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-mangle4.C
@@ -0,0 +1,29 @@
+// PR c++/95298
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -fabi-compat-version=0 }
+
+template
+struct list{};
+
+template
+struct _func_select
+{
+

[pushed] testsuite: suppress mangling compatibility aliases

2023-12-21 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Recently a mangling test failed on a target with no mangling alias support
because I hadn't updated the expected mangling, but it was still passing on
x86_64-pc-linux-gnu because of the alias for the old mangling.  So let's
avoid these aliases in mangling tests.

gcc/testsuite/ChangeLog:

* g++.dg/abi/mangle-arm-crypto.C: Specify -fabi-compat-version.
* g++.dg/abi/mangle-concepts1.C
* g++.dg/abi/mangle-neon-aarch64.C
* g++.dg/abi/mangle-neon.C
* g++.dg/abi/mangle-regparm.C
* g++.dg/abi/mangle-regparm1a.C
* g++.dg/abi/mangle-ttp1.C
* g++.dg/abi/mangle-union1.C
* g++.dg/abi/mangle1.C
* g++.dg/abi/mangle13.C
* g++.dg/abi/mangle15.C
* g++.dg/abi/mangle16.C
* g++.dg/abi/mangle18-1.C
* g++.dg/abi/mangle19-1.C
* g++.dg/abi/mangle20-1.C
* g++.dg/abi/mangle22.C
* g++.dg/abi/mangle23.C
* g++.dg/abi/mangle24.C
* g++.dg/abi/mangle25.C
* g++.dg/abi/mangle26.C
* g++.dg/abi/mangle27.C
* g++.dg/abi/mangle28.C
* g++.dg/abi/mangle29.C
* g++.dg/abi/mangle3-2.C
* g++.dg/abi/mangle3.C
* g++.dg/abi/mangle30.C
* g++.dg/abi/mangle31.C
* g++.dg/abi/mangle32.C
* g++.dg/abi/mangle33.C
* g++.dg/abi/mangle34.C
* g++.dg/abi/mangle35.C
* g++.dg/abi/mangle36.C
* g++.dg/abi/mangle37.C
* g++.dg/abi/mangle39.C
* g++.dg/abi/mangle40.C
* g++.dg/abi/mangle43.C
* g++.dg/abi/mangle44.C
* g++.dg/abi/mangle45.C
* g++.dg/abi/mangle46.C
* g++.dg/abi/mangle47.C
* g++.dg/abi/mangle48.C
* g++.dg/abi/mangle49.C
* g++.dg/abi/mangle5.C
* g++.dg/abi/mangle50.C
* g++.dg/abi/mangle51.C
* g++.dg/abi/mangle52.C
* g++.dg/abi/mangle53.C
* g++.dg/abi/mangle54.C
* g++.dg/abi/mangle55.C
* g++.dg/abi/mangle56.C
* g++.dg/abi/mangle57.C
* g++.dg/abi/mangle58.C
* g++.dg/abi/mangle59.C
* g++.dg/abi/mangle6.C
* g++.dg/abi/mangle60.C
* g++.dg/abi/mangle61.C
* g++.dg/abi/mangle62.C
* g++.dg/abi/mangle62a.C
* g++.dg/abi/mangle63.C
* g++.dg/abi/mangle64.C
* g++.dg/abi/mangle65.C
* g++.dg/abi/mangle66.C
* g++.dg/abi/mangle68.C
* g++.dg/abi/mangle69.C
* g++.dg/abi/mangle7.C
* g++.dg/abi/mangle70.C
* g++.dg/abi/mangle71.C
* g++.dg/abi/mangle72.C
* g++.dg/abi/mangle73.C
* g++.dg/abi/mangle74.C
* g++.dg/abi/mangle75.C
* g++.dg/abi/mangle76.C
* g++.dg/abi/mangle77.C
* g++.dg/abi/mangle78.C
* g++.dg/abi/mangle8.C
* g++.dg/abi/mangle9.C: Likewise.
---
 gcc/testsuite/g++.dg/abi/mangle-arm-crypto.C   | 1 +
 gcc/testsuite/g++.dg/abi/mangle-concepts1.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle-neon-aarch64.C | 1 +
 gcc/testsuite/g++.dg/abi/mangle-neon.C | 1 +
 gcc/testsuite/g++.dg/abi/mangle-regparm.C  | 2 +-
 gcc/testsuite/g++.dg/abi/mangle-regparm1a.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle-ttp1.C | 1 +
 gcc/testsuite/g++.dg/abi/mangle-union1.C   | 1 +
 gcc/testsuite/g++.dg/abi/mangle1.C | 2 +-
 gcc/testsuite/g++.dg/abi/mangle13.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle15.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle16.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle18-1.C  | 2 +-
 gcc/testsuite/g++.dg/abi/mangle19-1.C  | 2 +-
 gcc/testsuite/g++.dg/abi/mangle20-1.C  | 2 +-
 gcc/testsuite/g++.dg/abi/mangle22.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle23.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle24.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle25.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle26.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle27.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle28.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle29.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle3-2.C   | 2 +-
 gcc/testsuite/g++.dg/abi/mangle3.C | 2 +-
 gcc/testsuite/g++.dg/abi/mangle30.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle31.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle32.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle33.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle34.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle35.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle36.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle37.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle39.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle40.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle43.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle44.C| 1 +
 gcc/testsuite/g++.dg/abi/mangle45.C| 2 +-
 gcc/testsuite/g++.dg/abi/mangle46.C

Re: Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-21 Thread 钟居哲

Maybe use riscv_v ?



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-12-22 03:16
To: pan2.li; gcc-patches
CC: juzhe.zhong; yanzhang.wang; kito.cheng; richard.guenther; tamar.christina
Subject: Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV
 
 
On 12/20/23 19:25, pan2...@intel.com wrote:
> From: Pan Li 
> 
> This patch would like to XFail the signbit-5 run test case for
> the RVV.  Given the case has one limitation like "This test does not
> work when the truth type does not match vector type." in the beginning
> of the test file.  Aka, the RVV vector truth type is not integer type.
> 
> The target board of riscv-sim like below will pick up `-march=rv64gcv`
> when building the run test elf. Thus, the RVV cannot bypass this test
> case like aarch64_sve with additional option `-march=armv8-a`.
> 
>riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow
> 
> For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.
But isn't that just going to turn this into an XPASS when vector is not 
enabled?
 
Looking at a recent rv64gc run of mine:
 
> PASS: gcc.dg/signbit-5.c (test for excess errors)
> PASS: gcc.dg/signbit-5.c execution test
 
 
Ideally we'd find a way to handle with and without vector.
 
jeff

Re: [PATCH 4/5][_Hashtable] Generalize the small size optimization

2023-12-21 Thread Jonathan Wakely

On Thu, 23 Nov 2023 at 22:00, François Dumont  wrote:
>
>  libstdc++: [_Hashtable] Extend the small size optimization
>
>  A number of methods were still not using the small size
> optimization which
>  is to prefer an O(N) research to a hash computation as long as N is
> small.
>
>  libstdc++-v3/ChangeLog:
>
>  * include/bits/hashtable.h: Move comment about all
> equivalent values
>  being next to each other in the class documentation header.
>  (_M_reinsert_node, _M_merge_unique): Implement small size
> optimization.
>  (_M_find_tr, _M_count_tr, _M_equal_range_tr): Likewise.
>
> Tested under Linux x64
>
> Ok to commit ?

Yes, this one seems safe to commit now. Thanks.

Re: [PATCH 3/5][_Hashtable] Avoid redundant usage of rehash policy

2023-12-21 Thread Jonathan Wakely

On Thu, 23 Nov 2023 at 21:59, François Dumont  wrote:
>
>  libstdc++: [_Hashtable] Avoid redundant usage of rehash policy
>
>  Bypass call to __detail::__distance_fwd and the check if rehash is
> needed when
>  assigning an initializer_list to an unordered_multimap or
> unordered_multiset.

I find this patch and the description a bit confusing. It would help
if the new _Hashtable::_M_insert_range function had a comment (or a
different name!) explaining how it's different from the existing
_Insert_base::_M_insert_range functions.


>
>  libstdc++-v3/ChangeLog:
>
>  * include/bits/hashtable.h
>  (_Hashtable<>::_M_insert_range(_InputIte, _InputIte,
> _NodeGen&)): New.
> (_Hashtable<>::operator=(initializer_list)): Use latter.
>  (_Hashtable<>::_Hashtable(_InputIte, _InputIte, size_type,
> const _Hash&, const _Equal&,
>  const allocator_type&, false_type)): Use latter.
>  * include/bits/hashtable_policy.h
>  (_Insert_base<>::_M_insert_range(_InputIte, _InputIte,
> true_type)): Use latter.
>  (_Insert_base<>::_M_insert_range(_InputIte, _InputIte,
> false_type)): Likewise.
>
> Tested under Linux x64
>
> Ok to commit ?
>
> François

Re: [PATCH 2/5][_Hashtable] Fix implementation inconsistencies

2023-12-21 Thread Jonathan Wakely

On Thu, 23 Nov 2023 at 21:59, François Dumont  wrote:
>
>  libstdc++: [_Hashtable] Fix some implementation inconsistencies
>
>  Get rid of the different usages of the mutable keyword. For
>  _Prime_rehash_policy methods are exported from the library, we need to
>  keep their const qualifier, so adapt implementation to update
> previously
>  mutable member.

If anybody ever declares a const _Prime_rehash_policy and then calls
its _M_next_bkt member or _M_need_rehash member they'll get undefined
behaviour, which seems bad. Probably nobody will ever do that, but if
we just leave the mutable member then that problem doesn't exist.

It would be possible to add non-const overlaods of _M_next_bkt and
_M_need_rehash, and then make the const ones do:

return const_cast<_Prime_rehash_policy*>(this)->_M_next_bkt(n);

or even just define the const symbol as an alias of the non-const
symbol, on targets that support that.  That would still be undefined
if somebody uses a const Prime_rehash_policy object somewhere, but it
would mean the definition of the member functions don't contain nasty
surprises, and new code would call the non-const version, which
doesn't use the unsafe const_cast.

>
>  Remove useless noexcept qualification on _Hashtable _M_bucket_index
> overload.
>  Fix comment to explain that we need the computation of bucket index
> to be
>  noexcept to be able to rehash the container when needed. For Standard
>  instantiations through std::unordered_xxx containers we already force
>  usage of hash code cache when hash functor is not noexcep so it is
> guarantied.
>  The static_assert purpose in _Hashtable on _M_bucket_index is thus
> limited
>  to usages of _Hashtable with exotic _Hashtable_traits.
>
>  libstdc++-v3/ChangeLog:
>
>  * include/bits/hashtable_policy.h
> (_NodeBuilder<>::_S_build): Remove
>  const qualification on _NodeGenerator instance.
> (_ReuseOrAllocNode<>::operator()(_Args&&...)): Remove const qualification.
>  (_ReuseOrAllocNode<>::_M_nodes): Remove mutable.
>  (_Prime_rehash_policy::max_load_factor()): Remove noexcept.

Why?

>  (_Prime_rehash_policy::_M_reset()): Remove noexcept.

Why?

Those functions really are noexcept, right? We should remove noexcept
where it's incorrect or misleading, but here it's OK, isn't it? Or am
I forgetting the problem being solved here?


>  (_Prime_rehash_policy::_M_next_resize): Remove mutable.
>  (_Power2_rehash_policy::_M_next_bkt(size_t)): Remove noexcept.
>  (_Power2_rehash_policy::_M_bkt_for_elements(size_t)):
> Remove noexcept.
>  (_Power2_rehash_policy::_M_neeed_rehash): Remove noexcept.
>  (_Power2_rehash_policy::_M_reset): Remove noexcept.
>  (_Insert_base<>::_M_insert_range): Remove _NodeGetter const
> qualification.
>  (_Hash_code_base<>::_M_bucket_index(const
> _Hash_node_value<>&, size_t)):
>  Simplify noexcept declaration, we already static_assert
> that _RangeHash functor
>  is noexcept.
>  * include/bits/hashtable.h: Rework comments. Remove const
> qualifier on
>  _NodeGenerator& arguments.
>  (_Hashtable<>::_M_bucket_index(const __node_value_type&)):
> Remove useless
>  noexcept qualification.
>  * src/c++11/hashtable_c++0x.cc (_Prime_rehash_policy):
> Workaround
>  _M_next_resize not being mutable anymore.
>
> Tested under Linux x86_64,
>
> ok to commit ?
>
> François

Re: [PATCH 1/5][_Hashtable] Add benches

2023-12-21 Thread Jonathan Wakely

On Thu, 23 Nov 2023 at 21:58, François Dumont  wrote:
>
> libstdc++: [_Hashtable] Enhance/Add performance benches

This one is OK for trunk now, thanks.

Re: [PATCH 5/5][_Hashtable] Prefer to insert after last node

2023-12-21 Thread Jonathan Wakely

I think this should wait for the next stage 1. It's a big patch
affecting the default -std mode (not just experimental C++20/23/26
material), and was first posted after the end of stage 1.

Do we really need the changes for versioned namespace? How much
difference does that extra member make to performance, compared with
the version for the default config?

On Wed, 20 Dec 2023 at 06:10, François Dumont  wrote:
>
> Here is a new version of this patch.
>
> The previous one had some flaws that were unnoticed by testsuite tests,
> only the performance tests were spotting it. So I'm adding checks on the
> consistency of the unordered containers in this patch.
>
> I also forget to signal that after this patch gnu versioned namespace
> version is supposed to be bump. But I hope it's already the plan because
> of the move to the cxx11 abi in this mode.
>
> Note for reviewer, after application of the patch, a 'git diff -b' is
> much more readable.
>
> And some benches results:
>
> before:
>
> unordered_set_range_insert.cc-threadhash code NOT cached 2 X 100
> inserts individually 1990 calls  44r   44u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 2 X 100
> inserts in range 2000 calls  43r   43u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 100 X
> inserts individually 1990 calls  44r   44u 0s  95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 100 X
> inserts in range 2000 calls  43r   43u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code cached 2 X 100
> inserts individually 1000 calls  30r   30u0s 111999328mem
> 0pf
> unordered_set_range_insert.cc-threadhash code cached 2 X 100
> inserts in range 1010 calls  33r   32u0s 111999328mem0pf
> unordered_set_range_insert.cc-threadhash code cached 100 X
> inserts individually 1000 calls  30r   31u0s 111999328mem
> 0pf
> unordered_set_range_insert.cc-threadhash code cached 100 X
> inserts in range 1010 calls  32r   32u0s 111999328mem0pf
>
> after:
>
> unordered_set_range_insert.cc-threadhash code NOT cached 2 X 100
> inserts individually 1990 calls  44r   44u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 2 X 100
> inserts in range 1020 calls  26r   25u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 100 X
> inserts individually 1990 calls  43r   44u 0s  95999760mem0pf
> unordered_set_range_insert.cc-threadhash code NOT cached 100 X
> inserts in range 1020 calls  26r   26u0s 95999760mem0pf
> unordered_set_range_insert.cc-threadhash code cached 2 X 100
> inserts individually 1000 calls  35r   35u0s 111999328mem
> 0pf
> unordered_set_range_insert.cc-threadhash code cached 2 X 100
> inserts in range 1010 calls  32r   33u0s 111999328mem0pf
> unordered_set_range_insert.cc-threadhash code cached 100 X
> inserts individually 1000 calls  31r   32u0s 111999328mem
> 0pf
> unordered_set_range_insert.cc-threadhash code cached 100 X
> inserts in range 1010 calls  31r   31u0s 111999328mem0pf
>
>
>  libstdc++: [_Hashtable] Prefer to insert after last node
>
>  When inserting an element into an empty bucket we currently insert
> the new node
>  after the before-begin node so in first position. The drawback of
> doing this is
>  that we are forced to update the bucket that was containing this
> before-begin
>  node to point to the newly inserted node. To do so we need at best
> to do a modulo
>  to find this bucket and at worst, when hash code is not cached,
> also compute it.
>
>  To avoid this side effect it is better to insert after the last
> node. To do so
>  we are introducing a helper type _HintPolicy that has 3
> resposibilities.
>
>  1. When the gnu versioned namespace is used we add a _M_last member
> to _Hashtable,
>  _HintPolicy is then in charge of maintaining it. For this purpose
> _HintPolicy is
>  using the RAII pattern, it resets the _M_last at destruction level.
> It also maintain
>  its own _M_last, all mutable operations are updating it when needed.
>
>  2. When the gnu versioned namespace is not used _HintPolicy will
> still maintain its
>  _M_last member using initially the user provided hint if any and if
> it is actually
>  the container last node that is to say a dereferenceable node with
> its next node being
>  null. All mutable operations can also update the contextual
> _HintPolicy instance
>  whenever they detect the last node during their process.
>
>  3. As long as we haven't been able to detect the container last
> node, _HintPolicy
>  is used to keep a cache of the before-begi

Re: [PATCH v2 2/2] libstdc++: implement std::generator

2023-12-21 Thread Jonathan Wakely

On Thu, 21 Dec 2023 at 21:26, Arsen Arsenović wrote:
>
> libstdc++-v3/ChangeLog:
>
> * include/Makefile.am: Install std/generator, bits/elements_of.h
> as freestanding.
> * include/Makefile.in: Regenerate.
> * include/bits/version.def: Add __cpp_lib_generator.
> * include/bits/version.h: Regenerate.
> * include/precompiled/stdc++.h: Include .
> * include/std/ranges: Include bits/elements_of.h
> * include/bits/elements_of.h: New file.
> * include/std/generator: New file.
> * testsuite/24_iterators/range_generators/01.cc: New test.
> * testsuite/24_iterators/range_generators/02.cc: New test.
> * testsuite/24_iterators/range_generators/copy.cc: New test.
> * testsuite/24_iterators/range_generators/except.cc: New test.
> * testsuite/24_iterators/range_generators/synopsis.cc: New test.
> * testsuite/24_iterators/range_generators/subrange.cc: New test.

OK

> ---
>  libstdc++-v3/include/Makefile.am  |   2 +
>  libstdc++-v3/include/Makefile.in  |   2 +
>  libstdc++-v3/include/bits/elements_of.h   |  72 ++
>  libstdc++-v3/include/bits/version.def |   9 +
>  libstdc++-v3/include/bits/version.h   |  11 +
>  libstdc++-v3/include/precompiled/stdc++.h |   1 +
>  libstdc++-v3/include/std/generator| 812 ++
>  libstdc++-v3/include/std/ranges   |   4 +
>  .../24_iterators/range_generators/01.cc   |  55 ++
>  .../24_iterators/range_generators/02.cc   | 219 +
>  .../24_iterators/range_generators/copy.cc |  97 +++
>  .../24_iterators/range_generators/except.cc   |  97 +++
>  .../24_iterators/range_generators/subrange.cc |  45 +
>  .../24_iterators/range_generators/synopsis.cc |  38 +
>  14 files changed, 1464 insertions(+)
>  create mode 100644 libstdc++-v3/include/bits/elements_of.h
>  create mode 100644 libstdc++-v3/include/std/generator
>  create mode 100644 libstdc++-v3/testsuite/24_iterators/range_generators/01.cc
>  create mode 100644 libstdc++-v3/testsuite/24_iterators/range_generators/02.cc
>  create mode 100644 
> libstdc++-v3/testsuite/24_iterators/range_generators/copy.cc
>  create mode 100644 
> libstdc++-v3/testsuite/24_iterators/range_generators/except.cc
>  create mode 100644 
> libstdc++-v3/testsuite/24_iterators/range_generators/subrange.cc
>  create mode 100644 
> libstdc++-v3/testsuite/24_iterators/range_generators/synopsis.cc
>
> diff --git a/libstdc++-v3/include/Makefile.am 
> b/libstdc++-v3/include/Makefile.am
> index 368b92eafbc7..ca76afbcc77f 100644
> --- a/libstdc++-v3/include/Makefile.am
> +++ b/libstdc++-v3/include/Makefile.am
> @@ -35,6 +35,7 @@ std_freestanding = \
> ${std_srcdir}/coroutine \
> ${std_srcdir}/expected \
> ${std_srcdir}/functional \
> +   ${std_srcdir}/generator \
> ${std_srcdir}/iterator \
> ${std_srcdir}/limits \
> ${std_srcdir}/memory \
> @@ -123,6 +124,7 @@ bits_freestanding = \
> ${bits_srcdir}/concept_check.h \
> ${bits_srcdir}/char_traits.h \
> ${bits_srcdir}/cpp_type_traits.h \
> +   ${bits_srcdir}/elements_of.h \
> ${bits_srcdir}/enable_special_members.h \
> ${bits_srcdir}/functexcept.h \
> ${bits_srcdir}/functional_hash.h \
> diff --git a/libstdc++-v3/include/Makefile.in 
> b/libstdc++-v3/include/Makefile.in
> index a31588c01002..4fa4a259fef3 100644
> --- a/libstdc++-v3/include/Makefile.in
> +++ b/libstdc++-v3/include/Makefile.in
> @@ -393,6 +393,7 @@ std_freestanding = \
> ${std_srcdir}/coroutine \
> ${std_srcdir}/expected \
> ${std_srcdir}/functional \
> +   ${std_srcdir}/generator \
> ${std_srcdir}/iterator \
> ${std_srcdir}/limits \
> ${std_srcdir}/memory \
> @@ -478,6 +479,7 @@ bits_freestanding = \
> ${bits_srcdir}/concept_check.h \
> ${bits_srcdir}/char_traits.h \
> ${bits_srcdir}/cpp_type_traits.h \
> +   ${bits_srcdir}/elements_of.h \
> ${bits_srcdir}/enable_special_members.h \
> ${bits_srcdir}/functexcept.h \
> ${bits_srcdir}/functional_hash.h \
> diff --git a/libstdc++-v3/include/bits/elements_of.h 
> b/libstdc++-v3/include/bits/elements_of.h
> new file mode 100644
> index ..663e15a94aa7
> --- /dev/null
> +++ b/libstdc++-v3/include/bits/elements_of.h
> @@ -0,0 +1,72 @@
> +// Tag type for yielding ranges rather than values in   -*- C++ 
> -*-
> +
> +// Copyright (C) 2023 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the imp

[V6] c23: construct composite type for tagged types

2023-12-21 Thread Martin Uecker



This version now sets  DECL_NONADDRESSABLE_P, DECL_PADDING_P 
and C_DECL_VARIABLE_SIZE and adds three new tests:
c23-tag-alias-7.c, c23-tag-composite-10.c, and 
gnu23-tag-composite-5.c.

Martin



Support for constructing composite types for structs and unions
in C23.

gcc/c:
* c-typeck.cc (composite_type_internal): Adapted from
composite_type to support structs and unions.
(composite_type): New wrapper function.
(build_conditional_operator): Return composite type.
* c-decl.cc (finish_struct): Allow NULL for
enclosing_struct_parse_info.

gcc/testsuite:
* gcc.dg/c23-tag-alias-6.c: New test.
* gcc.dg/c23-tag-alias-7.c: New test.
* gcc.dg/c23-tag-composite-1.c: New test.
* gcc.dg/c23-tag-composite-2.c: New test.
* gcc.dg/c23-tag-composite-3.c: New test.
* gcc.dg/c23-tag-composite-4.c: New test.
* gcc.dg/c23-tag-composite-5.c: New test.
* gcc.dg/c23-tag-composite-6.c: New test.
* gcc.dg/c23-tag-composite-7.c: New test.
* gcc.dg/c23-tag-composite-8.c: New test.
* gcc.dg/c23-tag-composite-9.c: New test.
* gcc.dg/c23-tag-composite-10.c: New test.
* gcc.dg/gnu23-tag-composite-1.c: New test.
* gcc.dg/gnu23-tag-composite-2.c: New test.
* gcc.dg/gnu23-tag-composite-3.c: New test.
* gcc.dg/gnu23-tag-composite-4.c: New test.
* gcc.dg/gnu23-tag-composite-5.c: New test.
---
 gcc/c/c-decl.cc  |  21 +--
 gcc/c/c-typeck.cc| 140 ---
 gcc/testsuite/gcc.dg/c23-tag-alias-6.c   |  32 +
 gcc/testsuite/gcc.dg/c23-tag-alias-7.c   |  34 +
 gcc/testsuite/gcc.dg/c23-tag-composite-1.c   |  26 
 gcc/testsuite/gcc.dg/c23-tag-composite-10.c  |  35 +
 gcc/testsuite/gcc.dg/c23-tag-composite-2.c   |  16 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-3.c   |  50 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-4.c   |  21 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-5.c   |  25 
 gcc/testsuite/gcc.dg/c23-tag-composite-6.c   |  18 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-7.c   |  20 +++
 gcc/testsuite/gcc.dg/c23-tag-composite-8.c   |  15 ++
 gcc/testsuite/gcc.dg/c23-tag-composite-9.c   |  19 +++
 gcc/testsuite/gcc.dg/gnu23-tag-composite-1.c |  45 ++
 gcc/testsuite/gcc.dg/gnu23-tag-composite-2.c |  30 
 gcc/testsuite/gcc.dg/gnu23-tag-composite-3.c |  24 
 gcc/testsuite/gcc.dg/gnu23-tag-composite-4.c |  28 
 gcc/testsuite/gcc.dg/gnu23-tag-composite-5.c |  29 
 19 files changed, 601 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-alias-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-10.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-2.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-3.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-4.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-5.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-6.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-7.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-8.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-tag-composite-9.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-4.c
 create mode 100644 gcc/testsuite/gcc.dg/gnu23-tag-composite-5.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 6639ec35e5f..b72738ea04a 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9674,7 +9674,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
 }
 
   /* Check for consistency with previous definition.  */
-  if (flag_isoc23)
+  if (flag_isoc23 && NULL != enclosing_struct_parse_info)
 {
   tree vistype = previous_tag (t);
   if (vistype
@@ -9744,16 +9744,19 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   if (warn_cxx_compat)
 warn_cxx_compat_finish_struct (fieldlist, TREE_CODE (t), loc);
 
-  delete struct_parse_info;
+  if (NULL != enclosing_struct_parse_info)
+{
+  delete struct_parse_info;
 
-  struct_parse_info = enclosing_struct_parse_info;
+  struct_parse_info = enclosing_struct_parse_info;
 
-  /* If this struct is defined inside a struct, add it to
- struct_types.  */
-  if (warn_cxx_compat
-  && struct_parse_info != NULL
-  && !in_sizeof && !in_typeof && !in_alignof)
-struct_parse_info->struct_types.safe_push (t);
+  /* If this struct is defined inside a struct, add it to
+struct_types.  */
+  if (warn_cxx_compat
+ && struct_parse_info != NULL
+ && !in_sizeof && !in_typeof && !in_alignof)

Re: [PATCH v2 1/2] libstdc++: add missing include in ranges_util.h

2023-12-21 Thread Jonathan Wakely

On Thu, 21 Dec 2023 at 21:26, Arsen Arsenović wrote:
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/ranges_util.h: Add missing 
> include.

OK


> ---
>  libstdc++-v3/include/bits/ranges_util.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libstdc++-v3/include/bits/ranges_util.h 
> b/libstdc++-v3/include/bits/ranges_util.h
> index 185e46ec7a94..ad61a19dd33d 100644
> --- a/libstdc++-v3/include/bits/ranges_util.h
> +++ b/libstdc++-v3/include/bits/ranges_util.h
> @@ -33,6 +33,7 @@
>  #if __cplusplus > 201703L
>  # include 
>  # include 
> +# include 
>
>  #ifdef __glibcxx_ranges
>  namespace std _GLIBCXX_VISIBILITY(default)
> --
> 2.43.0
>

[PATCH v2 2/2] libstdc++: implement std::generator

2023-12-21 Thread Arsen Arsenović

libstdc++-v3/ChangeLog:

* include/Makefile.am: Install std/generator, bits/elements_of.h
as freestanding.
* include/Makefile.in: Regenerate.
* include/bits/version.def: Add __cpp_lib_generator.
* include/bits/version.h: Regenerate.
* include/precompiled/stdc++.h: Include .
* include/std/ranges: Include bits/elements_of.h
* include/bits/elements_of.h: New file.
* include/std/generator: New file.
* testsuite/24_iterators/range_generators/01.cc: New test.
* testsuite/24_iterators/range_generators/02.cc: New test.
* testsuite/24_iterators/range_generators/copy.cc: New test.
* testsuite/24_iterators/range_generators/except.cc: New test.
* testsuite/24_iterators/range_generators/synopsis.cc: New test.
* testsuite/24_iterators/range_generators/subrange.cc: New test.
---
 libstdc++-v3/include/Makefile.am  |   2 +
 libstdc++-v3/include/Makefile.in  |   2 +
 libstdc++-v3/include/bits/elements_of.h   |  72 ++
 libstdc++-v3/include/bits/version.def |   9 +
 libstdc++-v3/include/bits/version.h   |  11 +
 libstdc++-v3/include/precompiled/stdc++.h |   1 +
 libstdc++-v3/include/std/generator| 812 ++
 libstdc++-v3/include/std/ranges   |   4 +
 .../24_iterators/range_generators/01.cc   |  55 ++
 .../24_iterators/range_generators/02.cc   | 219 +
 .../24_iterators/range_generators/copy.cc |  97 +++
 .../24_iterators/range_generators/except.cc   |  97 +++
 .../24_iterators/range_generators/subrange.cc |  45 +
 .../24_iterators/range_generators/synopsis.cc |  38 +
 14 files changed, 1464 insertions(+)
 create mode 100644 libstdc++-v3/include/bits/elements_of.h
 create mode 100644 libstdc++-v3/include/std/generator
 create mode 100644 libstdc++-v3/testsuite/24_iterators/range_generators/01.cc
 create mode 100644 libstdc++-v3/testsuite/24_iterators/range_generators/02.cc
 create mode 100644 libstdc++-v3/testsuite/24_iterators/range_generators/copy.cc
 create mode 100644 
libstdc++-v3/testsuite/24_iterators/range_generators/except.cc
 create mode 100644 
libstdc++-v3/testsuite/24_iterators/range_generators/subrange.cc
 create mode 100644 
libstdc++-v3/testsuite/24_iterators/range_generators/synopsis.cc

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 368b92eafbc7..ca76afbcc77f 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -35,6 +35,7 @@ std_freestanding = \
${std_srcdir}/coroutine \
${std_srcdir}/expected \
${std_srcdir}/functional \
+   ${std_srcdir}/generator \
${std_srcdir}/iterator \
${std_srcdir}/limits \
${std_srcdir}/memory \
@@ -123,6 +124,7 @@ bits_freestanding = \
${bits_srcdir}/concept_check.h \
${bits_srcdir}/char_traits.h \
${bits_srcdir}/cpp_type_traits.h \
+   ${bits_srcdir}/elements_of.h \
${bits_srcdir}/enable_special_members.h \
${bits_srcdir}/functexcept.h \
${bits_srcdir}/functional_hash.h \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index a31588c01002..4fa4a259fef3 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -393,6 +393,7 @@ std_freestanding = \
${std_srcdir}/coroutine \
${std_srcdir}/expected \
${std_srcdir}/functional \
+   ${std_srcdir}/generator \
${std_srcdir}/iterator \
${std_srcdir}/limits \
${std_srcdir}/memory \
@@ -478,6 +479,7 @@ bits_freestanding = \
${bits_srcdir}/concept_check.h \
${bits_srcdir}/char_traits.h \
${bits_srcdir}/cpp_type_traits.h \
+   ${bits_srcdir}/elements_of.h \
${bits_srcdir}/enable_special_members.h \
${bits_srcdir}/functexcept.h \
${bits_srcdir}/functional_hash.h \
diff --git a/libstdc++-v3/include/bits/elements_of.h 
b/libstdc++-v3/include/bits/elements_of.h
new file mode 100644
index ..663e15a94aa7
--- /dev/null
+++ b/libstdc++-v3/include/bits/elements_of.h
@@ -0,0 +1,72 @@
+// Tag type for yielding ranges rather than values in   -*- C++ -*-
+
+// Copyright (C) 2023 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1

[PATCH v2 0/2] libstdc++: generators v2

2023-12-21 Thread Arsen Arsenović

Hi,

This is v2 of my generators patch.  It addresses Jonathans review
comments, but does not add more tests yet :-/

Original series:
https://inbox.sourceware.org/20231118195008.579211-1-ar...@aarsen.me/

Changes since v1:
- Uglify some symbols
- Convert _Is_generator concept to __is_generator CE bool
- Add "libstdc++: add missing include in ranges_util.h" - this can be
  pushed separately, really, but I forgot to send it.

Range-diff:

1:  feab374887e5 = 1212:  c4286af0c70f libstdc++: add missing include in 
ranges_util.h
2:  010eab271755 ! 1213:  fb589641656f libstdc++: implement std::generator
 @@ libstdc++-v3/include/std/generator (new)
  +#define __glibcxx_want_generator
  +#include 
  +
 -+#if __cplusplus < 202302L
 -+# error "std::generator is a C++23 extension"
 -+#endif
 -+
  +#ifdef __cpp_lib_generator  // C++ >= 23 && __glibcxx_coroutine
  +#include 
  +#include 
 @@ libstdc++-v3/include/std/generator (new)
  +  {
  +/// _Reference type for a generator whose reference (first argument) and
  +/// value (second argument) types are _Ref and _V.
 -+template
 -+using _Reference_t = __conditional_t,
 ++template
 ++using _Reference_t = __conditional_t,
  + _Ref&&, _Ref>;
  +
  +/// Type yielded by a generator whose _Reference type is _Reference.
 @@ libstdc++-v3/include/std/generator (new)
  + const _Reference&>;
  +
  +/// _Yield_t * _Reference_t
 -+template
 -+using _Yield2_t = _Yield_t<_Reference_t<_Ref, _V>>;
 -+
 -+template struct _Is_generator_t : std::false_type {};
 -+template
 -+  struct _Is_generator_t<::std::generator<_V, _R, _A>> : std::true_type 
{};
 -+
 -+template
 -+concept _Is_generator = _Is_generator_t>::value;
 ++template
 ++using _Yield2_t = _Yield_t<_Reference_t<_Ref, _Val>>;
  +
 ++template constexpr bool __is_generator = false;
 ++template
 ++constexpr bool __is_generator> = true;
  +
  +/// Allocator and value type erased generator promise type.
  +/// \tparam _Yielded The corresponding generators yielded type.
 @@ libstdc++-v3/include/std/generator (new)
  + using _Coro_handle = std::coroutine_handle<_Promise_erased>;
  +
  + template
 -+ friend struct std::generator;
 ++ friend class std::generator;
  +
  + template
  +  struct _Recursive_awaiter;
 @@ libstdc++-v3/include/std/generator (new)
  +  __rn._M_top() = __new;
  +
  +  // Presume we're the second frame...
 -+  auto& __bott = __rest;
 ++  auto __bott = __rest;
  +  if (auto __f = std::get_if<_Frame>(&__rn._M_stack))
  +// But, if we aren't, get the actual bottom.  We're only the second
  +// frame if our parent is the bottom frame, i.e. it doesn't have a
 @@ libstdc++-v3/include/std/generator (new)
  +  struct _Promise_erased<_Yielded>::_Recursive_awaiter
  +  {
  + _Gen _M_gen;
 -+ static_assert(_Is_generator<_Gen>);
 ++ static_assert(__is_generator<_Gen>);
  + static_assert(std::same_as);
  +
  + _Recursive_awaiter(_Gen __gen) noexcept
 @@ libstdc++-v3/include/std/generator (new)
  +  requires default_initializable<_Rebound> // _Alloc is non-void
  + { return _M_allocate({}, __sz); }
  +
 -+ template
 ++ template
  + void*
  + operator new(std::size_t __sz,
 -+ allocator_arg_t, const _NA& __na,
 ++ allocator_arg_t, const _Na& __na,
  + const _Args&...)
 -+  requires convertible_to
 ++  requires convertible_to
  + {
  +  return _M_allocate(static_cast<_Rebound>(static_cast<_Alloc>(__na)),
  + __sz);
  + }
  +
 -+ template
 ++ template
  + void*
  + operator new(std::size_t __sz,
  + const _This&,
 -+ allocator_arg_t, const _NA& __na,
 ++ allocator_arg_t, const _Na& __na,
  + const _Args&...)
 -+  requires convertible_to
 ++  requires convertible_to
  + {
  +  return _M_allocate(static_cast<_Rebound>(static_cast<_Alloc>(__na)),
  + __sz);
 @@ libstdc++-v3/include/std/generator (new)
  +}
  + }
  +
 -+ template
 ++ template
  + static void*
 -+ _M_allocate(const _NA& __na, std::size_t __csz)
 ++ _M_allocate(const _Na& __na, std::size_t __csz)
  + {
 -+  using _Rebound = typename std::allocator_traits<_NA>
 ++  using _Rebound = typename std::allocator_traits<_Na>
  +::template rebind_alloc<_Alloc_block>;
 -+  using _Rebound_ATr = typename std::allocator_traits<_NA>
 ++  using _Rebound_ATr = typename std::allocator_traits<_Na>
  +::template rebind_traits<_Alloc_block>;
  +
  +  static_assert(is_pointer_v,
 @@ libstdc++-v3/include/std/generator (new)
  +  return __p;
  + }
  +
 -+ template
 ++ template
  + void*
  + operator new(std::size_t __sz,
 -+

[PATCH v2 1/2] libstdc++: add missing include in ranges_util.h

2023-12-21 Thread Arsen Arsenović

libstdc++-v3/ChangeLog:

* include/bits/ranges_util.h: Add missing 
include.
---
 libstdc++-v3/include/bits/ranges_util.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/include/bits/ranges_util.h 
b/libstdc++-v3/include/bits/ranges_util.h
index 185e46ec7a94..ad61a19dd33d 100644
--- a/libstdc++-v3/include/bits/ranges_util.h
+++ b/libstdc++-v3/include/bits/ranges_util.h
@@ -33,6 +33,7 @@
 #if __cplusplus > 201703L
 # include 
 # include 
+# include 
 
 #ifdef __glibcxx_ranges
 namespace std _GLIBCXX_VISIBILITY(default)
-- 
2.43.0

[r14-6770 Regression] FAIL: gcc.dg/gnu23-tag-4.c (test for excess errors) on Linux/x86_64

2023-12-21 Thread haochen.jiang

On Linux/x86_64,

23fee88f84873b0b8b41c8e5a9b229d533fb4022 is the first bad commit
commit 23fee88f84873b0b8b41c8e5a9b229d533fb4022
Author: Martin Uecker 
Date:   Tue Aug 15 14:58:32 2023 +0200

c23: tag compatibility rules for struct and unions

caused

FAIL: gcc.dg/gnu23-tag-4.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-6770/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/gnu23-tag-4.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/gnu23-tag-4.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)

[PATCH] libgccjit: Add convert vector

2023-12-21 Thread Antoni Boucher

Hi.
This patch adds the support for the convert vector internal function.
I'll need to double-check that making the decl a register is necessary.
Thanks for the review.
From ca4b3606c853b3425cf4ef9e88fbd5939f0e8f7c Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sat, 14 May 2022 17:24:29 -0400
Subject: [PATCH] libgccjit: Add convert vector

gcc/jit/ChangeLog:

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_26): New ABI tag.
	* docs/topics/expressions.rst: Document gcc_jit_context_convert_vector.
	* jit-playback.cc (convert_vector): New method.
	* jit-playback.h: New method.
	* jit-recording.cc (recording::context::new_convert_vector,
	recording::convert_vector::replay_into,
	recording::convert_vector::visit_children,
	recording::convert_vector::make_debug_string,
	recording::convert_vector::write_reproducer): New methods.
	* jit-recording.h (class convert_vector): New class.
	(context::new_convert_vector): New method.
	* libgccjit.cc (gcc_jit_context_convert_vector): New function.
	* libgccjit.h (gcc_jit_context_convert_vector): New function.
	* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-failing-tests.h: New test.
	* jit.dg/test-convert-vector.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst|  7 ++
 gcc/jit/docs/topics/expressions.rst  | 18 +
 gcc/jit/jit-playback.cc  | 31 +
 gcc/jit/jit-playback.h   |  5 ++
 gcc/jit/jit-recording.cc | 72 
 gcc/jit/jit-recording.h  | 34 +
 gcc/jit/libgccjit.cc | 36 ++
 gcc/jit/libgccjit.h  |  8 +++
 gcc/jit/libgccjit.map|  5 ++
 gcc/testsuite/jit.dg/all-non-failing-tests.h | 10 +++
 gcc/testsuite/jit.dg/test-convert-vector.c   | 56 +++
 11 files changed, 282 insertions(+)
 create mode 100644 gcc/testsuite/jit.dg/test-convert-vector.c

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index ebede440ee4..33eb1a0bc06 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -378,3 +378,10 @@ alignment of a variable:
 
 ``LIBGCCJIT_ABI_25`` covers the addition of
 :func:`gcc_jit_type_get_restrict`
+
+.. _LIBGCCJIT_ABI_26:
+
+``LIBGCCJIT_ABI_26``
+
+``LIBGCCJIT_ABI_26`` covers the addition of
+:func:`gcc_jit_context_convert_vector`
diff --git a/gcc/jit/docs/topics/expressions.rst b/gcc/jit/docs/topics/expressions.rst
index 42cfee36302..a75e69bf1f4 100644
--- a/gcc/jit/docs/topics/expressions.rst
+++ b/gcc/jit/docs/topics/expressions.rst
@@ -699,6 +699,24 @@ Type-coercion
 
   #ifdef LIBGCCJIT_HAVE_gcc_jit_context_new_bitcast
 
+.. function:: gcc_jit_rvalue *
+  gcc_jit_context_convert_vector (gcc_jit_context *ctxt, \
+  gcc_jit_location *loc, \
+  gcc_jit_rvalue *vector, \
+  gcc_jit_type *type)
+
+   Given a vector rvalue, cast it to the type ``type``.
+
+   The number of elements in ``vector`` and ``type`` must match.
+   The ``type`` must be a vector type.
+
+   This entrypoint was added in :ref:`LIBGCCJIT_ABI_26`; you can test for
+   its presence using
+
+   .. code-block:: c
+
+  #ifdef LIBGCCJIT_HAVE_gcc_jit_context_convert_vector
+
 Lvalues
 ---
 
diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
index 537f3b1..48901d71418 100644
--- a/gcc/jit/jit-playback.cc
+++ b/gcc/jit/jit-playback.cc
@@ -1527,6 +1527,37 @@ new_array_access (location *loc,
 }
 }
 
+/* Construct a playback::rvalue instance (wrapping a tree) for a
+   vector conversion.  */
+
+playback::rvalue *
+playback::context::
+convert_vector (location *loc,
+		   rvalue *vector,
+		   type *type)
+{
+  gcc_assert (vector);
+  gcc_assert (type);
+
+  /* For comparison, see:
+   c/c-common.cc: c_build_vec_convert
+  */
+
+  tree t_vector = vector->as_tree ();
+
+  /* It seems IFN_VEC_CONVERT only work on registers, not on memory.  */
+  if (TREE_CODE (t_vector) == VAR_DECL)
+DECL_REGISTER (t_vector) = 1;
+  tree t_result =
+build_call_expr_internal_loc (UNKNOWN_LOCATION, IFN_VEC_CONVERT,
+  type->as_tree (), 1, t_vector);
+
+  if (loc)
+set_tree_location (t_result, loc);
+
+  return new rvalue (this, t_result);
+}
+
 /* Construct a tree for a field access.  */
 
 tree
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index b0166f8f6ce..59ffd739875 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -191,6 +191,11 @@ public:
 		rvalue *ptr,
 		rvalue *index);
 
+  rvalue *
+  convert_vector (location *loc,
+		  rvalue *vector,
+		  type *type);
+
   void
   set_str_option (enum gcc_jit_str_option opt,
 		  const char *value);
diff --git a/gcc/jit/jit-recording.cc b/gcc/jit/jit-recording.cc
index 9b5b8

Re: [PATCH] c++: visibility wrt template and ptrmem targs [PR70413]

2023-12-21 Thread Jason Merrill


On 12/21/23 14:12, Patrick Palka wrote:

On Sat, 16 Sep 2023, Jason Merrill wrote:


On 9/15/23 12:03, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


Thanks a lot.  Testing on cmcstl2 revealed that we don't maintain
visibility flags on alias templates properly, and so we can't trust
them when constraining visibilitiy.  So I went ahead and pushed the
following restricted version of the patch which excludes alias templates
(and adds a couple of alias template targ linkage tests):


OK.


-- >8 --

Subject: [PATCH] c++: visibility wrt template and ptrmem targs [PR70413]

When constraining the visibility of an instantiation, we weren't
properly considering the visibility of PTRMEM_CST and TEMPLATE_DECL
template arguments.

This patch fixes this.  It turns out we don't maintain the relevant
visibility flags for alias templates (e.g. TREE_PUBLIC is never set),
so continue to ignore alias template template arguments for now.

PR c++/70413
PR c++/107906

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r): Handle PTRMEM_CST and TEMPLATE_DECL
other than those for alias templates.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage2.C: New test.
* g++.dg/template/linkage3.C: New test.
* g++.dg/template/linkage4.C: New test.
* g++.dg/template/linkage4a.C: New test.
---
  gcc/cp/decl2.cc   | 22 ++
  gcc/testsuite/g++.dg/template/linkage2.C  | 13 +
  gcc/testsuite/g++.dg/template/linkage3.C  | 17 +
  gcc/testsuite/g++.dg/template/linkage4.C  | 16 
  gcc/testsuite/g++.dg/template/linkage4a.C | 14 ++
  5 files changed, 78 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/linkage2.C
  create mode 100644 gcc/testsuite/g++.dg/template/linkage3.C
  create mode 100644 gcc/testsuite/g++.dg/template/linkage4.C
  create mode 100644 gcc/testsuite/g++.dg/template/linkage4a.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 0850d3f5bce..4777ceb8af7 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -2655,7 +2655,10 @@ min_vis_expr_r (tree *tp, int */*walk_subtrees*/, void 
*data)
int *vis_p = (int *)data;
int tpvis = VISIBILITY_DEFAULT;
  
-  switch (TREE_CODE (*tp))

+  tree t = *tp;
+  if (TREE_CODE (t) == PTRMEM_CST)
+t = PTRMEM_CST_MEMBER (t);
+  switch (TREE_CODE (t))
  {
  case CAST_EXPR:
  case IMPLICIT_CONV_EXPR:
@@ -2666,15 +2669,26 @@ min_vis_expr_r (tree *tp, int */*walk_subtrees*/, void 
*data)
  case NEW_EXPR:
  case CONSTRUCTOR:
  case LAMBDA_EXPR:
-  tpvis = type_visibility (TREE_TYPE (*tp));
+  tpvis = type_visibility (TREE_TYPE (t));
break;
  
+case TEMPLATE_DECL:

+  if (DECL_ALIAS_TEMPLATE_P (t))
+   /* FIXME: We don't maintain TREE_PUBLIC / DECL_VISIBILITY for
+  alias templates so we can't trust it here (PR107906).  */
+   break;
+  t = DECL_TEMPLATE_RESULT (t);
+  /* Fall through.  */
  case VAR_DECL:
  case FUNCTION_DECL:
-  if (! TREE_PUBLIC (*tp))
+  if (! TREE_PUBLIC (t))
tpvis = VISIBILITY_ANON;
else
-   tpvis = DECL_VISIBILITY (*tp);
+   tpvis = DECL_VISIBILITY (t);
+  break;
+
+case FIELD_DECL:
+  tpvis = type_visibility (DECL_CONTEXT (t));
break;
  
  default:

diff --git a/gcc/testsuite/g++.dg/template/linkage2.C 
b/gcc/testsuite/g++.dg/template/linkage2.C
new file mode 100644
index 000..08fb6930262
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage2.C
@@ -0,0 +1,13 @@
+// PR c++/70413
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" } }
+
+namespace {
+  template struct A;
+}
+
+template class Q> void f() { }
+
+int main() {
+  f();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage3.C 
b/gcc/testsuite/g++.dg/template/linkage3.C
new file mode 100644
index 000..257aab33b38
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage3.C
@@ -0,0 +1,17 @@
+// PR c++/70413
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" } }
+
+namespace {
+  struct A {
+void f();
+int m;
+  };
+}
+
+template void g() { }
+template void h() { }
+
+int main() {
+  g<&A::f>();
+  h<&A::m>();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage4.C 
b/gcc/testsuite/g++.dg/template/linkage4.C
new file mode 100644
index 000..03630eebd3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage4.C
@@ -0,0 +1,16 @@
+// PR c++/107906
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" { xfail *-*-* } } }
+
+namespace {
+  template using X = int;
+  struct A {
+template using X = int;
+  };
+}
+template class Q> void f() { }
+
+int main() {
+  f();
+  f();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage4a.C 
b/gcc/testsuite/g++.dg/template/linkage4a.C
new file mode 100644
i

Tech Giant Compiler Team Invitation

2023-12-21 Thread michael...@vip.163.com

Hello Tulio Magno Quites Machado Filho via Gcc-patches,

Hope this email find you well. I come across your email in GCC community. 
This is Michael Zhao and I am responsible for recruiting top-tier tech talent 
for an internationally renowned tech giant. 

Presently, we're seeking a Chief Compiler Expert to spearhead the 
development of the next-generation compiler. Our team is deeply committed to 
advancing foundational technologies in compilers and virtual machines, making a 
significant impact within the industry. 

Recognizing your extensive expertise in the Compiler domain, we are eager 
to invite someone of your caliber. The position is available in various cities, 
including Hong Kong, Shanghai, Beijing,  Shenzhen and UK or Canada.  

If you're interested in delving deeper, please send me  your phone number, 
and we'll gladly schedule a phone call to introduce you more details regarding 
the role.  I am looking forward to hearing from you soon!

Warm regards
Michael Zhao

Re: [PATCH] Document cond_copysign and cond_len_copysign optabs [PR112951]

2023-12-21 Thread Richard Biener




> Am 21.12.2023 um 21:11 schrieb Andrew Pinski :
> 
> This adds the documentation for cond_copysign and cond_len_copysign optabs.
> Also reorders the optabs.def to be in the similar order as how the internal
> function was done.

Ok

> gcc/ChangeLog:
> 
>PR middle-end/112951
>* doc/md.texi (cond_copysign): Document.
>(cond_len_copysign): Likewise.
>* optabs.def: Reorder cond_copysign to be before
>cond_fmin. Likewise for cond_len_copysign.
> 
> Signed-off-by: Andrew Pinski 
> ---
> gcc/doc/md.texi | 10 +-
> gcc/optabs.def  |  4 ++--
> 2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 536ce997f01..030a9bf4c3d 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -7315,6 +7315,7 @@ form of @samp{@var{op}@var{mode}2}.
> @cindex @code{cond_smax@var{mode}} instruction pattern
> @cindex @code{cond_umin@var{mode}} instruction pattern
> @cindex @code{cond_umax@var{mode}} instruction pattern
> +@cindex @code{cond_copysign@var{mode}} instruction pattern
> @cindex @code{cond_fmin@var{mode}} instruction pattern
> @cindex @code{cond_fmax@var{mode}} instruction pattern
> @cindex @code{cond_ashl@var{mode}} instruction pattern
> @@ -7334,6 +7335,7 @@ form of @samp{@var{op}@var{mode}2}.
> @itemx @samp{cond_smax@var{mode}}
> @itemx @samp{cond_umin@var{mode}}
> @itemx @samp{cond_umax@var{mode}}
> +@itemx @samp{cond_copysign@var{mode}}
> @itemx @samp{cond_fmin@var{mode}}
> @itemx @samp{cond_fmax@var{mode}}
> @itemx @samp{cond_ashl@var{mode}}
> @@ -7371,6 +7373,8 @@ form of @samp{@var{op}@var{mode}3}.  As an exception, 
> the vector forms
> of shifts correspond to patterns like @code{vashl@var{mode}3} rather
> than patterns like @code{ashl@var{mode}3}.
> 
> +@samp{cond_copysign@var{mode}} is only defined for floating point modes.
> +
> @cindex @code{cond_fma@var{mode}} instruction pattern
> @cindex @code{cond_fms@var{mode}} instruction pattern
> @cindex @code{cond_fnma@var{mode}} instruction pattern
> @@ -7432,6 +7436,7 @@ form of @samp{@var{op}@var{mode}2}.
> @cindex @code{cond_len_smax@var{mode}} instruction pattern
> @cindex @code{cond_len_umin@var{mode}} instruction pattern
> @cindex @code{cond_len_umax@var{mode}} instruction pattern
> +@cindex @code{cond_len_copysign@var{mode}} instruction pattern
> @cindex @code{cond_len_fmin@var{mode}} instruction pattern
> @cindex @code{cond_len_fmax@var{mode}} instruction pattern
> @cindex @code{cond_len_ashl@var{mode}} instruction pattern
> @@ -7451,6 +7456,7 @@ form of @samp{@var{op}@var{mode}2}.
> @itemx @samp{cond_len_smax@var{mode}}
> @itemx @samp{cond_len_umin@var{mode}}
> @itemx @samp{cond_len_umax@var{mode}}
> +@itemx @samp{cond_len_copysign@var{mode}}
> @itemx @samp{cond_len_fmin@var{mode}}
> @itemx @samp{cond_len_fmax@var{mode}}
> @itemx @samp{cond_len_ashl@var{mode}}
> @@ -7478,11 +7484,13 @@ integer if @var{m} is scalar, otherwise it has the 
> mode returned by
> @code{TARGET_VECTORIZE_GET_MASK_MODE}.  Operand 5 has whichever
> integer mode the target prefers.
> 
> -@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional
> +@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional
> form of @samp{@var{op}@var{mode}3}.  As an exception, the vector forms
> of shifts correspond to patterns like @code{vashl@var{mode}3} rather
> than patterns like @code{ashl@var{mode}3}.
> 
> +@samp{cond_len_copysign@var{mode}} is only defined for floating point modes.
> +
> @cindex @code{cond_len_fma@var{mode}} instruction pattern
> @cindex @code{cond_len_fms@var{mode}} instruction pattern
> @cindex @code{cond_len_fnma@var{mode}} instruction pattern
> diff --git a/gcc/optabs.def b/gcc/optabs.def
> index 07c06ba8cbb..92acec73b3a 100644
> --- a/gcc/optabs.def
> +++ b/gcc/optabs.def
> @@ -249,6 +249,7 @@ OPTAB_D (cond_smin_optab, "cond_smin$a")
> OPTAB_D (cond_smax_optab, "cond_smax$a")
> OPTAB_D (cond_umin_optab, "cond_umin$a")
> OPTAB_D (cond_umax_optab, "cond_umax$a")
> +OPTAB_D (cond_copysign_optab, "cond_copysign$F$a")
> OPTAB_D (cond_fmin_optab, "cond_fmin$a")
> OPTAB_D (cond_fmax_optab, "cond_fmax$a")
> OPTAB_D (cond_fma_optab, "cond_fma$a")
> @@ -256,7 +257,6 @@ OPTAB_D (cond_fms_optab, "cond_fms$a")
> OPTAB_D (cond_fnma_optab, "cond_fnma$a")
> OPTAB_D (cond_fnms_optab, "cond_fnms$a")
> OPTAB_D (cond_neg_optab, "cond_neg$a")
> -OPTAB_D (cond_copysign_optab, "cond_copysign$F$a")
> OPTAB_D (cond_one_cmpl_optab, "cond_one_cmpl$a")
> OPTAB_D (cond_len_add_optab, "cond_len_add$a")
> OPTAB_D (cond_len_sub_optab, "cond_len_sub$a")
> @@ -275,6 +275,7 @@ OPTAB_D (cond_len_smin_optab, "cond_len_smin$a")
> OPTAB_D (cond_len_smax_optab, "cond_len_smax$a")
> OPTAB_D (cond_len_umin_optab, "cond_len_umin$a")
> OPTAB_D (cond_len_umax_optab, "cond_len_umax$a")
> +OPTAB_D (cond_len_copysign_optab, "cond_len_copysign$F$a")
> OPTAB_D (cond_len_fmin_optab, "cond_len_fmin$a")
> OPTAB_D (cond_len_fmax_optab, "cond_len_fmax$a")
> OPTAB_D (cond_len_fma_optab, "cond_len_fma$a")
> @@ -282,

[PATCH] Document cond_copysign and cond_len_copysign optabs [PR112951]

2023-12-21 Thread Andrew Pinski

This adds the documentation for cond_copysign and cond_len_copysign optabs.
Also reorders the optabs.def to be in the similar order as how the internal
function was done.

gcc/ChangeLog:

PR middle-end/112951
* doc/md.texi (cond_copysign): Document.
(cond_len_copysign): Likewise.
* optabs.def: Reorder cond_copysign to be before
cond_fmin. Likewise for cond_len_copysign.

Signed-off-by: Andrew Pinski 
---
 gcc/doc/md.texi | 10 +-
 gcc/optabs.def  |  4 ++--
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 536ce997f01..030a9bf4c3d 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -7315,6 +7315,7 @@ form of @samp{@var{op}@var{mode}2}.
 @cindex @code{cond_smax@var{mode}} instruction pattern
 @cindex @code{cond_umin@var{mode}} instruction pattern
 @cindex @code{cond_umax@var{mode}} instruction pattern
+@cindex @code{cond_copysign@var{mode}} instruction pattern
 @cindex @code{cond_fmin@var{mode}} instruction pattern
 @cindex @code{cond_fmax@var{mode}} instruction pattern
 @cindex @code{cond_ashl@var{mode}} instruction pattern
@@ -7334,6 +7335,7 @@ form of @samp{@var{op}@var{mode}2}.
 @itemx @samp{cond_smax@var{mode}}
 @itemx @samp{cond_umin@var{mode}}
 @itemx @samp{cond_umax@var{mode}}
+@itemx @samp{cond_copysign@var{mode}}
 @itemx @samp{cond_fmin@var{mode}}
 @itemx @samp{cond_fmax@var{mode}}
 @itemx @samp{cond_ashl@var{mode}}
@@ -7371,6 +7373,8 @@ form of @samp{@var{op}@var{mode}3}.  As an exception, the 
vector forms
 of shifts correspond to patterns like @code{vashl@var{mode}3} rather
 than patterns like @code{ashl@var{mode}3}.
 
+@samp{cond_copysign@var{mode}} is only defined for floating point modes.
+
 @cindex @code{cond_fma@var{mode}} instruction pattern
 @cindex @code{cond_fms@var{mode}} instruction pattern
 @cindex @code{cond_fnma@var{mode}} instruction pattern
@@ -7432,6 +7436,7 @@ form of @samp{@var{op}@var{mode}2}.
 @cindex @code{cond_len_smax@var{mode}} instruction pattern
 @cindex @code{cond_len_umin@var{mode}} instruction pattern
 @cindex @code{cond_len_umax@var{mode}} instruction pattern
+@cindex @code{cond_len_copysign@var{mode}} instruction pattern
 @cindex @code{cond_len_fmin@var{mode}} instruction pattern
 @cindex @code{cond_len_fmax@var{mode}} instruction pattern
 @cindex @code{cond_len_ashl@var{mode}} instruction pattern
@@ -7451,6 +7456,7 @@ form of @samp{@var{op}@var{mode}2}.
 @itemx @samp{cond_len_smax@var{mode}}
 @itemx @samp{cond_len_umin@var{mode}}
 @itemx @samp{cond_len_umax@var{mode}}
+@itemx @samp{cond_len_copysign@var{mode}}
 @itemx @samp{cond_len_fmin@var{mode}}
 @itemx @samp{cond_len_fmax@var{mode}}
 @itemx @samp{cond_len_ashl@var{mode}}
@@ -7478,11 +7484,13 @@ integer if @var{m} is scalar, otherwise it has the mode 
returned by
 @code{TARGET_VECTORIZE_GET_MASK_MODE}.  Operand 5 has whichever
 integer mode the target prefers.
 
-@samp{cond_@var{op}@var{mode}} generally corresponds to a conditional
+@samp{cond_len_@var{op}@var{mode}} generally corresponds to a conditional
 form of @samp{@var{op}@var{mode}3}.  As an exception, the vector forms
 of shifts correspond to patterns like @code{vashl@var{mode}3} rather
 than patterns like @code{ashl@var{mode}3}.
 
+@samp{cond_len_copysign@var{mode}} is only defined for floating point modes.
+
 @cindex @code{cond_len_fma@var{mode}} instruction pattern
 @cindex @code{cond_len_fms@var{mode}} instruction pattern
 @cindex @code{cond_len_fnma@var{mode}} instruction pattern
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 07c06ba8cbb..92acec73b3a 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -249,6 +249,7 @@ OPTAB_D (cond_smin_optab, "cond_smin$a")
 OPTAB_D (cond_smax_optab, "cond_smax$a")
 OPTAB_D (cond_umin_optab, "cond_umin$a")
 OPTAB_D (cond_umax_optab, "cond_umax$a")
+OPTAB_D (cond_copysign_optab, "cond_copysign$F$a")
 OPTAB_D (cond_fmin_optab, "cond_fmin$a")
 OPTAB_D (cond_fmax_optab, "cond_fmax$a")
 OPTAB_D (cond_fma_optab, "cond_fma$a")
@@ -256,7 +257,6 @@ OPTAB_D (cond_fms_optab, "cond_fms$a")
 OPTAB_D (cond_fnma_optab, "cond_fnma$a")
 OPTAB_D (cond_fnms_optab, "cond_fnms$a")
 OPTAB_D (cond_neg_optab, "cond_neg$a")
-OPTAB_D (cond_copysign_optab, "cond_copysign$F$a")
 OPTAB_D (cond_one_cmpl_optab, "cond_one_cmpl$a")
 OPTAB_D (cond_len_add_optab, "cond_len_add$a")
 OPTAB_D (cond_len_sub_optab, "cond_len_sub$a")
@@ -275,6 +275,7 @@ OPTAB_D (cond_len_smin_optab, "cond_len_smin$a")
 OPTAB_D (cond_len_smax_optab, "cond_len_smax$a")
 OPTAB_D (cond_len_umin_optab, "cond_len_umin$a")
 OPTAB_D (cond_len_umax_optab, "cond_len_umax$a")
+OPTAB_D (cond_len_copysign_optab, "cond_len_copysign$F$a")
 OPTAB_D (cond_len_fmin_optab, "cond_len_fmin$a")
 OPTAB_D (cond_len_fmax_optab, "cond_len_fmax$a")
 OPTAB_D (cond_len_fma_optab, "cond_len_fma$a")
@@ -282,7 +283,6 @@ OPTAB_D (cond_len_fms_optab, "cond_len_fms$a")
 OPTAB_D (cond_len_fnma_optab, "cond_len_fnma$a")
 OPTAB_D (cond_len_fnms_optab, "cond_len_fnms$a")
 OPTAB_D (cond_len_neg_optab, "

Re: [PATCH] RISC-V: Add --with-cmodel configure-time argument

2023-12-21 Thread Jeff Law





On 12/21/23 12:35, Palmer Dabbelt wrote:

On Thu, 21 Dec 2023 11:18:22 PST (-0800), jeffreya...@gmail.com wrote:



On 12/20/23 11:41, Palmer Dabbelt wrote:

I couldn't find another way to set the default code model.

gcc/ChangeLog:

* config.gcc (RISC-V): Add --with-cmodel
* config/riscv/riscv.h (TARGET_DEFAULT_CMODEL): Use
TARGET_RISCV_DEFAULT_CMODEL

OK once its sniff tested.


Thanks.  A few of us were chatting in the office yesterday, looks like 
it should be pretty manageable to get the large code model stuff into CI 
for testing.  With the holidays stuff might be a little clunky, but 
Patrick or Edwin should be able to get this going eventually.

Yea.  100% expected.



So I'm going to do nothing for now ;)

Likewise.

jeff

Re: [PATCH] Allow overriding EXPECT

2023-12-21 Thread Mike Stump

On Dec 21, 2023, at 8:49 AM, Christophe Lyon  wrote:
> 
> While investigating possible race conditions in the GCC testsuites
> caused by bufferization issues, I wanted to investigate workarounds
> similar to GDB's READ1 [1], and I noticed it was not always possible
> to override EXPECT when running 'make check'.
> 
> This patch adds the missing support in various Makefiles.

Ok.

Re: [PATCH] testsuite: Remove testsuite_tr1.h

2023-12-21 Thread Jason Merrill


On 12/21/23 10:52, Patrick Palka wrote:

On Thu, Dec 21, 2023 at 8:29 AM Patrick Palka  wrote:


On Wed, 20 Dec 2023, Ken Matsui wrote:


This patch removes the testsuite_tr1.h dependency from g++.dg/ext/is_*.C
tests since the header is supposed to be used only by libstdc++, not
front-end.  This also includes test code consistency fixes.


For the record this fixes the test failures reported at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641058.html



LGTM


Very minor but let's use the commit title

   c++: testsuite: Remove testsuite_tr1.h includes

to convey that the commit only touches C++ tests, and isn't removing
the file testsuite_tr1.h but rather #includes of it :)


OK with that change.

Jason

Re: [PATCH] c++: fix -Wparentheses with boolean-like class types

2023-12-21 Thread Jason Merrill


On 12/20/23 20:01, Patrick Palka wrote:

On Wed, 20 Dec 2023, Jason Merrill wrote:


On 12/20/23 17:54, Patrick Palka wrote:

On Wed, 20 Dec 2023, Jason Merrill wrote:


On 12/20/23 17:07, Patrick Palka wrote:

Bootstrap and regtesting in progress on x86_64-pc-linux-gnu, does this
look OK for trunk if successful?

-- >8 --

Since r14-4977-g0f2e2080685e75 the -Wparentheses warning now undesirably
warns on the idiom

Wparentheses-34.C:9:14: warning: suggest parentheses around assignment
used
as truth value [-Wparentheses]
   9 |   b = v[i] = true;
 |  ^~~~

where v has type std::vector.  That commit intended to only extend
the existing diagnostics so that they happen in a template context as
well, but the refactoring of is_assignment_op_expr_p caused us for this
particular -Wparentheses warning (from convert_for_assignment) to now
consider user-defined operator= instead of just built-in operator=.  And
since std::vector is really a bitset, whose operator[] returns a
class type with such a user-defined operator= (taking bool), we now
warn.

But arguably "boolish" class types should be treated like ordinary bool
as far as the warning is concerned.  To that end this patch relaxes the
warning for such types, specifically when the (class) type can be
(implicitly) converted to a and assigned from a bool.  This should cover
at least implementations of std::vector::reference.

gcc/cp/ChangeLog:

* cp-tree.h (maybe_warn_unparenthesized_assignment): Add
'nested_p' bool parameter.
* semantics.cc (is_assignment_op_expr_p): Add 'rhs' bool
parameter and set it accordingly.
(class_type_is_boolish_cache): Define.
(class_type_is_boolish): Define.
(maybe_warn_unparenthesized_assignment): Add 'nested_p'
bool parameter.  Relax the warning for nested assignments
to boolean-like class types.
(maybe_convert_cond): Pass nested_p=false to
maybe_warn_unparenthesized_assignment.
* typeck.cc (convert_for_assignment): Pass nested_p=true to
maybe_warn_unparenthesized_assignment.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wparentheses-34.C: New test.
---
gcc/cp/cp-tree.h|   2 +-
gcc/cp/semantics.cc | 106
++--
gcc/cp/typeck.cc|   2 +-
gcc/testsuite/g++.dg/warn/Wparentheses-34.C |  31 ++
4 files changed, 129 insertions(+), 12 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-34.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1979572c365..97065cccf3d 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7928,7 +7928,7 @@ extern tree lambda_regenerating_args
(tree);
extern tree most_general_lambda (tree);
extern tree finish_omp_target   (location_t, tree,
tree, bool);
extern void finish_omp_target_clauses   (location_t, tree,
tree *);
-extern void maybe_warn_unparenthesized_assignment (tree,
tsubst_flags_t);
+extern void maybe_warn_unparenthesized_assignment (tree, bool,
tsubst_flags_t);
extern tree cp_check_pragma_unroll  (location_t, tree);
  /* in tree.cc */
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 64839b1ac87..92acd560fa4 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -839,10 +839,11 @@ finish_goto_stmt (tree destination)
  return add_stmt (build_stmt (input_location, GOTO_EXPR,
destination));
}
-/* Returns true if T corresponds to an assignment operator
expression.
*/
+/* Returns true if T corresponds to an assignment operator expression,
+   and sets *LHS to its left-hand-side operand if so.  */
  static bool
-is_assignment_op_expr_p (tree t)
+is_assignment_op_expr_p (tree t, tree *lhs)
{
  if (t == NULL_TREE)
return false;
@@ -850,7 +851,10 @@ is_assignment_op_expr_p (tree t)
  if (TREE_CODE (t) == MODIFY_EXPR
  || (TREE_CODE (t) == MODOP_EXPR
  && TREE_CODE (TREE_OPERAND (t, 1)) == NOP_EXPR))
-return true;
+{
+  *lhs = TREE_OPERAND (t, 0);
+  return true;
+}
tree call = extract_call_expr (t);
  if (call == NULL_TREE
@@ -859,26 +863,107 @@ is_assignment_op_expr_p (tree t)
return false;
tree fndecl = cp_get_callee_fndecl_nofold (call);
-  return fndecl != NULL_TREE
-&& DECL_ASSIGNMENT_OPERATOR_P (fndecl)
-&& DECL_OVERLOADED_OPERATOR_IS (fndecl, NOP_EXPR);
+  if (fndecl != NULL_TREE
+  && DECL_ASSIGNMENT_OPERATOR_P (fndecl)
+  && DECL_OVERLOADED_OPERATOR_IS (fndecl, NOP_EXPR))
+{
+  *lhs = get_nth_callarg (call, 0);
+  return true;
+}
+
+  return false;
}
+static GTY((deletable)) hash_map
*class_type_is_boolish_cache;
+
+/* Return true if the class type TYPE can be converted to and assigned
+   from a boolean.  */
+
+static bool
+class_type_is_boolish (tree type)
+{
+  type = TYPE_MAIN_VARIANT (type)

[PATCH] toplevel: don't override gettext-runtime/configure-discovered build args

2023-12-21 Thread Arsen Arsenović

ChangeLog:
PR bootstrap/112534
* Makefile.def (host-gettext): Set all_args_override="".
* Makefile.in: Regenerate.
* Makefile.tpl (all--args): Define as a helper macro for
computing extra arguments to make.
(all): Use all--args over args.
---
Hi,

This patch fixes the build failure noted in PR112534 by preventing the
build system from overriding the compiler flags the gettext-runtime
subconfigure discovers.

This would appear to keep bootstrap working correctly, as shown in:

  [arsen@gcc2-power8 build2]$ readelf -p .comment */intl/dcgettext.o

  File: gettext/intl/dcgettext.o

  String dump of section '.comment':
[ 1]  GCC: (GNU) 14.0.0 20231220 (experimental)


  File: prev-gettext/intl/dcgettext.o

  String dump of section '.comment':
[ 1]  GCC: (GNU) 14.0.0 20231220 (experimental)


  File: stage1-gettext/intl/dcgettext.o

  String dump of section '.comment':
[ 1]  GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-44)

Tested on ppc64le-redhat-linux, OK for trunk?

TIA, have a lovely day.

 Makefile.def |  2 ++
 Makefile.in  | 40 +---
 Makefile.tpl |  8 +---
 3 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/Makefile.def b/Makefile.def
index 3f8962c28032..93222bfd7a4b 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -76,6 +76,8 @@ host_modules= { module= gprof; };
 host_modules= { module= gprofng; };
 host_modules= { module= gettext; bootstrap=true; no_install=true;
 module_srcdir= "gettext/gettext-runtime";
+   // Don't override configure-discovered build arguments
+   all_args_override="";
// We always build gettext with pic, because some packages 
(e.g. gdbserver)
// need it in some configuratons, which is determined via 
nontrivial tests.
// Always enabling pic seems to make sense for something tied to
diff --git a/Makefile.in b/Makefile.in
index c6313048c914..e67a50750c76 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -3168,6 +3168,8 @@ TAGS: do-TAGS
 
 
 
+
+
 # --
 # Modules which run on the build machine
 # --
@@ -20189,7 +20191,7 @@ all-gettext: configure-gettext
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
$(HOST_EXPORTS)  \
(cd $(HOST_SUBDIR)/gettext && \
- $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) 
$(STAGE1_FLAGS_TO_PASS)  \
+ $(MAKE) $(BASE_FLAGS_TO_PASS)  $(STAGE1_FLAGS_TO_PASS)  \
$(TARGET-gettext))
 @endif gettext
 
@@ -20219,7 +20221,7 @@ all-stage1-gettext: configure-stage1-gettext
CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
-   $(EXTRA_HOST_FLAGS)  \
+ \
$(STAGE1_FLAGS_TO_PASS)  \
TFLAGS="$(STAGE1_TFLAGS)"  \
$(TARGET-stage1-gettext)
@@ -20234,7 +20236,7 @@ clean-stage1-gettext:
  $(MAKE) stage1-start; \
fi; \
cd $(HOST_SUBDIR)/gettext && \
-   $(MAKE) $(EXTRA_HOST_FLAGS)  \
+   $(MAKE)   \
$(STAGE1_FLAGS_TO_PASS)  clean
 @endif gettext-bootstrap
 
@@ -20264,7 +20266,7 @@ all-stage2-gettext: configure-stage2-gettext
CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
-   $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  \
+$(POSTSTAGE1_FLAGS_TO_PASS)  \
TFLAGS="$(STAGE2_TFLAGS)"  \
$(TARGET-stage2-gettext)
 
@@ -20278,7 +20280,7 @@ clean-stage2-gettext:
  $(MAKE) stage2-start; \
fi; \
cd $(HOST_SUBDIR)/gettext && \
-   $(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  clean
+   $(MAKE)  $(POSTSTAGE1_FLAGS_TO_PASS)  clean
 @endif gettext-bootstrap
 
 
@@ -20307,7 +20309,7 @@ all-stage3-gettext: configure-stage3-gettext
CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)" \
LIBCFLAGS_FOR_TARGET="$(LIBCFLAGS_FOR_TARGET)" \
-   $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  \
+$(POSTSTAGE1_FLAGS_TO_PASS)  \
TFLAGS="$(STAGE3_TFLAGS)"  \
$(TARGET-stage3-gettext)
 
@@ -20321,7 +20323,7 @@ clean-stage3-gettext:
  $(MAKE) stage3-start; \
fi; \
cd $(HOST_SUBDIR)/gettext && \
-   $(MAKE) $(EXTRA_HOST_FLAGS) $(POSTSTAGE1_FLAGS_TO_PASS)  clean
+   $(MAKE)  $(POSTSTAGE1_FLAGS_TO_PASS)  clean
 @endif gettext-bootstrap
 
 
@@ -20350,7 +20352,7 @@ all-stage4-gettext: configure-stage4-gettext
CFLAGS_FOR_TARGET="$(CFLAGS_FOR_TARGET)" \
CXXFLAGS_FOR_TARGET="$(CXXFLAGS_FOR_TARGET)"

Re: [PATCH] RISC-V: Add --with-cmodel configure-time argument

2023-12-21 Thread Palmer Dabbelt


On Thu, 21 Dec 2023 11:18:22 PST (-0800), jeffreya...@gmail.com wrote:



On 12/20/23 11:41, Palmer Dabbelt wrote:

I couldn't find another way to set the default code model.

gcc/ChangeLog:

* config.gcc (RISC-V): Add --with-cmodel
* config/riscv/riscv.h (TARGET_DEFAULT_CMODEL): Use
TARGET_RISCV_DEFAULT_CMODEL

OK once its sniff tested.


Thanks.  A few of us were chatting in the office yesterday, looks like 
it should be pretty manageable to get the large code model stuff into CI 
for testing.  With the holidays stuff might be a little clunky, but 
Patrick or Edwin should be able to get this going eventually.


So I'm going to do nothing for now ;)



jeff

Re: [PATCH] RISC-V: Add --with-cmodel configure-time argument

2023-12-21 Thread Jeff Law





On 12/20/23 11:41, Palmer Dabbelt wrote:

I couldn't find another way to set the default code model.

gcc/ChangeLog:

* config.gcc (RISC-V): Add --with-cmodel
* config/riscv/riscv.h (TARGET_DEFAULT_CMODEL): Use
TARGET_RISCV_DEFAULT_CMODEL

OK once its sniff tested.

jeff

Re: [PATCH v1] RISC-V: XFail the signbit-5 run test for RVV

2023-12-21 Thread Jeff Law





On 12/20/23 19:25, pan2...@intel.com wrote:

From: Pan Li 

This patch would like to XFail the signbit-5 run test case for
the RVV.  Given the case has one limitation like "This test does not
work when the truth type does not match vector type." in the beginning
of the test file.  Aka, the RVV vector truth type is not integer type.

The target board of riscv-sim like below will pick up `-march=rv64gcv`
when building the run test elf. Thus, the RVV cannot bypass this test
case like aarch64_sve with additional option `-march=armv8-a`.

   riscv-sim/-march=rv64gcv/-mabi=lp64d/-mcmodel=medlow

For RVV, we leverage dg-xfail-run-if for this case like `amdgcn`.
But isn't that just going to turn this into an XPASS when vector is not 
enabled?


Looking at a recent rv64gc run of mine:


PASS: gcc.dg/signbit-5.c (test for excess errors)
PASS: gcc.dg/signbit-5.c execution test



Ideally we'd find a way to handle with and without vector.

jeff

Re: [PATCH] c++: visibility wrt template and ptrmem targs [PR70413]

2023-12-21 Thread Patrick Palka

On Sat, 16 Sep 2023, Jason Merrill wrote:

> On 9/15/23 12:03, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> 
> OK.

Thanks a lot.  Testing on cmcstl2 revealed that we don't maintain
visibility flags on alias templates properly, and so we can't trust
them when constraining visibilitiy.  So I went ahead and pushed the
following restricted version of the patch which excludes alias templates
(and adds a couple of alias template targ linkage tests):

-- >8 --

Subject: [PATCH] c++: visibility wrt template and ptrmem targs [PR70413]

When constraining the visibility of an instantiation, we weren't
properly considering the visibility of PTRMEM_CST and TEMPLATE_DECL
template arguments.

This patch fixes this.  It turns out we don't maintain the relevant
visibility flags for alias templates (e.g. TREE_PUBLIC is never set),
so continue to ignore alias template template arguments for now.

PR c++/70413
PR c++/107906

gcc/cp/ChangeLog:

* decl2.cc (min_vis_expr_r): Handle PTRMEM_CST and TEMPLATE_DECL
other than those for alias templates.

gcc/testsuite/ChangeLog:

* g++.dg/template/linkage2.C: New test.
* g++.dg/template/linkage3.C: New test.
* g++.dg/template/linkage4.C: New test.
* g++.dg/template/linkage4a.C: New test.
---
 gcc/cp/decl2.cc   | 22 ++
 gcc/testsuite/g++.dg/template/linkage2.C  | 13 +
 gcc/testsuite/g++.dg/template/linkage3.C  | 17 +
 gcc/testsuite/g++.dg/template/linkage4.C  | 16 
 gcc/testsuite/g++.dg/template/linkage4a.C | 14 ++
 5 files changed, 78 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/linkage2.C
 create mode 100644 gcc/testsuite/g++.dg/template/linkage3.C
 create mode 100644 gcc/testsuite/g++.dg/template/linkage4.C
 create mode 100644 gcc/testsuite/g++.dg/template/linkage4a.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 0850d3f5bce..4777ceb8af7 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -2655,7 +2655,10 @@ min_vis_expr_r (tree *tp, int */*walk_subtrees*/, void 
*data)
   int *vis_p = (int *)data;
   int tpvis = VISIBILITY_DEFAULT;
 
-  switch (TREE_CODE (*tp))
+  tree t = *tp;
+  if (TREE_CODE (t) == PTRMEM_CST)
+t = PTRMEM_CST_MEMBER (t);
+  switch (TREE_CODE (t))
 {
 case CAST_EXPR:
 case IMPLICIT_CONV_EXPR:
@@ -2666,15 +2669,26 @@ min_vis_expr_r (tree *tp, int */*walk_subtrees*/, void 
*data)
 case NEW_EXPR:
 case CONSTRUCTOR:
 case LAMBDA_EXPR:
-  tpvis = type_visibility (TREE_TYPE (*tp));
+  tpvis = type_visibility (TREE_TYPE (t));
   break;
 
+case TEMPLATE_DECL:
+  if (DECL_ALIAS_TEMPLATE_P (t))
+   /* FIXME: We don't maintain TREE_PUBLIC / DECL_VISIBILITY for
+  alias templates so we can't trust it here (PR107906).  */
+   break;
+  t = DECL_TEMPLATE_RESULT (t);
+  /* Fall through.  */
 case VAR_DECL:
 case FUNCTION_DECL:
-  if (! TREE_PUBLIC (*tp))
+  if (! TREE_PUBLIC (t))
tpvis = VISIBILITY_ANON;
   else
-   tpvis = DECL_VISIBILITY (*tp);
+   tpvis = DECL_VISIBILITY (t);
+  break;
+
+case FIELD_DECL:
+  tpvis = type_visibility (DECL_CONTEXT (t));
   break;
 
 default:
diff --git a/gcc/testsuite/g++.dg/template/linkage2.C 
b/gcc/testsuite/g++.dg/template/linkage2.C
new file mode 100644
index 000..08fb6930262
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage2.C
@@ -0,0 +1,13 @@
+// PR c++/70413
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" } }
+
+namespace {
+  template struct A;
+}
+
+template class Q> void f() { }
+
+int main() {
+  f();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage3.C 
b/gcc/testsuite/g++.dg/template/linkage3.C
new file mode 100644
index 000..257aab33b38
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage3.C
@@ -0,0 +1,17 @@
+// PR c++/70413
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" } }
+
+namespace {
+  struct A {
+void f();
+int m;
+  };
+}
+
+template void g() { }
+template void h() { }
+
+int main() {
+  g<&A::f>();
+  h<&A::m>();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage4.C 
b/gcc/testsuite/g++.dg/template/linkage4.C
new file mode 100644
index 000..03630eebd3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/linkage4.C
@@ -0,0 +1,16 @@
+// PR c++/107906
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler-not "(weak|glob)\[^\n\]*_Z" { xfail *-*-* } } }
+
+namespace {
+  template using X = int;
+  struct A {
+template using X = int;
+  };
+}
+template class Q> void f() { }
+
+int main() {
+  f();
+  f();
+}
diff --git a/gcc/testsuite/g++.dg/template/linkage4a.C 
b/gcc/testsuite/g++.dg/template/linkage4a.C
new file mode 100644
index 000..f1934fd6557
--- /dev/null
+++ b/gcc/testsuite/g++.d

Re: [PATCH v5 2/3] RISC-V: Add crypto machine descriptions

2023-12-21 Thread Jeff Law





On 12/20/23 20:50, juzhe.zh...@rivai.ai wrote:

+   (and:VI
+ (match_operand:VI 3 "register_operand" "vr, vr, vr, vr")
+ (not:VI (match_operand:VI 4 "register_operand" "vr, vr, vr, vr")))

This order should be swapped like ARM SVE:

(define_expand "@cond_bic"
   [(set (match_operand:SVE_FULL_I 0 "register_operand")
   (unspec:SVE_FULL_I
     [(match_operand: 1 "register_operand")
      (and:SVE_FULL_I
        (not:SVE_FULL_I (match_operand:SVE_FULL_I 3 "register_operand"))
        (match_operand:SVE_FULL_I 2 "register_operand"))
      (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
     UNSPEC_SEL))]
"TARGET_SVE"
)


Correct.  This case is even noted in the internals manual ;-)


A machine that has an instruction that performs a bitwise logical-and of one
operand with the bitwise negation of the other should specify the pattern
for that instruction as

@smallexample
(define_insn ""
  [(set (match_operand:@var{m} 0 @dots{})
(and:@var{m} (not:@var{m} (match_operand:@var{m} 1 @dots{}))
 (match_operand:@var{m} 2 @dots{})))]
  "@dots{}"
  "@dots{}")
@end smallexample



Jeff

[PATCH] libgccjit: Allow sending a const pointer as argument

2023-12-21 Thread Antoni Boucher

Hi.
This patch adds the ability to send const pointer as argument to a
function.
Thanks for the review.
From f53c4600d8103a5612e7de6cb8205cad37421074 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Tue, 24 May 2022 17:44:53 -0400
Subject: [PATCH] libgccjit: Allow sending a const pointer as argument

gcc/jit/ChangeLog:

	* jit-recording.h: Remove memento_of_get_const::accepts_writes_from.

gcc/testsuite/ChangeLog:

	* jit.dg/all-non-failing-tests.h: Add test-const-pointer-argument.c.
	* jit.dg/test-const-pointer-argument.c: New test.
---
 gcc/jit/jit-recording.h   |  6 --
 gcc/testsuite/jit.dg/all-non-failing-tests.h  | 10 +++
 .../jit.dg/test-const-pointer-argument.c  | 76 +++
 3 files changed, 86 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/jit.dg/test-const-pointer-argument.c

diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 4a8082991fb..21aeb7d0bbd 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -720,12 +720,6 @@ public:
   memento_of_get_const (type *other_type)
   : decorated_type (other_type) {}
 
-  bool accepts_writes_from (type */*rtype*/) final override
-  {
-/* Can't write to a "const".  */
-return false;
-  }
-
   /* Strip off the "const", giving the underlying type.  */
   type *unqualified () final override { return m_other_type; }
 
diff --git a/gcc/testsuite/jit.dg/all-non-failing-tests.h b/gcc/testsuite/jit.dg/all-non-failing-tests.h
index e762563f9bd..6ac9173ea7d 100644
--- a/gcc/testsuite/jit.dg/all-non-failing-tests.h
+++ b/gcc/testsuite/jit.dg/all-non-failing-tests.h
@@ -126,6 +126,13 @@
 #undef create_code
 #undef verify_code
 
+/* test-const-pointer-argument.c */
+#define create_code create_code_const_pointer_argument
+#define verify_code verify_code_const_pointer_argument
+#include "test-const-pointer-argument.c"
+#undef create_code
+#undef verify_code
+
 /* test-debug-strings.c */
 #define create_code create_code_debug_strings
 #define verify_code verify_code_debug_strings
@@ -437,6 +444,9 @@ const struct testcase testcases[] = {
   {"constants",
create_code_constants,
verify_code_constants},
+  {"const_pointer_argument",
+   create_code_const_pointer_argument,
+   verify_code_const_pointer_argument},
   {"debug_strings",
create_code_debug_strings,
verify_code_debug_strings},
diff --git a/gcc/testsuite/jit.dg/test-const-pointer-argument.c b/gcc/testsuite/jit.dg/test-const-pointer-argument.c
new file mode 100644
index 000..836634f1dd0
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-const-pointer-argument.c
@@ -0,0 +1,76 @@
+#include 
+#include 
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  /* Let's try to inject the equivalent of:
+
+ int test_ptr(const int* value)
+ {
+  return *foo;
+ }
+
+ int main (void)
+ {
+   int value = 10;
+   const int *ptr = &value;
+   int res = test_ptr (ptr);
+   return res;
+ }
+  */
+  gcc_jit_type *int_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT);
+  gcc_jit_type *int_ptr_type =
+gcc_jit_type_get_pointer (int_type);
+  gcc_jit_type *const_ptr_type =
+gcc_jit_type_get_const (int_ptr_type);
+
+  /* Build the test_ptr.  */
+  gcc_jit_param *param =
+gcc_jit_context_new_param (ctxt, NULL, const_ptr_type, "value");
+  gcc_jit_function *test_ptr =
+gcc_jit_context_new_function (ctxt, NULL,
+  GCC_JIT_FUNCTION_EXPORTED,
+  int_type,
+  "test_ptr",
+  1, ¶m,
+  0);
+  gcc_jit_block *block = gcc_jit_function_new_block (test_ptr, NULL);
+  gcc_jit_block_end_with_return (block,
+NULL,
+gcc_jit_lvalue_as_rvalue (
+  gcc_jit_rvalue_dereference (gcc_jit_param_as_rvalue (param), NULL)));
+
+  /* Build main.  */
+  gcc_jit_function *main =
+gcc_jit_context_new_function (ctxt, NULL,
+  GCC_JIT_FUNCTION_EXPORTED,
+  int_type,
+  "main",
+  0, NULL,
+  0);
+  gcc_jit_block *main_block = gcc_jit_function_new_block (main, NULL);
+  gcc_jit_lvalue *variable =
+gcc_jit_function_new_local (main, NULL, int_type, "value");
+  gcc_jit_lvalue *pointer =
+gcc_jit_function_new_local (main, NULL, const_ptr_type, "ptr");
+  gcc_jit_block_add_assignment (main_block, NULL, pointer,
+gcc_jit_lvalue_get_address (variable, NULL));
+  gcc_jit_rvalue *ptr_rvalue = gcc_jit_lvalue_as_rvalue (pointer);
+  gcc_jit_rvalue *res =
+gcc_jit_context_new_call (ctxt, NULL, test_ptr, 1, &ptr_rvalue);
+  gcc_jit_block_end_with_return (main_block, NULL, res);
+}
+
+void
+verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
+{
+  CHECK_NON_NULL (result);
+}
-- 
2.43.0

[PATCH] Allow overriding EXPECT

2023-12-21 Thread Christophe Lyon

While investigating possible race conditions in the GCC testsuites
caused by bufferization issues, I wanted to investigate workarounds
similar to GDB's READ1 [1], and I noticed it was not always possible
to override EXPECT when running 'make check'.

This patch adds the missing support in various Makefiles.

I was not able to test the patch for all the libraries updated here,
but I confirmed it works as intended/needed for libstdc++.

libatomic, libitm, libgomp already work as intended because their
Makefiles do not have:
MAKEOVERRIDES=

Tested on (native) aarch64-linux-gnu, confirmed the patch introduces
the behaviour I want in gcc, g++, gfortran and libstdc++.

I updated (but could not test) libgm2, libphobos, libquadmath and
libssp for consistency since their Makefiles have MAKEOVERRIDES=

libffi, libgo, libsanitizer seem to need a similar update, but they
are imported from their respective upstream repo, so should not be
patched here.

[1] https://github.com/bminor/binutils-gdb/blob/master/gdb/testsuite/README#L269

2023-12-21  Christophe Lyon  

gcc/
* Makefile.in: Allow overriding EXEPCT.

libgm2/
* Makefile.am: Allow overriding EXEPCT.
* Makefile.in: Regenerate.

libphobos/
* Makefile.am: Allow overriding EXEPCT.
* Makefile.in: Regenerate.

libquadmath/
* Makefile.am: Allow overriding EXEPCT.
* Makefile.in: Regenerate.

libssp/
* Makefile.am: Allow overriding EXEPCT.
* Makefile.in: Regenerate.

libstdc++-v3/
* Makefile.am: Allow overriding EXEPCT.
* Makefile.in: Regenerate.
---
 gcc/Makefile.in  | 3 +++
 libgm2/Makefile.am   | 1 +
 libgm2/Makefile.in   | 1 +
 libphobos/Makefile.am| 1 +
 libphobos/Makefile.in| 1 +
 libquadmath/Makefile.am  | 1 +
 libquadmath/Makefile.in  | 1 +
 libssp/Makefile.am   | 1 +
 libssp/Makefile.in   | 1 +
 libstdc++-v3/Makefile.am | 1 +
 libstdc++-v3/Makefile.in | 1 +
 11 files changed, 13 insertions(+)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f284c1387e2..bc35a1bd237 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -4303,6 +4303,7 @@ $(lang_checks_parallel): site.exp
vardots=`echo "$$variant" | sed 's,/,.,g'`; \
$(MAKE) TESTSUITEDIR="testsuite.$$vardots" \
  RUNTESTFLAGS="--target_board=$$variant $(RUNTESTFLAGS)" \
+ EXPECT=$(EXPECT) \
  "$$target"
 
 TESTSUITEDIR = testsuite
@@ -4368,6 +4369,7 @@ $(lang_checks_parallelized): check-% : site.exp
  
GCC_RUNTEST_PARALLELIZE_DIR=`${PWD_COMMAND}`/$(TESTSUITEDIR)/$(check_p_tool)-parallel
 ; \
  export GCC_RUNTEST_PARALLELIZE_DIR ; \
  $(MAKE) TESTSUITEDIR="$(TESTSUITEDIR)" RUNTESTFLAGS="$(RUNTESTFLAGS)" 
\
+   EXPECT=$(EXPECT) \
check-parallel-$* \
$(patsubst %,check-parallel-$*_%, $(check_p_subdirs)); \
  sums= ; logs= ; \
@@ -4386,6 +4388,7 @@ $(lang_checks_parallelized): check-% : site.exp
  rm -rf $(TESTSUITEDIR)/$*-parallel || true; \
else \
  $(MAKE) TESTSUITEDIR="$(TESTSUITEDIR)" RUNTESTFLAGS="$(RUNTESTFLAGS)" 
\
+   EXPECT=$(EXPECT) \
check_$*_parallelize= check-parallel-$*; \
fi
 
diff --git a/libgm2/Makefile.am b/libgm2/Makefile.am
index d2eadfc51aa..72391d01291 100644
--- a/libgm2/Makefile.am
+++ b/libgm2/Makefile.am
@@ -69,6 +69,7 @@ AM_MAKEFLAGS = \
"CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
"CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
 "CFLAGS_LONGDOUBLE=$(CFLAGS_LONGDOUBLE)" \
+   "EXPECT=$(EXPECT)" \
"INSTALL=$(INSTALL)" \
"INSTALL_DATA=$(INSTALL_DATA)" \
"INSTALL_PROGRAM=$(INSTALL_PROGRAM)" \
diff --git a/libgm2/Makefile.in b/libgm2/Makefile.in
index 5a96f98edc9..4c30d2b034f 100644
--- a/libgm2/Makefile.in
+++ b/libgm2/Makefile.in
@@ -371,6 +371,7 @@ AM_MAKEFLAGS = \
"CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
"CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
 "CFLAGS_LONGDOUBLE=$(CFLAGS_LONGDOUBLE)" \
+   "EXPECT=$(EXPECT)" \
"INSTALL=$(INSTALL)" \
"INSTALL_DATA=$(INSTALL_DATA)" \
"INSTALL_PROGRAM=$(INSTALL_PROGRAM)" \
diff --git a/libphobos/Makefile.am b/libphobos/Makefile.am
index d46cfef533e..307c57c8b22 100644
--- a/libphobos/Makefile.am
+++ b/libphobos/Makefile.am
@@ -38,6 +38,7 @@ AM_MAKEFLAGS = \
"CXXFLAGS=$(CXXFLAGS)" \
"CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
"CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
+   "EXPECT=$(EXPECT)" \
"GDC_FOR_TARGET=$(GDC_FOR_TARGET)" \
"GDC=$(GDC)" \
"GDCFLAGS=$(GDCFLAGS)" \
diff --git a/libphobos/Makefile.in b/libphobos/Makefile.in
index 8d62c31dab0..eef750bc46e 100644
--- a/libphobos/Makefile.in
+++ b/libphobos/Makefile.in
@@ -365,6 +365,7 @@ AM_MAKEFLAGS = \
"CXXFLAGS=$(CXXFLAGS)" \
"CFLAGS_FOR_BUILD=$(CFLAGS_FOR_BUILD)" \
"CFLAGS_FOR_TARGET=$(CFLAGS_FOR_TARGET)" \
+   "

Re: [PATCH] OpenMP: Support accelerated 2D/3D memory copies for AMD GCN

2023-12-21 Thread Tobias Burnus


Hi Julian,

On 20.09.23 13:44, Julian Brown wrote:


This patch adds support for 2D/3D memory copies for omp_target_memcpy_rect
and "target update", using AMD extensions to the HSA API.  I've just
committed a version of this patch to the og13 branch, but this is the
mainline version.

Support is also added for 1-dimensional strided accesses: these are
treated as a special case of 2-dimensional transfers, where the innermost
dimension is formed from the stride length (in bytes).

This patch has (somewhat awkwardly from a review perspective) been merged
on top of the following list of in-review series:

"OpenMP/OpenACC: map clause and OMP gimplify rework":
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627895.html

[That's now in mainline.]

"OpenMP: lvalue parsing and "declare mapper" support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629363.html

[Review in progress for the lvalue patches (C++; then C); mapper part
should not be required for this patch.]

"OpenMP: Array-shaping operator and strided/rectangular 'target update'
[support":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629422.html

Here, 2/5 and 3/5 are required (and 1/5 is committed). [4+5/5 are make
it more useful but should not be required for this patch]

"OpenMP: Enable 'declare mapper' mappers for 'target update' directives":
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629432.html


I think this patch is not required for that patch, but obviously still
useful.

* * *

I think it makes sense to split this patch into two parts:

* Thelibgomp/plugin/plugin-gcn.c – which is independent and would already
used by omp_memcpy_rect. * The libgomp/target.c which depends on otherpatches 
and adds
  1D strided transfer support.

I will ignore the latter change in this review. And add a note to:
https://gcc.gnu.org/wiki/openmpPendingPatches to ensure that it
will eventually be reviewed and added.

* * *


Though it only depends directly on parts of that work (regarding
strided/rectangular updates).  A stand-alone version that just works
for the OpenMP API routine omp_target_memcpy_rect could be prepared to
apply separately, if preferable.

This version has been re-tested and bootstrapped.  OK?

2023-09-20  Julian Brown  

libgomp/
  * plugin/plugin-gcn.c (hsa_runtime_fn_info): Add
  hsa_amd_memory_lock_fn, hsa_amd_memory_unlock_fn,
  hsa_amd_memory_async_copy_rect_fn function pointers.
  (init_hsa_runtime_functions): Add above functions, with
  DLSYM_OPT_FN.
  (GOMP_OFFLOAD_memcpy2d, GOMP_OFFLOAD_memcpy3d): New functions.
  * target.c (omp_target_memcpy_rect_worker): Add 1D strided transfer
  support.
---
  libgomp/plugin/plugin-gcn.c | 359 

The plugin change LGTM. Thanks.

  libgomp/target.c|  31 


I defer the review this part until the required other patches are in.

Cf. https://gcc.gnu.org/wiki/openmpPendingPatches

Thanks,

Tobias


diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index ef22d48da79..95c0a57e792 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -196,6 +196,16 @@ struct hsa_runtime_fn_info
hsa_status_t (*hsa_code_object_deserialize_fn)
  (void *serialized_code_object, size_t serialized_code_object_size,
   const char *options, hsa_code_object_t *code_object);
+  hsa_status_t (*hsa_amd_memory_lock_fn)
+(void *host_ptr, size_t size, hsa_agent_t *agents, int num_agent,
+ void **agent_ptr);
+  hsa_status_t (*hsa_amd_memory_unlock_fn) (void *host_ptr);
+  hsa_status_t (*hsa_amd_memory_async_copy_rect_fn)
+(const hsa_pitched_ptr_t *dst, const hsa_dim3_t *dst_offset,
+ const hsa_pitched_ptr_t *src, const hsa_dim3_t *src_offset,
+ const hsa_dim3_t *range, hsa_agent_t copy_agent,
+ hsa_amd_copy_direction_t dir, uint32_t num_dep_signals,
+ const hsa_signal_t *dep_signals, hsa_signal_t completion_signal);
  };

  /* Structure describing the run-time and grid properties of an HSA kernel
@@ -1398,6 +1408,9 @@ init_hsa_runtime_functions (void)
DLSYM_FN (hsa_signal_load_acquire)
DLSYM_FN (hsa_queue_destroy)
DLSYM_FN (hsa_code_object_deserialize)
+  DLSYM_OPT_FN (hsa_amd_memory_lock)
+  DLSYM_OPT_FN (hsa_amd_memory_unlock)
+  DLSYM_OPT_FN (hsa_amd_memory_async_copy_rect)
return true;
  #undef DLSYM_FN
  }
@@ -3790,6 +3803,352 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst, const void 
*src, size_t n)
return true;
  }

+/* Here _size refers to  multiplied by size -- i.e.
+   measured in bytes.  So we have:
+
+   dim1_size: number of bytes to copy on innermost dimension ("row")
+   dim0_len: number of rows to copy
+   dst: base pointer for destination of copy
+   dst_offset1_size: innermost row offset (for dest), in bytes
+   dst_offset0_len: offset, number of rows (for dest)
+   dst_dim1_size: whole-array dest row length, in bytes (pitch)
+   src: base pointer for source of copy
+   src_offset1_size: innermos

Re: [PATCH] testsuite: Remove testsuite_tr1.h

2023-12-21 Thread Patrick Palka

On Thu, Dec 21, 2023 at 8:29 AM Patrick Palka  wrote:
>
> On Wed, 20 Dec 2023, Ken Matsui wrote:
>
> > This patch removes the testsuite_tr1.h dependency from g++.dg/ext/is_*.C
> > tests since the header is supposed to be used only by libstdc++, not
> > front-end.  This also includes test code consistency fixes.

For the record this fixes the test failures reported at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641058.html

>
> LGTM

Very minor but let's use the commit title

  c++: testsuite: Remove testsuite_tr1.h includes

to convey that the commit only touches C++ tests, and isn't removing
the file testsuite_tr1.h but rather #includes of it :)

>
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * g++.dg/ext/is_array.C: Remove testsuite_tr1.h.  Add necessary
> >   definitions accordingly.  Tweak macros for consistency across
> >   test codes.
> >   * g++.dg/ext/is_bounded_array.C: Likewise.
> >   * g++.dg/ext/is_function.C: Likewise.
> >   * g++.dg/ext/is_member_function_pointer.C: Likewise.
> >   * g++.dg/ext/is_member_object_pointer.C: Likewise.
> >   * g++.dg/ext/is_member_pointer.C: Likewise.
> >   * g++.dg/ext/is_object.C: Likewise.
> >   * g++.dg/ext/is_reference.C: Likewise.
> >   * g++.dg/ext/is_scoped_enum.C: Likewise.
> >
> > Signed-off-by: Ken Matsui 
> > ---
> >  gcc/testsuite/g++.dg/ext/is_array.C   | 15 ---
> >  gcc/testsuite/g++.dg/ext/is_bounded_array.C   | 20 -
> >  gcc/testsuite/g++.dg/ext/is_function.C| 41 +++
> >  .../g++.dg/ext/is_member_function_pointer.C   | 14 +++
> >  .../g++.dg/ext/is_member_object_pointer.C | 26 ++--
> >  gcc/testsuite/g++.dg/ext/is_member_pointer.C  | 29 ++---
> >  gcc/testsuite/g++.dg/ext/is_object.C  | 21 --
> >  gcc/testsuite/g++.dg/ext/is_reference.C   | 28 +++--
> >  gcc/testsuite/g++.dg/ext/is_scoped_enum.C | 12 ++
> >  9 files changed, 101 insertions(+), 105 deletions(-)
> >
> > diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
> > b/gcc/testsuite/g++.dg/ext/is_array.C
> > index facfed5c7cb..f1a6e08b87a 100644
> > --- a/gcc/testsuite/g++.dg/ext/is_array.C
> > +++ b/gcc/testsuite/g++.dg/ext/is_array.C
> > @@ -1,15 +1,14 @@
> >  // { dg-do compile { target c++11 } }
> >
> > -#include 
> > +#define SA(X) static_assert((X),#X)
> >
> > -using namespace __gnu_test;
> > +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)\
> > +  SA(TRAIT(TYPE) == EXPECT); \
> > +  SA(TRAIT(const TYPE) == EXPECT);   \
> > +  SA(TRAIT(volatile TYPE) == EXPECT);\
> > +  SA(TRAIT(const volatile TYPE) == EXPECT)
> >
> > -#define SA(X) static_assert((X),#X)
> > -#define SA_TEST_CATEGORY(TRAIT, X, expect) \
> > -  SA(TRAIT(X) == expect);  \
> > -  SA(TRAIT(const X) == expect);\
> > -  SA(TRAIT(volatile X) == expect); \
> > -  SA(TRAIT(const volatile X) == expect)
> > +class ClassType { };
> >
> >  SA_TEST_CATEGORY(__is_array, int[2], true);
> >  SA_TEST_CATEGORY(__is_array, int[], true);
> > diff --git a/gcc/testsuite/g++.dg/ext/is_bounded_array.C 
> > b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> > index 346790eba12..b5fe435de95 100644
> > --- a/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> > +++ b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> > @@ -1,21 +1,19 @@
> >  // { dg-do compile { target c++11 } }
> >
> > -#include 
> > -
> > -using namespace __gnu_test;
> > -
> >  #define SA(X) static_assert((X),#X)
> >
> > -#define SA_TEST_CONST(TRAIT, TYPE, EXPECT)   \
> > +#define SA_TEST_FN(TRAIT, TYPE, EXPECT)  \
> >SA(TRAIT(TYPE) == EXPECT); \
> > -  SA(TRAIT(const TYPE) == EXPECT)
> > +  SA(TRAIT(const TYPE) == EXPECT);
> >
> >  #define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)\
> > -  SA(TRAIT(TYPE) == EXPECT); \
> > -  SA(TRAIT(const TYPE) == EXPECT);   \
> > -  SA(TRAIT(volatile TYPE) == EXPECT);\
> > +  SA(TRAIT(TYPE) == EXPECT); \
> > +  SA(TRAIT(const TYPE) == EXPECT);   \
> > +  SA(TRAIT(volatile TYPE) == EXPECT);\
> >SA(TRAIT(const volatile TYPE) == EXPECT)
> >
> > +class ClassType { };
> > +
> >  SA_TEST_CATEGORY(__is_bounded_array, int[2], true);
> >  SA_TEST_CATEGORY(__is_bounded_array, int[], false);
> >  SA_TEST_CATEGORY(__is_bounded_array, int[2][3], true);
> > @@ -31,8 +29,8 @@ SA_TEST_CATEGORY(__is_bounded_array, ClassType[][3], 
> > false);
> >  SA_TEST_CATEGORY(__is_bounded_array, int(*)[2], false);
> >  SA_TEST_CATEGORY(__is_bounded_array, int(*)[], false);
> >  SA_TEST_CATEGORY(__is_bounded_array, int(&)[2], false);
> > -SA_TEST_CONST(__is_bounded_array, int(&)[], false);
> > +SA_TEST_FN(__is_bounded_array, int(&)[], false);
> >
> >  // Sanity check.
> >  SA_TEST_CATEGORY(__is_bounded_array, ClassType, false);
> > -SA_TEST_CONST(__is_bounded_array, void(), false);
> > +SA_TEST

[committed] i386: Fix shifts with high register input operand [PR113044]

2023-12-21 Thread Uros Bizjak

The move to the output operand should use high register input operand.

   PR target/113044

   gcc/ChangeLog:

   * config/i386/i386.md (*ashlqi_ext_1): Move from the
   high register of the input operand.
   (*qi_ext_1): Ditto.

   gcc/testsuite/ChangeLog:

   * gcc.target/i386/pr113044.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros,
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 710068e9093..4c6368bf3b7 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -15527,7 +15527,8 @@ (define_insn_and_split "*ashlqi_ext_1"
&& !(rtx_equal_p (operands[0], operands[1]))"
   [(set (zero_extract:SWI248
  (match_dup 0) (const_int 8) (const_int 8))
-   (match_dup 1))
+   (zero_extract:SWI248
+ (match_dup 1) (const_int 8) (const_int 8)))
(parallel
  [(set (zero_extract:SWI248
 (match_dup 0) (const_int 8) (const_int 8))
@@ -16689,7 +16690,8 @@ (define_insn_and_split "*qi_ext_1"
&& !(rtx_equal_p (operands[0], operands[1]))"
   [(set (zero_extract:SWI248
  (match_dup 0) (const_int 8) (const_int 8))
-   (match_dup 1))
+   (zero_extract:SWI248
+ (match_dup 1) (const_int 8) (const_int 8)))
(parallel
  [(set (zero_extract:SWI248
 (match_dup 0) (const_int 8) (const_int 8))
diff --git a/gcc/testsuite/gcc.target/i386/pr113044.c 
b/gcc/testsuite/gcc.target/i386/pr113044.c
new file mode 100644
index 000..923e7f48299
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr113044.c
@@ -0,0 +1,24 @@
+/* PR target/113044 */
+/* { dg-do run } */
+/* { dg-options "-O" } */
+
+typedef unsigned char __attribute__((__vector_size__ (2))) V;
+
+V
+foo (char c, V v)
+{
+  V x = v >> (v & 8);
+  volatile char d = c;
+  if (!d)
+__builtin_abort();
+  return x;
+}
+
+int
+main (void)
+{
+  V x = foo (10, (V){3});
+  if (x[0] != 3 || x[1])
+__builtin_abort();
+  return 0;
+}

[PATCH] libgccjit: Support signed char flag

2023-12-21 Thread Antoni Boucher

Hi.
This patch adds support for the -fsigned-char flag.
I'm not sure how to test it since I stumbled upon this bug when I found
this other bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107863)
which is now fixed.
Any idea how I could test this patch?
Thanks for the review.
From 45719be81ab71983ab10ecb67139eaf02955e4db Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Mon, 3 Oct 2022 19:11:39 -0400
Subject: [PATCH] libgccjit: Support signed char flag

gcc/jit/ChangeLog:

	* dummy-frontend.cc (jit_langhook_init): Send flag_signed_char
	argument to build_common_tree_nodes.
---
 gcc/jit/dummy-frontend.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/jit/dummy-frontend.cc b/gcc/jit/dummy-frontend.cc
index 9f71bca44b1..11615b30f40 100644
--- a/gcc/jit/dummy-frontend.cc
+++ b/gcc/jit/dummy-frontend.cc
@@ -607,7 +607,7 @@ jit_langhook_init (void)
   diagnostic_starter (global_dc) = jit_begin_diagnostic;
   diagnostic_finalizer (global_dc) = jit_end_diagnostic;

-  build_common_tree_nodes (false);
+  build_common_tree_nodes (flag_signed_char);

   build_common_builtin_nodes ();

-- 
2.43.0

Re: [PATCH] libgccjit: Add type checks in gcc_jit_block_add_assignment_op

2023-12-21 Thread Antoni Boucher

Hi.
Here's the updated patch.
Thanks.

On Thu, 2023-12-07 at 20:15 -0500, David Malcolm wrote:
> On Thu, 2023-12-07 at 17:34 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch adds checks gcc_jit_block_add_assignment_op to make sure
> > it
> > is only ever called on numeric types.
> > 
> > With the previous patch, this might require a change to also allow
> > vector types here.
> > 
> > Thanks for the review.
> 
> Thanks for the patch.
> 
> [...snip...]
> 
> > @@ -2890,6 +2900,17 @@ gcc_jit_block_add_assignment_op
> > (gcc_jit_block *block,
> >  lvalue->get_type ()->get_debug_string (),
> >  rvalue->get_debug_string (),
> >  rvalue->get_type ()->get_debug_string ());
> > +  // TODO: check if it is a numeric vector?
> > +  RETURN_IF_FAIL_PRINTF3 (
> > +    lvalue->get_type ()->is_numeric () && rvalue->get_type ()-
> > >is_numeric (), ctxt, loc,
> > +    "gcc_jit_block_add_assignment_op %s has non-numeric lvalue %s
> > (type: %s)",
> > +    gcc::jit::binary_op_reproducer_strings[op],
> > +    lvalue->get_debug_string (), lvalue->get_type ()-
> > >get_debug_string ());
> 
> The condition being tested here should probably just be:
> 
>    lvalue->get_type ()->is_numeric ()
> 
> since otherwise if the lvalue's type is numeric and the rvalue's type
> fails to be, then the user would incorrectly get a message about the
> lvalue.
> 
> > +  RETURN_IF_FAIL_PRINTF3 (
> > +    rvalue->get_type ()->is_numeric () && rvalue->get_type ()-
> > >is_numeric (), ctxt, loc,
> > +    "gcc_jit_block_add_assignment_op %s has non-numeric rvalue %s
> > (type: %s)",
> > +    gcc::jit::binary_op_reproducer_strings[op],
> > +    rvalue->get_debug_string (), rvalue->get_type ()-
> > >get_debug_string ());
> 
> The condition being tested here seems to have a redundant repeated:
>   && rvalue->get_type ()->is_numeric ()
> 
> Am I missing something, or is that a typo?
> 
> [...snip...]
> 
> The patch is OK otherwise.
> 
> Thanks
> Dave
> 
> 
> 

From a93b029db4622ff6385715ff9cdaf1be5ffa5657 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Wed, 18 Oct 2023 18:33:18 -0400
Subject: [PATCH] libgccjit: Add type checks in gcc_jit_block_add_assignment_op

gcc/jit/ChangeLog:

	* libgccjit.cc (RETURN_IF_FAIL_PRINTF3): New macro.
	(gcc_jit_block_add_assignment_op): Add numeric checks.

gcc/testsuite/ChangeLog:

	* jit.dg/test-error-bad-assignment-op.c: New test.
---
 gcc/jit/libgccjit.cc  | 21 +++
 .../jit.dg/test-error-bad-assignment-op.c | 57 +++
 2 files changed, 78 insertions(+)
 create mode 100644 gcc/testsuite/jit.dg/test-error-bad-assignment-op.c

diff --git a/gcc/jit/libgccjit.cc b/gcc/jit/libgccjit.cc
index 0451b4df7f9..10d23e7fcf6 100644
--- a/gcc/jit/libgccjit.cc
+++ b/gcc/jit/libgccjit.cc
@@ -267,6 +267,16 @@ struct gcc_jit_extended_asm : public gcc::jit::recording::extended_asm
   }\
   JIT_END_STMT
 
+#define RETURN_IF_FAIL_PRINTF3(TEST_EXPR, CTXT, LOC, ERR_FMT, A0, A1, A2) \
+  JIT_BEGIN_STMT			\
+if (!(TEST_EXPR))			\
+  {\
+	jit_error ((CTXT), (LOC), "%s: " ERR_FMT,			\
+		   __func__, (A0), (A1), (A2));			\
+	return;			\
+  }\
+  JIT_END_STMT
+
 #define RETURN_IF_FAIL_PRINTF4(TEST_EXPR, CTXT, LOC, ERR_FMT, A0, A1, A2, A3) \
   JIT_BEGIN_STMT			\
 if (!(TEST_EXPR))			\
@@ -2890,6 +2900,17 @@ gcc_jit_block_add_assignment_op (gcc_jit_block *block,
 lvalue->get_type ()->get_debug_string (),
 rvalue->get_debug_string (),
 rvalue->get_type ()->get_debug_string ());
+  // TODO: check if it is a numeric vector?
+  RETURN_IF_FAIL_PRINTF3 (
+lvalue->get_type ()->is_numeric (), ctxt, loc,
+"gcc_jit_block_add_assignment_op %s has non-numeric lvalue %s (type: %s)",
+gcc::jit::binary_op_reproducer_strings[op],
+lvalue->get_debug_string (), lvalue->get_type ()->get_debug_string ());
+  RETURN_IF_FAIL_PRINTF3 (
+rvalue->get_type ()->is_numeric (), ctxt, loc,
+"gcc_jit_block_add_assignment_op %s has non-numeric rvalue %s (type: %s)",
+gcc::jit::binary_op_reproducer_strings[op],
+rvalue->get_debug_string (), rvalue->get_type ()->get_debug_string ());
 
   gcc::jit::recording::statement *stmt = block->add_assignment_op (loc, lvalue, op, rvalue);
 
diff --git a/gcc/testsuite/jit.dg/test-error-bad-assignment-op.c b/gcc/testsuite/jit.dg/test-error-bad-assignment-op.c
new file mode 100644
index 000..683ebbfb1fe
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-error-bad-assignment-op.c
@@ -0,0 +1,57 @@
+#include 
+#include 
+
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  /* Let's try to inject the equivalent of:
+
+ void
+ test_fn ()
+ {
+const char *variable;
+variable += "test";
+ }
+
+ and verify that the API complains about the mismatching types
+ in the assignments.
+  */
+  gcc_jit_type *void_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_VOID);
+  gcc_jit

[PATCH] libgccjit: Allow comparing aligned int types

2023-12-21 Thread Antoni Boucher

Hi.
This patch allows comparing aligned integer types as equal.
There's a TODO in the code about whether we should check that the
alignment is equal.
What are your thoughts on this?

Thanks for the review.
From b1db2e31729876d313061a94c13b155bcd552c02 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sun, 8 Oct 2023 09:12:12 -0400
Subject: [PATCH] libgccjit: Allow comparing aligned int types

gcc/jit/ChangeLog:

	* jit-recording.h (type::is_same_type_as): Compare integer
	types.
	(type::is_aligned, memento_of_get_aligned::is_same_type_as,
	memento_of_get_aligned::is_aligned): new methods.

gcc/testsuite/ChangeLog:

	* jit.dg/test-types.c: Add checks comparing aligned types.
---
 gcc/jit/jit-recording.h   | 28 +---
 gcc/testsuite/jit.dg/test-types.c | 10 +++---
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 4a8082991fb..97f39f3fc98 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -555,6 +555,14 @@ public:
 
   virtual bool is_same_type_as (type *other)
   {
+if (is_int ()
+		 && other->is_int ()
+		 && get_size () == other->get_size ()
+		 && is_signed () == other->is_signed ())
+{
+  /* LHS (this) is an integer of the same size and sign as rtype.  */
+  return true;
+}
 return this == other;
   }
 
@@ -571,6 +579,7 @@ public:
   virtual type *is_volatile () { return NULL; }
   virtual type *is_restrict () { return NULL; }
   virtual type *is_const () { return NULL; }
+  virtual type *is_aligned () { return NULL; }
   virtual type *is_array () = 0;
   virtual struct_ *is_struct () { return NULL; }
   virtual bool is_union () const { return false; }
@@ -625,13 +634,6 @@ public:
 	   accept it:  */
 	return true;
 	  }
-  } else if (is_int ()
-		 && rtype->is_int ()
-		 && get_size () == rtype->get_size ()
-		 && is_signed () == rtype->is_signed ())
-  {
-	/* LHS (this) is an integer of the same size and sign as rtype.  */
-	return true;
   }
 
 return type::accepts_writes_from (rtype);
@@ -805,6 +807,18 @@ public:
   : decorated_type (other_type),
 m_alignment_in_bytes (alignment_in_bytes) {}
 
+  bool is_same_type_as (type *other) final override
+  {
+// TODO: check if outermost alignment is equal?
+if (!other->is_aligned ())
+{
+  return m_other_type->is_same_type_as (other);
+}
+return m_other_type->is_same_type_as (other->is_aligned ());
+  }
+
+  type *is_aligned () final override { return m_other_type; }
+
   /* Strip off the alignment, giving the underlying type.  */
   type *unqualified () final override { return m_other_type; }
 
diff --git a/gcc/testsuite/jit.dg/test-types.c b/gcc/testsuite/jit.dg/test-types.c
index a01944e35fa..c2f4d2bcb3d 100644
--- a/gcc/testsuite/jit.dg/test-types.c
+++ b/gcc/testsuite/jit.dg/test-types.c
@@ -485,11 +485,15 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
 
   CHECK_VALUE (z.m_FILE_ptr, stderr);
 
+  gcc_jit_type *long_type = gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_LONG);
+  gcc_jit_type *int64_type = gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT64_T);
   if (sizeof(long) == 8)
-CHECK (gcc_jit_compatible_types (
-  gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_LONG),
-  gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT64_T)));
+CHECK (gcc_jit_compatible_types (long_type, int64_type));
 
   CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_FLOAT)), sizeof (float));
   CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_DOUBLE)), sizeof (double));
+
+  gcc_jit_type *aligned_long = gcc_jit_type_get_aligned (long_type, 4);
+  gcc_jit_type *aligned_int64 = gcc_jit_type_get_aligned (int64_type, 4);
+  CHECK (gcc_jit_compatible_types (aligned_long, aligned_int64));
 }
-- 
2.43.0

Re: [PATCH] testsuite: Remove testsuite_tr1.h

2023-12-21 Thread Patrick Palka

On Wed, 20 Dec 2023, Ken Matsui wrote:

> This patch removes the testsuite_tr1.h dependency from g++.dg/ext/is_*.C
> tests since the header is supposed to be used only by libstdc++, not
> front-end.  This also includes test code consistency fixes.

LGTM

> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/ext/is_array.C: Remove testsuite_tr1.h.  Add necessary
>   definitions accordingly.  Tweak macros for consistency across
>   test codes.
>   * g++.dg/ext/is_bounded_array.C: Likewise.
>   * g++.dg/ext/is_function.C: Likewise.
>   * g++.dg/ext/is_member_function_pointer.C: Likewise.
>   * g++.dg/ext/is_member_object_pointer.C: Likewise.
>   * g++.dg/ext/is_member_pointer.C: Likewise.
>   * g++.dg/ext/is_object.C: Likewise.
>   * g++.dg/ext/is_reference.C: Likewise.
>   * g++.dg/ext/is_scoped_enum.C: Likewise.
> 
> Signed-off-by: Ken Matsui 
> ---
>  gcc/testsuite/g++.dg/ext/is_array.C   | 15 ---
>  gcc/testsuite/g++.dg/ext/is_bounded_array.C   | 20 -
>  gcc/testsuite/g++.dg/ext/is_function.C| 41 +++
>  .../g++.dg/ext/is_member_function_pointer.C   | 14 +++
>  .../g++.dg/ext/is_member_object_pointer.C | 26 ++--
>  gcc/testsuite/g++.dg/ext/is_member_pointer.C  | 29 ++---
>  gcc/testsuite/g++.dg/ext/is_object.C  | 21 --
>  gcc/testsuite/g++.dg/ext/is_reference.C   | 28 +++--
>  gcc/testsuite/g++.dg/ext/is_scoped_enum.C | 12 ++
>  9 files changed, 101 insertions(+), 105 deletions(-)
> 
> diff --git a/gcc/testsuite/g++.dg/ext/is_array.C 
> b/gcc/testsuite/g++.dg/ext/is_array.C
> index facfed5c7cb..f1a6e08b87a 100644
> --- a/gcc/testsuite/g++.dg/ext/is_array.C
> +++ b/gcc/testsuite/g++.dg/ext/is_array.C
> @@ -1,15 +1,14 @@
>  // { dg-do compile { target c++11 } }
>  
> -#include 
> +#define SA(X) static_assert((X),#X)
>  
> -using namespace __gnu_test;
> +#define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)\
> +  SA(TRAIT(TYPE) == EXPECT); \
> +  SA(TRAIT(const TYPE) == EXPECT);   \
> +  SA(TRAIT(volatile TYPE) == EXPECT);\
> +  SA(TRAIT(const volatile TYPE) == EXPECT)
>  
> -#define SA(X) static_assert((X),#X)
> -#define SA_TEST_CATEGORY(TRAIT, X, expect) \
> -  SA(TRAIT(X) == expect);  \
> -  SA(TRAIT(const X) == expect);\
> -  SA(TRAIT(volatile X) == expect); \
> -  SA(TRAIT(const volatile X) == expect)
> +class ClassType { };
>  
>  SA_TEST_CATEGORY(__is_array, int[2], true);
>  SA_TEST_CATEGORY(__is_array, int[], true);
> diff --git a/gcc/testsuite/g++.dg/ext/is_bounded_array.C 
> b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> index 346790eba12..b5fe435de95 100644
> --- a/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> +++ b/gcc/testsuite/g++.dg/ext/is_bounded_array.C
> @@ -1,21 +1,19 @@
>  // { dg-do compile { target c++11 } }
>  
> -#include 
> -
> -using namespace __gnu_test;
> -
>  #define SA(X) static_assert((X),#X)
>  
> -#define SA_TEST_CONST(TRAIT, TYPE, EXPECT)   \
> +#define SA_TEST_FN(TRAIT, TYPE, EXPECT)  \
>SA(TRAIT(TYPE) == EXPECT); \
> -  SA(TRAIT(const TYPE) == EXPECT)
> +  SA(TRAIT(const TYPE) == EXPECT);
>  
>  #define SA_TEST_CATEGORY(TRAIT, TYPE, EXPECT)\
> -  SA(TRAIT(TYPE) == EXPECT); \
> -  SA(TRAIT(const TYPE) == EXPECT);   \
> -  SA(TRAIT(volatile TYPE) == EXPECT);\
> +  SA(TRAIT(TYPE) == EXPECT); \
> +  SA(TRAIT(const TYPE) == EXPECT);   \
> +  SA(TRAIT(volatile TYPE) == EXPECT);\
>SA(TRAIT(const volatile TYPE) == EXPECT)
>  
> +class ClassType { };
> +
>  SA_TEST_CATEGORY(__is_bounded_array, int[2], true);
>  SA_TEST_CATEGORY(__is_bounded_array, int[], false);
>  SA_TEST_CATEGORY(__is_bounded_array, int[2][3], true);
> @@ -31,8 +29,8 @@ SA_TEST_CATEGORY(__is_bounded_array, ClassType[][3], false);
>  SA_TEST_CATEGORY(__is_bounded_array, int(*)[2], false);
>  SA_TEST_CATEGORY(__is_bounded_array, int(*)[], false);
>  SA_TEST_CATEGORY(__is_bounded_array, int(&)[2], false);
> -SA_TEST_CONST(__is_bounded_array, int(&)[], false);
> +SA_TEST_FN(__is_bounded_array, int(&)[], false);
>  
>  // Sanity check.
>  SA_TEST_CATEGORY(__is_bounded_array, ClassType, false);
> -SA_TEST_CONST(__is_bounded_array, void(), false);
> +SA_TEST_FN(__is_bounded_array, void(), false);
> diff --git a/gcc/testsuite/g++.dg/ext/is_function.C 
> b/gcc/testsuite/g++.dg/ext/is_function.C
> index 2e1594b12ad..1fc3c96df1f 100644
> --- a/gcc/testsuite/g++.dg/ext/is_function.C
> +++ b/gcc/testsuite/g++.dg/ext/is_function.C
> @@ -1,16 +1,19 @@
>  // { dg-do compile { target c++11 } }
>  
> -#include 
> +#define SA(X) static_assert((X),#X)
>  
> -using namespace __gnu_test;
> +#define SA_TEST_FN(TRAIT, TYPE, EXPECT)  \
> +  SA(TRAIT(TYPE) == EXPECT); \
> +  SA(TRAIT(const TYPE) == EXPECT);
>  
> -#define SA(X) static_as

Re: OpenMP offloading vs. C++ static local variables

2023-12-21 Thread Jakub Jelinek

On Thu, Dec 21, 2023 at 01:31:19PM +0100, Thomas Schwinge wrote:
> These three: implicitly, or explicit '#pragma omp declare target' etc.,
> or inside '#pragma omp begin declare target' region are the only OpenMP
> facilities to get things 'omp declare target'ed, right?

I think so.
> That doesn't generally work, as the gimplification-level code re
> 'Static locals [...] need to be "omp declare target"' runs *after*
> 'omp_discover_implicit_declare_target'.  Thus my "move" idea above.

Can't we mark the static locals already during that discovery?
The addition during gimplification was probably made when we didn't have
that at all.

> OK to push, for a start, the attached
> "GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local 
> variables support"?
> That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
> target testing, where we thus see a number of FAIL -> PASS progressions.

> For now, for single-threaded GCN, nvptx target use only; extension for
> multi-threaded offloading use to follow later.
> 
>   libgcc/
>   * c++-minimal/README: New.
>   * c++-minimal/guard.c: New.
>   * config/gcn/t-amdgcn (LIB2ADD): Add it.
>   * config/nvptx/t-nvptx (LIB2ADD): Likewise.

> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/cxxabi.h'.  */
> +
> +  int
> +  __cxa_guard_acquire(__guard*);
> +
> +  void
> +  __cxa_guard_release(__guard*);
> +
> +  void
> +  __cxa_guard_abort(__guard*);

When all this isn't inside a namespace, shouldn't it be indented by
2 spaces less?

> +
> +/* Copy'n'paste/edit from 'libstdc++-v3/libsupc++/guard.cc'.  */
> +
> +# undef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
> +# undef _GLIBCXX_GUARD_SET_AND_RELEASE
> +# define _GLIBCXX_GUARD_SET_AND_RELEASE(G) _GLIBCXX_GUARD_SET (G)

And without a space after # here?

Otherwise LGTM, but hope that one day we'll get rid of it again.

Jakub

[PATCH] libcpp: Fix __has_include_next ICE in the last directory of the path [PR80755]

2023-12-21 Thread Lewis Hyatt

Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80755

Here is a short fix for the ICE in libcpp noted in the PR. Bootstrap +
regtest all languages on x86-64 Linux. Is it OK please? Thanks!

-Lewis

-- >8 --

In libcpp/files.cc, the function _cpp_has_header(), which implements
__has_include and __has_include_next, does not check for a NULL return value
from search_path_head(), leading to an ICE tripping an assert when
_cpp_find_file() tries to use it. Fix it by checking for that case and
silently returning false instead.

As suggested by the PR author, it is easiest to make a testcase by using
the -idirafter option. To enable that, also modify the dg-additional-options
testsuite procedure to make the global $srcdir available, since -idirafter
requires the full path.

libcpp/ChangeLog:

PR preprocessor/80755
* files.cc (search_path_head): Add SUPPRESS_DIAGNOSTIC argument
defaulting to false.
(_cpp_has_header): Silently return false if the search path has been
exhausted, rather than issuing a diagnostic and then hitting an
assert.

gcc/testsuite/ChangeLog:

* lib/gcc-defs.exp (dg-additional-options): Make $srcdir usable in a
dg-additional-options directive.
* c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h: New 
test.
* c-c++-common/cpp/has-include-next-2.c: New test.
---
 libcpp/files.cc  | 12 
 .../cpp/has-include-next-2-dir/has-include-next-2.h  |  3 +++
 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c  |  4 
 gcc/testsuite/lib/gcc-defs.exp   |  1 +
 4 files changed, 16 insertions(+), 4 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
 create mode 100644 gcc/testsuite/c-c++-common/cpp/has-include-next-2.c

diff --git a/libcpp/files.cc b/libcpp/files.cc
index 27301d79fa4..aaab4b13a6a 100644
--- a/libcpp/files.cc
+++ b/libcpp/files.cc
@@ -181,7 +181,8 @@ static bool read_file_guts (cpp_reader *pfile, _cpp_file 
*file,
 static bool read_file (cpp_reader *pfile, _cpp_file *file,
   location_t loc);
 static struct cpp_dir *search_path_head (cpp_reader *, const char *fname,
-int angle_brackets, enum include_type);
+int angle_brackets, enum include_type,
+bool suppress_diagnostic = false);
 static const char *dir_name_of_file (_cpp_file *file);
 static void open_file_failed (cpp_reader *pfile, _cpp_file *file, int,
  location_t);
@@ -1041,7 +1042,7 @@ _cpp_mark_file_once_only (cpp_reader *pfile, _cpp_file 
*file)
nothing left in the path, returns NULL.  */
 static struct cpp_dir *
 search_path_head (cpp_reader *pfile, const char *fname, int angle_brackets,
- enum include_type type)
+ enum include_type type, bool suppress_diagnostic)
 {
   cpp_dir *dir;
   _cpp_file *file;
@@ -1070,7 +1071,7 @@ search_path_head (cpp_reader *pfile, const char *fname, 
int angle_brackets,
 return make_cpp_dir (pfile, dir_name_of_file (file),
 pfile->buffer ? pfile->buffer->sysp : 0);
 
-  if (dir == NULL)
+  if (dir == NULL && !suppress_diagnostic)
 cpp_error (pfile, CPP_DL_ERROR,
   "no include path in which to search for %s", fname);
 
@@ -2164,7 +2165,10 @@ bool
 _cpp_has_header (cpp_reader *pfile, const char *fname, int angle_brackets,
 enum include_type type)
 {
-  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type);
+  cpp_dir *start_dir = search_path_head (pfile, fname, angle_brackets, type,
+/* suppress_diagnostic = */ true);
+  if (!start_dir)
+return false;
   _cpp_file *file = _cpp_find_file (pfile, fname, start_dir, angle_brackets,
_cpp_FFK_HAS_INCLUDE, 0);
   return file->err_no != ENOENT;
diff --git 
a/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h 
b/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
new file mode 100644
index 000..1e4be6ce7a3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-next-2-dir/has-include-next-2.h
@@ -0,0 +1,3 @@
+#if __has_include_next()
+/* This formerly led to an ICE when the current directory was the last one in 
the path.  */
+#endif
diff --git a/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c 
b/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
new file mode 100644
index 000..4928d3e992c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/has-include-next-2.c
@@ -0,0 +1,4 @@
+/* PR preprocessor/80755 */
+/* { dg-do preprocess } */
+/* { dg-additional-options "-idirafter 
$srcdir/c-c++-common/cpp/has-include-next-2-dir" } */
+#include 
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index fc

Re: OpenMP offloading vs. C++ static local variables

2023-12-21 Thread Thomas Schwinge

Hi Jakub!

On 2023-12-07T16:33:08+0100, Jakub Jelinek  wrote:
> On Thu, Dec 07, 2023 at 04:09:04PM +0100, Thomas Schwinge wrote:
>> > Yeah, I believe we should in the omp_discover_* sub-pass handle with
>> > a help of a langhook automatically mark the guard variables (possibly
>> > iff the guarded variable is marked?),
>>
>> Looking at 'gcc/omp-offload.cc:omp_discover_implicit_declare_target' left
>> me confused how that would be the code that marks up 'static' variables
>> as implicit 'omp declare target'.  Working through a simple POD example
>> (say, 's%static S s%static int i') it turns out, indeed that's not where
>> that is happending, but instead 'gcc/gimplify.cc:gimplify_bind_expr' is
>> the place:
>
> Sure, that is for the case where those local statics should be marked
> implicitly because they appear in a target function.
> They can be also marked explicitly by the user through
> #pragma omp declare target enter (name_of_static_var)
> or
> [[omp::decl (declare target)]] attribute on it etc.

These three: implicitly, or explicit '#pragma omp declare target' etc.,
or inside '#pragma omp begin declare target' region are the only OpenMP
facilities to get things 'omp declare target'ed, right?

>> That said...  Couldn't we indeed move this gimplification-level code re
>> 'Static locals [...] need to be "omp declare target"' into
>> 'gcc/omp-offload.cc:omp_discover_implicit_declare_target'?
>
> The omp-offload.cc discovery stuff was added for stuff where the OpenMP
> standard says something is implicitly declare target because there is
> some use of it satisfying some rule.
> Like, calls to functions defined in current compilation unit referenced in
> target region or something similar, or such calls referenced in declare
> target static var initializers.
> So, that feels to me like the right spot to handle the guards as well.
> Of course, the middle-end doesn't know about C++ FE's get_guard variable,
> so it should be some new language hook which would take care of it.
> The omp_discover_declare* functions can add further VAR_DECLs to the
> worklist, so I'd probably call the new language hook in the
> omp_discover_implicit_declare_target last loop.
> Or maybe even better just handle that in the
> cxx_omp_finish_decl_inits hook.  You can just
>   FOR_EACH_VARIABLE (vnode)
> if (DECL_FUNCTION_SCOPE_P (vnode->decl)
>   && omp_declare_target_var_p (vnode->decl))
>   {
>   tree sname = mangle_guard_variable (decl);
>   tree guard = get_global_binding (sname);
>   if (guard)
> ... mark guard as declare target if not yet marked ...
>   }
> because guard var initializers don't really mention anything and so
> their addition doesn't need to trigger further worklist changes.

That doesn't generally work, as the gimplification-level code re
'Static locals [...] need to be "omp declare target"' runs *after*
'omp_discover_implicit_declare_target'.  Thus my "move" idea above.
However, let's defer the latter one; I've now got a simple setup where
the new language hook is invoked in all necessary places.  (Will post
later.)

>> > And sure, __cxa_guard_* would need to be implemented in the offloading
>> > libsupc++.a or libstdc++.a.
>>
>> Until proper libstdc++/libsupc++ support emerges (I'm working on it...),
>> my idea was to add a temporary 'libgomp/config/accel/*.c' implementation
>> (based on 'libstdc++-v3/libsupc++/guard.cc').
>
> That looks reasonable.

OK to push, for a start, the attached
"GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for C++ static local 
variables support"?
That's now in libgcc not libgomp, so that it's also usable for GCN, nvptx
target testing, where we thus see a number of FAIL -> PASS progressions.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From d40678768ae90c3fe1208cffd7d92e7058db5bbf Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 20 Dec 2023 12:27:48 +0100
Subject: [PATCH] GCN, nvptx: Basic '__cxa_guard_{acquire,abort,release}' for
 C++ static local variables support

For now, for single-threaded GCN, nvptx target use only; extension for
multi-threaded offloading use to follow later.

	libgcc/
	* c++-minimal/README: New.
	* c++-minimal/guard.c: New.
	* config/gcn/t-amdgcn (LIB2ADD): Add it.
	* config/nvptx/t-nvptx (LIB2ADD): Likewise.
---
 libgcc/c++-minimal/README   |  2 +
 libgcc/c++-minimal/guard.c  | 97 +
 libgcc/config/gcn/t-amdgcn  |  3 ++
 libgcc/config/nvptx/t-nvptx |  3 ++
 4 files changed, 105 insertions(+)
 create mode 100644 libgcc/c++-minimal/README
 create mode 100644 libgcc/c++-minimal/guard.c

diff --git a/libgcc/c++-minimal/README b/libgcc/c++-minimal/README
new file mode 100644
index 000..832f1265f7e
--- /dev/null
+++ b/libgcc/c++-mi

Re: Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread chenglulu

Sorry, I've been busy with something else these two days. I don't think 
there's anything wrong with the code,


but I need to test the spec.:-)

在 2023/12/21 下午7:56, Xi Ruoyao 写道:

Ping :).

On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote:

The problem with peephole2 is it uses a naive sliding-window algorithm
and misses many cases.  For example:

     float a[1];
     float t() { return a[0] + a[8000]; }

is compiled to:

     la.local    $r13,a
     la.local    $r12,a+32768
     fld.s   $f1,$r13,0
     fld.s   $f0,$r12,-768
     fadd.s  $f0,$f1,$f0

by trunk.  But as we've explained in r14-4851, the following would be
better with -mexplicit-relocs=auto:

     pcalau12i   $r13,%pc_hi20(a)
     pcalau12i   $r12,%pc_hi20(a+32000)
     fld.s   $f1,$r13,%pc_lo12(a)
     fld.s   $f0,$r12,%pc_lo12(a+32000)
     fadd.s  $f0,$f1,$f0

However the sliding-window algorithm just won't detect the pcalau12i/fld
pair to be optimized.  Use a define_insn_and_split in combine pass will
work around the issue.

gcc/ChangeLog:

* config/loongarch/loongarch.md:
(simple_load): New
define_insn_and_split.
(simple_load_off): Likewise.
(simple_load_ext): Likewise.
(simple_load_offext):
Likewise.
(simple_store): Likewise.
(simple_store_off): Likewise.
(define_peephole2): Remove la.local/[f]ld peepholes.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/explicit-relocs-auto-single-load-store-2.c:
New test.
---

Bootstrapped & regtested on loongarch64-linux-gnu.  Ok for trunk?

  gcc/config/loongarch/loongarch.md | 165 +-
  ...explicit-relocs-auto-single-load-store-2.c |  11 ++
  2 files changed, 98 insertions(+), 78 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-2.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 7b26d15aa4e..4009de408fb 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -4033,101 +4033,110 @@ (define_insn "loongarch_crcc_w__w"
  ;;
  ;; And if the pseudo op cannot be relaxed, we'll get a worse result (with
  ;; 3 instructions).
-(define_peephole2
-  [(set (match_operand:P 0 "register_operand")
-   (match_operand:P 1 "symbolic_pcrel_operand"))
-   (set (match_operand:LD_AT_LEAST_32_BIT 2 "register_operand")
-   (mem:LD_AT_LEAST_32_BIT (match_dup 0)))]
-  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
-   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
-   && (peep2_reg_dead_p (2, operands[0]) \
-   || REGNO (operands[0]) == REGNO (operands[2]))"
-  [(set (match_dup 2)
-   (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 0) (match_dup 1]
+(define_insn_and_split "simple_load"
+  [(set (match_operand:LD_AT_LEAST_32_BIT 0 "register_operand" "=r,f")
+   (mem:LD_AT_LEAST_32_BIT
+     (match_operand:P 1 "symbolic_pcrel_operand" "")))]
+  "loongarch_pre_reload_split () \
+   && la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM)"
+  "#"
+  ""
+  [(set (match_dup 0)
+   (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 2) (match_dup 1]
    {
-    emit_insn (gen_pcalau12i_gr (operands[0], operands[1]));
+    operands[2] = gen_reg_rtx (Pmode);
+    emit_insn (gen_pcalau12i_gr (operands[2], operands[1]));
    })
  
-(define_peephole2

-  [(set (match_operand:P 0 "register_operand")
-   (match_operand:P 1 "symbolic_pcrel_operand"))
-   (set (match_operand:LD_AT_LEAST_32_BIT 2 "register_operand")
-   (mem:LD_AT_LEAST_32_BIT (plus (match_dup 0)
-   (match_operand 3 "const_int_operand"]
-  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
-   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
-   && (peep2_reg_dead_p (2, operands[0]) \
-   || REGNO (operands[0]) == REGNO (operands[2]))"
-  [(set (match_dup 2)
-   (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 0) (match_dup 1]
+(define_insn_and_split "simple_load_off"
+  [(set (match_operand:LD_AT_LEAST_32_BIT 0 "register_operand" "=r,f")
+   (mem:LD_AT_LEAST_32_BIT
+     (plus (match_operand:P 1 "symbolic_pcrel_operand" "")
+   (match_operand 2 "const_int_operand" ""]
+  "loongarch_pre_reload_split () \
+   && la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM)"
+  "#"
+  ""
+  [(set (match_dup 0)
+   (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 2) (match_dup 1]
    {
-    operands[1] = plus_constant (Pmode, operands[1], INTVAL (operands[3]));
-    emit_insn (gen_pcalau12i_gr (operands[0], operands[1]));
+    HOST_WIDE_INT offset = INTVAL (operands[2]);
+    operands[2] = gen_reg_rtx (Pmode);
+    operands[1] = plus_constant (Pmode, operands[1], offset);
+    emit_insn (gen_pcalau12i_gr (operands[2], operands[1]));
    })
  
-(define_peephole2

-  [(set (match_operand:P 0 "

Ping: [PATCH] LoongArch: Replace -mexplicit-relocs=auto simple-used address peephole2 with combine

2023-12-21 Thread Xi Ruoyao

Ping :).

On Tue, 2023-12-12 at 14:47 +0800, Xi Ruoyao wrote:
> The problem with peephole2 is it uses a naive sliding-window algorithm
> and misses many cases.  For example:
> 
>     float a[1];
>     float t() { return a[0] + a[8000]; }
> 
> is compiled to:
> 
>     la.local    $r13,a
>     la.local    $r12,a+32768
>     fld.s   $f1,$r13,0
>     fld.s   $f0,$r12,-768
>     fadd.s  $f0,$f1,$f0
> 
> by trunk.  But as we've explained in r14-4851, the following would be
> better with -mexplicit-relocs=auto:
> 
>     pcalau12i   $r13,%pc_hi20(a)
>     pcalau12i   $r12,%pc_hi20(a+32000)
>     fld.s   $f1,$r13,%pc_lo12(a)
>     fld.s   $f0,$r12,%pc_lo12(a+32000)
>     fadd.s  $f0,$f1,$f0
> 
> However the sliding-window algorithm just won't detect the pcalau12i/fld
> pair to be optimized.  Use a define_insn_and_split in combine pass will
> work around the issue.
> 
> gcc/ChangeLog:
> 
>   * config/loongarch/loongarch.md:
>   (simple_load): New
>   define_insn_and_split.
>   (simple_load_off): Likewise.
>   (simple_load_ext): Likewise.
>   (simple_load_offext):
>   Likewise.
>   (simple_store): Likewise.
>   (simple_store_off): Likewise.
>   (define_peephole2): Remove la.local/[f]ld peepholes.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/loongarch/explicit-relocs-auto-single-load-store-2.c:
>   New test.
> ---
> 
> Bootstrapped & regtested on loongarch64-linux-gnu.  Ok for trunk?
> 
>  gcc/config/loongarch/loongarch.md | 165 +-
>  ...explicit-relocs-auto-single-load-store-2.c |  11 ++
>  2 files changed, 98 insertions(+), 78 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-2.c
> 
> diff --git a/gcc/config/loongarch/loongarch.md 
> b/gcc/config/loongarch/loongarch.md
> index 7b26d15aa4e..4009de408fb 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -4033,101 +4033,110 @@ (define_insn "loongarch_crcc_w__w"
>  ;;
>  ;; And if the pseudo op cannot be relaxed, we'll get a worse result (with
>  ;; 3 instructions).
> -(define_peephole2
> -  [(set (match_operand:P 0 "register_operand")
> - (match_operand:P 1 "symbolic_pcrel_operand"))
> -   (set (match_operand:LD_AT_LEAST_32_BIT 2 "register_operand")
> - (mem:LD_AT_LEAST_32_BIT (match_dup 0)))]
> -  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
> -   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
> -   && (peep2_reg_dead_p (2, operands[0]) \
> -   || REGNO (operands[0]) == REGNO (operands[2]))"
> -  [(set (match_dup 2)
> - (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 0) (match_dup 1]
> +(define_insn_and_split "simple_load"
> +  [(set (match_operand:LD_AT_LEAST_32_BIT 0 "register_operand" "=r,f")
> + (mem:LD_AT_LEAST_32_BIT
> +   (match_operand:P 1 "symbolic_pcrel_operand" "")))]
> +  "loongarch_pre_reload_split () \
> +   && la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
> +   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM)"
> +  "#"
> +  ""
> +  [(set (match_dup 0)
> + (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 2) (match_dup 1]
>    {
> -    emit_insn (gen_pcalau12i_gr (operands[0], operands[1]));
> +    operands[2] = gen_reg_rtx (Pmode);
> +    emit_insn (gen_pcalau12i_gr (operands[2], operands[1]));
>    })
>  
> -(define_peephole2
> -  [(set (match_operand:P 0 "register_operand")
> - (match_operand:P 1 "symbolic_pcrel_operand"))
> -   (set (match_operand:LD_AT_LEAST_32_BIT 2 "register_operand")
> - (mem:LD_AT_LEAST_32_BIT (plus (match_dup 0)
> - (match_operand 3 "const_int_operand"]
> -  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
> -   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
> -   && (peep2_reg_dead_p (2, operands[0]) \
> -   || REGNO (operands[0]) == REGNO (operands[2]))"
> -  [(set (match_dup 2)
> - (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 0) (match_dup 1]
> +(define_insn_and_split "simple_load_off"
> +  [(set (match_operand:LD_AT_LEAST_32_BIT 0 "register_operand" "=r,f")
> + (mem:LD_AT_LEAST_32_BIT
> +   (plus (match_operand:P 1 "symbolic_pcrel_operand" "")
> + (match_operand 2 "const_int_operand" ""]
> +  "loongarch_pre_reload_split () \
> +   && la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
> +   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM)"
> +  "#"
> +  ""
> +  [(set (match_dup 0)
> + (mem:LD_AT_LEAST_32_BIT (lo_sum:P (match_dup 2) (match_dup 1]
>    {
> -    operands[1] = plus_constant (Pmode, operands[1], INTVAL (operands[3]));
> -    emit_insn (gen_pcalau12i_gr (operands[0], operands[1]));
> +    HOST_WIDE_INT offset = INTVAL (operands[2]);
> +    operands[2] = gen_reg_rtx (Pmode);
> +    operands[1] = plus_constant (Pmode, operands[1], offset);
> +    emit_insn (gen_pcalau12i_gr (operands[2], operands[1]));
>    })
>  
> -(define_peephole2
> -  [(set (match_operand:P 0 "register_o

RE: [PATCH v7] libgfortran: Replace mutex with rwlock

2023-12-21 Thread Thomas Schwinge

Hi!

On 2023-12-13T21:52:29+0100, I wrote:
> On 2023-12-12T02:05:26+, "Zhu, Lipeng"  wrote:
>> On 2023/12/12 1:45, H.J. Lu wrote:
>>> On Sat, Dec 9, 2023 at 7:25 PM Zhu, Lipeng  wrote:
>>> > On 2023/12/9 23:23, Jakub Jelinek wrote:
>>> > > On Sat, Dec 09, 2023 at 10:39:45AM -0500, Lipeng Zhu wrote:
>>> > > > This patch try to introduce the rwlock and split the read/write to
>>> > > > unit_root tree and unit_cache with rwlock instead of the mutex to
>>> > > > increase CPU efficiency. In the get_gfc_unit function, the
>>> > > > percentage to step into the insert_unit function is around 30%, in
>>> > > > most instances, we can get the unit in the phase of reading the
>>> > > > unit_cache or unit_root tree. So split the read/write phase by
>>> > > > rwlock would be an approach to make it more parallel.
>>> > > >
>>> > > > BTW, the IPC metrics can gain around 9x in our test server with
>>> > > > 220 cores. The benchmark we used is
>>> > > > https://github.com/rwesson/NEAT
>
>>> > > Ok for trunk, thanks.
>
>>> > Thanks! Looking forward to landing to trunk.
>
>>> Pushed for you.

> I've just filed 
> "'libgomp.fortran/rwlock_1.f90', 'libgomp.fortran/rwlock_3.f90' execution 
> test timeouts".
> Would you be able to look into that?

See my update in there.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[pushed] aarch64: Fix early RA handling of deleted insns [PR113094]

2023-12-21 Thread Richard Sandiford

The testcase constructs a sequence of insns that are fully dead
and yet (due to forced options) are not removed as such.  This
triggered a case where we would emit a meaningless reload for a
to-be-deleted insn.

We can't delete the insns first because that might disrupt the
iteration ranges.  So this patch turns them into notes before
the walk and then continues to delete them properly afterwards.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
PR target/113094
* config/aarch64/aarch64-early-ra.cc (apply_allocation): Stub
out instructions that are going to be deleted before iterating
over the rest.

gcc/testsuite/
PR target/113094
* gcc.target/aarch64/pr113094.c: New test.
---
 gcc/config/aarch64/aarch64-early-ra.cc  |  3 +++
 gcc/testsuite/gcc.target/aarch64/pr113094.c | 10 ++
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr113094.c

diff --git a/gcc/config/aarch64/aarch64-early-ra.cc 
b/gcc/config/aarch64/aarch64-early-ra.cc
index 24415bd829c..5d2da3e1110 100644
--- a/gcc/config/aarch64/aarch64-early-ra.cc
+++ b/gcc/config/aarch64/aarch64-early-ra.cc
@@ -3210,6 +3210,9 @@ early_ra::maybe_convert_to_strided_access (rtx_insn *insn)
 void
 early_ra::apply_allocation ()
 {
+  for (auto *insn : m_dead_insns)
+set_insn_deleted (insn);
+
   rtx_insn *prev;
   for (auto insn_range : m_insn_ranges)
 for (rtx_insn *insn = insn_range.first;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr113094.c 
b/gcc/testsuite/gcc.target/aarch64/pr113094.c
new file mode 100644
index 000..b79e1f744ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr113094.c
@@ -0,0 +1,10 @@
+/* { dg-options "-fno-tree-dse -Ofast -fno-tree-coalesce-vars -fno-dce 
-fno-tree-dce" } */
+
+struct TV4 {
+  __attribute__((vector_size(sizeof(int) * 4))) int v;
+};
+void modify() {
+  struct TV4 __trans_tmp_1, temp;
+  temp.v[0] = temp.v[3] = 0;
+  __trans_tmp_1 = temp;
+}
-- 
2.25.1

[pushed] aarch64: Fix cut-&-pasto in early RA pass [PR112948]

2023-12-21 Thread Richard Sandiford

As the PR notes, there was a cut-&-pasto in find_strided_accesses.
I've not been able to find a testcase that shows the problem.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
PR target/112948
* config/aarch64/aarch64-early-ra.cc (find_strided_accesses): Fix
cut-&-pasto.
---
 gcc/config/aarch64/aarch64-early-ra.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-early-ra.cc 
b/gcc/config/aarch64/aarch64-early-ra.cc
index 484db94513d..24415bd829c 100644
--- a/gcc/config/aarch64/aarch64-early-ra.cc
+++ b/gcc/config/aarch64/aarch64-early-ra.cc
@@ -2072,8 +2072,8 @@ early_ra::find_strided_accesses ()
 
  if (group1->strided_polarity)
group2->strided_polarity = group1->strided_polarity * pref;
- else if (group1->strided_polarity)
-   group2->strided_polarity = group1->strided_polarity * pref;
+ else if (group2->strided_polarity)
+   group1->strided_polarity = group2->strided_polarity * pref;
  else
{
  group1->strided_polarity = 1;
-- 
2.25.1

Re: [PATCH] aarch64: Prevent moving throwing accesses in ldp/stp pass [PR113093]

2023-12-21 Thread Richard Sandiford

Alex Coplan  writes:
> As the PR shows, there was nothing to prevent the ldp/stp pass from
> trying to move throwing insns, which lead to an RTL verification
> failure.
>
> This patch fixes that.
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   PR target/113093
>   * config/aarch64/aarch64-ldp-fusion.cc (latest_hazard_before):
>   If the insn is throwing, record the previous insn as a hazard to
>   prevent moving it from the end of the BB.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113093
>   * gcc.dg/pr113093.c: New test.

OK, thanks.

Richard

> diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc 
> b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> index 0e2c299a0bf..59db70e9cd0 100644
> --- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
> +++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
> @@ -618,6 +618,13 @@ latest_hazard_before (insn_info *insn, rtx *ignore,
>  {
>insn_info *result = nullptr;
>  
> +  // If the insn can throw then it is at the end of a BB and we can't
> +  // move it, model this by recording a hazard in the previous insn
> +  // which will prevent moving the insn up.
> +  if (cfun->can_throw_non_call_exceptions
> +  && find_reg_note (insn->rtl (), REG_EH_REGION, NULL_RTX))
> +return insn->prev_nondebug_insn ();
> +
>// Return true if we registered the hazard.
>auto hazard = [&](insn_info *h) -> bool
>  {
> diff --git a/gcc/testsuite/gcc.dg/pr113093.c b/gcc/testsuite/gcc.dg/pr113093.c
> new file mode 100644
> index 000..af2a334b45d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr113093.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os -fharden-control-flow-redundancy -fnon-call-exceptions" 
> } */
> +_Complex long *c;
> +void init() { *c = 1.0; }

Re: [pushed][PATCH v1] LoongArch: Fix builtin function prototypes for LASX in doc.

2023-12-21 Thread chenglulu


Pushed to r14-6776.

在 2023/12/19 下午4:43, chenxiaolong 写道:

gcc/ChangeLog:

* doc/extend.texi:According to the documents submitted earlier,
Two problems with function return types and using the actual types
of parameters instead of variable names were found and fixed.
---
  gcc/doc/extend.texi | 24 
  1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 61c560a1cd3..cce6862b82b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18660,14 +18660,14 @@ __m256 __lasx_xvfnmsub_s (__m256, __m256, __m256);
  __m256d __lasx_xvfrecip_d (__m256d);
  __m256 __lasx_xvfrecip_s (__m256);
  __m256d __lasx_xvfrint_d (__m256d);
-__m256i __lasx_xvfrintrm_d (__m256d);
-__m256i __lasx_xvfrintrm_s (__m256);
-__m256i __lasx_xvfrintrne_d (__m256d);
-__m256i __lasx_xvfrintrne_s (__m256);
-__m256i __lasx_xvfrintrp_d (__m256d);
-__m256i __lasx_xvfrintrp_s (__m256);
-__m256i __lasx_xvfrintrz_d (__m256d);
-__m256i __lasx_xvfrintrz_s (__m256);
+__m256d __lasx_xvfrintrm_d (__m256d);
+__m256 __lasx_xvfrintrm_s (__m256);
+__m256d __lasx_xvfrintrne_d (__m256d);
+__m256 __lasx_xvfrintrne_s (__m256);
+__m256d __lasx_xvfrintrp_d (__m256d);
+__m256 __lasx_xvfrintrp_s (__m256);
+__m256d __lasx_xvfrintrz_d (__m256d);
+__m256 __lasx_xvfrintrz_s (__m256);
  __m256 __lasx_xvfrint_s (__m256);
  __m256d __lasx_xvfrsqrt_d (__m256d);
  __m256 __lasx_xvfrsqrt_s (__m256);
@@ -19134,10 +19134,10 @@ __m256i __lasx_xvssub_hu (__m256i, __m256i);
  __m256i __lasx_xvssub_w (__m256i, __m256i);
  __m256i __lasx_xvssub_wu (__m256i, __m256i);
  void __lasx_xvst (__m256i, void *, imm_n2048_2047);
-void __lasx_xvstelm_b (__m256i, void *, imm_n128_127, idx);
-void __lasx_xvstelm_d (__m256i, void *, imm_n128_127, idx);
-void __lasx_xvstelm_h (__m256i, void *, imm_n128_127, idx);
-void __lasx_xvstelm_w (__m256i, void *, imm_n128_127, idx);
+void __lasx_xvstelm_b (__m256i, void *, imm_n128_127, imm0_31);
+void __lasx_xvstelm_d (__m256i, void *, imm_n128_127, imm0_3);
+void __lasx_xvstelm_h (__m256i, void *, imm_n128_127, imm0_15);
+void __lasx_xvstelm_w (__m256i, void *, imm_n128_127, imm0_7);
  void __lasx_xvstx (__m256i, void *, long int);
  __m256i __lasx_xvsub_b (__m256i, __m256i);
  __m256i __lasx_xvsub_d (__m256i, __m256i);

Re:[pushed] [PATCH v2] extend.texi: Fix typos in LSX intrinsics

2023-12-21 Thread chenglulu


Pushed to r14-6775.

Thank you so much!

在 2023/12/13 下午11:26, Jiajie Chen 写道:

Several typos have been found and fixed: missing semicolons, using
variable name instead of type, duplicate functions and wrong types.

gcc/ChangeLog:

* doc/extend.texi(__lsx_vabsd_di): remove extra `i' in name.
(__lsx_vfrintrm_d, __lsx_vfrintrm_s, __lsx_vfrintrne_d,
__lsx_vfrintrne_s, __lsx_vfrintrp_d, __lsx_vfrintrp_s, __lsx_vfrintrz_d,
__lsx_vfrintrz_s): fix return types.
(__lsx_vld, __lsx_vldi, __lsx_vldrepl_b, __lsx_vldrepl_d,
__lsx_vldrepl_h, __lsx_vldrepl_w, __lsx_vmaxi_b, __lsx_vmaxi_d,
__lsx_vmaxi_h, __lsx_vmaxi_w, __lsx_vmini_b, __lsx_vmini_d,
__lsx_vmini_h, __lsx_vmini_w, __lsx_vsrani_d_q, __lsx_vsrarni_d_q,
__lsx_vsrlni_d_q, __lsx_vsrlrni_d_q, __lsx_vssrani_d_q,
__lsx_vssrarni_d_q, __lsx_vssrarni_du_q, __lsx_vssrlni_d_q,
__lsx_vssrlrni_du_q, __lsx_vst, __lsx_vstx, __lsx_vssrani_du_q,
__lsx_vssrlni_du_q, __lsx_vssrlrni_d_q): add missing semicolon.
(__lsx_vpickve2gr_bu, __lsx_vpickve2gr_hu): fix typo in return
type.
(__lsx_vstelm_b, __lsx_vstelm_d, __lsx_vstelm_h,
__lsx_vstelm_w): use imm type for the last argument.
(__lsx_vsigncov_b, __lsx_vsigncov_h, __lsx_vsigncov_w,
__lsx_vsigncov_d): remove duplicate definitions.
---
  gcc/doc/extend.texi | 90 ++---
  1 file changed, 43 insertions(+), 47 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index f0c789f6cb4..ba1317c3510 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -17563,7 +17563,7 @@ int __lsx_bz_v (__m128i);
  int __lsx_bz_w (__m128i);
  __m128i __lsx_vabsd_b (__m128i, __m128i);
  __m128i __lsx_vabsd_bu (__m128i, __m128i);
-__m128i __lsx_vabsd_di (__m128i, __m128i);
+__m128i __lsx_vabsd_d (__m128i, __m128i);
  __m128i __lsx_vabsd_du (__m128i, __m128i);
  __m128i __lsx_vabsd_h (__m128i, __m128i);
  __m128i __lsx_vabsd_hu (__m128i, __m128i);
@@ -17769,14 +17769,14 @@ __m128 __lsx_vfnmsub_s (__m128, __m128, __m128);
  __m128d __lsx_vfrecip_d (__m128d);
  __m128 __lsx_vfrecip_s (__m128);
  __m128d __lsx_vfrint_d (__m128d);
-__m128i __lsx_vfrintrm_d (__m128d);
-__m128i __lsx_vfrintrm_s (__m128);
-__m128i __lsx_vfrintrne_d (__m128d);
-__m128i __lsx_vfrintrne_s (__m128);
-__m128i __lsx_vfrintrp_d (__m128d);
-__m128i __lsx_vfrintrp_s (__m128);
-__m128i __lsx_vfrintrz_d (__m128d);
-__m128i __lsx_vfrintrz_s (__m128);
+__m128d __lsx_vfrintrm_d (__m128d);
+__m128 __lsx_vfrintrm_s (__m128);
+__m128d __lsx_vfrintrne_d (__m128d);
+__m128 __lsx_vfrintrne_s (__m128);
+__m128d __lsx_vfrintrp_d (__m128d);
+__m128 __lsx_vfrintrp_s (__m128);
+__m128d __lsx_vfrintrz_d (__m128d);
+__m128 __lsx_vfrintrz_s (__m128);
  __m128 __lsx_vfrint_s (__m128);
  __m128d __lsx_vfrsqrt_d (__m128d);
  __m128 __lsx_vfrsqrt_s (__m128);
@@ -17845,12 +17845,12 @@ __m128i __lsx_vinsgr2vr_b (__m128i, int, imm0_15);
  __m128i __lsx_vinsgr2vr_d (__m128i, long int, imm0_1);
  __m128i __lsx_vinsgr2vr_h (__m128i, int, imm0_7);
  __m128i __lsx_vinsgr2vr_w (__m128i, int, imm0_3);
-__m128i __lsx_vld (void *, imm_n2048_2047)
-__m128i __lsx_vldi (imm_n1024_1023)
-__m128i __lsx_vldrepl_b (void *, imm_n2048_2047)
-__m128i __lsx_vldrepl_d (void *, imm_n256_255)
-__m128i __lsx_vldrepl_h (void *, imm_n1024_1023)
-__m128i __lsx_vldrepl_w (void *, imm_n512_511)
+__m128i __lsx_vld (void *, imm_n2048_2047);
+__m128i __lsx_vldi (imm_n1024_1023);
+__m128i __lsx_vldrepl_b (void *, imm_n2048_2047);
+__m128i __lsx_vldrepl_d (void *, imm_n256_255);
+__m128i __lsx_vldrepl_h (void *, imm_n1024_1023);
+__m128i __lsx_vldrepl_w (void *, imm_n512_511);
  __m128i __lsx_vldx (void *, long int);
  __m128i __lsx_vmadd_b (__m128i, __m128i, __m128i);
  __m128i __lsx_vmadd_d (__m128i, __m128i, __m128i);
@@ -17886,13 +17886,13 @@ __m128i __lsx_vmax_d (__m128i, __m128i);
  __m128i __lsx_vmax_du (__m128i, __m128i);
  __m128i __lsx_vmax_h (__m128i, __m128i);
  __m128i __lsx_vmax_hu (__m128i, __m128i);
-__m128i __lsx_vmaxi_b (__m128i, imm_n16_15)
+__m128i __lsx_vmaxi_b (__m128i, imm_n16_15);
  __m128i __lsx_vmaxi_bu (__m128i, imm0_31);
-__m128i __lsx_vmaxi_d (__m128i, imm_n16_15)
+__m128i __lsx_vmaxi_d (__m128i, imm_n16_15);
  __m128i __lsx_vmaxi_du (__m128i, imm0_31);
-__m128i __lsx_vmaxi_h (__m128i, imm_n16_15)
+__m128i __lsx_vmaxi_h (__m128i, imm_n16_15);
  __m128i __lsx_vmaxi_hu (__m128i, imm0_31);
-__m128i __lsx_vmaxi_w (__m128i, imm_n16_15)
+__m128i __lsx_vmaxi_w (__m128i, imm_n16_15);
  __m128i __lsx_vmaxi_wu (__m128i, imm0_31);
  __m128i __lsx_vmax_w (__m128i, __m128i);
  __m128i __lsx_vmax_wu (__m128i, __m128i);
@@ -17902,13 +17902,13 @@ __m128i __lsx_vmin_d (__m128i, __m128i);
  __m128i __lsx_vmin_du (__m128i, __m128i);
  __m128i __lsx_vmin_h (__m128i, __m128i);
  __m128i __lsx_vmin_hu (__m128i, __m128i);
-__m128i __lsx_vmini_b (__m128i, imm_n16_15)
+__m128i __lsx_vmini_b (__m128i, imm_n16_15);
  __m128i __lsx_vmini_bu (__

Re: [PATCH v3 4/6] RISC-V: Adds the prefix "th." for the instructions of XTheadVector.

2023-12-21 Thread Kito Cheng

Why not just check the prefix is 'v'? I don't think xtheadvector able
to work with other vector stuffs like vector crypto or any other new
vector stuffs, then we don't need extra attribute.

On Thu, Dec 21, 2023 at 12:42 PM Jeff Law  wrote:
>
>
>
> On 12/20/23 15:48, 钟居哲 wrote:
> >  >> So rather than looking at the mode, would it make more sense to have an
> >>>attribute (or re-use an existing attribute) to identify which opcodes
> >>>are going to need prefixing?  We've got access to the INSN via
> >>>current_output_insn.  So we can lookup attributes trivially.
> >
> > Yes, I totally aggree with Jeff's idea. We have addes many attributes
> > for each RVV instructions.
> > For example, VSETVL PASS is highly depending on those attribute to do
> > the optimizations.
> Also note that with attributes, we can potentially even deal with cases
> where some alternatives need special handling while other alternatives
> simply aren't available with the thead extension.  Not sure if that's
> going to be needed or not, but it's worth remembering.
>
> Jeff

Re:[pushed] [PATCH v2] LoongArch: Fix incorrect code generation for sad pattern

2023-12-21 Thread chenglulu


Pushed to r14-6773.

在 2023/12/14 下午8:49, Jiahao Xu 写道:

When I attempt to enable vect_usad_char effective target for LoongArch, 
slp-reduc-sad.c
and vect-reduc-sad*.c tests fail. These tests fail because the sad pattern 
generates bad
code. This patch to fixed them, for sad patterns, use zero expansion instead of 
sign
expansion for reduction.

Currently, we are fixing failed vectorized tests, and in the future, we will
enable more tests of "vect" for LoongArch.

gcc/ChangeLog:

* config/loongarch/lasx.md: Use zero expansion instruction.
* config/loongarch/lsx.md: Ditto.

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index eeac8cd984b..db6871507e2 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -5097,8 +5097,8 @@ (define_expand "usadv32qi"
rtx t2 = gen_reg_rtx (V16HImode);
rtx t3 = gen_reg_rtx (V8SImode);
emit_insn (gen_lasx_xvabsd_u_bu (t1, operands[1], operands[2]));
-  emit_insn (gen_lasx_xvhaddw_h_b (t2, t1, t1));
-  emit_insn (gen_lasx_xvhaddw_w_h (t3, t2, t2));
+  emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
+  emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
DONE;
  })
@@ -5114,8 +5114,8 @@ (define_expand "ssadv32qi"
rtx t2 = gen_reg_rtx (V16HImode);
rtx t3 = gen_reg_rtx (V8SImode);
emit_insn (gen_lasx_xvabsd_s_b (t1, operands[1], operands[2]));
-  emit_insn (gen_lasx_xvhaddw_h_b (t2, t1, t1));
-  emit_insn (gen_lasx_xvhaddw_w_h (t3, t2, t2));
+  emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
+  emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
DONE;
  })
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index dbdb423011b..5e5e2503636 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -3468,8 +3468,8 @@ (define_expand "usadv16qi"
rtx t2 = gen_reg_rtx (V8HImode);
rtx t3 = gen_reg_rtx (V4SImode);
emit_insn (gen_lsx_vabsd_u_bu (t1, operands[1], operands[2]));
-  emit_insn (gen_lsx_vhaddw_h_b (t2, t1, t1));
-  emit_insn (gen_lsx_vhaddw_w_h (t3, t2, t2));
+  emit_insn (gen_lsx_vhaddw_hu_bu (t2, t1, t1));
+  emit_insn (gen_lsx_vhaddw_wu_hu (t3, t2, t2));
emit_insn (gen_addv4si3 (operands[0], t3, operands[3]));
DONE;
  })
@@ -3485,8 +3485,8 @@ (define_expand "ssadv16qi"
rtx t2 = gen_reg_rtx (V8HImode);
rtx t3 = gen_reg_rtx (V4SImode);
emit_insn (gen_lsx_vabsd_s_b (t1, operands[1], operands[2]));
-  emit_insn (gen_lsx_vhaddw_h_b (t2, t1, t1));
-  emit_insn (gen_lsx_vhaddw_w_h (t3, t2, t2));
+  emit_insn (gen_lsx_vhaddw_hu_bu (t2, t1, t1));
+  emit_insn (gen_lsx_vhaddw_wu_hu (t3, t2, t2));
emit_insn (gen_addv4si3 (operands[0], t3, operands[3]));
DONE;
  })

Re:[pushed] [PATCH v2] LoongArch: Modify the check type of the vector builtin function.

2023-12-21 Thread chenglulu


Pushed to r14-6774.

在 2023/12/13 上午9:31, chenxiaolong 写道:

On LoongArch architecture, using the latest gcc14 in regression test,
it is found that the vector test cases in vector directory appear FAIL
entries with unmatched pointer types. In order to solve this kind of
problem, the type of the variable in the check result is modified with
the parameter type defined in the vector builtin function.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/simd_correctness_check.h:The variable
types in the check results are modified in conjunction with the
parameter types defined in the vector builtin function.
---
v1->v2:
If an error occurs, output the data in hexadecimal format, and fill the
high part of the result with 0.
---
  .../loongarch/vector/simd_correctness_check.h   | 13 +++--
  1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/loongarch/vector/simd_correctness_check.h 
b/gcc/testsuite/gcc.target/loongarch/vector/simd_correctness_check.h
index eb7fbd59cc7..551340bd51f 100644
--- a/gcc/testsuite/gcc.target/loongarch/vector/simd_correctness_check.h
+++ b/gcc/testsuite/gcc.target/loongarch/vector/simd_correctness_check.h
@@ -8,11 +8,12 @@
int fail = 0;   
\
for (size_t i = 0; i < sizeof (res) / sizeof (res[0]); ++i) 
\
  { 
\
-  long *temp_ref = &ref[i], *temp_res = &res[i];  \
+  long long *temp_ref = (long long *)&ref[i], \
+   *temp_res = (long long *)&res[i]; \
if (abs (*temp_ref - *temp_res) > 0)
\
  { 
\
printf (" error: %s at line %ld , expected " #ref   
\
-  "[%ld]:0x%lx, got: 0x%lx\n",\
+  "[%ld]:0x%016lx, got: 0x%016lx\n",  \
__FILE__, line, i, *temp_ref, *temp_res);   
\
fail = 1;   
\
  } 
\
@@ -28,11 +29,11 @@
int fail = 0;   
\
for (size_t i = 0; i < sizeof (res) / sizeof (res[0]); ++i) 
\
  { 
\
-  int *temp_ref = &ref[i], *temp_res = &res[i];   \
+  int *temp_ref = (int *)&ref[i], *temp_res = (int *)&res[i]; \
if (abs (*temp_ref - *temp_res) > 0)
\
  { 
\
printf (" error: %s at line %ld , expected " #ref   
\
-  "[%ld]:0x%x, got: 0x%x\n",  \
+  "[%ld]:0x%08x, got: 0x%08x\n",  \
__FILE__, line, i, *temp_ref, *temp_res);   
\
fail = 1;   
\
  } 
\
@@ -47,8 +48,8 @@
  { 
\
if (ref != res) 
\
  { 
\
-  printf (" error: %s at line %ld , expected %d, got %d\n", __FILE__, \
-  line, ref, res);\
+  printf (" error: %s at line %ld , expected 0x:%016x",   \
+ "got 0x:%016x\n", __FILE__, line, ref, res);\
  } 
\
  } 
\
while (0)

[Committed] RISC-V: Add dynamic LMUL test for x264

2023-12-21 Thread Juzhe-Zhong

When working on evaluating x264 performance, I notice the best LMUL for such 
case with -march=rv64gcv is LMUL = 2

LMUL = 1:

x264_pixel_8x8:
add a4,a1,a2
addia6,a0,16
vsetivlizero,4,e8,mf4,ta,ma
add a5,a4,a2
vle8.v  v12,0(a6)
vle8.v  v2,0(a4)
addia6,a0,4
addia4,a4,4
vle8.v  v11,0(a6)
vle8.v  v9,0(a4)
addia6,a1,4
addia4,a0,32
vle8.v  v13,0(a0)
vle8.v  v1,0(a1)
vle8.v  v4,0(a6)
vle8.v  v8,0(a4)
vle8.v  v7,0(a5)
vwsubu.vv   v3,v13,v1
add a3,a5,a2
addia6,a0,20
addia4,a0,36
vle8.v  v10,0(a6)
vle8.v  v6,0(a4)
addia5,a5,4
vle8.v  v5,0(a5)
vsetvli zero,zero,e16,mf2,ta,mu
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v1,v12,v2
vsetvli zero,zero,e16,mf2,ta,mu
vmslt.viv0,v1,0
vneg.v  v1,v1,v0.t
vmv1r.v v2,v1
vwadd.vvv1,v3,v2
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v2,v11,v4
vsetvli zero,zero,e16,mf2,ta,mu
vmslt.viv0,v2,0
vneg.v  v2,v2,v0.t
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v3,v10,v9
vsetvli zero,zero,e16,mf2,ta,mu
vmv1r.v v4,v2
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.vvv2,v4,v3
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v3,v8,v7
vsetvli zero,zero,e16,mf2,ta,mu
add a4,a3,a2
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv1,v1,v3
vsetvli zero,zero,e8,mf4,ta,ma
add a5,a4,a2
vwsubu.vv   v3,v6,v5
addia6,a0,48
vsetvli zero,zero,e16,mf2,ta,mu
vle8.v  v16,0(a3)
vle8.v  v12,0(a4)
addia3,a3,4
addia4,a4,4
vle8.v  v17,0(a6)
vle8.v  v14,0(a3)
vle8.v  v10,0(a4)
vle8.v  v8,0(a5)
add a6,a5,a2
addia3,a0,64
addia4,a0,80
addia5,a5,4
vle8.v  v13,0(a3)
vle8.v  v4,0(a5)
vle8.v  v9,0(a4)
vle8.v  v6,0(a6)
vmslt.viv0,v3,0
addia7,a0,52
vneg.v  v3,v3,v0.t
vle8.v  v15,0(a7)
vwadd.wvv2,v2,v3
addia3,a0,68
addia4,a0,84
vle8.v  v11,0(a3)
vle8.v  v5,0(a4)
addia5,a0,96
vle8.v  v7,0(a5)
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v3,v17,v16
vsetvli zero,zero,e16,mf2,ta,mu
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv1,v1,v3
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v3,v15,v14
vsetvli zero,zero,e16,mf2,ta,mu
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv2,v2,v3
vsetvli zero,zero,e8,mf4,ta,ma
vwsubu.vv   v3,v13,v12
vsetvli zero,zero,e16,mf2,ta,mu
sllia4,a2,3
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv1,v1,v3
vsetvli zero,zero,e8,mf4,ta,ma
sub a4,a4,a2
vwsubu.vv   v3,v11,v10
vsetvli zero,zero,e16,mf2,ta,mu
add a1,a1,a4
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv2,v2,v3
vsetvli zero,zero,e8,mf4,ta,ma
lbu a7,0(a1)
vwsubu.vv   v3,v9,v8
lbu a5,112(a0)
vsetvli zero,zero,e16,mf2,ta,mu
subwa5,a5,a7
vmslt.viv0,v3,0
lbu a3,113(a0)
vneg.v  v3,v3,v0.t
lbu a4,1(a1)
vwadd.wvv1,v1,v3
addia6,a6,4
vsetvli zero,zero,e8,mf4,ta,ma
subwa3,a3,a4
vwsubu.vv   v3,v5,v4
addia2,a0,100
vsetvli zero,zero,e16,mf2,ta,mu
vle8.v  v4,0(a6)
sraiw   a6,a5,31
vle8.v  v5,0(a2)
sraiw   a7,a3,31
vmslt.viv0,v3,0
xor a2,a5,a6
vneg.v  v3,v3,v0.t
vwadd.wvv2,v2,v3
vsetvli zero,zero,e8,mf4,ta,ma
lbu a4,114(a0)
vwsubu.vv   v3,v7,v6
lbu t1,2(a1)
vsetvli zero,zero,e16,mf2,ta,mu
subwa2,a2,a6
xor a6,a3,a7
vmslt.viv0,v3,0
subwa4,a4,t1
vneg.v  v3,v3,v0.t
lbu t1,3(a1)
vwadd.wvv1,v1,v3
lbu a5,115(a0)
subwa6,a6,a7
vsetvli zero,zero,e8,mf4,ta,ma
li  a7,0
vwsubu.vv   v3,v5,v4
sraiw   t3,a4,31
vsetvli zero,zero,e16,mf2,ta,mu
subwa5,a5,t1
vmslt.viv0,v3,0
vneg.v  v3,v3,v0.t
vwadd.wvv2,v2,v3
sraiw   t1,a5,31
vsetvli zero,zero,e32,m1,ta,ma

Re: [PATCH v7 5/5] OpenMP/OpenACC: Reorganise OMP map clause handling in gimplify.cc

2023-12-21 Thread Tobias Burnus


Hi Julian,

On 20.12.23 22:29, Julian Brown wrote:

Thanks for review! Here's a new version of the patch which hopefully
addresses this round of comments.


Thanks for the patch. LGTM now.

Tobias


On Tue, 19 Dec 2023 16:41:54 +0100
Tobias Burnus  wrote:


On 16.12.23 14:25, Julian Brown wrote:

--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -10107,6 +10114,20 @@ omp_segregate_mapping_groups
(omp_mapping_group *inlist) ard_tail = &w->next;
 break;

+ case GOMP_MAP_PRESENT_ALLOC:
+   *pa_tail = w;
+   w->next = NULL;
+   pa_tail = &w->next;
+   break;
+
+ case GOMP_MAP_PRESENT_FROM:
+ case GOMP_MAP_PRESENT_TO:
+ case GOMP_MAP_PRESENT_TOFROM:
+   *ptf_tail = w;
+   w->next = NULL;
+   ptf_tail = &w->next;
+   break;
+

First, I note that GOMP_MAP_PRESENT_ALLOC and
GOMP_MAP_PRESENT_{FROM,TO,TOFROM} are semantically identical: If the
variable is not present, error termination will happen - otherwise, if
present, no data movement will happen. Hence, they will be changed to
GOMP_MAP_FORCE_PRESENT in gimplify_adjust_omp_clauses.

That's also the reason that the old code handled all of them
identical.

However, besides a plain 'present', there is also 'always' +
'present'. Those are different as after a normal 'present' check
(abort if not present), the data copying will happen:
GOMP_MAP_ALWAYS_PRESENT_TO, GOMP_MAP_ALWAYS_PRESENT_FROM,
GOMP_MAP_ALWAYS_PRESENT_TOFROM.

(Note that: always + present + alloc = GOMP_MAP_PRESENT_ALLOC (w/o
'always') as already done in the FE.)

Thus, all 'case' from your patch should go to a single group (possibly
adding a comment about it). The question is what to do with the
'present,always' case. I think leaving them under 'default:' is fine,
but I might have missed something.

I've made this change (i.e.: grouping all "GOMP_MAP_PRESENT_*" nodes
together), and in fact that restores the dump output for the
gfortran.dg/gomp/map-12.f90 that needed to be adjusted for the previous
version of the patch (so that hunk has now disappeared).


   default:
 *tf_tail = w;
 w->next = NULL;
@@ -10118,8 +10139,10 @@ omp_segregate_mapping_groups
(omp_mapping_group *inlist)

* * *

@@ -11922,119 +11945,30 @@ gimplify_scan_omp_clauses (tree *list_p,
gimple_seq *pre_p, break;
 }

-  if (code == OMP_TARGET
-  || code == OMP_TARGET_DATA
-  || code == OMP_TARGET_ENTER_DATA
-  || code == OMP_TARGET_EXIT_DATA)
-{
-  vec *groups;
-  groups = omp_gather_mapping_groups (list_p);
-  if (groups)
- {
-   hash_map
*grpmap;
-   grpmap = omp_index_mapping_groups (groups);
+  vec *groups = omp_gather_mapping_groups
(list_p);
+  hash_map *grpmap =
NULL;
+  unsigned grpnum = 0;
+  tree *grp_start_p = NULL, grp_end = NULL_TREE;

...


-  else if (region_type & ORT_ACC)
-{

I wonder whether you should not better call
'omp_gather_mapping_groups' only for the 'code == OMP_TARGET...' and
for ORT_ACC (or some subset of OACC *), given that this function is
also called bygimplify_omp_parallel, gimplify_omp_task,
gimplify_omp_for, ...

This avoids some memory allocation and list_p walking, i.e. it is not
too bad - but also not really needed for task, parallel, for, ...

I've made that change -- OpenACC uses OMP_CLAUSE_MAP in quite a wide
range of directives, but the new version of the patch lists them
individually anyway, rather than using a catch-all for ORT_ACC regions.
That seems OK, I think.


@@ -14008,26 +13926,73 @@ gimplify_adjust_omp_clauses (gimple_seq
*pre_p, gimple_seq body, tree *list_p, default:
 break;
   }
-   if (code == OMP_TARGET_EXIT_DATA
-   && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER)
+   switch (code)
   {
+ case OMP_TARGET:
+   break;
+ case OACC_DATA:
+   if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE)
+ break;
+   goto check_firstprivate;
+ case OACC_ENTER_DATA:
+ case OACC_EXIT_DATA:
+ case OMP_TARGET_DATA:
+ case OMP_TARGET_ENTER_DATA:
+ case OMP_TARGET_EXIT_DATA:
+ case OACC_HOST_DATA:
+ check_firstprivate:
+   if (OMP_CLAUSE_MAP_KIND (c) ==
GOMP_MAP_FIRSTPRIVATE_POINTER

I think it looks nicer if the OACC_HOST is before OMP_* such that all
OACC_* are together. (In the old code, oacc_enter/exit was treated
differently than OMP_* and OACC_HOST_DATA; your order is a leftover
from that code movement/change.)

I've fixed this bit -- which actually doesn't need the goto any more
either, so that's now a fallthrough instead.


+ flags = GOVD_MAP | GOVD_EXPLICIT;
+ if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_TO
+ || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_TOFROM)
+   flags |= GOVD_MAP_ALWAYS_TO;

I know that the code has only been moved, but I wonder whether that
should also include GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM} as condition.

I've added it (caveat: without any test

[gcc-wwwdocs PATCH v2] gcc-13/14: Mention recent update for x86_64 backend

2023-12-21 Thread Haochen Jiang

Hi all,

This is the v2 patch for the wwwdocs change regarding to review.

If there is no objection, I will push this change next Tuesday.

Changes is v2:

  - Remove RAO-INT from Grand Ridge
  - Remove the mask register restriction for -mno-evex512
  - Arrange the options alphabetically
  - Other minor text change

Thx,
Haochen

Messages in v1:

This patch will mention the following changes in wwwdocs for x86_64 backend:

  - AVX10.1 support
  - APX EGPR, PUSH2POP2, PPX and NDD support
  - Xeon Phi ISAs deprecated

Also I adjust the words in x86_64 part for GCC 13.

---
Mention AVX10.1 support, APX support and Xeon Phi deprecate in GCC 14.
Also adjust documentation in GCC 13.
---
 htdocs/gcc-13/changes.html | 38 --
 htdocs/gcc-14/changes.html | 27 ++-
 2 files changed, 42 insertions(+), 23 deletions(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index d3bacc16..b4b1a39a 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -543,24 +543,28 @@ You may also want to check out our
   __bf16 type to x86 psABI. Users need to adjust their
   AVX512BF16-related source code when upgrading GCC12 to GCC13.
   
-  New ISA extension support for Intel AVX-IFMA was added.
-  AVX-IFMA intrinsics are available via the -mavxifma
+  New ISA extension support for Intel AMX-COMPLEX was added.
+  AMX-COMPLEX intrinsics are available via the -mamx-complex
   compiler switch.
   
-  New ISA extension support for Intel AVX-VNNI-INT8 was added.
-  AVX-VNNI-INT8 intrinsics are available via the -mavxvnniint8
+  New ISA extension support for Intel AMX-FP16 was added.
+  AMX-FP16 intrinsics are available via the -mamx-fp16
+  compiler switch.
+  
+  New ISA extension support for Intel AVX-IFMA was added.
+  AVX-IFMA intrinsics are available via the -mavxifma
   compiler switch.
   
   New ISA extension support for Intel AVX-NE-CONVERT was added.
   AVX-NE-CONVERT intrinsics are available via the
   -mavxneconvert compiler switch.
   
-  New ISA extension support for Intel CMPccXADD was added.
-  CMPccXADD intrinsics are available via the -mcmpccxadd
+  New ISA extension support for Intel AVX-VNNI-INT8 was added.
+  AVX-VNNI-INT8 intrinsics are available via the -mavxvnniint8
   compiler switch.
   
-  New ISA extension support for Intel AMX-FP16 was added.
-  AMX-FP16 intrinsics are available via the -mamx-fp16
+  New ISA extension support for Intel CMPccXADD was added.
+  CMPccXADD intrinsics are available via the -mcmpccxadd
   compiler switch.
   
   New ISA extension support for Intel PREFETCHI was added.
@@ -571,10 +575,6 @@ You may also want to check out our
   RAO-INT intrinsics are available via the -mraoint
   compiler switch.
   
-  New ISA extension support for Intel AMX-COMPLEX was added.
-  AMX-COMPLEX intrinsics are available via the -mamx-complex
-  compiler switch.
-  
   GCC now supports the Intel CPU named Raptor Lake through
 -march=raptorlake.
 Raptor Lake is based on Alder Lake.
@@ -585,13 +585,13 @@ You may also want to check out our
   
   GCC now supports the Intel CPU named Sierra Forest through
 -march=sierraforest.
-The switch enables the AVX-IFMA, AVX-VNNI-INT8, AVX-NE-CONVERT, CMPccXADD,
-ENQCMD and UINTR ISA extensions.
+Based on ISA extensions enabled on Alder Lake, the switch further enables
+the AVX-IFMA, AVX-NE-CONVERT, AVX-VNNI-INT8, CMPccXADD, ENQCMD and UINTR
+ISA extensions.
   
   GCC now supports the Intel CPU named Grand Ridge through
 -march=grandridge.
-The switch enables the AVX-IFMA, AVX-VNNI-INT8, AVX-NE-CONVERT, CMPccXADD,
-ENQCMD, UINTR and RAO-INT ISA extensions.
+Grand Ridge is based on Sierra Forest.
   
   GCC now supports the Intel CPU named Emerald Rapids through
 -march=emeraldrapids.
@@ -599,11 +599,13 @@ You may also want to check out our
   
   GCC now supports the Intel CPU named Granite Rapids through
 -march=graniterapids.
-The switch enables the AMX-FP16 and PREFETCHI ISA extensions.
+Based on Sapphire Rapids, the switch further enables the AMX-FP16 and
+PREFETCHI ISA extensions.
   
   GCC now supports the Intel CPU named Granite Rapids D through
 -march=graniterapids-d.
-The switch enables the AMX-FP16, PREFETCHI and AMX-COMPLEX ISA extensions.
+Based on Granite Rapids, the switch further enables the AMX-COMPLEX ISA
+extensions.
   
   GCC now supports AMD CPUs based on the znver4 core
 via -march=znver4.  The switch makes GCC consider
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 24e6409a..4b83037a 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -320,8 +320,18 @@ a work-in-progress.
 IA-32/x86-64
 
   New compiler option -m[no-]evex512 was added.
-  The compiler switch enables/disables 512 bit vector and 64 bit mask
-  register. It will

76 matches

Mail list logo